Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

William Hersh

Overview of TREC 2025 Biomedical Generative Retrieval (BioGen) Track

Mar 23, 2026

Deepak Gupta, Dina Demner-Fushman, William Hersh, Steven Bedrick, Kirk Roberts

Abstract:Recent advances in large language models (LLMs) have made significant progress across multiple biomedical tasks, including biomedical question answering, lay-language summarization of the biomedical literature, and clinical note summarization. These models have demonstrated strong capabilities in processing and synthesizing complex biomedical information and in generating fluent, human-like responses. Despite these advancements, hallucinations or confabulations remain key challenges when using LLMs in biomedical and other high-stakes domains. Inaccuracies may be particularly harmful in high-risk situations, such as medical question answering, making clinical decisions, or appraising biomedical research. Studies on the evaluation of the LLMs' abilities to ground generated statements in verifiable sources have shown that models perform significantly

Via

Access Paper or Ask Questions

Bridge2AI: Building A Cross-disciplinary Curriculum Towards AI-Enhanced Biomedical and Clinical Care

May 20, 2025

John Rincon, Alexander R. Pelletier, Destiny Gilliland, Wei Wang, Ding Wang, Baradwaj S. Sankar, Lori Scott-Sheldon, Samson Gebreab, William Hersh, Parisa Rashidi(+9 more)

Figure 1 for Bridge2AI: Building A Cross-disciplinary Curriculum Towards AI-Enhanced Biomedical and Clinical Care

Figure 2 for Bridge2AI: Building A Cross-disciplinary Curriculum Towards AI-Enhanced Biomedical and Clinical Care

Figure 3 for Bridge2AI: Building A Cross-disciplinary Curriculum Towards AI-Enhanced Biomedical and Clinical Care

Figure 4 for Bridge2AI: Building A Cross-disciplinary Curriculum Towards AI-Enhanced Biomedical and Clinical Care

Abstract:Objective: As AI becomes increasingly central to healthcare, there is a pressing need for bioinformatics and biomedical training systems that are personalized and adaptable. Materials and Methods: The NIH Bridge2AI Training, Recruitment, and Mentoring (TRM) Working Group developed a cross-disciplinary curriculum grounded in collaborative innovation, ethical data stewardship, and professional development within an adapted Learning Health System (LHS) framework. Results: The curriculum integrates foundational AI modules, real-world projects, and a structured mentee-mentor network spanning Bridge2AI Grand Challenges and the Bridge Center. Guided by six learner personas, the program tailors educational pathways to individual needs while supporting scalability. Discussion: Iterative refinement driven by continuous feedback ensures that content remains responsive to learner progress and emerging trends. Conclusion: With over 30 scholars and 100 mentors engaged across North America, the TRM model demonstrates how adaptive, persona-informed training can build interdisciplinary competencies and foster an integrative, ethically grounded AI education in biomedical contexts.

Via

Access Paper or Ask Questions

Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education

Jan 17, 2025

William Hersh

Figure 1 for Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education

Figure 2 for Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education

Figure 3 for Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education

Figure 4 for Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education

Abstract:Generative AI has had a profound impact on biomedicine and health, both in professional work and in education. Based on large language models (LLMs), generative AI has been found to perform as well as humans in simulated situations taking medical board exams, answering clinical questions, solving clinical cases, applying clinical reasoning, and summarizing information. Generative AI is also being used widely in education, performing well in academic courses and their assessments. This review summarizes the successes of LLMs and highlights some of their challenges in the context of education, most notably aspects that may undermines the acquisition of knowledge and skills for professional work. It then provides recommendations for best practices overcoming shortcomings for LLM use in education. Although there are challenges for use of generative AI in education, all students and faculty, in biomedicine and health and beyond, must have understanding and be competent in its use.

Via

Access Paper or Ask Questions

Overview of TREC 2024 Biomedical Generative Retrieval (BioGen) Track

Nov 27, 2024

Deepak Gupta, Dina Demner-Fushman, William Hersh, Steven Bedrick, Kirk Roberts

Figure 1 for Overview of TREC 2024 Biomedical Generative Retrieval (BioGen) Track

Figure 2 for Overview of TREC 2024 Biomedical Generative Retrieval (BioGen) Track

Figure 3 for Overview of TREC 2024 Biomedical Generative Retrieval (BioGen) Track

Figure 4 for Overview of TREC 2024 Biomedical Generative Retrieval (BioGen) Track

Abstract:With the advancement of large language models (LLMs), the biomedical domain has seen significant progress and improvement in multiple tasks such as biomedical question answering, lay language summarization of the biomedical literature, clinical note summarization, etc. However, hallucinations or confabulations remain one of the key challenges when using LLMs in the biomedical and other domains. Inaccuracies may be particularly harmful in high-risk situations, such as making clinical decisions or appraising biomedical research. Studies on the evaluation of the LLMs' abilities to ground generated statements in verifiable sources have shown that models perform significantly worse on lay-user generated questions, and often fail to reference relevant sources. This can be problematic when those seeking information want evidence from studies to back up the claims from LLMs[3]. Unsupported statements are a major barrier to using LLMs in any applications that may affect health. Methods for grounding generated statements in reliable sources along with practical evaluation approaches are needed to overcome this barrier. Towards this, in our pilot task organized at TREC 2024, we introduced the task of reference attribution as a means to mitigate the generation of false statements by LLMs answering biomedical questions.

Via

Access Paper or Ask Questions