Adam Rodman

ORCID: 0000-0001-8452-0692
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Artificial Intelligence in Healthcare and Education
  • Clinical Reasoning and Diagnostic Skills
  • Innovations in Medical Education
  • Social Media in Health Education
  • Machine Learning in Healthcare
  • Radiology practices and education
  • Innovations in Educational Methods
  • Radio, Podcasts, and Digital Media
  • Healthcare Systems and Technology
  • Electronic Health Records Systems
  • Healthcare cost, quality, practices
  • Cardiac, Anesthesia and Surgical Outcomes
  • Autopsy Techniques and Outcomes
  • Health and Medical Research Impacts
  • COVID-19 diagnosis using AI
  • Cardiac Arrest and Resuscitation
  • Surgical Simulation and Training
  • Primary Care and Health Outcomes
  • Obesity and Health Practices
  • Obesity, Physical Activity, Diet
  • Health Literacy and Information Accessibility
  • Healthcare Policy and Management
  • Biomedical and Engineering Education
  • Empathy and Medical Education
  • Organ Donation and Transplantation

Beth Israel Deaconess Medical Center
2017-2025

Hadassah Medical Center
2023-2025

Harvard University
2017-2025

George Washington University
2024

University of Colorado Hospital
2024

University of Colorado Denver
2024

University of Alberta
2024

Massachusetts General Hospital
2024

ORCID
2024

Boston University
2024

This study assesses the diagnostic accuracy of Generative Pre-trained Transformer 4 (GPT-4) artificial intelligence (AI) model in a series challenging cases.

10.1001/jama.2023.8288 article EN JAMA 2023-06-15

Interview with Adam Rodman on the potential effects of generative artificial intelligence medical education and clinical practice. (09:51)Download Artificial could have broad implications for education. Educators lead way when it comes to integrating this technology into

10.1056/nejmp2304993 article EN New England Journal of Medicine 2023-07-29

Importance Large language models (LLMs) have shown promise in their performance on both multiple-choice and open-ended medical reasoning examinations, but it remains unknown whether the use of such tools improves physician diagnostic reasoning. Objective To assess effect an LLM physicians’ compared with conventional resources. Design, Setting, Participants A single-blind randomized clinical trial was conducted from November 29 to December 29, 2023. Using remote video conferencing in-person...

10.1001/jamanetworkopen.2024.40969 article EN cc-by-nc-nd JAMA Network Open 2024-10-28

This cross-sectional study assesses the ability of a large language model to process medical data and display clinical reasoning compared with attending physicians residents.

10.1001/jamainternmed.2024.0295 article EN JAMA Internal Medicine 2024-04-01

The U.S. government recently took steps to ensure that clinical decision support algorithms are safe for use. next and larger step will be teaching physicians how use the effectively.

10.1056/nejmp2304839 article EN New England Journal of Medicine 2023-08-05

ABSTRACT Importance Diagnostic errors are common and cause significant morbidity. Large language models (LLMs) have shown promise in their performance on both multiple-choice open-ended medical reasoning examinations, but it remains unknown whether the use of such tools improves diagnostic reasoning. Objective To assess impact GPT-4 LLM physicians’ compared to conventional resources. Design Multi-center, randomized clinical vignette study. Setting The study was conducted using remote video...

10.1101/2024.03.12.24303785 preprint EN cc-by-nc-nd medRxiv (Cold Spring Harbor Laboratory) 2024-03-14

This comparative effectiveness research assesses the performance of newer open-source large language models (LLMs) with that closed-source proprietary LLMs.

10.1001/jamahealthforum.2025.0040 article EN cc-by-nc-nd JAMA Health Forum 2025-03-14

This diagnostic study compares the performance of artificial intelligence (AI) with that human clinicians in estimating probability diagnoses before and after testing.

10.1001/jamanetworkopen.2023.47075 article EN cc-by-nc-nd JAMA Network Open 2023-12-11

Purpose Medical podcasts have grown in popularity, but little is known about their didactic methods. This study sought to systemically describe the pedagogical approach employed by 100 most popular medical United States. also aimed assess factors related quality control and conflicts of interest podcasting.Methods The authors averaged rank positions for Apple Medicine category States from 06/01/18 09/30/20 generate a list highest-ranked podcasts. They developed validated categorization...

10.1080/0142159x.2022.2071691 article EN Medical Teacher 2022-05-08

Background: While large language models (LLMs) are being increasingly deployed for clinical decision support, existing evaluation methods like medical licensing exams fail to capture critical aspects of reasoning including in dynamic circumstances. Script Concordance Testing (SCT), a decades-old assessment tool, offers nuanced way assess how new information influences diagnostic and therapeutic decisions under uncertainty. Methods: We developed comprehensive publicly available benchmark...

10.1101/2025.02.11.25321822 preprint EN medRxiv (Cold Spring Harbor Laboratory) 2025-02-12

Improved performance of large language models (LLMs) on traditional reasoning assessments has led to benchmark saturation. This spurred efforts develop new benchmarks, including synthetic computational simulations clinical practice involving multiple AI agents. We argue that it is crucial ground such in extensive human validation. conclude by providing four recommendations for researchers better evaluate LLMs practice.

10.1056/aie2500143 article EN NEJM AI 2025-03-25

Background: General-purpose large language models that utilize both text and images have not been evaluated on a diverse array of challenging medical cases. Methods: Using 934 cases from the NEJM Image Challenge published between 2005 2023, we accuracy recently released Generative Pre-trained Transformer 4 with Vision model (GPT-4V) compared to human respondents overall stratified by question difficulty, image type, skin tone. We further conducted physician evaluation GPT-4V 69...

10.48550/arxiv.2311.05591 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Abstract Background There has been a shift in postgraduate medical education towards digital educational resources—podcasts, videos, social media and other formats consumed asynchronously apart from formal curricula. It is unclear what drives residents to select use these resources. Understanding how why choose resources can aid programme directors, faculty optimising residents' informal learning time. Method This focus group study was conducted with at two US internal medicine residency...

10.1111/tct.13722 article EN The Clinical Teacher 2024-01-17

This perspective summarizes the episode of NEJM AI Grand Rounds in which Dr. Adam Rodman joins cohosts Drs. Arjun Manrai and Andrew Beam for a wide-ranging conversation about history future medical diagnosis.1 Drawing on his experience as historian epistemology clinical reasoning — that is, how doctors "know" things diseases their patients places our current discussion diagnostic abilities large language models (LLMs) into century-long context attempts to build artificial intelligence....

10.1056/aip2400707 article EN NEJM AI 2024-08-08

Large language model (LLM) artificial intelligence (AI) systems have shown promise in diagnostic reasoning, but their utility management reasoning with no clear right answers is unknown. To determine whether LLM assistance improves physician performance on open-ended tasks compared to conventional resources. Prospective, randomized controlled trial conducted from 30 November 2023 21 April 2024. Multi-institutional study Stanford University, Beth Israel Deaconess Medical Center, and the...

10.1101/2024.08.05.24311485 preprint EN cc-by-nd medRxiv (Cold Spring Harbor Laboratory) 2024-08-07

This comparative effectiveness research study examines the association between racial differences in pain assessment and false beliefs about biologization of race by large language models compared with a human baseline.

10.1001/jamanetworkopen.2024.37977 article EN cc-by-nc-nd JAMA Network Open 2024-10-07
Coming Soon ...