IRP findings shed light on risks and benefits of integrating AI into medical decision-making
AI model scored well on medical diagnostic quiz, but made mistakes explaining answers
Researchers at the National Institutes of Health (NIH) found that an artificial intelligence (AI) model solved medical quiz questions — designed to test health professionals’ ability to diagnose patients based on clinical images and a brief text summary — with high accuracy. However, physician-graders found the AI model made mistakes when describing images and explaining how its decision-making led to the correct answer. The findings, which shed light on AI’s potential in the clinical setting, were published in npj Digital Medicine. The study was led by researchers from NIH’s National Library of Medicine (NLM) and Weill Cornell Medicine, New York City.
“Integration of AI into health care holds great promise as a tool to help medical professionals diagnose patients faster, allowing them to start treatment sooner,” said NLM Acting Director, Stephen Sherry, Ph.D. “However, as this study shows, AI is not advanced enough yet to replace human experience, which is crucial for accurate diagnosis.”
The AI model and human physicians answered questions from the New England Journal of Medicine (NEJM)’s Image Challenge. The challenge is an online quiz that provides real clinical images and a short text description that includes details about the patient’s symptoms and presentation, then asks users to choose the correct diagnosis from multiple-choice answers.
This page was last updated on Tuesday, July 23, 2024