AI is nearly on par with radiologists but is not yet ready to potentially be considered as a replacement (Photo courtesy of the British Medical Journal)
AI underperforms on UK radiology training exam
December 30, 2022
by John R. Fischer
, Senior Reporter
Despite showing a significant level of accuracy and passing a standard radiology exam, AI is still not a viable replacement for radiologists.
British researchers tested an AI system in 10 mock tests to see if it could pass the “rapid reporting” part of the Fellowship of the Royal College of Radiologists exam. Across all 10, the system had an accuracy of 79.5%, compared to an average of 84.8% among actual human radiologists.
"The artificial intelligence candidate would still need further training to achieve the same level of performance and skill of an average recently FRCR-qualified radiologist," the researchers said in their study. They also said more training on "subtle musculoskeletal abnormalities" is required.
The exam, which U.K. radiologists must pass as part of their training, is made up of three parts. The rapid reporting section requires interpretations of 30 radiographs in 35 minutes, with participants needing to achieve at least 90% accuracy to pass.
The AI system was a SmartUrgences tool developed by French company Milvue that is commercially available. Its results were compared to those of 26 radiologists who took and passed the real FRCR exam within the last year.
When noninterpretable images were not used, the AI system achieved higher than 90% on two of the 10 tests. Radiologists were able to pass four of the 10 mock exams, on average. Despite having “relatively high accuracy,” AI scored higher on only one of the mock exams, according to the researchers.
The solution showed 83.6% sensitivity and 75.2% specificity, compared to 84.1% and 87.3%, respectively, among human radiologists.
It typically diagnosed images that more than 90% of radiologists diagnosed correctly (148 out of 300 radiographs) 91% of the time (134). Among those that most radiologists incorrectly diagnose, the AI system was wrong 50% of the time, correctly diagnosing hands, carpal bones and feet.
The researchers say that more training in analyzing areas considered to be noninterpretable is essential, such as the abdomen and axial skeleton. They add that while AI offers "untapped potential" for diagnostic efficiency and accuracy, there is still a need to educate "physicians and the public better about the limitations of artificial intelligence, and making these more transparent."
The authors included clinicians from Great Ormond Street Hospital for Children; NIHR Great Ormond Street Hospital Biomedical Research Centre; St George’s Hospital; University Hospitals of Morecambe Bay NHS Trust, Royal Lancaster Infirmary; University of Cambridge; and Royal Papworth Hospital.
The findings were published in the British Medical Journal.