AI shows similar accuracy in detecting
diseases as human radiologists

AI shows similar accuracy to humans in disease detection

September 27, 2019
by John R. Fischer, Senior Reporter
AI possesses similar accuracy as healthcare professionals in detecting diseases from medical images.

That’s the consensus of researchers at University Hospitals Birmingham NHS in the U.K., who claim to have conducted the first systematic review and meta-analysis of all available evidence from scientific literature on the matter. The group warns, however, that such a finding should be approached with caution, as only a few studies were considered to be up to a standard to be included in the analysis, and that the true potential of deep learning is still uncertain.

"We reviewed over 20,500 articles, but less than one percent of these were sufficiently robust in their design and reporting that independent reviewers had high confidence in their claims,” said professor Alastair Denniston from University Hospitals Birmingham NHS Foundation Trust, U.K., in a statement. “What's more, only 25 studies validated the AI models externally (using medical images from a different population), and just 14 studies actually compared the performance of AI and health professionals using the same test sample.”

Deep learning enables computers to identify patterns in thousands of medical images for greater accuracy and faster diagnoses. Such capabilities have generated enthusiasm around the prospect of deep learning models outperforming humans in diagnostic exams, and led to more than 30 AI algorithms gaining FDA approval in healthcare. Concerns, however, have also arisen around the objectivity of these studies, with many questioning if they are biased in favor of machine learning, and to what degree their findings can be applied in real-world clinical practice.

To address this, Denniston led a systematic review and meta-analysis of 82 studies published between January 2012 and June 2019 that compared the performance of deep learning models and healthcare professionals in detecting diseases from medical imaging. His team also assessed study design, reporting, and clinical value. Of these, data were analyzed for 69 articles considered to have enough information to calculate test performance accurately. The meta-analyses evaluated pooled estimates from 25 articles to validate the findings in an independent subset of images.

Data from 14 studies in the same sample found that, at best, deep learning algorithms correctly detected disease in 87 percent of cases, compared to 86 percent handled by healthcare professionals. They also showed 93 percent specificity in excluding patients who did not have disease, compared to 91 percent of human professionals.

While promising, Dr. Xiaoxuan Liu, another author from the University of Birmingham, U.K., cautions that the AI algorithms did not “substantially outperform” humans. He adds that several limitations within the methodology and reporting of AI-diagnostic studies, including his analysis, must be addressed. These include the fact that he and his colleagues assessed deep learning in isolation in a way that does not reflect clinical practice; the fact that few prospective studies were completed in real clinical environments; the need for high-quality comparisons in patients to determine diagnostic accuracy as well as datasets; and the elimination of poor reporting, which is common and caused by most studies not reporting missing data.

"There is an inherent tension between the desire to use new, potentially lifesaving diagnostics and the imperative to develop high-quality evidence in a way that can benefit patients and health systems in clinical practice," he said in a statement. "A key lesson from our work is that in AI — as with any other part of healthcare — good study design matters. Without it, you can easily introduce bias, which skews your results. These biases can lead to exaggerated claims of good performance for AI tools which do not translate into the real world.”

The findings were published in The Lancet Digital Health.