A new study says that using AI to interpret medical images can create errors that can lead to false positive or negatives that may potentially harm patients.

New research provides a sobering reality check to AI optimists

May 14, 2020
by John R. Fischer, Senior Reporter
For all the revolutionary promise that artificial intelligence holds for radiology, a new study found that existing AI can alter data in ways that lead to substantial errors and inaccuracies.

Researchers at the University of Cambridge and Simon Fraser University are saying medical image reconstruction algorithms can create myriad artifacts and other instabilities that lead to false positives or negatives in diagnoses and can potentially harm patients.

"There is no immediate fix, and that is the problem," Dr. Anders Hansen from Cambridge’s department of applied mathematics and theoretical physics, told HCB News. "One can describe mathematically why these algorithms become unstable."

AI techniques could, in theory, improve the quality of images by reconstructing low-resolution scans into high-resolution ones. This would cut exam time, thereby reducing risks to individual patients and increasing the overall number of scans that can be performed daily. The basis of algorithm reconstruction is prior datasets the program is trained on, while classical reconstruction is solely based on mathematical theory and does not depend on previous data.

Hansen and his colleagues from Norway, Portugal, Canada and the U.K. applied a series of basic computational mathematical tools and tests to search for flaws within AI-based medical imaging systems, including MR, CT and NMR.

Among their findings were instabilities associated with tiny perturbations or movements, namely myriad artifacts in the final images. They also encountered instabilities related to small structural changes, such as blurring or complete removal of details, and deterioration in the quality of image reconstruction due to the use of repeated subsampling.

The errors were widespread across different types of neural networks, indicating that fixing them will not be easy. It also raises concerns that the most troubling errors are the ones radiologists determine to be medical issues, compared to ones that can be simply dismissed due to a technical error.

"The problem is to get stability and optimal performance of the recovery algorithm, but these properties have to be balanced," said Hansen. "One can mathematically show that the new AI techniques over perform in certain cases and that is the reason why they become unstable. In particular, there is a limit to how well an algorithm can reconstruct an image from the under sampled MR data. In fact, if the AI algorithm does too well on only two images, the result is instability. The problem is that it is likely that the optimal stable method can never be learned although it theoretically exists. In particular, modern AI techniques will not be able to determine its own limitations. This we have not proved yet, however, it is work in progress."

Hansen led the research with Dr. Ben Adcock from Simon Fraser University. The team is now using trial and error-based research to decipher the fundamental limits for the use of AI techniques in order to show radiologists which problems can be solved with them.

The new AI techniques assessed in the paper are not FDA approved and are not in clinical use.

The findings were published in the Proceedings of the National Academy of Sciences.