A new study raises concerns about the influence that findings in AI trials have on patient safety.
A group of British and American researchers are worried about the influence studies on artificial intelligence have on the safety of patients.
They claim in their own study that trials comparing or asserting AI to be superior to the accuracy of human experts for interpreting medical imaging are often of poor quality and arguably exaggerated, asserting that the design and reporting standards for these studies need to change.
“Many arguably exaggerated claims exist about equivalence with (or superiority over) clinicians, which presents a potential risk for patient safety and population health at the societal level,” they said in a statement.
While AI and deep learning hold the potential to improve patient care as well as workflow for providers, studies around its use have been hyped up by the media, leading to rapid implementation with few assessments of the methods and risks of bias these studies may incur due to flaws in their design and conduct, say the researchers.
The team reviewed the findings of studies published over the last decade that compared the performance of a deep learning algorithm in medical imaging to expert clinicians.
They identified two eligible randomized clinical trials and 81 non-randomized studies. Of the non-randomized group, only nine were prospective and just six were tested in a "real world" clinical setting. The average number of human experts in the comparator group was four, and access to raw data and code for independent scrutiny of findings was severely limited.
Problems in study design that could influence results put 58 of the 81 at high risk of incurring bias, and adherence to recognized reporting standards was poor. In addition, 61 stated that performance of AI was a least comparable to or better than that of clinicians, and only 31 asserted that further prospective studies or trials were needed.
The authors claim that over-promising language "leaves studies susceptible to being misinterpreted by the media and the public, and as a result the possible provision of inappropriate care that does not necessarily align with patients' best interests. Maximising patient safety will be best served by ensuring that we develop a high quality and transparently reported evidence base moving forward.”
Some limitations in the researchers findings include the possibility of missed studies and the focus on deep learning medical imaging studies, meaning that results may not apply to other types of AI.