Newswise — Artificial intelligence tools and deep learning models are a powerful tool in cancer treatment. They can be used to analyze digital images of tumor biopsy samples, helping physicians quickly classify the type of cancer, predict prognosis and guide a course of treatment for the patient. However, unless these algorithms are properly calibrated, they can sometimes make inaccurate or biased predictions.
A new study led by researchers from the University of Chicago shows that deep learning models trained on large sets of cancer genetic and tissue histology data can easily identify the institution that submitted the images. The models, which use machine learning methods to “teach” themselves how to recognize certain cancer signatures, end up using the submitting site as a shortcut to predicting outcomes for the patient, lumping them together with other patients from the same location instead of relying on the biology of individual patients. This in turn may lead to bias and missed opportunities for treatment in patients from racial or ethnic minority groups who may be more likely to be represented in certain medical centers and already struggle with access to care.
“We identified a glaring hole in the in the current methodology for deep learning model development which makes certain regions and patient populations more susceptible to be included in inaccurate algorithmic predictions,” said Alexander Pearson, MD, PhD, assistant Assistant Professor of Medicine at UChicago Medicine and co-senior author. The study was published July 20, in Nature Communications.
One of the first steps in treatment for a cancer patient is taking a biopsy, or small tissue sample of a tumor. A very thin slice of the tumor is affixed to glass slide, which is stained with multicolored dyes for review by a pathologist to make a diagnosis. Digital images can then be created for storage and remote analysis by using a scanning microscope. While these steps are mostly standard across pathology labs, minor variations in the color or amount of stain, tissue processing techniques and in the imaging equipment can create unique signatures, like tags, on each image. These location-specific signatures aren’t visible to the naked eye, but are easily detected by powerful deep learning algorithms.
These algorithms have the potential to be a valuable tool for allowing physicians to quickly analyze a tumor and guide treatment options, but the introduction of this kind of bias means that the models aren’t always basing their analysis on the biological signatures it sees in the images, but rather the image artifacts generated by differences between submitting sites.