Addressing bias in radiology machine learning systems

by John R. Fischer, Senior Reporter | September 06, 2022

Suboptimal practices in the development of machine learning models increase the risk of these solutions producing biased data.

Suboptimal practices in the development of machine learning systems put them at risk of producing biased insights when applied in radiology.

But researchers at Mayo Clinic have come up with several strategies for addressing developmental problems and eliminating the risk of biased information, with the first focusing on the data handling process and the 12 suboptimal practices associated with it.

"If these systematic biases are unrecognized or not accurately quantified, suboptimal results will ensue, limiting the application of AI to real-world scenarios,” said Dr. Bradley Erickson, professor of radiology and director of the AI Lab at the Mayo Clinic, in Rochester, Minnesota, in a statement.

Your Trusted Source for Sony Medical Displays, Printers & More!

Ampronix, a Top Master Distributor for Sony Medical, provides Sales, Service & Exchanges for Sony Surgical Displays, Printers, & More. Rely on Us for Expert Support Tailored to Your Needs. Email info@ampronix.com or Call 949-273-8000 for Premier Pricing.

The data handling process consists of data collection, data investigation, data splitting and data engineering. The issues afflicting this phase include:

Data collection – improper identification of the data set, single source of data, unreliable source of data
Data investigation – inadequate exploratory data analysis, exploratory data analysis with no domain expertise, failing to observe actual data
Data splitting – leakage between data sets, unrepresentative data sets, overfitting to hyperparameters
Data engineering – improper feature removal, improper feature rescaling, mismanagement of missing data

The researchers recommend in-depth reviews of clinical and technical literature and working with data science experts to plan out data collections. They also say collections should come from multiple institutions in different countries and regions, use data from different vendors and different times, or include public data sets to incorporate diverse data sets.

"Creating a robust machine learning system requires researchers to do detective work and look for ways in which the data may be fooling you,” said Erickson. "Before you put data into the training module, you must analyze it to ensure it's reflective of your target population. AI won't do it for you."

The second and third reports discuss biases that occur when developing and evaluating the model, and when reporting findings.

The findings were published in Radiology: Artificial Intelligence, a journal of the Radiological Society of North America.



You Must Be Logged In To Post A Comment Sign In If you've already created an account, use your email address and password to sign in using the form below. Login Problems: Click here if you are having login issues. Email address: Password: Forgot your password? Login Problems? View our Legal Notice and Privacy Notice Register Registration is Free and Easy. Enjoy the benefits of The World's Leading New & Used Medical Equipment Marketplace. Register Now!