by John R. Fischer
, Senior Reporter | October 20, 2020
A group of researchers are critiquing a recent publication that touts a Google Health AI system, asserting that it demonstrates how concerns over data privacy lead to less transparency of information needed to validate the efficiency of such solutions.
The international group claims that the study, which was published in Nature
, exemplifies the concerns they have about the lack of transparency when publishing work on artificial intelligence algorithms for health applications. They note the restrictive data access procedures, lack of published computer codes, and unreported model parameters make it difficult for other researchers to validate or build upon the work demonstrated by the system, and say that sharing such information in an appropriate manner will still maintain patient privacy.
“For many kinds of data there is well-established precedent for what has been safely shared,” Levi Waldron, associate professor, department of epidemiology and biostatistics, CUNY Graduate School of Public Health and Health Policy, told HCB News. “Data that can't safely be made fully public is best shared through government databases that will independently handle access requests by researchers with a valid research use, such as the Database of Genotypes and Phenotypes (dbGaP) and the Sequence Read Archive (SRA).”
The AI model was released by Google Health back in January and was trained on more than 90,000 mammogram X-rays, according to VentureBeat
. A team of U.S. and British researchers evaluated the solution in a test that included 28,000 mammogram results — 25,000 from the U.K. and 3,000 from the U.S. They found that AI was not only as accurate as the human radiologists
, but that it cut false positives 5.7 percent in U.S. results and 1.2 percent in those read by British physicians.
More than 19 co-authors, including Waldron, affiliated with McGill University, the City University of New York (CUNY), Harvard University, and Stanford University assert this claim is lacking in scientific value due to the publication sharing little about the detailed methods and code in Google’s research. This includes a lack of a description of model development, data processing and training pipelines used, and the definition of several hyperparameters for the model’s architecture (the variables used by the model to make diagnostic predictions). It also did not include which variables were used to augment the data set on which the model was trained, which can “significantly” affect performance, according to the coauthors.