The model can be adapted to learn patterns of any disease or condition. But the ability to detect the daily voice-usage patterns associated with vocal cord nodules is an important step in developing improved methods to prevent, diagnose, and treat the disorder, the researchers say. That could include designing new ways to identify and alert people to potentially damaging vocal behaviors.
Joining Gonzalez Ortiz on the paper is John Guttag, the Dugald C. Jackson Professor of Computer Science and Electrical Engineering and head of CSAIL’s Data Driven Inference Group; Robert Hillman, Jarrad Van Stan, and Daryush Mehta, all of Massachusetts General Hospital’s Center for Laryngeal Surgery and Voice Rehabilitation; and Marzyeh Ghassemi, an assistant professor of computer science and medicine at the University of Toronto.

Ad Statistics
Times Displayed: 113476
Times Visited: 6766 MIT labs, experts in Multi-Vendor component level repair of: MRI Coils, RF amplifiers, Gradient Amplifiers Contrast Media Injectors. System repairs, sub-assembly repairs, component level repairs, refurbish/calibrate. info@mitlabsusa.com/+1 (305) 470-8013
Forced feature-learning
For years, the MIT researchers have worked with the Center for Laryngeal Surgery and Voice Rehabilitation to develop and analyze data from a sensor to track subject voice usage during all waking hours. The sensor is an accelerometer with a node that sticks to the neck and is connected to a smartphone. As the person talks, the smartphone gathers data from the displacements in the accelerometer.
In their work, the researchers collected a week’s worth of this data — called “time-series” data — from 104 subjects, half of whom were diagnosed with vocal cord nodules. For each patient, there was also a matching control, meaning a healthy subject of similar age, sex, occupation, and other factors.
Traditionally, experts would need to manually identify features that may be useful for a model to detect various diseases or conditions. That helps prevent a common machine-learning problem in health care: overfitting. That’s when, in training, a model “memorizes” subject data instead of learning just the clinically relevant features. In testing, those models often fail to discern similar patterns in previously unseen subjects.
“Instead of learning features that are clinically significant, a model sees patterns and says, ‘This is Sarah, and I know Sarah is healthy, and this is Peter, who has a vocal cord nodule.’ So, it’s just memorizing patterns of subjects. Then, when it sees data from Andrew, which has a new vocal usage pattern, it can’t figure out if those patterns match a classification,” Gonzalez Ortiz says.
The main challenge, then, was preventing overfitting while automating manual feature engineering. To that end, the researchers forced the model to learn features without subject information. For their task, that meant capturing all moments when subjects speak and the intensity of their voices.