In a first for cancer pathology reports, the team developed a multitask convolutional neural network, or CNN--a deep learning model that learns to perform tasks, such as identifying key words in a body of text, by processing language as a two-dimensional numerical dataset.
"We use a common technique called word embedding, which represents each word as a sequence of numerical values," Alawad said.
Words that have a semantic relationship--or that together convey meaning--are close to each other in dimensional space as vectors (values that have magnitude and direction). This textual data is inputted into the neural network and filtered through network layers according to parameters that find connections within the data. These parameters are then increasingly honed as more and more data is processed.
Although some single-task CNN models are already being used to comb through pathology reports, each model can extract only one characteristic from the range of information in the reports. For example, a single-task CNN may be trained to extract just the primary cancer site, outputting the organ where the cancer was detected such as lungs, prostate, bladder, or others. But extracting information on the histological grade, or growth of cancer cells, would require training a separate deep learning model.
The research team scaled efficiency by developing a network that can complete multiple tasks in roughly the same amount of time as a single-task CNN. The team's neural network simultaneously extracts information for five characteristics: primary site (the body organ), laterality (right or left organ, if applicable), behavior, histological type (cell type), and histological grade (how quickly the cancer cells are growing or spreading).
The team's multitask CNN completed and outperformed a single-task CNN for all five tasks within the same amount of time--making it five times as fast. However, Alawad said, "It's not so much that it's five times as fast. It's that it's n-times as fast. If we had n different tasks, then it would take one-nth of the time per task."
The team's key to success was the development of a CNN architecture that enables layers to share information across tasks without draining efficiency or undercutting performance.
"It's efficiency in computing and efficiency in performance," Alawad said. "If we use single-task models, then we need to develop a separate model per task. However, with multitask learning, we only need to develop one model--but developing this one model, figuring out the architecture, was computationally time consuming. We needed a supercomputer for model development."