By Jane Z. Reed, Ph.D., head of life science strategy, Linguamatics
The process of getting a new drug through development and out into the market for patient use is known to be slow and expensive. In fact, a March 2016 analysis by the Tufts Center for the Study of Drug Development found the average cost to develop a new drug was over $2.5 billion. Furthermore, market approval is not the end of the process for drug companies. In order to maintain market access for a product, companies must continue making investments to demonstrate the value of drugs to patients, health care providers, payers, and regulatory authorities.
Real World Data (RWD) is essential for understanding the benefits and risks of a drug product after regulatory approval. The FDA defines RWD as any data relating to patient outcomes gathered outside rigorous clinical trials, and Real World Evidence (RWE) as the clinical evidence regarding the usage and potential benefits or risks of a medical product derived from analysis of RWD.
RWE can shed light on real-world clinical effectiveness and on the safety profiles of products across a broad patient community, as well as provide better insights for disease epidemiology, assess patient-reported outcomes, understand product reputation management, and engage opinion leaders.
Pharmaceutical and biotech companies need RWE to understand the impact of their new products. RWE can be used to update labelling if necessary, to understand long-term effectiveness, to further demonstrate safety, tolerability or outcome superiority, or to identify subpopulations where products work better.
RWD and the free-text challenge
There are many sources of RWD, including EHRs, adverse event reports, social media, and customer call transcripts. Much of the information is hidden in unstructured text, which often provides a level of detail and granularity not available from the structured fields. Information in free text can be hugely valuable, but extracting the buried data and mapping these to standards is a challenge.
Artificial intelligence technologies such as natural language process (NLP)-based text mining provide a solution to transform unstructured data into actionable intelligence for decision-making.
NLP-based text mining: transforming RWD into actionable RWE
With NLP-based text analytics, users can extract key details from unstructured documents using relevant ontologies and focused queries. For example, queries can be written to extract information on treatment patterns to identify drug switching or discontinuation. Numeric-based queries can search for lab values and dosage information, as well as patient-specific details such as history of disease, problem lists, demographics, social factors, and lifestyle. Ideally the technology is flexible enough to apply different business rules based on particular data sets, such as sentiments from tweets, or outcomes and treatment patterns from EHRs.
NLP is being used to transform RWD into RWE in many pharmaceutical organizations, from real world data sources such as patient-reported outcomes in web forums, and customer call transcripts.
Comparing adverse event profiles from clinical trials with data from patient forums
AstraZeneca is using NLP text-mining to examine the differences in nausea adverse reaction (AR) frequencies in clinical trials versus AR frequencies in real-world occurrences, as noted in the patient forum PatientsLikeMe. Using NLP-based text mining, AstraZeneca researchers also extract nausea AR frequencies reported in clinical trials from FDA Drug Product Labels. They are then able to demonstrate the differences in reporting rates, including those due to differences in dosage and usage.
RWD from voice of the customer calls
Patient and customer call transcripts are rich with details on patient-reported outcomes, side effects, drug interactions, and other insights that greatly impact commercial business decisions and affect post-launch product marketing and planning.
In order to gain insights into the real-world use of their drugs, another large biopharma company currently uses NLP-based text mining technology to annotate and categorize “voice of the customer” (VoC) call feeds. Researchers in the company’s predictive analytics group have built an end-to-end workflow for processing call transcripts and making sense of the unstructured feeds.
Using agile text-mining technology, researchers categorize and tag calls for key metadata, such as caller demographics and call reasons. By leveraging its use of NLP test-mining technology, the company has doubled the efficiency of their analysis and enabled longitudinal exploration of real-world patient concerns and outcomes.
Leveraging NLP-based text mining for better insights
The explosion of patient outcome data in many forms has created a trove of information that can be leveraged to advance the development and commercialization of drugs and therapies that ultimately improve the health of patients. Thanks to increasingly sophisticated technologies like NLP-based text mining, biopharma companies can take advantage of RWD to advance the innovation and delivery of their products.
Jane Reed is head of life science strategy at Linguamatics. She is responsible for developing the strategic vision for Linguamatics’ growing product portfolio and business development in the life science market. Jane has extensive experience in life science informatics. She has worked for more than 20 years in vendor companies supplying data products, data integration and analysis, and consultancy to pharma and biotech—with roles at Instem, BioWisdom, Incyte, and Hexagen. Before moving into the life science industry, Jane worked in academia with post-docs in genetics and genomics.