Skip to main content
Journal of the American Medical Informatics Association : JAMIA logoLink to Journal of the American Medical Informatics Association : JAMIA
editorial
. 2019 Dec 16;27(1):1–2. doi: 10.1093/jamia/ocz206

Advancing the state of the art in automatic extraction of adverse drug events from narratives

Özlem Uzuner 1,2,3, Amber Stubbs 4,, Leslie Lenert 5
PMCID: PMC6913224  PMID: 31841150

Adverse drug events (ADEs), defined as “any injuries resulting from medication use, including physical harm, mental harm, or loss of function,”1 are reported to account for approximately 30% of all adverse events,2 with results that can include repeated hospital admission and fatality. Information about causes of ADEs can be found in data that document concurrent use of multiple medications, drug interactions, and possible allergies such as “charts, laboratory [data], prescription data” and “administrative data.”3 However, much of the crucial information related to ADEs are detailed in free text narratives and are not easily accessible by computerized systems, requiring manual review and manual identification of this information. Natural language processing (NLP) holds potential for automatically extracting ADE-related information from narratives, to make it available for decision support systems that can alert clinicians to potential ADEs at the point of care.

To assess and advance the state of the art in NLP for extraction of ADEs, the National NLP Clinical Challenges (n2c2) shared task in 2018 included a track on this topic.4 This track required the identification of potential ADE mentions, along with their link to the medication that caused them, and the administration details such as the dosage, route, and frequency information related to the medication causing the ADE. The systems that tackled extraction of ADEs and related concepts primarily utilized recurrent deep neural networks consisting of bidirectional long short-term memory units, achieving performances that reached 94% in F-measure. In linking ADEs to their causes, the systems were more diverse in their methods, utilizing a range of machine learning approaches including both deep learning and more traditional methods and achieving performances that reached 96% in F-measure. These results indicate that while they are not perfect, NLP systems can successfully extract ADE information from narratives with impressive accuracy. In this editorial, we highlight 4 systems. Others are summarized in Henry et al.4

One such system, developed by Wei et al,5 incorporated deep learning and traditional machine learning approaches. The authors compared these approaches to each other and created ensembles from their combinations to benefit from their complementary strengths. They found that postprocessing the machine learning output with rules improved performance over the machine learning methods alone. Methods for jointly learning ADEs and their relationships to their causes improved performance over systems that learned ADEs and relations separately, especially for observations with smaller sample sizes.

Ensembles of individual systems were also explored by Dai et al6 and Ju et al.7 Their ensembles included conditional random fields and deep neural networks, focusing on “overlapping” entities that share part of their textual span,6 “nested” entities in which the span of one entity is subsumed in the span of the other, and “polysemous” entities in which an entity can participate in different relations depending on context.7 These systems showed that, consistently with the literature, both conditional random fields and neural networks continue to provide promising results on entity and relation extraction tasks. However, neural networks are more successful in identifying ADEs that are described in narrative passages instead of succinct phrases.

Yang et al’s8 solution to ADE extraction differed from other solutions in its incorporation of medical knowledge in the embedding layers of deep learning architectures. Their knowledge embeddings captured the semantics of concepts (ie, concept embeddings) based on a medical terminology.9 Yang et al8 found that addition of knowledge embeddings to their ADE extraction system improved precision but hurt recall, contradicting previous work on incorporating knowledge from the Unified Medical Language System10 for extraction of clinical concepts, possibly indicating a shortfall of their knowledge source in its coverage of the ADE-related concepts.

Overall, these approaches demonstrate the feasibility of automatic extraction of ADEs and related information from narratives in test settings. These approaches hold promise for incorporation of such solutions to clinical workflow for informing health care and preventing ADEs. However, further work is needed, particularly in epidemiologically representative samples, in which ADEs are infrequent. Additionally, the remaining system errors require further studies of “ambiguous language” and references that “require inference” to resolve.4 ADEs and reasons for medication administration (ie, indications) are 2 of the most frequently confused concepts. Linking ADEs to their causes and other related concepts is challenging, especially when multiple ADEs and multiple possible causes are discussed in the same context, and when the cause is separated from the ADE mention by long spans of intervening text.4 These errors can be alleviated by the incorporation of domain knowledge, such as information found in knowledge bases that outline the default values for administration, indication, and side effects of medications, so as to provide prior expectations that can be interpreted in the context of individual patients for determining the potential for an ADE for a specific case.4 Such knowledge sources can give systems the boost they need for resolving ambiguities and for distinguishing between linguistically similar concepts (eg, indications and ADEs that are both medical problems but differ in their relation to a medication), provided that the knowledge sources are comprehensive in their coverage of the concepts of interest.

FUNDING

This work was supported by the National Library of Medicine of the National Institutes of Health under Award Numbers R13LM013127 (to OU) and R13LM011411 (to OU). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

AUTHOR CONTRIBUTIONS

ÖU was the primary author of the article, with AS and LL aiding in editing and proofreading.

CONFLICT OF INTEREST STATEMENT

None declared.

REFERENCES

  • 1. Zhu J, Weingart SN. Prevention of adverse drug events in hospitals. 2017. https://www.uptodate.com/contents/prevention-of-adverse-drug-events-in-hospitals Accessed October 12, 2019.
  • 2.Office of Disease Health and Prevention. Overview: adverse drug events. https://health.gov/hcq/ade.asp Accessed October 12, 2019.
  • 3. Morimoto T, Gandhi TK, Seger AC, et al. Adverse drug events and medication errors: detection and classification methods. Qual Saf Health Care 2004; 13 (4): 306–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Henry S, Buchan K, Filannino M, Stubbs A, Uzuner O.. 2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records. J Am Med Inform Assoc 2020; 27 (1): 3–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Wei Q, Ji Z, Li Z, et al. A study of deep learning approaches for medication and adverse drug event extraction from clinical text. J Am Med Inform Assoc 2020; 27 (1): 13–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Dai HJ, Su CH, Wu CS.. Adverse drug event and medication extraction in electronic health records via a cascading architecture with different sequence labeling models and word embeddings. J Am Med Inform Assoc 2020; 27 (1): 47–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Ju M, Nguyen NTH, Miwa M, et al. An ensemble of neural models for nested adverse drug events and medication extraction with subwords. J Am Med Inform Assoc 2020; 27 (1): 22–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Yang X, Bian J, Fang R, et al. Identifying relations of medications with adverse drug events using recurrent convolutional neural networks and gradient boosting. J Am Med Inform Assoc 2020; 27 (1): 65–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Kuhn M, Campillos M, Letunic I, et al. A side effect resource to capture phenotypic effects of drugs. Mol Syst Biol 2010; 6 (1): 343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res 2004; 32 (Database issue): D267–70. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of the American Medical Informatics Association : JAMIA are provided here courtesy of Oxford University Press

RESOURCES