Large-scale drug safety surveillance and pharmacovigilance is a key component of effective drug regulation systems, clinical practice and public health programs[1]. Although a drug’s efficacy and safety must be demonstrated in a series of clinical trials prior to approval[2], many adverse drug events (ADEs) are detected only after a drug has been marketed when it is used by a larger and more diverse population than during clinical trials. ADEs discovered after a drug is in broad use can be a significant cause of morbidity and mortality, so effective and accurate post-market drug surveillance is in urgent demand for the protection of public health and the reduction of healthcare expenditures due to ADE-related hospital complications[3–5].
Spontaneous reporting systems (SRSs) [6–9] has been traditionally used for pharmacovigilance. However, this type of data is inherently passive because except for drug companies SRS reporting is voluntary, and studies have shown that as many as 90% of serious ADEs go unreported[10]. Electronic health records (EHRs) contain real-time, real-world clinical data gathered during routine clinical care, offering a potentially more proactive approach to pharmacovigilance[2]. Therefore EHRs for post-market surveillance plays an important role in the new paradigm of drug regulation[11]. More importantly, compared with structured data or coded data in EHRs, unstructured clinical narratives provide more information on ADE documentation. A study shows that only 9020 (28.6%) out of 31,531 patients with documented statin side effects had the relevant ADE recorded in a structured format[12]. Therefore, developing advanced natural language processing (NLP) techniques to unpin ADE information from EHR narratives will greatly facilitate proactive, accurate, and efficient post-market drug safety monitoring on a large scale.
In 2010, i2b2 partnered with VA Salt Lake City Health Care System and organized an NLP open challenge[13], which supports the community efforts in applying NLP to extract medication, treating targets, and caused adverse events from EHR narratives. But the annotation schema defined for that challenge only covers a limited set of entities relevant for pharmacovigilance. To further/better assess the current methodological progress in this research area, we organized “NLP Challenge for Detecting Medication and Adverse Drug Events from Electronic Health Records (MADE 1.0)” in 2018, which offers a larger scale expert annotated clinical notes labeled with more fine-grained clinical named entities and relations related to drug safety surveillance. There are 15 teams from 7 countries registered in this challenge and in total 41 runs from 11 teams were submitted.
Part of this theme issue of Drug Safety is to present recent advances in mining unstructured information from clinical narratives in the context of drug safety surveillance and pharmacovigilance. There are 5 articles from the MADE1.0 challenge, including an overview paper and 4 research papers which were invited from top performance teams participating in the challenge.
The first paper by Jagannatha et al.[14], provides an overview of the MADE1.0 challenge. First, the article describes the MADE1.0 corpus, including the details about the annotation process and a comprehensive annotation schema. They report the Fleiss’s Kappa score of 0.628 and 0.424 for the inter-annotator agreement on named entity annotation and relation annotation respectively. Second, the authors introduce the three subtasks defined in the challenge: Named Entity Recognition (NER), Relation Identification (RI), and Joint Relation Extraction (NER-RI), followed by a comprehensive report of system submissions for the challenge. Finally, an ensemble-based system aggregation has shown improved performance, suggesting that the top performing systems learn things in a different but complementary manner.
Wunnava et al.[15] presents a three-layer deep learning architecture for the NER subtask, consisting of a bi-directional long short-term memory (BiLSTM) layer for character-level encoding, a BiLSTM layer for word-level encoding, and a conditional random fields (CRF) layer for structured prediction. To better handle the noisy format of clinical notes, they built a rule- based sentence and word tokenizer leading to a better performance compared with using off-the- shelf NLTK toolkit [16]. Their system achieved the best micro F1 score of 0.829 for NER, and they found character-level encoding and CRF sequence inference contribute to performance improvement.
Yang et al.[17] applied the similar BiLSTM-CRF structure for NER, which is combined with a support vector machines (SVM) based relation extraction system to address all the three tasks. They trained BiLSTM-CRF in two stages: optimize parameters based on validation data and then train a final model using both validation and training data, leading to better results than using the validation-optimized model directly. Their experiments demonstrate that developing separate classifiers to handle intra-sentence and inter-sentence relations respectively obtained better performance (F1 score of 0.8466) than one single classifier for both (F1 score of 0.8304).
Dandala et al.[18] employed two deep learning architectures in this challenge: BiLSTM-CRF model for NER and BiLSTIM model with attention mechanism for RI. In addition to character/word level embeddings, part-of-speech embeddings were also utilized for input encoding in both models. Based on the observation that “adverse drug events” and “indications” entities have semantic overlap with “other sign and symptoms”, they experimented a joint modeling method where those three types of entities are first merged into one category for NER model and their relations with medications determined by RI model were in turn used to distinguish those three types of entities. Experimental results show the joint modeling approach outperformed the standard sequential model for the integrated NER-RI subtask (micro F1 of 0.653 vs. 0.624).
Chapman et al.[19] explore traditional machine learning models, CRF for NER and random forest (RF) for RI, which were shown more computationally efficient and thus easily deployed in real-world applications without depending on special high-performing infrastructure. As part of the feature engineering effort, they included word embeddings as clustering features trained with Mini-batch K-means, in which multiple cluster sizes and compound cluster features were also examined. Compared with the counterpart deep learning models, their system achieved competitive overall results through effective feature engineering, yielding the best micro F1 of 0.8684 for RI subtask.
While the performance reported in this challenge is promising, there is much room for further improvement, especially for the complex joint NER-RI task. The design of better learning algorithms and the availability of more labeled data are two important aspects contributing to improved system performance. Another future direction is to validate and increase the existing systems’ generalizability on larger scale datasets from diverse clinical subspecialties. That may require more efforts in building annotated data as well as exploring effective domain adaption techniques for data-scarce subspecialties. Finally, it would be essential to investigate how to effectively integrate a large volume of diverse, dynamic, distributed structured or unstructured data from different sources such as SRS reports, EHRs, insurance claims, medical literature, and social media for collective ADE signal detection.
Data mining EHRs for drug safety surveillance, especially mining unstructured narratives through NLP, will remain an active research topic. The innovative approaches reported in this them issue, which were motivated by the MADE1.0 challenge, will lay a solid foundation for further advancing methodological development and system deployment towards more intelligent drug safety surveillance.
We would like to thank the editors and the manuscript authors for their contributions to this issue. We would also like to thank all the reviewers for their comments and thoughtful suggestions for improving the submitted drafts.
Acknowledgments
Funding
It was supported by National Heart, Lung, and Blood Institute(NHLBI) of the National Institutes of Health(R01HL125089). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Footnotes
Compliance with Ethical Standards
Conflicts of Interest
Feifan Liu has no conflicts of interest that are directly relevant to the content of this study.
Abhyuday Jagannatha has no conflicts of interest that are directly relevant to the content of this study.
Hong Yu has no conflicts of interest that are directly relevant to the content of this study.
Reference
- 1.Jeetu G, Anusha G. Pharmacovigilance: A Worldwide Master Key for Drug Safety Monitoring. J Young Pharm 2010;2:315–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Coloma PM, Trifirò G, Patadia V, Sturkenboom M. Postmarketing safety surveillance : where does signal detection using electronic healthcare records fit into the big picture? Drug Saf 2013;36:183–97. [DOI] [PubMed] [Google Scholar]
- 3.Bates DW, Spell N, Cullen DJ, Burdick E, Laird N, Petersen LA, et al. The costs of adverse drug events in hospitalized patients. JAMA: the journal of the American Medical Association 1997;277:307. [PubMed] [Google Scholar]
- 4.Nebeker JR, Hoffman JM, Weir CR, Bennett CL, Hurdle JF. High rates of adverse drug events in a highly computerized hospital. Arch Intern Med 2005;165:1111–6. [DOI] [PubMed] [Google Scholar]
- 5.Hakkarainen KM, Hedna K, Petzold M, Hägg S. Percentage of patients with preventable adverse drug reactions and preventability of adverse drug reactions--a meta-analysis. PLoS ONE 2012;7:e33236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Polepalli Ramesh B, Belknap SM, Li Z, Frid N, West DP, Yu H. Automatically Recognizing Medication and Adverse Event Information From Food and Drug Administration’s Adverse Event Reporting System Narratives. JMIR Medical Informatics 2014;2:e10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Botsis T, Nguyen MD, Woo EJ, Markatou M, Ball R. Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection. Journal of the American Medical Informatics Association [Internet] 2011. [cited 2011 Oct 21]; Available from: http://jamia.bmj.com/content/early/2011/06/27/amiajnl-2010-000022.abstract [DOI] [PMC free article] [PubMed]
- 8.Lindquist M VigiBase, the WHO Global ICSR Database System: Basic Facts. Drug Information Journal 2008;42:409–19. [Google Scholar]
- 9.Alvarez Y, Hidalgo A, Maignen F, Slattery J. Validation of statistical signal detection procedures in eudravigilance post-authorization data: a retrospective evaluation of the potential for earlier signalling. Drug Saf 2010;33:475–87. [DOI] [PubMed] [Google Scholar]
- 10.Hazell L, Shakir SAW. Under-reporting of adverse drug reactions: a systematic review. Drug Safety 2006;29:385–396. [DOI] [PubMed] [Google Scholar]
- 11.Moore TJ, Furberg CD. Electronic Health Data for Postmarket Surveillance: A Vision Not Realized. Drug Saf 2015;38:601–10. [DOI] [PubMed] [Google Scholar]
- 12.Skentzos S, Shubina M, Plutzky J, Turchin A. Structured vs. unstructured: factors affecting adverse drug reaction documentation in an EMR repository. AMIA Annual Symposium Proceedings [Internet] 2011. [cited 2013 Nov 14]. p. 1270 Available from: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3243255/ [PMC free article] [PubMed]
- 13.Uzuner Ö, South BR, Shen S, DuVall SL. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association 2011;18:552–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Jagannatha Abhyuday, Liu Feifan, Liu Weisong, Yu Hong. Overview of the First Natural Language Processing Challenge for Extracting Medication, Indication, and Adverse Drug Events from Electronic Health Record Notes (MADE 1.0). Drug Safety 2018; [DOI] [PMC free article] [PubMed]
- 15.Wunnava Susmitha, Qin Xiao, Kakar Tabassum, Sen Cansu, Rundensteiner Elke A., Kong Xiangnan. Adverse Drug Event Detection from Electronic Health Records Using Hierarchical Recurrent Neural Networks with Dual-Level Embedding. Drug Safety [DOI] [PubMed]
- 16.Bird S, Klein E, Loper E. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit O’Reilly Media, Inc.; 2009. [Google Scholar]
- 17.Yang Xi, Bian Jiang, Gong Yan, Hogan William R., Wu Yonghui. MADEx: A System for Detecting Medications, Adverse Drug Events, and their Relations from Clinical Notes. Drug Safety 2018; [DOI] [PMC free article] [PubMed]
- 18.Dandala Bharath, Joopudi Venkata, Devarakonda Murthy. Adverse Drug Events Detection in Clinical Notes by Jointly Modeling Entities and Relations using Neural Networks. Drug Safety 2018; [DOI] [PubMed]
- 19.Peterson Kelly S., Chapman Alec B., Alba Patrick R., DuVall Scott L., Patterson Olga V. Detecting Adverse Drug Events with Rapidly Trained Classification Models. Drug Safety 2018; [DOI] [PMC free article] [PubMed]