Abstract
Background:
Allergic drug reaction epidemiologic data are sparse because it remains difficult to identify true cases in large datasets using manual chart review.
Objective:
To develop and validate a novel informatics method based on natural language processing (NLP) in combination with ICD-9-CM codes that identifies allergic drug reactions in the electronic health record (EHR).
Methods:
Previously studied and high-yield ICD-9-CM codes were used to screen for possible allergic drug reactions among all inpatients admitted in 2007 and 2008. A random sample was selected for manual chart review to identify true cases of allergic drug reactions. A rule-based NLP algorithm was then developed to identify allergic drug reactions using free-text clinical notes and discharge summaries from the filtered cases. The performance of using manual chart review of ICD-9-CM codes alone was compared to ICD-9-CM codes in combination with NLP.
Results:
Of 3,907 cases identified by ICD-9-CM codes, 725 (19%) were randomly selected for manual chart review; 335 were confirmed as allergic drug reactions, resulting in a positive predictive value (PPV) of 46% (range: 18%-79%) when using ICD-9-CM codes alone. Our NLP algorithm in combination with ICD-9-CM codes achieved a PPV of 86% (range: 69% -100%). Among the 335 confirmed positive cases, NLP identified 259 true cases, resulting in a recall/sensitivity of 77% (range: 26%-100%). Among the 390 negative cases, NLP achieved a specificity of 89% (range: 69%-100%).
Conclusion:
Using NLP with ICD-9-CM codes improved identification of allergic drug reactions. The resulting decrease in manual chart review effort will facilitate large epidemiology studies of this understudied area.
Keywords: Drug allergy, Drug, Adverse drug reactions, Epidemiology, Electronic health record, Natural language processing
INTRODUCTION
An accurate understanding of the epidemiology of allergic drug reactions is important in healthcare, both for improving patient care as well as guiding public health and preventive medicine efforts. However, few studies to date have systematically assessed the epidemiology of allergic drug reactions because it is difficult to identify true cases in large datasets. Chart review is currently the most complete method for case identification, but it is time consuming and resource intensive. We previously demonstrated the utility of using a broad set of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes to capture true cases of allergic drug reactions in both the Emergency Department (ED) and inpatient settings.1,2 However, use of ICD-9-CM codes has limitations including the variability in coding among providers; even the most accurate code (dermatitis due to drug) has a sensitivity of just 80%.2 While chart review continues to be necessary for case confirmation, automated methods that use computer-based algorithms, such as natural language processing (NLP), may prove useful in reducing the magnitude of chart review needed. NLP refers to any computer system that processes, analyzes, or synthesizes human languages, in text and/or speech form.
The rapidly increasing use of Electronic Health Records (EHRs) in hospitals has launched a research discipline using large EHR datasets. Over 90% of United States hospitals have an EHR, primarily Epic, Cerner, or Athena.3,4 In the past decade, novel computational approaches and tools have allowed for large-scale and sophisticated analyses of EHR data,5–7 such as identification of rare events within large patient populations.8 NLP applications accurately categorized medication-related events to help improve the efficiency of patient safety event review, and has been used to identify allergic reactions (including those due to drugs) in the ED setting.8,9 Accordingly, NLP may serve as a complementary method to identify and study large drug allergy cohorts within EHRs without requiring as much time- and resource-intensive manual chart review.
Building on our prior research experience using ICD-9-CM codes, we developed and validated a novel informatics method based on NLP to identify allergic drug reactions from unstructured clinical notes and discharge summaries in our EHR system.
METHODS
We previously used manual chart review to study specific ICD-9-CM codes from 2005 and 2010 to identify allergic drug reactions and better characterize the epidemiology of allergic drug reactions in the inpatient setting.2 Patient data were collected from the Massachusetts General Hospital (MGH)’s Research Patient Data Registry (RPDR), an EHR data warehouse, based on ICD-9-CM codes, inpatient status, and date of hospitalization; MGH is a large teaching hospital in Boston, MA. Records with a primary or secondary ICD-9-CM code for anaphylaxis, urticaria, angioedema, allergy, rash, pruritus, and eosinophilia were identified, and manually chart reviewed to identify true cases of allergic drug reactions. The specific ICD-9-CM codes included: dermatitis due to drugs and medicines taken internally (693.0), unspecified adverse effect due to unspecified drug (995.2), anaphylaxis (995.0), allergic urticaria (708.0), angioneurotic edema (995.1), allergy unspecified (995.3), specified pruritus (698.8), unspecified pruritus (698.9), rash (782.1), urticaria unspecified (708.9), and eosinophilia (288.3).
All selected medical records were reviewed by both a trained research assistant and a board-certified allergist/immunologist. We determined allergic drug reactions by the presence of allergic symptoms (including rash, angioedema, respiratory symptoms, or anaphylaxis) that were temporally related to the initiation of drug therapy and consistent with an immunologically-mediated response to a drug. If there was disagreement between the reviewers, the case was reviewed by a second allergist.
NLP Algorithm Training
To develop the new NLP algorithm, we used the Medical Text Extraction, Reasoning, and Mapping System (MTERMS) which includes a suite of NLP solutions.9–11 A rule-based approach was used to extract allergy information from clinical free-text documents. First, each document is pre-processed, separating it into sections, sentences, and word tokens. Then, MTERMS searches each document for key words or phrases related to allergic reactions, accounting for negations and other contextual information. Using the extracted terms, each encounter was classified as associated with an allergic drug reaction or not.
Allergic drug reactions were defined as those with symptoms of rash, angioedema, respiratory symptoms, or anaphylaxis that were temporally related to the initiation of drug therapy and were consistent with an immunologically-mediated response to a drug. We created a lexicon of reaction terms (see Supplementary Table E1) using an iterative process via manual review of our training set (reviewed cases from 2005 and 2010).1,2 See Figure 1. When building the lexicon, we included hypersensitivity reactions, such as “anaphylaxis” or “rash”, and excluded side effects like “nausea” or “vomiting”. We ignored allergies listed in the “Allergy” sections because this section generally lists a patient’s allergy history rather than an allergy that occurred in the current encounter/visit. We also excluded terms found in other particular sections, including “Medications”, “Problems”, and sections of discharge instructions containing patient instructions; our goal was to consider only the clinical narrative of each note, rather than semi-structured lists, templates, or boilerplate patient instructions. We conducted an error analysis on the results to identify additional reaction terms to add to the lexicon. We repeated this process until there were no more reaction terms that could be identified.
Figure 1: Training and Testing of the NLP Algorithm1,2.
(1) Manual chart review of randomly selected inpatient charts from 2005 and 2010 was conducted to identify allergic drug reactions. These charts (n=1,357) were used to train the NLP algorithm. (2) To test the NLP algorithm, we chart reviewed records of inpatients hospitalized in 2007 (n=359) and 2008 (n=366) at MGH excluding chemotherapy desensitizations (2007, n=27; 2008, n=61). This created a combined testing data set of 725 inpatient charts.
To detect negations (such as “absent”, “denies”, or “no”), we used the NegEx algorithm.12 In addition, we created a lexicon of “negative context”, also created by manual review of the training set. Terms in this lexicon can be divided into six general categories (see Supplementary Table E2). We included additional terms (e.g. “did not clearly have”) to capture negations that may not have been covered by NegEx; similarly, we included terms (e.g. “please note that”) that indicate instructions for the patient or the clinician, not found in an instructions section. Certain phrases are indicators that the reaction is not allergic in a certain case (e.g. “pseudoallergic”) or are part of a narrative differential diagnosis (e.g. “ddx”). Finally, certain phrases are temporal indicators, for reactions that happened in the remote past (e.g. “years ago”) or are at risk of occurring in the future (e.g. “can cause”). Each term was assigned a scope, signifying whether the phrase applies before, after, or in the same sentence as a reaction. Any reaction that was negated or had associated negative context was excluded from our analysis.
In addition to not appearing within the scope of a negation term or negative context, each possible reaction also had to appear in the scope of “positive context” to be considered a true drug reaction. We found that appearing in the same sentence as the word “drug” was sufficient. Alternatively, the reaction could appear in the same sentence as a medication name, along with an indicator, such as “secondary to” or “in the setting of”, linking the medication to the reaction. The medication lexicon was generated by combining our institution’s local drug dictionary with a subset of terms from the standard terminology RxNorm.10,13 The positive context indicators were collected via manual review. If the clinical notes or discharge summary for an encounter contained at least one mention of a reaction, not negated, and with positive but not negative context, then that encounter was classified as an encounter with an allergic drug reaction.
NLP Algorithm Testing
To test the NLP algorithm, using the same methodology as in 2005 and 2010, we chart reviewed records of inpatients hospitalized in 2007 and 2008 at MGH. See Figure 1. Medical records were randomly selected and reviewed by both a research assistant trained in identifying allergic drug reactions and a board-certified allergist/immunologist. The review sample size for each code depended on the total number of cases for each code. See Table 1. For most codes, at least 20% of the total cases were reviewed; when there were a large number of cases for each code (e.g., ICD-9-CM 693.0, 782.1, 995.2), at least 10%-20% of cases were reviewed. We reviewed discharge summaries and notes from the primary team as well as notes from allergy/immunology and dermatology consultations when available.
Table 1:
Manually Reviewed Cases Identified by ICD-9-CM codes Used for NLP Algorithm Testing Set (2007 and 2008)
2007* | 2008* | 2007 and 2008* | ||||
---|---|---|---|---|---|---|
Total cases identified | Randomly selected cases to review* | Total cases identified | Randomly selected cases to review* | Total cases identified | Randomly selected cases to review* | |
ICD-9-CM Code | n | n (%) | n | n (%) | n | n (%) |
Eosinophilia (288.3) | 91 | 32 (35) | 178 | 47 (26) | 269 | 79 (29) |
Dermatitis due to drugs and medicines taken internally (693.0) | 262 | 52 (20) | 367 | 47 (13) | 629 | 99 (16) |
Specified Pruritus (698.8) | 30 | 21 (70) | 25 | 24 (96) | 55 | 45 (82) |
Unspecified Pruritus (698.9) | 120 | 42 (35) | 127 | 30 (24) | 247 | 72 (29) |
Allergic Urticaria (708.0) | 36 | 22 (61) | 102 | 19 (19) | 138 | 41 (30) |
Urticaria Unspecified (708.9) | 39 | 19 (49) | 38 | 23 (61) | 77 | 42 (55) |
Rash (782.1) | 556 | 87 (16) | 683 | 79 (12) | 1239 | 166 (13) |
Anaphylaxis (995.0) | 65 | 12 (19) | 47 | 19 (40) | 112 | 31 (28) |
Angioneurotic Edema (995.1) | 62 | 24 (39) | 77 | 18 (23) | 139 | 42 (30) |
Unspecified Adverse Effect due to Unspecified Drug (995.2) | 280 | 30 (11) | 489 | 41 (8) | 769 | 71 (9) |
Allergy Unspecified (995.3) | 87 | 18 (21) | 146 | 19 (13) | 233 | 37 (16) |
Total | 1628 | 359 (22) | 2279 | 366 (16) | 3907 | 725 (19) |
ICD-9CM, International Classification of Diseases, Ninth Revision, Clinical Modification; NLP, Natural Language Processing
Excluding chemotherapy desensitization cases
Because MGH has a large chemotherapy desensitization program that allows patients to receive a drug despite a history of allergy but requires repeat hospital admissions for each dose of the culprit drug, we did not consider drug desensitization when constructing the MTERMS algorithm and we manually removed desensitization-related visits prior to testing the sensitivity and specificity of the algorithm. This was justifiable given both the objective to identify new allergic drug reactions (not prior reactions with a desensitization plan) and that most hospitals in the United States do not perform drug desensitizations with comparable frequency.
We compared our methods to another method, using the algorithm in Goss et al.9 and similar to our approach, if the clinical notes or discharge summary for an encounter contained at least one (non-negated) mention of an allergen, the encounter was classified as positive for an allergic drug reaction.
Statistical Analysis
We combined all manually reviewed results and assessment of MTERMS into a central SAS dataset. The number and percentage of cases reviewed for each year and each ICD-9-CM were presented. The allergy status, determined by manual review, was treated as the gold standard. We calculated the positive predictive value (PPV) of all the ICD-9-CM codes together and each ICD-9-CM code individually in identifying allergic drug reactions. We then calculated the PPV (precision) of NLP in combination with ICD-9-CM codes in case identification. We further calculated sensitivity (recall), specificity, and F-measure (the weighted harmonic mean of the precision and recall) of the NLP algorithm.14 The definitions are as follows:
Measures for evaluating ICD-9-CM codes:
Measures for evaluating NLP:
These evaluation metrics for each of the ICD-9-CM codes were determined. False positives were encounters that were classified as allergic drug reactions by MTERMS, but not by manual chart review. False negatives were encounters that were classified as allergic drug reactions by manual chart review, but not by MTERMS.
RESULTS
Using ICD-9-CM Codes and Chart Review to Study Epidemiology of Allergic Drug Reactions
Using the validated drug allergy ICD-9-CM codes from our prior studies,1,2 we identified 1628 inpatient visits in 2007 and 2279 inpatient visits in 2008 with a possible allergic drug reaction. For 2007, excluding 27 chemotherapy desensitization cases, 359 of the 1628 (22.1%) possible allergic drug reactions during hospitalization were chart reviewed. See Table 1. Only 155 of these cases were confirmed as allergic drug reactions by chart review. In other words, using ICD-9-CM codes alone resulted in a PPV of 43% in identifying allergic drug reactions. See Table 2. The codes that identified the highest percentage of true allergic drug reactions by chart review were: Dermatitis due to drugs and medicines taken internally (693.0; 75%), angioneurotic edema (995.1; 71%), and unspecified adverse effect due to unspecified drug (995.2; 70%). For 2008, excluding 61 chemotherapy desensitization cases, 366 of the 2279 (16%) possible allergic drug reactions during hospitalization were chart reviewed. See Table 1. Allergic drug reactions were found in 180 (49%) of the reviewed charts. See Table 2. The codes that identified the highest percentage of true allergic drug reactions were: Dermatitis due to drugs and medicines taken internally (693.0; 83%), allergic urticaria (708.0; 74%), and anaphylaxis (995.0; 68%).
Table 2:
True Positive and Negative Cases Validated by Manual Chart Review of Selected Cases
2007* | 2008* | 2007 and 2008* | |||||
---|---|---|---|---|---|---|---|
ICD-9-CM Code | Reviewed Cases | Positive Cases (PPV; %) | Reviewed Cases | Positive Case (PPV; %) | Reviewed Cases | Positive Cases (PPV; %) | Negative cases (FPR; %) |
Eosinophilia (288.3) | 32 | 7 (22) | 47 | 18 (38) | 79 | 25 (32) | 54 (68) |
Dermatitis due to drugs and medicines taken internally (693.0) | 52 | 39 (75) | 47 | 39 (83) | 99 | 78 (80) | 21 (21) |
Specified Pruritus (698.8) | 21 | 2 (10) | 24 | 7 (29) | 45 | 9 (20) | 36 (80) |
Unspecified Pruritus (698.9) | 42 | 8 (19) | 30 | 5 (17) | 72 | 13 (18) | 59 (82) |
Allergic Urticaria (708.0) | 22 | 6 (27) | 19 | 14 (74) | 41 | 20 (49) | 21 (51) |
Urticaria Unspecified (708.9) | 19 | 5 (26) | 23 | 11 (48) | 42 | 16 (38) | 26 (62) |
Rash (782.1) | 87 | 30 (35) | 79 | 32 (41) | 166 | 62 (37) | 104 (63) |
Anaphylaxis (995.0) | 12 | 8 (67) | 19 | 13 (68) | 31 | 21 (68) | 10 (32) |
Angioneurotic Edema (995.1) | 24 | 17 (71) | 18 | 10 (56) | 42 | 27 (64) | 15 (36) |
Unspecified Adverse Effect due to Unspecified Drug (995.2) | 30 | 21 (70) | 41 | 21 (51) | 71 | 42 (59) | 29 (41) |
Allergy Unspecified (995.3) | 18 | 12 (67) | 19 | 10 (53) | 37 | 22 (60) | 15 (41) |
Total, Allergic Drug Reactions | 359 | 155 (43) | 366 | 180 (49) | 725 | 335 (46) | 390 (54) |
ICD-9CM, International Classification of Diseases, Ninth Revision, Clinical Modification; PPV: positive predictive value; FPR: false positive rate
Excluding chemotherapy desensitization cases
In total, 725 (19%) of the 3907 cases from 2007 and 2008 were chart reviewed. Allergic drug reactions were found in 335 of the reviewed charts, indicating the PPV of ICD codes was 46% (95% CI 43%, 50%). While PPV of dermatitis due to drugs and medicines taken internally was high (693.0, 79%) along with anaphylaxis (995.0, 68%) and angioneurotic edema (995.1, 64%), many ICD-9-CM codes had a PPV below 40% with chart review alone. See Table 2.
NLP System Performance
Overall, the NLP algorithm in combination with ICD-9-CM codes retrieved 301 possible positive cases from the testing set, in which 259 were confirmed by manual review as true positives, resulting in a PPV of 86% (range: 69% -100%). Among the 335 positive cases filtered by ICD-9-CM codes and confirmed by chart review, NLP was able to identify 259 true cases, resulting in a recall/sensitivity of 77% (range: 26% - 100%). NLP sensitivity was >70% with most ICD-9 codes except for specified pruritus (698.8), anaphylaxis (995.0) and angioneurotic edema (995.1). Among the 390 negative cases, NLP achieved an overall specificity of 89% (range: 69% - 100%), and had a specificity below 85% for only three ICD-9 codes: anaphylaxis, eosinophilia and unspecified. See Table 3.
Table 3:
Sensitivity, Specificity and PPV by Specific ICD-9-CM Code using NLP Algorithm in combination with ICD codes
Code | Sensitivity/recall | Specificity | PPV (Precision) | F-measure |
---|---|---|---|---|
Overall | 77% (259/335) | 89% (348/390) | 86% (259/301) | 81% |
Eosinophilia (288.3) | 100% (25/25) | 80% (43/54) | 69% (25/36) | 82% |
Dermatitis due to drugs and medicines taken internally (693.0) | 90% (70/78) | 86% (18/21) | 96% (70/73) | 93% |
Specified Pruritus (698.8) | 56% (5/9) | 100% (36/36) | 100% (5/5) | 71% |
Unspecified Pruritus (698.9) | 77% (10/13) | 100% (59/59) | 100% (10/10) | 87% |
Allergic Urticaria (708.0) | 95% (19/20) | 86% (18/21) | 86% (19/22) | 91% |
Urticaria Unspecified (708.9) | 75% (12/16) | 85% (22/26) | 75% (12/16) | 75% |
Rash (782.1) | 73% (45/62) | 93% (97/104) | 87% (45/52) | 79% |
Anaphylaxis (995.0) | 57% (12/21) | 80% (8/10) | 86% (12/14) | 69% |
Angioneurotic Edema (995.1) | 26% (7/27) | 93% (14/15) | 87.5% (7/8) | 40% |
Unspecified Adverse Effect due to Unspecified Drug (995.2) | 86% (36/42) | 69% (20/29) | 80% (36/45) | 83% |
Allergy Unspecified (995.3) | 82% (18/22) | 87% (13/15) | 90% (18/20) | 86% |
PPV: positive predictive value; NLP, Natural Language Processing; ICD-9CM, International Classification of Diseases, Ninth Revision, Clinical Modification
Although we specifically excluded chemotherapy desensitization encounters, other drug desensitization cases were not captured by our NLP algorithm. MTERMS was generally unable to distinguish between initial and subsequent admissions for desensitization. Other notable errors occurred in cases where the cause of the symptoms was unclear. In these cases, the documents contained a description of the clinician’s reasoning, presenting evidence for and against each option. As such, if the possibility of an allergic drug reaction is justified enough in a document, our algorithm will classify the encounter as positive for an allergic drug reaction, even if the clinician eventually settles on another option, and vice versa.
Additional false positives were due to our algorithm misidentifying reactions that occurred in the past, or otherwise not temporally related to the current admission; for example, “patient reportedly had developed a skin rash (mainly on lower extremity) after starting clindamycin and primaquine on [date] which had resolved after stopping those meds.” In addition, our algorithm occasionally missed negations of reactions; for example, “In response to Zosyn, she has never developed any itchiness, facial flushing, hives, rash, nausea, angioedema, difficulty breathing, chest or throat discomfort”, where “never” and “angioedema” are too far away in the sentence to have been considered related by NegEx.
Most other false negatives resulted from the reaction and culprit drug(s) appearing in separate sentences; e.g., “After surgery, she received morphine and percocet for pain control. Since then, she has been extremely itchy.” The algorithm processed each sentence individually and was unable to link allergens and reactions across sentence boundaries. Lastly, culprit drugs not in the medication lexicon was another cause of false negatives. These included misspellings (“Gardisil” for “Gardasil”) and names of drug classes (“ACEI”, “antibiotics”).
DISCUSSION
Allergic drug reactions are a consequence of modern medical practice. Identification of allergic drug reactions is important to improve patient care and minimize patient risk.15 We developed and validated a novel NLP algorithm for extracting information from clinical free-text documents throughout the EHR, including full note search. We started with a lexicon of reaction terms and used encounters from the years 2005 and 2010 to train the algorithm. The algorithm was subsequently evaluated against the current gold standard, manual specialist chart review, and showed a high performance for identification of allergic drug reactions in cases stratified by ICD codes with an overall sensitivity of 77% and specificity of 89%. These results suggest that NLP has high utility in studying the epidemiology of allergic drug reactions, especially considering the expanding use of EHRs.
There are still challenges associated with using NLP to identify allergic drug reactions in free text. However, addressing these challenges is essential given that a large percentage of EHR data are in a free-text format.16 Our algorithm relates allergens and reactions at the sentence level to specifically target this challenge. While this is sufficient for processing clinical narrative related to allergic drug reactions, where allergens and reactions appear in close proximity, this is less suitable for processing large unstructured documents, which have more complex narrative structures. In addition, the NLP method that we used in this study was rule-based, which largely relied on lexical lookup for named entity recognition and relevant context information. However, lexicons for one task may not always be suitable for another. For example, when processing allergy sections for the purpose of encoding free-text allergy entries, an allergy to “antibiotics” may be too vague to be clinically useful, so it may not matter that such a term is not included in the lexicon, whereas in this study, it still counts as evidence for a drug reaction.
Our algorithm showed important differences when compared to previous work done by Goss et al. on identifying allergy information using urgent care notes without ICD-9 codes.9 While that study also used MTERMS, their algorithm focused on finding all allergens and reactions mentioned in all notes, including those in semi-structured allergy sections of the EHR, and they included side effect reactions that were not immune-mediated or allergic in nature. Because following this practice would have led to a falsely elevated estimation of allergic drug reactions, we excluded allergies listed in the allergy sections, as well as the other semi-structured lists and instructions in developing our algorithm to more accurately capture new allergic drug reactions. However, if an allergen was identified in the unstructured portion of any note or discharge summary for an encounter, we classified the encounter as due to drug allergy. Although we treated our task as a binary classification problem (drug allergy: yes or no), in many cases, the answer is much more nuanced, and arriving at an answer requires complex clinical judgment. There were many cases where the etiology of the reaction was unclear, with varying levels of likelihood assigned to allergic drug reaction. Machine learning methods, which can assign probabilities to possible outcomes, may be more suitable than NLP for capturing the uncertainty in these cases. However, training a machine learning algorithm to gauge uncertainty requires significantly more training data than we were able to create for this study.
Drug allergy epidemiology data in the United States are lacking, largely because of the challenges with capturing drug allergy cases in large, representative datasets. Our work validating specific ICD-9-CM codes to identify allergic drug reactions in both the ED and inpatient settings provided tools to begin to address this important knowledge gap.1,2 The current study builds on the initial tools while at the same time balancing the effort required to gain this knowledge. Our NLP algorithm, validated by manual chart review, enables identification of allergic drug reactions quickly and accurately. Implementation of diagnostic coding and NLP to identify allergic drug reactions over time and across medical systems would greatly advance understanding of the epidemiology of allergic drug reactions.
The field of NLP is growing at a remarkably fast pace and becoming mature enough for practical applications to have a significant clinical impact.17 Our NLP algorithm to identify allergic drug reactions was associated with substantial time savings compared to manual chart review. While these methods can be easily applied to epidemiologic research, they can also be used to improve patient care by facilitating advanced queries to accurately determine patients’ allergies. Prior studies have shown the strength of using NLP to efficiently study large patient populations and improve patient safety.15,18 The use of computer algorithms like the one we described could be modified to find all relevant information within the EHR on allergic drug reactions for that specific patient. To provide immediate access to an accurate drug allergy history from a large EHR would enable safe medication prescribing practices in real time for patients regardless of their ability to recall or communicate their histories before the encounter was complete.
Overall, allergic drug reactions were identified in 46% of the charts reviewed in 2007 and 2008 which is higher than a 30% rate we found in our prior work from 2005 and 2010.2 The difference could be explained by the variability in billing code use by different providers or true changes in the frequency of allergic drug reactions annually. While we identified variability in allergic drug reaction coding over time, the most accurate code remained dermatitis due to drug (693.0, 79%). Using NLP, the ICD-9 code for angioneurotic edema had the lowest sensitivity (26%) but high specificity (93%). This is likely explained by MTERMS missing cases for two main reasons including the note referred to ACE inhibitor or ACEI allergy without naming a specific drug, or that angioedema and the culprit drug were mentioned in different sentences. Similarly, the ICD-9 code for anaphylaxis has a low sensitivity due to lack of clarity for the cause of the reaction in the notes or the reaction and the culprit drug were in different sentences. In general, ICD-9-CM codes are non-specific and coding variation is expected, but accuracy of research using billing codes can be improved by using multiple codes or combing codes with other clinical information.19 The ICD-10 system was developed specifically to increase billing detail and increased the number of diagnostic codes from approximately 14,000 with ICD-9-CM coding to nearly 70,000 with ICD-10 coding.20 With ICD-10’s expanded codes, and an accurate NLP algorithm, national drug allergy epidemiologic studies using EHRs will soon be possible.
A limitation of this study is that the testing sample consisted only of inpatient records from a single large academic center, and thus our results may not be generalizable to other areas of the hospital or practice locations. Even within our own institution, findings were different in the ED compared to the inpatient setting1,2 and varied from year to year. While allergic reactions were defined by experts using a standard definition, causality was not systematically determined. Terms such as itching or pseudoallergic could lead to misclassification of allergic drug reactions. Although analysis of billing codes can be helpful in identifying allergic drug reactions, it is limited by its reliance on the provider correctly coding the event, and even the most accurate code had a sensitivity of just 80%. To compile large drug allergy cohorts using billing codes, therefore, extensive chart review continues to be necessary to supplement computer-based tools like NLP.
In summary, we present a novel research tool that offers investigators a way to effectively and accurately study the epidemiology of allergic drug reactions using the EHR while simultaneously decreasing the effort required for manual chart review. While a small number of cases were miscategorized, the NLP algorithm processed free text entry of allergy information and provided a high level of sensitivity and specificity when compared to manual chart review alone. Future studies can incorporate NLP with ICD-10 codes to improve identification of allergic drug reactions in large national datasets to advance knowledge of drug allergy epidemiology. As technology continues to grow rapidly, advanced artificial intelligence methods will improve drug allergy research to encompass the complexity of these reactions. Ideally, these algorithms could also be implemented in clinical settings to provide physicians readily accessible allergy information on a specific patient’s drug reactions during patient encounters.
Supplementary Material
Highlights:
What is already known about this topic?
The epidemiology of allergic drug reactions has been difficult to define because it is difficult to identify true cases in large datasets.
What does this article add to our knowledge?
We describe a novel informatics research tool, natural language processing in combination with diagnosis codes, that improved our ability to study the epidemiology of allergic drug reactions using the electronic health record (EHR).
How does this study impact current management guidelines?
Natural language processing can be used to identify allergic drug reactions. An accurate understanding of allergic drug reaction epidemiology is important to improve patient diagnosis and treatment.
Acknowledgement
Emily Huebner, MS, for editorial assistance.
Funding
This study was supported by CRICO, NIH and Partners Healthcare and the Agency for Healthcare Research and Quality grants (R01HS024264 and R01HS025375).
Conflict of Interest: Dr. KG Blumenthal reports grants from National Institutes of Health, Massachusetts General Hospital, and the American Academy of Allergy, Asthma and Immunology during the conduct of the study. Dr. Li Zhou reports grants from National Institutes of Health, Massachusetts General Hospital, Partners Healthcare and the Agency for Healthcare Research and Quality (AHRQ), during the conduct of the study. AB, KHL, YL, RRS, and CAC have no conflicts to disclose.
Abbreviations:
- ICD-9-CM
International Classification of Diseases, Ninth Revision, Clinical Modification
- ED
Emergency Department
- NLP
Natural Language Processing
- EHR
Electronic Health Record
- RPDR
Research Patient Data Registry
- MTERMS
Medical Text Extraction, Reasoning, and Mapping System
- CI
Confidence Interval
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
REFERENCES
- 1.Saff RR, Camargo CA Jr., Clark S, Rudders SA, Long AA, Banerji A. Utility of ICD-9-CM Codes for Identification of Allergic Drug Reactions. J Allergy Clin Immunol Pract. 2016;4(1):114–9 e1. [DOI] [PubMed] [Google Scholar]
- 2.Saff RR, Li Y, Santhanakrishnan N, Camargo CA, Blumenthal KG, Zhou L, et al. Identification of Inpatient Allergic Drug Reactions Using ICD-9-CM Codes. In submission 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Adler-Milstein J, Holmgren AJ, Kralovec P, Worzala C, Searcy T, Patel V. Electronic health record adoption in US hospitals: the emergence of a digital “advanced use” divide. J Am Med Inform Assoc. 2017;24(6):1142–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.News HM. Only 4% of Hospitals Don’t use EHRs. 2016. Accessed September 28, 2019.
- 5.Murff HJ, FitzHenry F, Matheny ME, Gentry N, Kotter KL, Crimin K, et al. Automated identification of postoperative complications within an electronic medical record using natural language processing. JAMA. 2011;306(8):848–55. [DOI] [PubMed] [Google Scholar]
- 6.Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R. Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med Inform Decis Mak. 2006;6:30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Denny JC, Choma NN, Peterson JF, Miller RA, Bastarache L, Li M, et al. Natural language processing improves identification of colorectal cancer testing in the electronic medical record. Med Decis Making. 2012;32(1):188–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wang X, Hripcsak G, Markatou M, Friedman C. Active computerized pharmacovigilance using natural language processing, statistics, and electronic health records: a feasibility study. J Am Med Inform Assoc. 2009;16(3):328–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Goss FR, Plasek JM, Lau JJ, Seger DL, Chang FY, Zhou L. An evaluation of a natural language processing tool for identifying and encoding allergy information in emergency department clinical notes. AMIA Annu Symp Proc. 2014;2014:580–8. [PMC free article] [PubMed] [Google Scholar]
- 10.Zhou L, Plasek JM, Mahoney LM, Karipineni N, Chang F, Yan X, et al. Using Medical Text Extraction, Reasoning and Mapping System (MTERMS) to process medication information in outpatient clinical notes. AMIA Annu Symp Proc. 2011. ;2011:1639–48. [PMC free article] [PubMed] [Google Scholar]
- 11.Goss FR, Lai KH, Topaz M, Acker WW, Kowalski L, Plasek JM, et al. A value set for documenting adverse reactions in electronic health records. J Am Med Inform Assoc. 2018;25(6):661–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2001;34(5):301–10. [DOI] [PubMed] [Google Scholar]
- 13.Zhou L, Plasek JM, Mahoney LM, Chang FY, DiMaggio D, Rocha RA. Mapping Partners Master Drug Dictionary to RxNorm using an NLP-based approach. J Biomed Inform. 2012;45(4):626–33. [DOI] [PubMed] [Google Scholar]
- 14.N. C. MUC-4 Evaluation Metrics. Proceedings of the 4th Conference on Message Understanding 1992. [Google Scholar]
- 15.Wong A, Plasek JM, Montecalvo SP, Zhou L. Natural Language Processing and Its Implications for the Future of Medication Safety: A Narrative Review of Recent Advances and Challenges. Pharmacotherapy. 2018;38(8):822–41. [DOI] [PubMed] [Google Scholar]
- 16.Griffon N, Charlet J, Darmoni SJ. Managing free text for secondary use of health data. Yearb Med Inform. 2014;9:167–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Neveol A, Zweigenbaum P. Clinical Natural Language Processing in 2014: Foundational Methods Supporting Efficient Healthcare. Yearb Med Inform. 2015;10(1): 194–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Fong A, Harriott N, Walters DM, Foley H, Morrissey R, Ratwani RR. Integrating natural language processing expertise with patient safety event review committees to improve the analysis of medication events. Int J Med Inform. 2017;104:120–5. [DOI] [PubMed] [Google Scholar]
- 19.Davis RL, Gallagher MA, Asgari MM, Eide MJ, Margolis DJ, Macy E, et al. Identification of Stevens-Johnson syndrome and toxic epidermal necrolysis in electronic health record databases. Pharmacoepidemiol Drug Saf. 2015;24(7):684–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Prevention. CfDCa. International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM). 2019. Accessed February 4, 2019.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.