Abstract
Objective: The study objective was to evaluate the accuracy, validity, and clinical usefulness of medication error alerts generated by an alerting system using outlier detection screening.
Materials and Methods: Five years of clinical data were extracted from an electronic health record system for 747 985 patients who had at least one visit during 2012–2013 at practices affiliated with 2 academic medical centers. Data were screened using the system to detect outliers suggestive of potential medication errors. A sample of 300 charts was selected for review from the 15 693 alerts generated. A coding system was developed and codes assigned based on chart review to reflect the accuracy, validity, and clinical value of the alerts.
Results: Three-quarters of the chart-reviewed alerts generated by the screening system were found to be valid in which potential medication errors were identified. Of these valid alerts, the majority (75.0%) were found to be clinically useful in flagging potential medication errors or issues.
Discussion: A clinical decision support (CDS) system that used a probabilistic, machine-learning approach based on statistically derived outliers to detect medication errors generated potentially useful alerts with a modest rate of false positives. The performance of such a surveillance and alerting system is critically dependent on the quality and completeness of the underlying data.
Conclusion: The screening system was able to generate alerts that might otherwise be missed with existing CDS systems and did so with a reasonably high degree of alert usefulness when subjected to review of patients’ clinical contexts and details.
Keywords: medication alert systems, machine learning, electronic health records, patient safety, clinical decision support
BACKGROUND AND SIGNIFICANCE
Prescription drug errors cause substantial morbidity, mortality, and wasteful health care cost, estimated at more than $20 billion annually in the United States.1–4 Current approaches to minimize such errors include various clinical decision support (CDS) alerting systems, but they often identify only a small fraction of the errors and also suffer from high alerting rates, resulting in “alert fatigue.”5–7 In addition, they can overlook alerts related to issues that might not have been anticipated or programmed into the decision support software rules.8 Some tragic errors have occurred that fall into this category, such as prescribing one medication that sounds like another for a patient who does not have the condition targeted by the initial drug, or giving a neonate an adult dosage of a medication.
MedAware (Raanana, Israel) is a commercial software screening system developed for identification and prevention of prescription errors.9 It uses machine-learning algorithms to identify and alert for potential medication prescription errors.10 Analyzing historical electronic medical records, the system automatically generates, for each medication, a computational model that captures the population that is likely to be prescribed the medication and the clinical environment in which it is likely to be prescribed. This model can then be used to identify prescriptions that are significant statistical outliers given patients’ clinical situations, ie, medications that are rarely or never prescribed to patients in similar situations, such as birth control pills to a baby boy. Such prescriptions are flagged by the system as potential medication errors.
To evaluate MedAware’s performance in identifying medication errors, the Brigham and Women’s Hospital (BWH) Center for Patient Safety Research and Practice conducted a retrospective study of MedAware alerts using a large dataset from the Partners HealthCare BWH and Massachusetts General Hospital (MGH) homegrown outpatient electronic health record (EHR). Given well-described problems related to over-alerting and alert fatigue, we were particularly interested in evaluating the extent to which MedAware alerts represented valid and useful alerts, and we designed a study to systematically evaluate alerts generated by the MedAware system.
OBJECTIVE
The objective of this study was to evaluate the accuracy, validity, and clinical value of potential medication error alerts generated by the MedAware system.
MATERIALS AND METHODS
Study setting and patient population
The patient population of this study comprised outpatients of all ages who had at least one outpatient encounter with a clinician affiliated with BWH or MGH (2 large urban academic medical centers in Boston, with a total of 1700 beds) during the 2 year period from January 1, 2012 to December 31, 2013. The overall study, including MedAware access to the data and chart review and analysis conducted by BWH research staff, was approved by the Partners Human Research Committee (Institutional Review Board protocol 2014P001678).
Data collection and transfer
Retrospective clinical and encounter data were extracted from existing databases for this cohort of patients for a 5-year period from January 1, 2009, to December 31, 2013. Data collected included demographics, diagnoses, problem lists, outpatient and inpatient encounters, encounter clinicians, clinician specialties, procedures, medications, allergies, vital signs, and selected blood tests. Patient and clinician names and medical record numbers were removed from the dataset, and a random study ID was assigned to each patient and each clinician. A limited data set was sent to MedAware through a secure transfer system (password-protected and encrypted) for analysis.
MedAware alert system
The database was screened by the MedAware software and alerts were generated. These included a short textual description with every alert, to provide a user-friendly explanation and enable clinicians to understand and track the reasoning underlying the alert (eg, an accompanying explanation could read as follows: “DIGOXIN is prescribed while patient is an adult or younger, doesn’t have cardiac dysrhythmias, ischemic heart disease, congestive heart failure, and similar drug wasn’t used before.”).
The MedAware system generates 3 distinct types of alerts:
Clinical outliers. Medication is a marked outlier from patient’s characteristics (eg, prescribing birth control for an infant boy).
Time-dependent irregularities. Changes in blood test results indicate that a current medication is an outlier from a patient’s profile (eg, thrombocytopenia in a patient on anticoagulants).
Dosage outliers. Medication dosage is an outlier from the machine-learned dosage distribution of the medication in the population and/or the patient’s own history (eg, 180 mg dose of OxyContin as a high-dose outlier).
Chart review analysis
The study’s patient cohort was randomly split into 2 cohorts of similar size. A training cohort was used to generate MedAware’s individual medication models, and a simulation cohort was used to test the performance of the models. For those areas where there were not sufficient numbers of patients to produce valid models, the BWH data were augmented by standard practice models from previous work done by MedAware with other institutions. The alerts generated by the models on the simulation cohort provided the alerts evaluated in this study. The alert results data were then securely transferred back from MedAware to the BWH research team.
Analysis was conducted on the number and types of alerts generated. In order to assess the accuracy and clinical relevance of the alerts generated by MedAware, an enriched random sample of 300 alerts was selected for manual patient chart review by the BWH research team. The enriched random sample was selected to represent the distribution of alert category types (clinical outliers, time-dependent, dosage outliers), the most frequently occurring alerts, and a random selection across the full dataset.
Alerts were separated according to their category (clinical outliers, time-dependent, dosage outliers). Within each category, frequency counts were calculated for each unique alert type. A random sample was selected and enriched by identifying the 10 most frequently occurring alert types within each category, for a total of 30 alert types. This was done so that the results would better reflect the types of clinical alerts that clinicians typically receive in this system. For each of the 30 alert types, the 5 individual alerts to which the lowest random numbers were assigned were selected for chart review. Because some categories did not have 10 alert types and some types did not have 5 alerts, this process yielded a total of 135 alerts selected for chart review based on frequency. To reach the target 300 charts for review, the remaining 165 cases were randomly selected from the remaining alerts based on the lowest randomly assigned numbers.
Once the random sample of charts was identified, selected BWH research staff had access to the crosswalk of study ID and medical record numbers in order to identify the specific patient charts for review in the EHRs. The BWH research staff conducting chart reviews had participated in data protection training and had been approved by the Institutional Review Board. No one from MedAware had access to any patient medical record identifiers or charts.
Patient charts were reviewed to determine:
Whether the alert was accurate based on the structured and coded information that was available in the data provided to MedAware. We defined an alert as accurate if it correctly fired based on the data in the database provided to MedAware.
Whether the alert was clinically valid based on the clinical data in the patient’s EHR. We defined an alert as valid if it was accurate based on available structured/coded data, and more detailed manual chart review indicated that it appropriately captured actual clinical situations. An invalid alert was one in which additional information available upon manual chart review indicated that it was not appropriate (eg, it fired because the patient was not diabetic based on structured data, but chart review identified the patient as diabetic in the free-text notes).
Whether the alert was clinically useful, which we defined as contributing potentially useful additional information to the care of the patient that could influence the caregiver to change the drug or be reminded to consider other important clinical information.
A coding scheme was developed through a process of expert opinion, discussion, and consensus by the research team to reflect the accuracy and validity of alerts given the clinical information presented in the patient charts upon manual review (Supplementary Appendix I). A detailed coding manual was created and used to support consistent coding of the alerts across the chart reviewers. Alerts were first categorized as eligible or ineligible for review. Eligible cases were then coded as accurate or inaccurate based on structured clinical data available in the record. Accurate cases were assessed to determine if they were clinically valid in reflecting a potential medication error or not clinically valid. The usefulness of these valid alerts was then assessed for their level of value. A case was coded as high value when the alert uncovered a situation representing a significant clinical issue that was not otherwise detected, medium value when it was a true clinical outlier but there was clinical rationale for selecting the medication most often documented in the record (hence an alert would be of questionable usefulness), and less value, where it appeared that the alert did not contribute additional useful information for patient management.
Three reviewers (LAV, MV, LW) individually examined charts. Following their examinations, uncertainties or disagreements on charts between the reviewers were resolved by consensus discussion among all members of the research team and consultation with a clinician.
Because development of MedAware alerts is an iterative process (ie, based on learning from outliers and assessment of their relevance), the process of chart review and coding of the alerts was conducted using 3 versions of the MedAware system (versions 1.0, 2.0, 2.5). The system was refined across these versions based on what was learned through the chart review process regarding data quality and the MedAware algorithm (Figure 1). Version 1.0 was the original set of alerts, version 2.0 reflected changes as a result of data issue fixes and some algorithm refinement, and version 2.5 removed alerts that were generated while patients were inpatients. Inpatient cases were removed because there were not sufficient data about inpatient medications and care available in the outpatient EHR system to evaluate these alerts. Chart review results are presented for each of the 3 versions evaluated.
Figure 1.
Process of iterative development and assessment of MedAware system alerts
RESULTS
There was a total of 747 985 patients who had at least one outpatient visit with a provider affiliated with BWH or MGH from January 1, 2012 to December 31, 2013. MedAware generated 15 692 alerts on the simulation cohort (consisting of 373 992 patients) using patient data across the 5-year study period. Those 15 692 alerts represented 1706 unique alert types. The overall distribution of alert categories was clinical outliers, 29.3%; time-dependent, 66.8%; and dosage outliers, 3.9%.
The research team’s assessment and categorization of these alerts reflect the numerous challenges and limitations identified in working with existing real-world structured clinical data residing in EHR systems. For example, medication start and stop dates did not always accurately reflect when patients had active prescriptions. Another issue was that medications could be added to medication lists through medication reconciliation even though they were not actually being prescribed through this EHR system, making it difficult at times to determine accurate prescription dates. There were cases where the care provided within the Partners HealthCare system was limited to a single specialty referral visit (eg, orthopedic), making it difficult to assess the accuracy of an alert because of the lack of additional clinical information likely residing in the record of a primary care physician outside our system. As noted above, some diagnoses were discussed in free-text notes, but because providers had not added them to any structured data fields (eg, problem list or ICD9 diagnosis), the MedAware system could generate an alert that was technically accurate given available structured data but was not valid for the clinical situation (as in the example above, generating an alert regarding insulin in the absence of a diabetes diagnosis, when in fact such a diagnosis was mentioned in the free-text note).
Table 1 details the distribution of the 300 alerts with chart reviews across the different alert assessment codes for each MedAware version.
Table 1.
Comparison of alerts distribution across alert assessment categories: version 1.0 (original), version 2.0 (data issues/revised algorithm), and version 2.5 (without inpatient cases)
|
Alert Status |
|||
|---|---|---|---|
| Alert Assessment Category and Codea | Version 1.0 | Version 2.0 | Version 2.5 |
| Alert Count (% of Total) | Alert Count (% of Total) | Alert Count (% of Total) | |
| Data Issues | |||
| Ineligible | |||
| 1 Limited PHS care | 15 (5.0) | 6 (3.5) | 5 (4.0) |
| 2 Pre-2009 | 24 (8.0) | 0 (0) | 0 (0) |
| Total | 39 (13.0) | 6 (3.5) | 5 (4.0) |
| Inaccurate | |||
| 3 MedAware data issues | 29 (9.7) | 0 (0) | 0 (0) |
| 4 BWH data issues | 9 (3.0) | 4 (2.4) | 4 (3.2) |
| Total | 38 (12.7) | 4 (2.4) | 4 (3.2) |
| Not Clinically Valid | |||
| 5a Free text | 25 (8.3) | 17 (10.1) | 14 (11.1) |
| 5b Inaccurate start/stop | 19 (6.3) | 14 (8.3) | 7 (5.5) |
| Total | 44 (14.6) | 31 (18.4) | 21 (16.7) |
| Total Data Issues | 121 (40.3) | 41 (24.3) | 30 (23.8) |
| Clinically Valid Alerts | |||
| Less Value | |||
| 6a Actively managed | 37 (12.3) | 31 (18.3) | 17 (13.5) |
| 6b Small resolving issue | 5 (1.7) | 5 (2.9) | 5 (4.0) |
| 6c Planned titration | 2 (0.7) | 1 (0.6) | 1 (0.8) |
| 6d Rare event/SOC | 10 (3.3) | 3 (1.8) | 1 (0.8) |
| Total | 54 (18.0) | 40 (23.7) | 24 (19.1) |
| Medium Value | |||
| 7 Clinical outlier; no doc | 1 (0.3) | 1 (0.6) | 1 (0.8) |
| 8a Off-label Rx | 20 (6.7) | 10 (5.9) | 10 (7.9) |
| 8b Extreme clinical | 11 (3.7) | 9 (5.3) | 7 (5.5) |
| 8c Concurrent doses | 2 (0.7) | 0 (0) | 0 (0) |
| Total | 34 (11.4) | 20 (11.8) | 18 (14.2) |
| High Value | |||
| 9a Med contraindicated | 55 (18.3) | 45 (26.6) | 31 (24.6) |
| 9b Wrong dose/med/pt | 19 (6.3) | 8 (4.7) | 8 (6.4) |
| 9c IT system bug | 17 (5.7) | 15 (8.9) | 15 (11.9) |
| Total | 91 (30.3) | 68 (40.2) | 54 (42.9) |
| Total Clinically Valid | 179 (59.7) | 128 (75.7) | 96 (76.2) |
| Total Alerts | 300 (100) | 169 (100) | 126 (100) |
aSee Supplementary Appendix I for a complete list of code definitions and examples.
PHS: Partners HealthCare Systems; BWH: Brigham and Women’s Hospital; SOC: standard of care; doc: documentation; Rx: prescription; med: medication; pt: patient; IT: information technology.
Table 2 summarizes the chart review alerts that were classified as data-related issues vs those determined to be clinically valid. In addition, the perceived clinical value of the valid alerts across the 3 MedAware system versions is shown. Clinically valid alerts generated by the later versions represented 76.2% of total alerts. With further refinement of data quality and the MedAware system, the proportion of highest value valid alerts increased from 50.8% to 56.2%.
Table 2.
Percent of alerts classified as data issues or valid alerts, and value classification of valid alerts
| Alert Assessment Category | Version 1.0 | Version 2.0 | Version 2.5 |
|---|---|---|---|
| (n = 300) (%) | (n = 169) (%) | (n = 126) (%) | |
| Data Issues | 121 (40.3) | 41 (24.3) | 30 (23.8) |
| Clinically Valid Alerts | 179 (59.7) | 128 (75.7) | 96 (76.2) |
| Clinically Valid Alerts | |||
| Less Value | 54 (30.2) | 40 (31.3) | 24 (25) |
| Medium Value | 34 (19.0) | 20 (15.6) | 18 (18.8) |
| High Value | 91 (50.8) | 68 (53.1) | 54 (56.2) |
| Total | 179 (100) | 128 (100) | 96 (100) |
Figure 2 presents a flowchart of the MedAware version 2.5 findings.
Figure 2.
Alert code assessment results for MedAware Version 2.5
Discussion
MedAware is a CDS system that uses statistically derived outliers to detect potential medication errors. This differs from typical rule-based CDS alerting systems, in which clinical criteria for screening for potential problems (eg, drug-drug or drug-lab interactions) are predefined and therefore the systems only generate alerts when triggering criteria are met. Although MedAware also uses predefined rules iteratively derived and refined from its prior data mining analytics, its self-learning and self-adaptive capability allows it to automatically and continuously search for patient- and institutional-specific novel outlier patterns that could represent medication errors or problems. In our study, the time-dependent alerts, which were essentially asynchronous drug-lab alerts, used predefined rules from earlier outlier detection data mining by MedAware, whereas the dosage and clinical sentence alerts relied entirely on outlier definition/detection data from the Partners data mining (ie, using the MedAware software outlier approach to CDS that contrasts “predefined rule” types of typical CDS alerts). This approach is similar to other data-mining techniques that typically use a hypothesis-free approach to identify signals.
We tested the MedAware algorithms on a massive outpatient EHR database and demonstrated the value, limitations, and challenges of both applying the screening software and analyzing its performance. Three-quarters of the chart-reviewed alerts generated by the screening system were found to have clinical validity, representing situations where potential medication errors were identified in charts. Of these, the majority were found to be clinically useful in flagging potential medication errors or issues.
The performance of such surveillance and alerting systems is obviously critically dependent on the quality and completeness of the underlying data that are electronically screened. In the case of screening a large homegrown EHR database, we identified a variety of challenges, inconsistencies, and areas of incomplete structured data that could compromise the accuracy and validity of potential medication error alerts. The data issues complicating the evaluation of alerts in this study are likely general problems that could affect implementation of outlier alert systems in other health care settings.
Despite these challenges and limitations, 76.2% of the alerts generated by MedAware’s probabilistic, machine-learning approach were found to be valid, in which potential medication errors were identified. Of these, the majority (75.0%) were found, based on detailed chart review, to be clinically useful in flagging potential medication errors or issues, with 18.8% classified as having medium clinical value, and 56.2% of the valid alerts (42.8% of the total alerts) as having high clinical value. Thus, this detection system is able to generate medication alerts for errors that might not otherwise be detected with existing CDS systems and does so with a reasonably high degree of alert usefulness when subjected to manual chart review of the clinical details and contexts for patients receiving them. Determining any precise additional benefit above and beyond existing CDS systems is something this study did not permit us to measure but is a question we will attempt to answer in future studies comparing these outlier-based alerts to existing rule-based alerts.
Limitations
In addition to the limitations due to data imperfections that were generated by the EHR for screening by MedAware (noted above), several potential limitations of the evaluation study must also be noted. Because the Partners HealthCare (BWH and MGH) EHR we studied is a unique homegrown system, it is unclear how the findings could be generalized to other EHR systems. However, the problems we identified, eg, those related to medication reconciliation issues, are widespread and not unique to our homegrown system. On the other hand, the quality of data from our mature EHR may be better than from commercial systems at this point. Each of these has implications for the application and performance of a screening system such as MedAware.
Our chart reviewers were carefully trained, and a coding manual was developed with clear operational definitions; nonetheless, each chart we manually reviewed to assess the MedAware alert required a degree of judgment on the part of the reviewer and, when needed to adjudicate questions, the overall research team. We developed, applied, and iteratively refined a novel rating system (defining and classifying alerts as “accurate,” “valid,” with levels of “clinical value”) but have not formally validated this tool, something that would be valuable for use by other researchers in the future.
CONCLUSION
We evaluated a CDS system based on a probabilistic, machine-learning approach, based on statistically derived outliers to detect medication errors, and found that it generated potentially useful alerts with a modest rate of false positives. The performance of such a surveillance and alerting system is critically dependent on the quality and completeness of the underlying data, which presents challenges in reliably screening patient data with systems such as this. This system was able to generate alerts that might otherwise be missed with existing CDS systems and does so with a reasonably high degree of alert usefulness when subjected to review of patients’ clinical contexts and details. This type of approach will likely complement more traditional rule-based approaches to generating alerts about potential medication safety issues.
Supplementary Material
ACKNOWLEDGEMENTS
The authors would like to acknowledge the indispensable contributions of Frank Chang and Chris Herrick to data-collection activities.
SUPPLEMENTARY MATERIAL
Supplementary material are available at Journal of the American Medical Informatics Association online.
FUNDING
This work was financially supported by MedAware, Ltd. MedAware was not involved in any of the coding scheme development, chart review, data analysis, data interpretation, or manuscript preparation.
COMPETING INTERESTS
All members of the research team, except SM, received some funding from MedAware to support this evaluation effort (see Funding section below). Becton Dickinson is providing additional funding for this research team to conduct a further evaluation of the MedAware system.
DB is a co-inventor on patent no. 6029138 held by Brigham and Women’s Hospital on the use of decision support software for medical management, licensed to the Medicalis Corporation. He holds a minority equity position in the privately held company Medicalis, which develops Web-based decision support for radiology test ordering. He serves on the board for SEA Medical Systems, which makes intravenous pump technology. He consults for EarlySense, which makes patient safety monitoring systems. He receives equity and cash compensation from QPID Inc., a company focused on intelligence systems for EHRs. He receives cash compensation from CDI (Negev), Ltd., a not-for-profit incubator for health IT startups. He receives equity from Enelgy, which makes software to support evidence-based clinical decisions. He receives equity from ValeraHealth, which makes software to help patients with chronic diseases. He receives equity from Intensix, which makes software to support clinical decision-making in intensive care. He receives equity from MDClone, which produces deidentified versions of clinical data.
RR is one of the founders of a startup called Hospitech Respiration Ltd., a medical device company not related in any way to the study described in this paper.
All other authors have no other competing interests to declare.
CONTRIBUTORS
GS, LV, MV, DB, RR contributed to the conception and design.
All authors contributed to data collection, analysis, and interpretation.
GS, LV, MV, LW, RR contributed to development of the coding scheme.
GS, LV, MV, LW, RR contributed to the chart review activities.
GS, LV, MV, SM, DB, RR contributed to the drafting of the manuscript.
All authors reviewed and approved the final manuscript.
REFERENCES
- 1. Priorities Partnership. Preventing Medication Errors: A $21 Billion Opportunity. 2010. [Google Scholar]
- 2. Andel C, Davidow SL, Hollander M, Moreno DA. The economics of health care quality and medical errors. J Health Care Finance 2012;39(1):39–50. [PubMed] [Google Scholar]
- 3. Institute of Medicine (IOM). To Err is Human: Building a Safer Health System. Kohn LT, Corrigan JM, Donaldson MS. eds. Washington, D.C.: National Academy Press; 2000. [PubMed] [Google Scholar]
- 4. James JT. A New, Evidence-based Estimate of Patient Harms Associated with Hospital Care. J Patient Saf. 2013;9:122Y128. [DOI] [PubMed] [Google Scholar]
- 5. McCoy AB, Thomas EJ, Krousel-Wood M, Sittig D. Clinical decision support alert appropriateness: a review and proposal for improvement. Ochsner J. 2014;14(2):195–202. [PMC free article] [PubMed] [Google Scholar]
- 6. Hoffman S, Podgurski A. Drug-drug interaction alerts: emphasizing the evidence. J Health Law Policy. 2012;5:2012–22. [Google Scholar]
- 7. Glassman PA, Simon B, Belperio P, Lanto A. Improving recognition of drug interactions: benefits and barriers to using automated drug alerts. Med Care. 2002;40(12):1161–71. [DOI] [PubMed] [Google Scholar]
- 8. van der Sijs H, Aarts J, van Gelder T, et al. Turning off frequently overridden drug alerts: limited opportunities for doing it safely. J Amer Med Inform Assoc. 2008;15(4):439–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. MedAware, Ltd. Eliminating Prescription Errors. Our Products. MedAware Alerting System. http://www.medaware.com/our-products/. Accessed August 4,2016. [Google Scholar]
- 10. Stein G. Eliminating Prescription Errors: A Systematic Approach. HIMSS Annual Conference & Exhibition; Chicago, IL: 2015. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


