Skip to main content
HHS Author Manuscripts logoLink to HHS Author Manuscripts
. Author manuscript; available in PMC: 2021 Aug 1.
Published in final edited form as: Ann Emerg Med. 2019 Oct 14;76(2):230–240. doi: 10.1016/j.annemergmed.2019.07.032

The Emergency Department Trigger Tool: A Novel Approach To Screening for Quality And Safety Events

Richard T Griffey 1, Ryan M Schneider 1, Alexandre A Todorov 2
PMCID: PMC7153965  NIHMSID: NIHMS1544553  PMID: 31623935

Abstract

Objective

Trigger tools improve surveillance for harm by focusing reviews on records with “triggers,” whose presence increases the likelihood of an adverse event (AE). We refine and automate a previously developed ED Trigger Tool (EDTT) and present record selection strategies to further optimize yield.

Methods

We specified 97 triggers for extraction from our electronic record, identifying 76,264 ED visits with ≥1 trigger. We reviewed 1726 records with ≥1 trigger following standard trigger tool review process. We validated query performance against manual review, and evaluated individual triggers, retaining only those associated with AEs in the ED. We explore two approaches to enhance record selection: on number of triggers present and using trigger weights derived using LASSO logistic regression.

Results

The automated query performed well compared to manual review (sensitivity >70% for 80 triggers; specificity >92% for all). Review yielded 374 AEs (21.6 AEs per 100 records). Thirty triggers were associated with risk of harm in the ED. An estimated 10.3% of records with >1 of these triggers would include an AE in the ED. Selecting only records with ≥4 or ≥9 triggers improves yield to 17% and 34.8% respectively while use of LASSO trigger weighting enhances the yield to as high as 52%.

Conclusions

The ED trigger tool is a promising approach to improve yield, scope and efficiency of review for all-cause harm in emergency medicine. Beginning with a broad set of candidate triggers, we validated a computerized query that eliminates the need for manual screening for triggers and identified a refined set of triggers associated with AEs in the ED. Review efficiency can be further enhanced using enhanced record selection.

INTRODUCTION

Background

Despite the expanding role of the emergency department (ED) in the US health system,13 relatively little is known about adverse events (AEs) – defined as “unintended physical injury resulting from or contributed to by medical care that requires additional monitoring, treatment or hospitalization, or that results in death”4 - and the characterization of patient harm in the ED. Conditions such as increasing acuity, time pressures, frequent handoffs, and hospital crowding and boarding create an environment with a high potential for AEs. Previous studies have often focused on error rather than harm or on specific AE types (e.g. drug events) or sub-populations (e.g. asthma, boarding patients).5,6,7,8,911 It is increasingly recognized that the focus of patient safety and quality initiatives should be upon prevention of harm rather than errors, which are widespread but rarely result in AEs.

Importance

Traditional methods for AE surveillance (deaths, 72-hour return visits with admission, upgrades in care from an inpatient floor to an ICU) are low yield, porous and decades old, making it more likely that AEs go undiscovered, unreported and thus unaddressed. Commonly used detection methods combined with event reporting systems miss ~90% of AE’s,12 and up to 95% of the 10–20% of errors reported caused no patient harm.13 Many EDs rely on predefined screening criteria (e.g., upgrades in care) to select cases for record review.1416 Though EDs may feel compelled to review, in particular, deaths and “72-hour returns with admission,” these rarely involve AEs and are at best controversial indicators of quality.14,1721 The yield using these typical criteria is just under 2%, reflecting an inefficient process that likely underestimates patient harm.14

Trigger tools, such as the Global Trigger Tool (GTT) pioneered by the Institute for Healthcare Improvement (IHI) offer significantly improved and efficient surveillance, outperforming traditional methods for detecting and characterizing all-cause harm.12,22,23 These tools have been developed for a range of clinical settings,4,2431 and have been called “the premier measurement strategy for patient safety.”32,33 Trigger tools detect a broader scope of events, help establish a baseline to assess the results of improvement efforts and help direct resources to improve quality and safety for patients. Trigger tools traditionally consist of a regular review of a random sample of records by a first-level reviewer (L1, typically a nurse), for the presence of pre-defined ‘triggers’ – events that increase the likelihood that an AE is present. Finding a trigger prompts an in-depth review for evidence of an AE. Putative AEs undergo confirmatory second-level physician review (L2).

Goals of this Investigation

The ED module of the GTT consists of just 2 blunt triggers (‘Time in the ED >6 hours’, ‘Readmissions to the ED within 48 hours’), prompting our interest in developing an ED-specific tool. Further, triggers are commonly selected by expert consensus with little or no data supporting their utility. In 2013, we developed a consensus-derived ED Trigger Tool (EDTT).34 Following encouraging pilot testing,35 we set out to further evaluate and improve the EDTT. Trigger tools typically begin by selecting a random sample from the general pool of records to first screen for triggers and then review only those records containing triggers for AEs. While the trigger screening step aids efficiency by preventing a complete review of low-yield, trigger-negative records, it remains inefficient to screen records only to find they have no trigger present. We aimed to create a computerized query that automates the search for triggers, eliminating this manual screening step. This query would identify an enriched pool of records containing ≥1 trigger from which to then randomly sample for review. In the present study, we develop and validate a computerized query for triggers, empirically assess the performance of each trigger in detecting AEs, eliminating low-yield triggers; and apply techniques to further enhance to the yield of reviews.

METHODS

Study Setting and Participants

This retrospective observational study was conducted at an urban, adult, academic medical center using data from 92,859 visit records (10/1/2014 – 10/31/2015). All ED patients aged ≥18 were eligible for inclusion. Visits in which patients left without being seen were excluded. We began reviews on 12/20/2017 and concluded on 8/02/2018. This study was approved by our university institutional review board.

Study Design

We specified candidate triggers for a computerized query and compared the performance of the query against manual screening. We performed a retrospective record review using the two-tiered trigger tool approach, with dual independent first level (L1) reviews and second level confirmatory review (L2) for cases with potential AEs. We looked for associations between triggers, clinical and sociodemographic data and AEs in order to characterize patient harm and arrive at a refined, automated trigger list. Figure 1 summarizes study flow.

Figure.

Figure.

Study flow diagram. AE, Adverse event.

Trigger Definition and Development of Query

To derive a data-driven set of triggers, we began with 104 of the 114 previously-identified candidate triggers34 that were felt to be potentially capturable, mapping these to structured fields and tables in a reporting copy of our electronic medical record (Allscripts ED, Allscripts Healthcare Solutions, Chicago, IL). We queried home medications, chief complaint, triage data such as acuity and trauma leveling, keywords in certain free-text documentation, lab results, vital signs, documentation of certain procedures, ED diagnoses, ED disposition and orders for medications, labs, radiology studies, and consults. We utilized the Healthcare Cost and Utilization Project Clinical Classification System (HCUP CCS) to map ICD-9 codes to diagnosis-related triggers, CPT codes to procedural triggers, and detailed parameters for operational, laboratory-related and other triggers.

Record Selection

The query was applied to all 92,859 visits. Records were selected for review using only the query results. First, we identified all visits with at least one trigger and selected 1,726 for review. Approximately 25% of these were deliberately selected to ensure that reviewers would see each trigger, including rare ones, at least 10 times (if common enough) to allow us to evaluate: 1) the query’s ability to detect triggers; and 2) trigger associations with AEs. Rare triggers would likely have been missed with random sampling. The remaining 75% of our sample was randomly selected from the remaining records with ≥1 trigger. Second, we randomly selected 60 additional visits with no triggers to determine the AE rate in trigger-negative records. To estimate the L1 false negative rate, L2s also reviewed randomly selected records where L1 reviews did not identify an AE.

Outcomes

Our primary outcomes include: 1) performance of the computerized query in identifying triggers (sensitivity, specificity); 2) AE detection and description of AEs; 3) association of individual triggers with AEs, yielding a refined set of triggers; and 4) analyses to further optimize record selection using the refined set of triggers. We used a taxonomy scheme consistent with prior trigger tool studies, presenting the broad AE categories Patient Care, Medication, Surgery/Procedural and Other events.36 The severity of each AE was rated according to the widely-used Medication Event Reporting and Prevention (MERP) Index which includes categories ranging from temporary harm (E) to death (I).37 Events were further categorized as being present on arrival (POA) or occurring in the ED and as acts of omission or commission. Consistent with the GTT, two events in the same visit were characterized as “cascading” and counted as one event if one led to the other or contributed to the same AE.

Training

All reviewers completed online training from the IHI, including test cases, as well as additional training developed for an IHI-sponsored project and modified to include events specific to the ED. L1 reviewers performed at least 50 “dry run” practice reviews. L2 reviewers performed complete record reviews of all cases (rather than truncated summary reviews typical of trigger tools) until L1s achieved 80% sensitivity for detection of AEs. We held periodic group meetings to clarify data definitions and approaches to reviews, which were codified for our general reference to facilitate consistency.38

Record Reviews and Query Performance Evaluation

The accuracy of the computer query was determined by comparing it to the independent manual reviews. Three L1s (nurse reviewers) worked through lists of visits, assigned in pairs to each visit to complete dual independent L1 review. Using a custom web-based application, L1 reviewers first performed a complete trigger search, blind to the query results. Triggers were organized in modules (e.g., Medications, Laboratories, Procedures) to facilitate efficient manual reviews. The initial L1 ratings were saved and the query results then revealed to the L1 reviewers, with which he or she could agree or disagree. This allowed validating the query’s accuracy in identifying presence or absence of each individual trigger. The L1s recorded then entered narrative summaries for the AEs they identified, indicated if events were acts of omission or commission and whether they occurred in the ED or were present on arrival (POA). Evaluations extended into the first 24 hours of inpatient stays for admitted patients to look for AEs attributable to ED care. All visits where L1s reported an AE were randomly assigned to one of two L2s (RTG, RMS) who could agree with or modify events, disagree & decline, mark events as duplicate & redundant, and add missed AEs. To estimate the L1 false negative rate, L2s also reviewed over 500 randomly selected records where the L1 reviews had not identified an AE.

Data Analysis

We present sample demographics and report the computerized query’s sensitivity and specificity and the estimated trigger prevalences in the population. In these analyses, query results were compared to consensus ratings, derived from the dual independent L1 reviews and second level review ratings using Latent Class Analysis. We then tested pairwise associations between triggers and events using Fisher’s exact test (“unadjusted”) and logistic regression (adjusting for age, gender, ethnicity and ESI), with a Benjamini-Hochberg correction for multiple testing. This was done for ED and POA AEs separately. Trigger selection for the ED AE trigger tool was based upon significant bivariate associations between individual triggers and AEs occurring in the ED (p < 0.05, adjusted for multiple comparisons). Our query validation provides estimates of the positive and negative predictive values. Since we have query results on the entire population, we can estimate the sensitivity and specificity of the triggers. Because we know the query-based trigger frequencies in the entire population, we can also estimate AE rates and event types.

Analyses to Optimize Record Selection Using the Refined Trigger Tool

The traditional trigger tool approach would lead us to review a random sample of records. Our first enhancement of this process was to review a random sample from among those records with at least one trigger present. Because of the brevity of ED encounters, resultant lower ‘exposure time’ for the occurrence of an AE and related relative low yield, we sought additional approaches to optimize yield based on how records are selected from the enriched pool of patients with one or more triggers. We compare the performance of three methods of visit selection for review: (1) random sampling from among records with ≥1 trigger (the baseline approach); (2) selecting records from among those with a nominal threshold number of triggers present (e.g., ≥5 triggers), the ‘trigger threshold’ approach; and (3) using trigger weights (the “LASSO” approach). Trigger weights were derived from a multivariable analysis using LASSO (least absolute shrinkage and selection operator) logistic regression. This machine learning algorithm extends logistic regression with a penalty term to force a more parsimonious model to avoid overfitting (important due to the number of triggers being evaluated). The resultant regression coefficients serve as trigger weights reflecting the strength of association of a trigger with AEs. Separate LASSO models were fitted for ED and POA AEs. Using this approach, records are distributed from low to high risk for an AE, based upon their risk score, which is calculated by summing the associated LASSO weights for the triggers the records contain. Deliberately selecting records from within the higher range of this distribution (eg the records whose risk scores are in the top 10%) returns a smaller enriched set of records with a higher probability of an AE. In order to compare the LASSO and the trigger threshold approach, we set the LASSO threshold at 10% or 1% of visits and we set corresponding thresholds for number of triggers as close as possible to 10% and 1% of visits (e.g., 10% of all visits contained 4 or more triggers). Under each of these selection rules and cutoffs, we then calculated yield (percentage of visits expected to have an AE, if selected in that manner) and negative predictive value (percentage that truly do not have an AE if not flagged by the selection rule). Analyses were conducted using SAS 9.4 and the and the Python Scikit-learn library.39

RESULTS

Record selection, sociodemographics and visit characteristics

We ran our query on 92,859 visits involving 58,497 patients (Figure 1). When applied to this population-representative data set, the query identified ≥1 of 97 candidate triggers in 76,894 records (83%). We reviewed 1,726 records with ≥1 trigger and 60 records without. There were no differences in the composition of our sample and the general population in terms of gender and ethnicity (Table 1). Our sample tended to be slightly older (median age of 52 vs 47) than the population with one or more triggers, though the age ranges are comparable (32 to 65 vs. 31 to 61).

Table 1:

Sample demographics and visit characteristics

Population Study Sample
w/ Triggers w/o Triggers w/ Triggers w/o Triggers
Patient demographics(1)
Sample size 44,138 9,190 1,726 60
% Female 46.76% 43.4% 46.8% 45.0%
Median age (IQR) 46.9 (30.5 – 61.4) 32.0 (24.5 – 46.5) 52.1 (32.0 – 65.4) 30.2 (22.9 – 43.4)
Ethnicity
Black 53.1% 70.1% 54.5% 67.8%
White 45.3% 28.3% 44.2% 30.5%
Other 1.6% 1.7% 1.3% 1.7%
Visit characteristics(2) N=76,264 N=15,942 N=1726 N=60
Visit acuity
1- Resuscitation 2.07% 0.25% 8.1% 0%
2- Emergent 33.6% 12.5% 42.0% 10.2%
3- Urgent 53.6% 45.3% 44.5% 62.7%
4- Semi-Urgent 10.1% 39.9% 5.3% 27.1%
5- Non-urgent 0.61% 1.97% 0.18% 0%
Disposition
Discharged 59.7% 85.5% 43.1% 76.7%
Admitted 38.0% 13.8% 51.2% 21.7%
Transferred 1.21% 0.03% 1.9% ~0%
Expired 0.32% ~0% 2.7% ~0%
AMA 0.76% 0.66% 1.0% 1.7%
Other 0.03% 0.02% ~0% ~0%
(1)

Patient characteristics: Distinct patients. If a patient has multiple encounters, age is age at first encounter.

(2)

Visit characteristics: Visits during 10/1/2014 – 10/31/2015. AMA – left against medical advice

Query performance and Trigger Frequencies

We were able to map 97 of 104 candidate triggers to structured fields in our EMR to develop the query. Query sensitivity (compared to the manual review) exceeded 70% for 80 triggers and specificity was over 92% for all triggers (Table S1). Trigger frequencies ranged from very common (e.g., C6: ED length of stay >6 hours, 38%) to very rare (e.g., P7: Delivery in the ED, seen 16 times in ~92K visits). Most of the query triggers were rare (62 of 97 with frequency < 1%) and seven were not found at all. This reflects the trigger frequency proper, as well as imperfect sensitivity on the part of the query. Note that some triggers overlap exactly with some demographic variables (e.g., C25: Age 75+).

Adverse event detection and description

We confirmed 374 AEs in 346 visits in the 1,726 records with ≥1 trigger, for an overall yield of 21.7 AEs per 100 records (374/1726) affecting 20% of visits (346/1726; Figure 1). Our review of 60 records without triggers yielded only two events (3.3%, 2/60). The median L1 review time was 7.4 minutes (IQR 4.7 – 11.9). The specificity and sensitivity of the dual L1 review were very good. Of the 631 putative AEs identified on L1 review, 528 (84%) were accepted on L2 review. In-depth L2 reviews of 597 records with no AEs reported by the dual L1 review identified 44 missed AEs, for an L1 false negative rate of 7.3%. As we began with experienced L1 reviewers, we did not observe the learning curve seen in prior studies, including our own.35,40 Broad categories of events and proportions of events that were POA and acts of omission are provided in Table 2. We found no association between gender or ethnicity and event occurrence but found robust associations of AE occurrence with emergency severity index (ESI) and whether an event was POA. Overall, 58% of identified AEs were POA. Though POA events tended to occur among older patients (median age 58.1 vs. 50.0), the association of POA status with AEs remains significant after controlling for age.

Table 2:

Frequencies of adverse event type and categories

AEs (N=374)
Present on arrival (POA) 57.9% (52.9%–62.6%)
Omission 19.0% (16%–23%)
Category*
Patient Care 22.0% (18.0%–26.4%)
Medication 53.4% (48.3%–58.4%)
Surgery/Procedural 10.2% (7.4%–13.6%)
Other 14.5% (11.1 %–18.3%)

AE – adverse event. Category percentages do not sum exactly to 100% due to rounding

Associations of triggers with AEs

Overall, the presence of one or more of the 97 triggers was associated with a 19.1% risk of an AE (ED or POA) compared to 3.3% when no trigger is present (RR = 5.8, 95% CI: 1.5–23.0). We found thirty triggers with significant bivariate associations with AEs in the ED (Table 3). In the refined set of triggers, 60% (18/30) were conserved from the previously-determined consensus-based trigger set34 and 12 were identified from among the broader set of candidate triggers. Table S2 presents triggers associated with POA AEs. In our population, 57% of all records had at least one of the 30 triggers associated with AEs in the ED, and of these, 22.7% would be expected to have any type of AE (ED or POA) and 10.3% would be expected to contain an AE in the ED.

Table 3:

EDTT model for detecting AEs in the ED

Event Frequencies (95% CI) Lasso
Trigger RR (95%CI) Trigger present Trigger absent weight
L6 pCO2 >60 5.3 (3.4–8.2) 41.2% (25.9–57.9) 7.8% (6.6– 9.1) 0.67
C7 BiPAP/CPAP 4.7 (3.0–7.4) 36.8% (22.9–52.7) 7.8% (6.6– 9.1) --
C3 Restraint use 4.6 (3.4–6.4) 31.6% (23.6–40.5) 6.8% (5.7– 8.1) 0.88
C39 Aspiration 4.5 (2.6–7.8) 36.0% (19.5–55.5) 8.0% (6.8– 9.3) 0.38
M7 TPA 4.3 (2.5–7.5) 34.6% (18.7–53.7) 8.0% (6.8– 9.3) 1.08
C4 SBP <90 x2 4.1 (3.0–5.7) 27.2% (20.7–34.5) 6.6% (5.4– 7.9) 0.69
C13 O2 <90% 4.0 (3.0–5.5) 25.2% (19.6–31.6) 6.3% (5.1– 7.5) 0.63
P6 Central line insertion 4.0 (2.8–5.8) 29.4% (20.5–39.7) 7.3% (6.2– 8.7) --
P2 Intubation 3.9 (2.8–5.4) 26.8% (20.0–34.6) 6.9% (5.7– 8.2) --
M5 D50 3.8 (2.3–6.3) 30.0% (17.6–45.2) 7.9% (6.7– 9.2) --
P8 US guided IV 3.8 (2.3–6.1) 29.5% (17.7–44.0) 7.9% (6.7– 9.2) 0.65
M3 Heparin 3.6 (2.3–5.6) 27.9% (17.8–40.0) 7.7% (6.5– 9.0) 0.53
C19 RR <10 or >24 3.5 (2.6–4.7) 17.9% (14.6–21.6) 5.2% (4.1– 6.5) 0.35
M9 Nitroglycerin / Nicardipine / Nitroprusside 3.4 (2.1–5.5) 26.5% (15.8–40.0) 7.9% (6.7– 9.2) 0.12
C42 Acute dialysis 3.3 (1.9–5.7) 26.3% (14.4–41.7) 8.0% (6.8– 9.4) --
L8 Cr >2.0 or BUN >30 3.1 (2.3–4.2) 20.3% (15.6–25.8) 6.6% (5.4– 7.9) 0.39
M18 Opiates + benzo 3.1 (2.2–4.4) 22.2% (16.2–29.1) 7.1% (5.9– 8.4) 0.22
C53 SIRS criteria 3.0 (2.2–4.2) 13.8% (11.4–16.4) 4.5% (3.4– 5.9) 0.08
M4 Benadryl 3.0 (2.0–4.6) 23.1% (15.3–32.5) 7.6% (6.4– 8.9) 1.32
L11 BNP >300 3.0 (1.9–4.8) 23.6% (15.0–34.3) 7.8% (6.6– 9.1) --
L2 Lactate >4 2.9 (1.9–4.3) 21.8% (14.6–30.6) 7.6% (6.4– 8.9) --
M17 IV Calcium 2.9 (1.8–4.6) 22.4% (13.7–33.4) 7.9% (6.7– 9.2) --
C27 SBP>180; DBP >120 2.8 (2.1–3.8) 18.0% (14.0–22.6) 6.4% (5.3– 7.7) 0.53
L7 Troponin >.03 2.8 (2.0–3.9) 19.9% (14.5–26.2) 7.1% (6.0– 8.5) --
L4 Glucose <60 or >300 2.8 (1.9–4.1) 20.7% (14.2–28.5) 7.5% (6.3– 8.8) 0.29
C20 Temp. <35 or >38 C 2.8 (1.8–4.4) 21.8% (14.2–31.4) 7.7% (6.5– 9.1) --
C8 HR>130 2.7 (1.8–3.9) 19.7% (13.9–26.7) 7.4% (6.2– 8.7) --
C48 IV anti-hypertensives 2.4 (1.6–3.6) 18.4% (12.1–26.3) 7.7% (6.5– 9.1) --
C30 ED Boarding >6hrs 2.0 (1.4–2.8) 14.7% (10.8–19.4) 7.3% (6.1– 8.7) --
C6 ED LOS >6hours 1.9 (1.4–2.7) 10.5% (8.8–12.5) 5.5% (4.0– 7.3) 0.43

The complete table is provided in Supplemental Table S2.

RR: Relative risk, the ratio of columns (3) an (4), with 95% confidence interval.

Event Freqs: Unadjusted risk of ED AE, given presence (col 3) and absence (col 4) of that trigger. Other triggers may be present.

Lasso: Weight given from a lasso predictive model, if retained. These weights are used later to calculate an overall score, indicative of risk of occurrence of an ED AE

Approaches to enhanced record selection

Both the threshold and the LASSO approaches can further enhance yield. Table 4 summarizes the positive predictive value (PPV) for detecting AEs when selecting records in three ways: 1) the baseline approach; 2) the LASSO approach, (with cutoffs tuned to return the top 10% or 1% of visits); and 3) the threshold approach (using trigger number cutoffs corresponding to 10% and 1% of records). For ED AEs, the gain in yield with a LASSO score cutoff of 10% of visits is small (23%), while the 1% cutoff increases this to 51.7%. Using the threshold approach for detection of AEs in the ED, 11.8% of the population had 4 or more triggers, while 1% of visits will have 9 or more of the 30 associated triggers. In these cohorts, 17.2% and 34.8% would be expected to have an AE in the ED respectively. These observations suggest some benefit of enhanced record selection. For example, if an ED averages ~6,200 visits / month, setting the LASSO score cutoff to flag 1% of visits, would return 62 visits that month, of which 32 (51.7%) would be expected to contain an AE. Using a cut-off of 9+ triggers, without weighting, would again return 62 visits that month, of which about 24 (38.4%) would be expected to contain an AE.

Table 4:

Model performance stratified by adverse event occurrence

Outcome Selection rule Population % flagged Yield NPV
ED Any trigger 57% 10.3% 99.4%
4+ Triggers 11.8% 17.2% 96.9%
9+ Triggers 1% 34.8% 94.7%
Lasso-10 10% 23.1% 96.9%
Lasso-1 1% 51.7% 92.3%
POA Any trigger 27.3% 17.5% 93.6%
2+ Triggers 6.7% 31.4% 92.2%
3+ Triggers 1.3% 47.4% 90.1%
Lasso-10 10% 27.3% 92.9%
Lasso-1 1% 54.1% 90.9%

Performance of the ED Trigger Tool to detect adverse events, by AE type: occurring in the ED or present on arrival (POA). Selection rule: Different sets of triggers are used for each outcome (Table 2 and Table S2): ED (up to 30 triggers), POA (up to 9). For ED events, “9+ Triggers” implies that only visits with 9 or more of the 30 triggers are eligible for review. LASSO scores are calculated using outcome-specific models and triggers (weights in Tables 2 and S2). Population: the proportion of all visits that would be marked as eligible for review by each selection rule. A randomly selected subset of these would then be selected for actual reviews. LASSO scores were thresholded to return visits with either the top 10% (Lasso-10) or top 1% (Lasso-1) of risk scores in the population. The cutoffs for the numbers of triggers were selected to yield as close to 10% or 1% as possible. Yield: for each outcome and each selection rule, the % of records expected to have an AE of that type (ED or POA). NPV: (negative predictive value) the % of unselected records that do not have AEs. The drop in NPV with more restrictive selection reflects the known trade-off between sensitivity and specificity.

LIMITATIONS

This was a retrospective study with the limitations inherent to that design. The single-center design could impact trigger selection, AE types detected, the adverse event rate and performance of the tool. We attempt to be consistent within reviewer and within our research team, but this is likely imperfect. Our intentional selection of 25% of the cases reviewed may impact the overall estimate of performance but was necessary to evaluate triggers. The computerized query worked very well for most but not all candidate triggers, which may have lowered power to detect their association with AEs. The majority of the triggers consist of laboratory values, vital signs, medication orders and administration, and procedures that are documented as part of clinical care, not requiring explicit documentation. For example, a trigger “oxygen saturation <90%” is based on data that are automatically uploaded from monitors into the vital signs portion of the chart. We were unable to map a handful of candidate triggers (e.g. “upgrades in trauma levelling”), as there were no structured fields available to capture the information in our EMR. For a minority of triggers, we had to clarify our priorities in simultaneously evaluating 1) the performance of the query in detecting individual triggers; and 2) evaluations of associations between AEs and individual triggers. For example, for the trigger “diagnosis of ataxia,” if neither ataxia nor a related diagnosis from the HCUP CCS family is listed as an ED diagnosis, the query will not flag that record. However, if on manual review ataxia is found documented in other areas of the medical record, we wanted to capture this trigger as present. To address these situations, we made the a priori decision to prioritize evaluation of the trigger ‘concept’ over the trigger as specified in the query, even though this penalized the query performance. Despite this, 71 triggers had sensitivities over 80% and all had specificities above 90%. Most trigger tools consist of about 40 triggers, with decreasing return on additional triggers. A computerized query affords the ability to include rare but potent triggers. Our results are part of a derivation phase of this project and require validation on an independent sample, currently underway.

DISCUSSION

This study describes a new approach to ED quality and safety review. We set out to identify a data-driven set of triggers that was not reliant only on expert consensus to guide selection. We specified 97 candidate triggers for a computerized query demonstrating high validity compared to manual review for the majority of triggers. We then used the standard trigger tool process to identify adverse events and determined which triggers are associated with AEs, arriving at a set of 30 triggers, only 60% of which had been identified in our original consensus-derived trigger set of 46 triggers. As expected, use of all 97 candidate triggers results in a large number of records (83%) containing ≥1 trigger. The refined trigger set reduces this to 57%. While still a large proportion, this does eliminate almost half the population of records for review. Selection of cases from this still relatively large pool of visits with ≥1 trigger improves the yield for identifying AEs to 10.3%, excluding AEs that were present on arrival. This basic selection process already has superior yield for finding an AE compared to that in the sample of records we reviewed with no triggers (3.3%) or the estimated yield (1.9%) of current screening methods used in many institutions.14 Randomly selecting cases for review from among those records with ≥4 or ≥9 triggers improves yield to 17.2% and 38.4% respectively, and use of LASSO trigger weights for record selection further improves yield to as high as 52%.

Trigger tools have been developed for use in various clinical settings and in many countries, largely with excellent performance as a surveillance tool, but none has focused on harm in the ED. With limited activity in the space, there is an opportunity to re-assess how we think about and perform AE detection in the ED. Trigger tools are meant for AE surveillance. They are not targeted to capture the high-profile or catastrophic cases that are likely to be reported or referred for review through other channels. Trigger tools typically detect all-cause harm, which includes both preventable and non-preventable harm. This means they will capture non-preventable allergic reactions to medications as well as hypoxia from an iatrogenic pneumothorax. Most physicians are understandably interested in identifying preventable harm, but the determination of preventability can be challenging and problematic, most notably due to hindsight and outcome biases, as eloquently described in this journal.41 For this reason, all-cause harm is a useful and arguably better initial goal for review.

Emergency physicians will also be most interested in AEs attributable to ED care rather than those occurring prior to arrival, resulting in an ED visit. However, we felt it is important to capture POA events for a more comprehensive understanding of AEs in the ED, including the role of the ED in addressing harm occurring elsewhere in the health system. This may also be useful in identifying problems in medications, facilities, prescribers, procedures, and patterns of care. What is considered an act of omission can be a slippery slope and many trigger tools avoid these. However, as timely and life-saving interventions are defining features of ED care, failure to act (when it is compelled) is highly important from a quality and safety perspective. We were intentionally conservative, limiting these to cases where failure to complete specific actions violated standards of care, (e.g. failure to give antibiotics in a case of diagnosed septic shock, failure to enact ordered precautions, resulting in a fall with injury), excluding cases with diagnostic uncertainty. Consistency in defining acts of omission is important, which we considered at length.38 While a full description of the nature of events is beyond the scope of this report, notable findings include the predominance of Medication-related events whose proportion was over double that of Patient Care related events, which in turn was over twice that of Surgery/Procedure-related events.

Identifying a set of triggers for inclusion based on performance was challenging. An “AE” is a highly heterogeneous outcome, which complicates modeling. Individual triggers tend to associate with certain kinds of AEs and not with others: For example, INR>5 is associated with a bleeding event and not an allergic reaction; while “diphenhydramine administration” is associated with an allergic reaction and not a bleeding event. When we group different kinds of adverse events together under a single construct called “AEs” and try to identify predictors of this combined outcome, the results can sometimes be unintuitive or difficult to interpret. This is particularly evident in multivariable analysis. The LASSO, while it outperformed other multivariable modeling approaches considered, is agnostic in attempting to derive parsimonious models. It may retain a trigger, not only because of its predictive ability, but also because it excludes another highly penetrant trigger. A trigger that is moderately associated with a prevalent AE may be retained while other triggers strongly associated with less prevalent AEs may get pushed out. For reasons of usability and interpretability, we used bivariate associations with AEs for selection of triggers and limited use of the LASSO weights to demonstrate one approach to enhanced record selection. While conceptually, use of weights to calculate a risk score and then reviewing high risk cases is fairly simple, alternative approaches such as selecting records with some threshold number of triggers is also effective and may be more readily generalizable.

The efficiency of trigger tools is achieved by first screening records for the presence of triggers in order to avoid low-yield record reviews. When present, triggers then help direct AE reviews, (e.g., the trigger “INR >5” prompts a search for a bleeding event). This function may be less important for ED records, where it is fairly easy to review the entire record, compared to lengthy inpatient records that rely on the trigger to focus review on a particular section. The IHI truncates GTT record reviews at 20 minutes. Our median L1 review time, including the manual screening for 97 triggers, was 7.4 minutes, which is further improved with the automated query. Using the 97 candidate triggers on our selected record sample yielded an AE rate (PPV) of 19.1%, including both POA and ED AEs. Reduction of this set of triggers to the 30 in our “ED AE” model, slightly improves AE detection (PPV) to 22.7% (ED or POA). Other approaches to improve yield and efficiency are possible but will need further testing both for usability and to probe for potential tradeoffs between enhanced detection and the desire to capture a broad scope of AEs. Creating separate trigger sets to detect AEs that are more closely related to one another (e.g. Surgical/Procedural AEs, Medication-related AEs) may improve performance by reducing heterogeneity. In addition to setting thresholds of numbers of triggers, or using trigger weighting, more complex prediction rules (e.g., cases with trigger A + B but not C) might also improve record selection.

We note that different profiles and problem areas of EDs may drive the utility of certain triggers. Operations-focused triggers may be higher yield in centers with more boarding and throughput problems. Other triggers may be more useful in centers with higher burdens of oncology-related AEs or those associated with an older population, etc. It is also possible that trigger utilities may change over time as improvements are made that obviate their associations with AEs. We expect there may be some core set of triggers that will be useful across institutions, while the utility of other triggers will vary across sites.

There are a number of potential applications and future developments for this work. Though this might be challenging in the fast-paced environment of the ED, some recent trigger tool approaches have focused on real-time AE detection and prevention.42,43 Knowing the test characteristics of triggers in identifying AEs, and the trigger distribution in a general population presents the potential to make some broader estimates and inferences about the AE rate in that population and the types of AEs that may be present. This could allow for comparisons of observed vs expected AE rates or types, for example. Trigger sets can be tuned to look at specific event types (e.g., near misses, AEs that are present on arrival, medication events, surgical events). An ED trigger tool might also eventually be validated to compare AE rates across sites. It is not hard to imagine future tools that could include use of natural language processing (NLP) and possibly artificial intelligence to screen for both triggers and AEs. We have performed some limited natural language processing (NLP for free-text data that enhances the use of automated triggers based on structured fields in the EMR with promising results.

We envision initial use of the EDTT in routine review, (e.g., a monthly review of ~50 records). alongside external referrals of cases (found to be the highest-yield traditional screening criterion)14,19 with the goal of identifying AEs and areas for improvement in the ED. While it may be politically infeasible not to review certain events (e.g. deaths in the ED), use of other traditional criteria is low-yield and we see no other reason to continue routinely reviewing these. We are currently validating our findings on an independent sample and we are specifying our EDTT query for the Epic ASAP platform to enhance generalizability. Next steps should involve multicenter testing for validity and usability.

CONCLUSION

The ED trigger tool is a promising new approach to improve yield, scope and efficiency of quality and safety review in emergency medicine, detecting all-cause harm. Beginning with a broad set of previously-identified candidate triggers, we identified a refined set of triggers demonstrated to be associated with AEs. We created and validated a computerized query that eliminates the need for manual screening for triggers. While the yield of a random review of records with a trigger is superior to that without a trigger, we demonstrated that yield can be further enhanced using various approaches to selecting records from among the pool of those with one or more triggers. Work is underway to validate the ED trigger tool on an independent sample.

Supplementary Material

1
2

Acknowledgments:

We would like to acknowledge Dr. Phil Asaro and Aaron Papp for their assistance with query development, to our first level nurse reviewers, Christina Moran, Jaimie Rolando, Tim Fortney, and Michelle Hoelscher, and to Dr. Lee Adler for his helpful guidance.

This work is supported by grant R18 HS025052-01 (Griffey PI) from the Agency for Healthcare Research and Quality. Dr. Griffey is also supported by NIH/NIDDK grant #P30DK092950 through the Washington University Center for Diabetes Translation Research Pilot and Feasibility Program, and grants #3767 from the Barnes Jewish Hospital Foundation. The contents of this work are solely the responsibility of the authors and do not necessarily represent the official view of the AHRQ, WU-CDTR, NIDDK, NIH or the BJHF.

RTG, RMS and AAT conceived the study, designed the trial, obtained research funding. supervised the conduct of the trial and data collection and managed the data, including quality control. AAT provided statistical advice on study design and analyzed the data; RTG drafted the manuscript, and all authors contributed substantially to its revision. RTG takes responsibility for the paper as a whole.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

The authors have no conflicts of interest to report.

REFERENCES

  • 1.Pitts SR, Carrier ER, Rich EC, Kellermann AL. Where Americans get acute care: increasingly, it’s not at their doctor’s office. Health Aff (Millwood) 2010;29:1620–9. [DOI] [PubMed] [Google Scholar]
  • 2.Morganti KG, Bauhoff S, Blanchard JC, Abir M, Iyer N. The evolving role of emergency departments in the United States. Santa Monica, California: Rand; 2013. [PMC free article] [PubMed] [Google Scholar]
  • 3.Schuur JD, Venkatesh AK. The growing role of emergency departments in hospital admissions. The New England journal of medicine 2012;367:391–3. [DOI] [PubMed] [Google Scholar]
  • 4.Griffin FA, Resar RK. IHI Global Trigger Tool for Measuring Adverse Events (Second Edition). Cambridge, Massachusetts: Institute for Healthcare Improvement; 2009. [Google Scholar]
  • 5.Fordyce J, Blank FS, Pekow P, et al. Errors in a busy emergency department. Ann Emerg Med 2003;42:324–33. [DOI] [PubMed] [Google Scholar]
  • 6.Stang AS, Wingert AS, Hartling L, Plint AC. Adverse events related to emergency department care: a systematic review. PloS one 2013;8:e74214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Calder LA, Forster A, Nelson M, et al. Adverse events among patients registered in high-acuity areas of the emergency department: a prospective cohort study. Cjem 2010;12:421–30. [DOI] [PubMed] [Google Scholar]
  • 8.Hafner JW Jr., Belknap SM, Squillante MD, Bucheit KA. Adverse drug events in emergency department patients. Ann Emerg Med 2002;39:258–67. [DOI] [PubMed] [Google Scholar]
  • 9.Liu SW, Thomas SH, Gordon JA, Hamedani AG, Weissman JS. A Pilot Study Examining Undesirable Events Among Emergency Department-Boarded Patients Awaiting Inpatient Beds. Annals of Emergency Medicine 2009;54:381–5. [DOI] [PubMed] [Google Scholar]
  • 10.Camargo CA Jr., Tsai CL, Sullivan AF, et al. Safety climate and medical errors in 62 US emergency departments. Ann Emerg Med 2012;60:555–63 e20. [DOI] [PubMed] [Google Scholar]
  • 11.Liu SW, Chang Y, Weissman JS, et al. An Empirical Assessment of Boarding and Quality of Care: Delays in Care Among Chest Pain, Pneumonia, and Cellulitis Patients. Academic Emergency Medicine 2011;18:1339–48. [DOI] [PubMed] [Google Scholar]
  • 12.Classen DC, Resar R, Griffin F, et al. ‘Global trigger tool’ shows that adverse events in hospitals may be ten times greater than previously measured. Health Aff (Millwood) 2011;30:581–9. [DOI] [PubMed] [Google Scholar]
  • 13.IHI Global Trigger Tool for Measuring Adverse Events. Institute for Healthcare Improvement, 2014. (Accessed September 2014, at http://www.ihi.org/resources/Pages/Tools/IHIGlobalTriggerToolforMeasuringAEs.aspx.)
  • 14.Griffey RT, Schneider RM, Sharp BR, et al. Description and Yield of Current Quality and Safety Review in Selected US Academic Emergency Departments. Journal of Patient Safety 2017. [DOI] [PubMed] [Google Scholar]
  • 15.Aaronson EL, Wittels KA, Nadel ES, Schuur JD. Morbidity and Mortality Conference in Emergency Medicine Residencies and the Culture of Safety. West J Emerg Med 2015;16:810–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Seigel TA, McGillicuddy DC, Barkin AZ, Rosen CL. Morbidity and Mortality conference in Emergency Medicine. The Journal of emergency medicine 2010;38:507–11. [DOI] [PubMed] [Google Scholar]
  • 17.Rising KL, Victor TW, Hollander JE, Carr BG. Patient Returns to the Emergency Department: The Time-to-return Curve. Acad Emerg Med 2014;21:864–71. [DOI] [PubMed] [Google Scholar]
  • 18.Pham JC, Kirsch TD, Hill PM, DeRuggerio K, Hoffmann B. Seventy-two-hour returns may not be a good indicator of safety in the emergency department: a national study. Academic emergency medicine : official journal of the Society for Academic Emergency Medicine 2011;18:390–7. [DOI] [PubMed] [Google Scholar]
  • 19.Griffey RT, Bohan JS. Healthcare provider complaints to the emergency department: a preliminary report on a new quality improvement instrument. Quality & safety in health care 2006;15:344–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Cheng J, Shroff A, Khan N, Jain S. Emergency Department Return Visits Resulting in Admission: Do They Reflect Quality of Care? Am J Med Qual 2016;31:541–51. [DOI] [PubMed] [Google Scholar]
  • 21.Sabbatini AK, Kocher KE, Basu A, Hsia RY. In-Hospital Outcomes and Costs Among Patients Hospitalized During a Return Visit to the Emergency Department. JAMA 2016;315:663–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Landrigan CP, Parry GJ, Bones CB, Hackbarth AD, Goldmann DA, Sharek PJ. Temporal trends in rates of patient harm resulting from medical care. The New England journal of medicine 2010;363:2124–34. [DOI] [PubMed] [Google Scholar]
  • 23.Office of the Inspector General. Adverse Events in Hospitals: Methods for Identifying Events. In: Department of Health and Human Services, ed. 2010. [Google Scholar]
  • 24.Suarez C, Menendez MD, Alonso J, Castano N, Alonso M, Vazquez F. Detection of adverse events in an acute geriatric hospital over a 6-year period using the Global Trigger Tool. Journal of the American Geriatrics Society 2014;62:896–900. [DOI] [PubMed] [Google Scholar]
  • 25.Sharek PJ, Horbar JD, Mason W, et al. Adverse events in the neonatal intensive care unit: Development, testing, and findings of an NICU-focused trigger tool to identify harm in North American NICUs. Pediatrics 2006;118:1332–40. [DOI] [PubMed] [Google Scholar]
  • 26.Najjar S, Hamdan M, Euwema MC, et al. The Global Trigger Tool shows that one out of seven patients suffers harm in Palestinian hospitals: challenges for launching a strategic safety plan. International journal for quality in health care : journal of the International Society for Quality in Health Care / ISQua 2013;25:640–7. [DOI] [PubMed] [Google Scholar]
  • 27.Mattsson TO, Knudsen JL, Brixen K, Herrstedt J. Does adding an appended oncology module to the Global Trigger Tool increase its value? International journal for quality in health care : journal of the International Society for Quality in Health Care / ISQua 2014. [DOI] [PubMed] [Google Scholar]
  • 28.Kalenderian E, Walji MF, Tavares A, Ramoni RB. An adverse event trigger tool in dentistry: a new methodology for measuring harm in the dental office. Journal of the American Dental Association 2013;144:808–14. [DOI] [PubMed] [Google Scholar]
  • 29.Carnevali L, Krug B, Amant F, et al. Performance of the adverse drug event trigger tool and the global trigger tool for identifying adverse drug events: experience in a Belgian hospital.The Annals of pharmacotherapy 2013;47:1414–9. [DOI] [PubMed] [Google Scholar]
  • 30.Szekendi MK, Sullivan C, Bobb A, et al. Active surveillance using electronic triggers to detect adverse events in hospitalized patients. Quality & safety in health care 2006;15:184–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Resar RK, Rozich JD, Simmonds T, Haraden CR. A trigger tool to identify adverse events in the intensive care unit. Joint Commission journal on quality and patient safety / Joint Commission Resources 2006;32:585–90. [DOI] [PubMed] [Google Scholar]
  • 32.Agency for Healthcare Research and Quality. AHRQ Web M&M: In Conversation with…David C. Classen In: Wachter RM, ed. Perspectives in Safety. AHRQ: Agency for Healthcare Research and Quality; 2012. [Google Scholar]
  • 33.The Emergence of the Trigger Tool as the Premier Measurement Strategy for Patient Safety. 2012. 2105, at https://psnet.ahrq.gov/perspectives/perspective/120/the-emergence-of-the-trigger-tool-as-the-premier-measurement-strategy-for-patient-safety.) [PMC free article] [PubMed]
  • 34.Griffey RT, Schneider RM, Adler LM, et al. Development of an Emergency Department Trigger Tool Using a Systematic Search and Modified Delphi Process. J Patient Saf 2016. [DOI] [PubMed] [Google Scholar]
  • 35.Griffey RT, Schneider RM, Sharp BR, et al. Multicenter Test of an Emergency Department Trigger Tool for Detecting Adverse Events. J Patient Saf 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Griffey RT, Schneider RM, Todorov AA, et al. Critical Review, Development, and Testing of a Taxonomy for Adverse Events and Near Misses in the Emergency Department. Academic emergency medicine : official journal of the Society for Academic Emergency Medicine 2019;26:670–9. [DOI] [PubMed] [Google Scholar]
  • 37.Hartwig SC, Denger SD, Schneider PJ. Severity-indexed, incident report-based medication error-reporting program. Am J Hosp Pharm 1991;48:2611–6. [PubMed] [Google Scholar]
  • 38.Griffey RT, Schneider RM, Sharp BR, Vrablik MC, Adler L. Practical Considerations in Use of Trigger Tool Methodology in the Emergency Department. J Patient Saf 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 2011;12(Oct):2825–30. [Google Scholar]
  • 40.Landrigan CP, Stockwell D, Toomey SL, et al. Performance of the Global Assessment of Pediatric Patient Safety (GAPPS) Tool. Pediatrics 2016;137. [DOI] [PubMed] [Google Scholar]
  • 41.Wears RL, Nemeth CP. Replacing hindsight with insight: toward better understanding of diagnostic failures. Ann Emerg Med 2007;49:206–9. [DOI] [PubMed] [Google Scholar]
  • 42.Sammer C, Miller S, Jones C, et al. Developing and Evaluating an Automated All-Cause Harm Trigger System. Joint Commission journal on quality and patient safety / Joint Commission Resources 2017;43:155–65. [DOI] [PubMed] [Google Scholar]
  • 43.Stockwell DC, Bisarya H, Classen DC, et al. Development of an Electronic Pediatric All-Cause Harm Measurement Tool Using a Modified Delphi Method. J Patient Saf 2014. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

RESOURCES