Abstract
Background:
Postoperative delirium is an important problem for surgical inpatients, and was the target of a multidisciplinary quality improvement project at our institution. We developed and tested a semi-automated delirium risk stratification instrument, AWOL-S, in three independent cohorts from our tertiary care hospital and describe its performance characteristics and impact on clinical care.
Methods:
The risk stratification instrument was derived with elective surgical patients who were admitted at least overnight and received at least one postoperative delirium screen (Nursing Delirium Screening Scale [NuDESC] or Confusion Assessment Method for the ICU [CAM-ICU]) and preoperative cognitive screening tests (orientation to place and ability to spell WORLD backward). Using data pragmatically collected between 12/7/2016 and 6/15/2017, we derived a logistic regression model predicting probability of delirium in the first 7 postoperative hospital days. A priori predictors included age, cognitive screening, illness severity or American Society of Anesthesiologists physical status, and surgical delirium risk. We applied model odds ratios to two subsequent cohorts (“validation” and “sustained performance”), and assessed performance using area under the receiver operator characteristic curves (AUC-ROC). A post hoc sensitivity analysis assessed performance in emergency and preadmitted patients. Finally, we retrospectively evaluated use of benzodiazepines and anticholinergic medications in patients who screened at high risk for delirium.
Results:
The logistic regression model used to derive odds ratios for the risk prediction tool included 2,091 patients. Model AUC-ROC was 0.71 [0.67–0.75], compared with 0.65 [0.58–0.72] in the validation (n=908) and 0.75 [0.71–0.78] in the sustained performance (n=3,168) cohorts. Sensitivity was approximately 75% in the derivation and sustained performance cohorts; specificity was approximately 59%. The AUC-ROC for emergency and preadmitted patients was 0.71 [0.67–0.75; n=1,301). After AWOL-S was implemented clinically, patients at high risk for delirium (n=3,630) had 21% [3–36%] lower relative risk of receiving an anticholinergic medication perioperatively, after controlling for secular trends.
Conclusions:
The AWOL-S delirium risk stratification tool has moderate accuracy for delirium prediction in a cohort of elective surgical patients, and performance is largely unchanged in emergent/preadmitted surgical patients. Using AWOL-S risk stratification as a part of a multidisciplinary delirium reduction intervention was associated with significantly lower rates of perioperative anticholinergic, but not benzodiazepine, medications in those at high risk for delirium. AWOL-S offers a feasible starting point for electronic medical record-based postoperative delirium risk stratification, and may serve as a useful paradigm for other institutions.
Introduction:
Postoperative delirium is a common complication of major surgery, and is associated with injurious falls, prolonged hospitalization, institutional discharge, and death.1 Evidence-based methods of preventing postoperative delirium exist, although effective preventative methods such as the Hospital Elder Life Program2 tend to be resource-intensive, limiting their broad application across an unselected population. Many risk stratification tools have been published in the perioperative setting,3–7 but performance of published instruments is widely variable, and internal validation studies are rarely undertaken, making it difficult to confirm validity even in a single-center population. Furthermore, while predicting postoperative delirium is an important goal, implementation of a risk prediction tool in clinical practice offers a major opportunity for quality improvement,1 ideally including a mechanism to make the risk stratification results available to all care providers via the electronic medical record (EMR), and facilitating deployment of institutional resources to prevent delirium in those at high risk.
The University of California, San Francisco (UCSF) Delirium Reduction Campaign was a health-system-wide initiative to broadly apply the AWOL8 (Age, WORLD backwards, Orientation, iLlness severity) delirium risk stratification tool to medical inpatients. To extend this initiative to elective surgical patients, we adapted the AWOL tool – now termed AWOL-S – and describe its performance in 3 distinct epochs: the derivation period, a validation period, and finally, a cohort of patients following broad roll-out of postoperative delirium prevention and diagnosis protocols to assess sustained performance despite institutional practice changes. Because emergent surgical patients and those preadmitted for optimization prior to planned surgery were not included in the derivation of the tool, we also describe the incidental performance of the tool in these important high-risk patients. Finally, in a post-hoc analysis, we examine trends of benzodiazepine and anticholinergic use in patients who screened at high risk for delirium to assess whether use of these medications decreased.
Methods:
Setting & ethical approval:
UCSF is an academic medical center consisting of two inpatient hospitals and 706 inpatient beds within the city of San Francisco. This work was performed as a quality improvement initiative at UCSF Medical Center, and the creation and testing of the AWOL-S tool was deemed to be not human subjects research as defined by the UCSF Committee on Human Research, as they occurred entirely under the auspices of a quality improvement activity (IRB #16–21073). The Committee on Human Research also approved the research into the performance characteristics and clinical impact of this tool and waived the requirement for written informed consent from subjects (IRB #19–27578).
Local context:
At the time we began this work, the UCSF Delirium Reduction Campaign – an institutionally-funded initiative focusing on the prevention and identification of delirium at a hospital-system-wide level – had begun risk-stratifying all patients admitted to a rolling schedule of hospital floors using the AWOL delirium risk stratification tool,8 and testing at least once per nursing shift for the presence of delirium using the Nursing Delirium Screening Scale (NuDESC)9 or Confusion Assessment Method for the Intensive Care Unit (CAM-ICU),10 which have similar sensitivity and specificity11. The training program for the approximately 1000 acute care and 500 intensive care nurses at UCSF Health included 30 minutes of formal training sessions, one-on-one education and daily huddles for the first week of clinical delirium screening, and weekly feedback emails for the first year of screening. Further details about the perioperative Delirium Reduction Campaign efforts are detailed elsewhere18.
The AWOL risk stratification tool, described in greater detail below, was developed at this institution and validated in medically hospitalized patients (i.e., nonsurgical) with acceptable performance.8 We felt that the greatest threat to face validity of the AWOL tool in the perioperative population was the well-demonstrated fact that certain procedures are more likely to precipitate postoperative delirium.12 Thus, we provisionally planned to test predictive power of the AWOL components, add a term for the postoperative delirium risk associated with the surgical procedure itself, and consider a perioperative-specific adaptation of illness severity (i.e., American Society of Anesthesiologists physical status [ASAPS]) in our risk stratification tool, while rederiving the risk estimates conferred by each characteristic to best predict delirium risk in our perioperative population.
Patients:
All patients 18 years of age and older admitted to UCSF Medical Center for a surgical procedure requiring at least one night inpatient stay were considered. Patients undergoing emergency surgery, or those who were hospitalized on the day prior to their surgery, were excluded. Because pre-admitted patients presenting for surgery (elective or emergent) may have already undergone risk stratification using the AWOL tool upon admission, we did not include these patients in our derivation data. Non-English-speaking patients were largely excluded in the derivation cohort, because of the evolving ability to perform AWOL and NuDESC assessment in these patients; during the derivation period, adaptations of the WORLD assessment for non-English-speaking patients were being trialed at a hospital level independent of our perioperative work.
Timing:
The derivation and assessment of the risk score occurred concurrently with the gradual planned roll-out of the UCSF delirium pathway. On 12/7/2016, nurses in the preoperative area were instructed to perform the items of the AWOL screen (described further below) as part of the standardized check-in process. On 6/15/2017, data collection was closed and the resulting “derivation” dataset was used to estimate odds ratios for the model (described below). From 6/16/2017 to 9/6/2017, a “validation” dataset was collected. Then, on 9/7/2017, the surgical inpatient floors began routinized AWOL screening and mandatory NuDesc assessments, representing a change in practice for data collection and management of high-delirium-risk patients; patients treated between 9/7/2017 and 6/5/2018 made up the “sustained performance” dataset. Finally, the AWOL-S screening tool was made available in the EMR and preoperative nurses were trained in its use on 6/6/2018. Major events and timeline for the derivation/validation data are graphically described in Figure 1 (“Derivation/validation data”).
Figure 1.
Graphical depiction of hospital quality improvement (QI) initiative, data collection periods from this study (derivation/validation of AWOL-S and post-hoc assessment of clinical impact), and significant events in the evolution of delirium-related screening and risk stratification.
AWOL screen:
Briefly, the AWOL screen assigns points for characteristics associated with the development of delirium; a sum of points greater than or equal to 2 stratifies the patient into a high-risk group. The characteristics are: age>=80, inability to spell WORLD backwards, lack of orientation to place (country, city, hospital name, hospital floor), and illness severity greater than or equal to “moderate”, as assessed by a registered nurse.8
Variables:
The primary predictors were determined a priori, and included age, nursing illness severity assigned at the day-of-surgery nursing assessment, and WORLD and orientation to place as components of the AWOL screen. We also evaluated the predictive power of ASAPS, as assigned in the routine provision of anesthesia care, as a potential alternative to nursing illness severity. ASAPS was assessed by a resident physician training in anesthesiology, a physician anesthesiologist, or a certified registered nurse anesthetist according to guidance from the American Society of Anesthesiologists.13
A priori, we stratified all surgical procedures into low, moderate, and high-risk based on the described potential to precipitate postoperative delirium using data from the National Surgical Quality Improvement Program12 and intrinsic UCSF rates of postoperative delirium by surgical booking code (i.e., an iterative process by which we identified unusually deliriogenic procedures in the derivation dataset and ensured that they were coded appropriately). A high-delirium-risk procedure was generally thought of as one with a >10% risk of postoperative delirium; moderate risk, 5–10%, and low risk, <5%, with some flexibility according to clinical judgment in acknowledgement that the literature rates may differ from institutional rates and experience, and depending on the patient composition of the study in question. For example, of general surgery cases, appendectomy and cholecystectomy have low literature risk of postoperative delirium, hence were designated as “low intrinsic risk,” but colectomy, esophagectomy, and pancreatectomy are typically associated with >10% risk of delirium, so were given a “high intrinsic risk” designation. While ventral hernia repair is typically a low-delirium-risk procedure, our institution performs an unusually high volume of complex hernia reconstructions, and thus, ventral hernia repair is a moderate-risk operation in our population. The full list of delirium risk determinations for surgical cases is in Supplemental Appendix 1.
The primary outcome was development of delirium in the first 7 days postoperatively, as assessed by NuDESC delirium screens which were performed at least once per shift of nursing care on acute care wards, or CAM-ICU for patients in the ICU.
AWOL-S performance in exclusion population:
Patients undergoing nonelective surgical cases (“emergency cases” as defined by the American Society of Anesthesiologists’ “E” modifier)13 and those admitted for optimization prior to an elective/nonemergent surgery were deliberately excluded from rule derivation; however, this is an important subpopulation for delirium risk stratification. We also describe the performance of the AWOL-S tool in these patients for the period 12/7/2016–6/5/2018.
Clinical implementation:
The AWOL-S logistic regression equation was programmed into a nursing flowsheet in our Epic EMR, to be completed by the preoperative nurse during check-in on the day of surgery. The flowsheet automatically imported values for patient age, any ASAPS which had been entered, and the assigned procedure risk score; nursing staff entered the results of the WORLD and orientation screening. If an ASAPS had not yet been entered, the nursing staff rated illness severity on a 5-point scale, as for the original AWOL. Once the flowsheet rows were completed, the EMR automatically calculated and displayed the absolute predicted probability of developing delirium (i.e., output of logistic regression equation derived above). For clinical utility, a dichotomous cutoff differentiating between “high” and “low” delirium risk was set at a predicted probability of 5%. Those with a predicted probability of delirium >5% (i.e., those who were stratified as “high” risk for delirium during this perioperative encounter) received an orange cue (“flag”) on their EMR chart which was visible to care providers from anesthesiology, surgery, and nursing who were using a perioperative Epic context. Choices of medications administered intraoperatively and in the post-anesthesia care unit (PACU) could, therefore, explicitly be impacted by the AWOL-S risk stratification flag (high or low).
On 7/1/2018, clinical implementation of delirium best practice interventions, based on AWOL-S risk stratification results, was made the subject of a perioperative quality improvement initiative. Details of clinical implementation are described in Supplemental Appendix 2, but briefly, anesthesiology resident physicians, nurse anesthetists, and attending physicians were instructed to avoid deliriogenic medications and prescribe nonpharmacologic delirium prevention precautions for all patients assessed by AWOL-S to be at high risk of delirium, including those not anticipating an overnight hospital stay, in the PACU. A PACU delirium orderset facilitating these measures was implemented in December 2017 (Figure 1, “Significant events”), but its use was not emphasized until the start of the quality improvement period. The content of the orderset is described elsewhere18, but briefly, nonpharmacologic delirium prevention measures like frequent reorientation were offered as a nursing order, and haloperidol replaced prochlorperazine and metoclopramide as rescue antiemetic selections. It was recommended, though not mandatory, that the surgical team prescribe similar delirium prevention measures to be implemented on the inpatient floor after PACU discharge.
Post-hoc anticholinergic medication and benzodiazepine use analysis:
We evaluated the association between an AWOL-S positive screen and the perioperative (i.e., “anesthesia start” to PACU discharge) use of medications typically avoided in patients at high delirium risk. We compared rates of benzodiazepines (midazolam and lorazepam) and anticholinergic medications (diphenhydramine, meperidine, prochlorperazine, promethazine, and scopolamine) before (12/1/2017–6/30/2018) and after (7/1/2018–6/30/2019) the quality improvement initiative focusing on clinical implementation of delirium best practices based on the AWOL-S score (Figure 1, “Clinical impact data”), using rates in patients who did not screen high-risk to control for secular trends.
Statistical analysis:
All analyses were performed in Stata version 14.2 (College Station, TX), and a threshold of p<0.05 was considered to indicate statistical significance. The threshold for statistical significance was not adjusted for multiple comparisons because the impact of the intervention on deliriogenic medication prescribing was assumed to be strongly correlated among different medications, and the post hoc exploratory nature of the analysis means findings should be interpreted cautiously.
A priori, we planned to include age, illness severity (assessed on a 1–5 scale by nursing staff8,14 or by ASAPS), cognitive ability (as assessed by WORLD backwards and orientation), and an indicator of procedure-specific delirium risk. We evaluated performance of a logistic regression model to predict delirium incorporating different versions of an age coefficient (e.g., as a continuous variable; as an ordinal variable with cutoffs at 65 and 80 years; as a dichotomous variable with cutoff at 70 years; etc.) and illness severity. Items were evaluated in the model according to clinical judgment, ease of interpretability for a clinical audience, and impact on model fit and prediction accuracy. Model fit was assessed using the area under the receiver operator characteristic (ROC-AUC) curve. All tested versions of coefficients were evaluated to ensure model assumptions were met (e.g., linearity in the logit for continuous predictors). To address nonindependence of multiple surgical cases in a single patient, variances were adjusted for clustering by patient identification number.
The final model, termed AWOL-S, was selected on the basis of model fit and alignment with AWOL components. Model odds ratios were then applied to the validation and sustained performance cohorts. Calibration of predicted probabilities in the cohorts not used to derive the AWOL-S model (i.e., the validation and sustained performance cohorts of elective patients) was assessed with predicted vs. observed probability plots using the user-defined Stata package “pmcalplot”.14
Univariate statistics were performed according to variable type and distribution to compare key characteristics among the derivation, validation, and sustained performance cohorts; the exclusion cohort was not included in the statistical comparison because of obvious and expected differences in the distribution of risk profiles.
Rates of benzodiazepine and anticholinergic medications during perioperative care – i.e., start of anesthesia team care until discharge from the PACU or, for patients who went directly to ICU, the end of anesthesia team care -- were compared before and after the clinical implementation of AWOL-S using binomial regression with an interaction term (i.e., a term in the model to indicate whether high-risk patients were treated differentially after AWOL-S implementation, after controlling for secular trends in medication use in low-risk patients). Variances were adjusted for clustering by patient, since multiple surgical encounters per patient could be considered. All elective and emergent surgical procedures were included for this analysis.
Results:
There were 10,621 surgical cases performed on patients who stayed at least one night postoperatively between 12/7/2016 and 6/5/2018. Of these, 6105 were elective procedures for patients who received preoperative risk stratification and at least one postoperative delirium assessment. An additional 1301 cases were performed in preadmitted patients or on an emergent basis who also received pre- and post-operative screening. (Figure 2; Table 1) There were modest but statistically significant differences in the distribution of ASA physical status, preoperative orientation, and procedure risk value among the 3 main cohorts. As expected, the exclusion cohort (emergent and pre-admitted cases) markedly differed from the target population for the derivation, validation, and sustained performance cohorts.
Figure 2.
Diagram of included and excluded patients in the derivation, validation, and sustained performance cohorts.
Table 1.
Description of the derivation, validation, and implementation elective surgical cohorts, and differential performance of the models for delirium risk stratification. The “exclusion” cohort was not included in the univariate statistical comparison for differences in cohort description among the derivation, validation, and implementation cohorts. AWOL-S model performance is reported as both the full regression model (“regression model AUC”) and the clinically-implemented cutoff of 5% (“5% cutoff”).
Description of cohort | Derivation | Validation | Sustained performance | Exclusion | p value | |||
---|---|---|---|---|---|---|---|---|
Number of encounters | 2091 | 908 | 3186 | 1301 | ||||
Age (years) | 58 ± 16 | 59 ± 16 | 59 ± 15 | 60 ± 16 | 0.64 | |||
Unable to spell WORLD backward | 16.8% | 16.7% | 15.6% | 29.7% | 0.46 | |||
Not oriented to place | 6.7% | 6.7% | 5.0% | 10.4% | 0.019 | |||
ASA physical status | 1 | 8.3% | 5.3% | 6.0% | 2.6% | 0.004 | ||
2 | 53.8% | 52.4% | 52.7% | 32.6% | ||||
3 | 35.5% | 39.3% | 39.0% | 53.4% | ||||
4 | 2.5% | 3.0% | 2.4% | 11.4% | ||||
5 | 0 | 0 | 0 | 0.1% | ||||
Surgery-specific delirium risk | Low | 16.7% | 19.9% | 18.6% | 18.5% | 0.001 | ||
Moderate | 63.8% | 60.4% | 65.3% | 55.9% | ||||
High | 19.5% | 19.7% | 16.0% | 25.6% | ||||
Number of delirium assessments | 5 [2–9] | 5 [2–9] | 6 [3–11] | 10 [4–14] | <0.001 | |||
Delirium in first 7 postop days | 5.5% | 5.7% | 5.6% | 15.1% | 0.97 | |||
Model performance | Derivation | Validation | Sustained performance | Exclusion | ||||
AWOL≥2 cutoff | AUC | 0.56 [0.52–0.60] | 0.56 [0.50–0.61] | 0.57 [0.54–0.60] | 0.67 [0.64–0.71] | |||
Sensitivity | 22.6% | 21.2% | 22.0% | 49.2% | ||||
Specificity | 88.9% | 90.0% | 91.9% | 85.4% | ||||
AWOL-S | Regression model AUC | 0.71 [0.67–0.75] | 0.65 [0.58–0.72] | 0.75 [0.71–0.78] | 0.71 [0.67–0.75] | |||
5% cutoff: sensitivity | 75.7% | 65.4% | 75.3% | 80.2% | ||||
5% cutoff: specificity | 58.9% | 59.2% | 59.6% | 44.8% |
Abbreviations: ASA, American Society of Anesthesiologists. AUC, area under the receiver operator characteristic curve.
Performance of the original AWOL cutoff for elevated delirium risk was poor in the elective perioperative population (ROC-AUC 0.56 [0.52–0.60] for the derivation cohort; Table 2). Odds ratios for the AWOL-S equation were obtained via logistic regression performed on 2,091 patients in the derivation cohort. Rounded odds ratios and baseline risk (intercept) for the prediction algorithm are in Table 3.
Table 2.
Clinical impact: description of cohort and medication administration. “Differential RR” is the relative risk of receiving a medication in the high-delirium-risk group after AWOL-S implementation, after controlling for secular trends using rates of medication use in the low-risk group.
Baseline | AWOL-S | p | |||||
---|---|---|---|---|---|---|---|
Low risk | High risk | Low risk | High risk | Before vs after AWOL-S | |||
Number of surgical encounters | 7,368 | 2,049 | 13,704 | 3,630 | |||
Age (years) | 53 ± 16 | 67 ± 12 | 53 ± 16 | 68 ± 12 | 0.85 | ||
Unable to spell WORLD backward | 10% | 35% | 13% | 38% | <0.001 | ||
Not oriented to place | 2.5% | 15% | 1.7% | 13% | <0.001 | ||
High illness severity | (ASA 3+) | 29% | 77% | 29% | 76% | 0.12 | |
Surgery-specific delirium risk | Low | 74% | 7.4% | 74% | 7.1% | 0.32 | |
Moderate | 23% | 70% | 23% | 70% | |||
High | 2.6% | 23% | 3.0% | 23% | |||
Delirium in first 7 postop days | 1.3% | 10% | 1.3% | 11.4% | 0.31 | ||
Perioperative medication use | Differential RR | p | |||||
Any benzodiazepine | 54% | 37% | 50% | 33% | 0.95 [0.88–1.03] | 0.24 | |
Midazolam | 54% | 36% | 50% | 32% | 0.95 [0.87–1.03] | 0.19 | |
Lorazepam | 1.8% | 2.0% | 1.7% | 1.8% | 0.98 [0.63–1.53] | 0.63 | |
Any anticholinergic | 12% | 8.6% | 12% | 6.5% | 0.79 [0.64–0.97] | 0.02 | |
Diphenhydramine | 1.7% | 1.5% | 2.2% | 1.4% | 0.74 [0.45–1.22] | 0.23 | |
Meperidine | 2.8% | 2.0% | 3.9% | 1.9% | 0.68 [0.45–1.03] | 0.07 | |
Prochlorperazine | 6.6% | 4.5% | 5.1% | 3.1% | 0.89 [0.67–1.19] | 0.44 | |
Promethazine | 0.1% | 0.1% | 0.06% | 0.03% | 0.52 [0.04–7.01] | 0.63 | |
Scopolamine | 2.1% | 1.1% | 1.7% | 0.6% | 0.72 [0.39–1.34] | 0.31 |
Table 3.
Rounded odds ratios for components in the AWOL-S prediction equation.
Baseline risk* | 0.28% | |
Age | Per year over 65 | Odds ratio 1.02 |
Per year under 65 | Odds ratio 0.98 | |
Unable to spell WORLD backward | Odds ratio 1.5 | |
DisOriented to place | Odds ratio 1.7 | |
ILlness severity | ASA 2 | Odds ratio 4.3 |
ASA 3 or higher | Odds ratio 8.3 | |
Surgical risk | Moderate-risk | Odds ratio 3.4 |
High-risk | Odds ratio 4.6 |
Baseline risk refers to the predicted probability of delirium for a hypothetical patient at the reference value for all categories; that is, a 65-year-old who is able to spell WORLD backwards, is oriented to place, and is ASA 1 with a low-risk planned surgery. The baseline risk is multiplied by the relevant odds ratios to obtain an individual’s predicted probability of delirium.
Odds ratios for each predictor variable from the derivation cohort were applied to the baseline risk for validation and sustained performance cohorts, and performance of AWOL-S (predicted probability model and performance of the 5% predicted probability cutoff used to dichotomize “high” versus “low” delirium risk) across the 3 time epochs is shown in Table 1, with ROC curves for the predicted probabilities in Figure 3A. Performance was slightly worse in the validation cohort, due to lower sensitivity in that cohort; specificity of the 5% cutoff was stable at approximately 59%. Model calibration in the combined validation and sustained performance cohorts was excellent; the calibration slope was 1.032 (ideal: 1.0) and the intercept was −0.023 (ideal: 0).(Figure 3B) For the entire cohort of 10,621 patients, overall ROC-AUC of the regression model was 0.72 [0.70–0.75]. The 5% predicted probability threshold which differentiated “low-” from “high-risk” patients performed at a sensitivity and specificity in the entire cohort of 73.9% and 59.3%, respectively.
Figure 3.
A, Receiver operator characteristic (ROC) curves for the logistic regression model predicting postoperative delirium in the derivation, validation, and sustained performance cohorts. Area under the ROC curve is displayed in the figure legend. B, calibration curve for the combined validation and sustained performance cohorts.
Performance of the AWOL-S tool in emergency and preadmitted patients specifically excluded from the derivation population was also moderate, with slightly higher sensitivity but lower specificity. (Table 1)
Changes in medication administration after AWOL-S entered clinical use as part of a multicomponent quality improvement program were evaluated in a total of 26,751 surgical encounters; 9,417 prior to the clinical implementation of AWOL-S screening (baseline), and 17,334 after implementation. We calculated a differential odds ratio (interaction term; Table 2) which shows the additional change in medication use in high-risk patients after using low-risk patients to adjust for secular trends. The point estimate for all medications studied was less than 1, suggesting an overall lower rate of deliriogenic medications in patients who screened high-risk for delirium, although confidence intervals were typically wide (Table 2). There was a statistically significant decrease in anticholinergic use; patients at high risk for delirium had a 21% lower relative risk (95% confidence interval 3–36%; p=0.02) of receiving perioperative anticholinergics after AWOL-S screening was clinically implemented, after adjusting for rate of change in low-risk patients.
Discussion:
We demonstrate moderate performance of an EMR-based postoperative delirium screening tool which incorporates both automatically-generated (e.g., age) and physician- and nursing-based (i.e., ASAPS or illness severity, and cognitive screening) components to calculate a summary probability of postoperative delirium and delirium risk stratification which is visible to all perioperative care providers via the EMR. We have implemented this tool, termed AWOL-S, for risk stratification in our perioperative care area, aligning our practice with recent recommendations from the American Society for Enhanced Recovery and Perioperative Quality Initiative.1 AWOL-S risk stratification was associated with a modest but statistically significant reduction in anticholinergic medication administration to high-delirium-risk patients, after adjusting for secular trends. The AWOL-S tool represents an improvement over the existing AWOL tool for predicting postoperative delirium and, we feel, offers a useful paradigm for EMR-based delirium risk stratification in a broad population of surgical patients.
While the performance of the AWOL-S tool is moderate, it remained roughly stable across 3 additional, independent cohorts from the same target population. Importantly, it represented a major improvement from the AWOL tool, which has similar, acceptable performance in a medical population,8,15 but proved poor at predicting postoperative delirium. One notable exception to the generally moderate performance of broadly-applied postoperative delirium risk stratification tools was published by Kim and colleagues, who demonstrated a ROC-AUC of 0.91 with their 7-factor model in a same-center validation cohort.6 However, their tool was tested only in general, vascular, and trauma surgery, and some of the factors in their model, particularly C-reactive protein and ICU admission, may not be universally assessed or reliably known at the start of surgery. In the medical population, an EMR-based clinical decision support tool for delirium achieved a ROC-AUC of 0.76 when applied to older adults, but surgical patients were not the focus of this tool,5 or of a similar EMR-based tool applied to the United States Veteran’s Affairs medical population.16 An EMR-based tool for predicting postoperative delirium in a semi-automated fashion was necessary to harmonize the perioperative departments with our institutional Delirium Reduction Campaign goals, prompting this work.
The AWOL-S tool was derived in a cohort which excluded emergency cases and patients who were preadmitted before their surgery; thus, the performance might be expected to be poorer in those patients. We found moderate performance in a post-hoc sensitivity analysis, which was not different from the performance in the target population of elective surgical patients. We had initially planned not to include patients undergoing nonelective surgery in the AWOL-S risk stratification quality improvement pathway, but after it became evident that the AWOL-S tool better predicted delirium than the unmodified AWOL tool, our practice is now to use AWOL-S risk stratification for every patient who enters the pre-anesthesia care unit, regardless of emergent/preadmitted status18.
Implementation of the AWOL-S tool was associated with an approximately 2% absolute risk reduction (from 8.6% to 6.5%) in the administration of anticholinergic medications to those at high delirium risk, translating to 73 high-risk patients who did not receive an anticholinergic medication during the yearlong AWOL-S implementation period. During the study period, rates of benzodiazepine administration decreased in our population, regardless of delirium risk. While this study was not designed to prove causality, the ability to adjust for concurrent local trends in medication administration (i.e., controlling for medication use rates in low-risk patients) makes it less likely that the finding was due to secular trends alone.
There are important limitations to this work. First, the incidence of delirium in the population of non-emergent surgical inpatients which formed the target population for this risk stratification tool was approximately 5% throughout the time period under study, which likely represents undercounting. Hypoactive delirium is particularly vulnerable to undercounting. The NuDESC tool may be one of the more sensitive options to detect delirium in the postoperative setting,17 although it, and the CAM-ICU, have elsewhere been criticized for low sensitivity.11 However, screening all postoperative patients for delirium in an institution of this size using the gold standard (a face-to-face interview with a qualified clinician using the Diagnostic and Statistical Manual of Mental Disorders) was infeasible. Second, the AWOL and AWOL-S tools were not consistently adapted for use in non-English-speaking patients; accordingly, data from these patients is mostly unavailable, although adapting the tools has been a focus for refinement of our ongoing quality improvement initiative. Third, delirium was defined as any positive delirium screen in the first 7 postoperative days, which implies a heterogeneous underlying mixture of delirium severity and duration which may be identified by this tool. Fourth, the clinical impact of the intervention was assessed using a post hoc, exploratory analysis which should be interpreted conservatively; there is no established threshold for clinical significance of anticholinergic medication use, and further, we are unable to make any conclusions about the impact of AWOL-S risk stratification on delirium because of evolving delirium interventions independent of the perioperative setting. Finally, delirium prediction tools are notorious for their broad failure to hold external validity3. This work was designed as a quality improvement initiative, and thus external validity was not an explicit goal; if this tool is implemented in other settings, reweighting the odds ratios according to local needs should be strongly considered.
In summary, we describe the derivation, validation, and sustained performance of an automated EMR-based postoperative delirium prediction tool which we implemented as part of a quality improvement initiative focusing on delirium. The tool has moderate performance across 3 independent elective surgical cohorts as well as a nonelective cohort in our population. Patient stratification to a high-delirium-risk group was associated with a reduction in perioperative use of delirium-precipitating medications, particularly anticholinergics, suggesting risk stratification impacted clinical care. AWOL-S may serve as a useful paradigm for other institutions interested in implementing broad perioperative delirium risk stratification.
Supplementary Material
Key points:
Question:
How can we accomplish electronic medical record-based postoperative delirium risk stratification to target resources toward high-risk patients?
Findings:
The AWOL-S semi-automated postoperative risk stratification tool has moderate, stable performance in a single-center cohort of surgical patients, and implementation was associated with a small decrease in anticholinergic, but not benzodiazepine, medication use in high-risk patients.
Meaning:
Although AWOL-S was developed in a single center and has its own limitations, it offers a useful paradigm for broad-based delirium stratification which may impact clinical care.
Acknowledgment:
The authors thank Jon Spinner, Rachelle Armstrong, Adam Jacobson, Amy Hephner, and Carly Deibler of UCSF Clinical Services and Anesthesia Information Technology, University of California, San Francisco, San Francisco, CA, USA, for their assistance in leveraging the electronic medical record to make this work possible.
Funding statement:
Support was provided from institutional and/or departmental sources.
Anne L. Donovan has received research support from the Network for the Investigation of Delirium: Unifying Scientists (Subcontract 91511 of R24AG054259).
Matthias R. Braehler has received funding support from the University of California, San Francisco Clinical and Translational Science Institute Pilot Awards Program.
Emily Finlayson has received research support from the National Institute on Aging (NIA R01 AG044425, NIA P30 AG04428, NIA R21AG054208).
Stephanie Rogers has received research support from Center for Disease Control (CDC-OPOIOIDS-2017-001 and CDC-STEADI-2016-001).
Vanja C. Douglas has received funding support from the Sara & Evan Williams Foundation Endowed Neurohospitalist Chair.
Elizabeth L. Whitlock has received research support from the National Institutes of Health (R03AG059822, P30AG044281, and KL2TR001879) and the Foundation for Anesthesia Education and Research.
The remainder of authors have no funding sources to disclose.
Glossary of Terms:
- NuDESC
Nursing Delirium Screening Scale
- ICU
Intensive care unit
- CAM-ICU
Confusion Assessment Method for the ICU
- AUC-ROC
Area under the receiver operator characteristic curve
- AWOL
Age, WORLD backwards, Orientation, iLlness severity delirium risk stratification tool
- AWOL-S
Abbreviation for AWOL-based surgery-specific delirium risk stratification tool under development
- EMR
Electronic medical record
- UCSF
University of California, San Francisco
- ASAPS
American Society of Anesthesiologists physical status
- PACU
Post-anesthesia care unit
Footnotes
Competing interests: Emily Finlayson is a founding shareholder of Ooney, Inc. The other authors declare no competing interests.
References:
- 1.Hughes CG, Boncyk CS, Culley DJ, et al. American Society for Enhanced Recovery and Perioperative Quality Initiative Joint Consensus Statement on Postoperative Delirium Prevention. Anesth Analg 2020; 130(6):1572–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hshieh TT, Yang T, Gartaganis SL, Yue J, Inouye SK: Hospital Elder Life Program: Systematic Review and Meta-analysis of Effectiveness. Am J Geriatr Psychiatry 2018; 26: 1015–1033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jansen CJ, Absalom AR, de Bock GH, van Leeuwen BL, Izaks GJ: Performance and agreement of risk stratification instruments for postoperative delirium in persons aged 50 years or older. PLoS One 2014; 9: e113946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.van Meenen LC, van Meenen DM, de Rooij SE, ter Riet G: Risk prediction models for postoperative delirium: a systematic review and meta-analysis. J Am Geriatr Soc 2014; 62: 2383–90 [DOI] [PubMed] [Google Scholar]
- 5.de Wit HA, Winkens B, Mestres Gonzalvo C, et al. : The development of an automated ward independent delirium risk prediction model. Int J Clin Pharm 2016; 38: 915–23 [DOI] [PubMed] [Google Scholar]
- 6.Kim MY, Park UJ, Kim HT, Cho WH: DELirium Prediction Based on Hospital Information (Delphi) in General Surgery Patients. Medicine (Baltimore) 2016; 95: e3072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Choi NY, Kim EH, Baek CH, Sohn I, Yeon S, Chung MK: Development of a nomogram for predicting the probability of postoperative delirium in patients undergoing free flap reconstruction for head and neck cancer. Eur J Surg Oncol 2017; 43: 683–688 [DOI] [PubMed] [Google Scholar]
- 8.Douglas VC, Hessler CS, Dhaliwal G, et al. : The AWOL tool: derivation and validation of a delirium prediction rule. J Hosp Med 2013; 8: 493–9 [DOI] [PubMed] [Google Scholar]
- 9.Gaudreau JD, Gagnon P, Harel F, Tremblay A, Roy MA: Fast, systematic, and continuous delirium assessment in hospitalized patients: the nursing delirium screening scale. J Pain Symptom Manage 2005; 29: 368–75 [DOI] [PubMed] [Google Scholar]
- 10.Ely EW, Margolin R, Francis J, et al. : Evaluation of delirium in critically ill patients: validation of the Confusion Assessment Method for the Intensive Care Unit (CAM-ICU). Crit Care Med 2001; 29: 1370–9 [DOI] [PubMed] [Google Scholar]
- 11.Neufeld KJ, Leoutsakos JS, Sieber FE, et al. : Evaluation of two delirium screening tools for detecting post-operative delirium in the elderly. Br J Anaesth 2013; 111: 612–8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Berian JR, Zhou L, Russell MM, et al. : Postoperative Delirium as a Target for Surgical Quality Improvement. Ann Surg 2018; 268: 93–99 [DOI] [PubMed] [Google Scholar]
- 13.American Society of Anesthesiologists: ASA Physical Status Classification System, Standards and Guidelines, 23 October 2019. https://www.asahq.org/standards-and-guidelines/asaphysical-status-classification-system, accessed 4/16/2020.
- 14.Ensor J, Kym IE: PMCALPLOT: Stata module to produce calibration plot of prediction model performance. Statistical Software Components, Boston College Department of Economics 2018; S458486 [Google Scholar]
- 15.Brown EG, Josephson SA, Anderson N, Reid M, Lee M, Douglas VC: Predicting inpatient delirium: The AWOL delirium risk-stratification score in clinical practice. Geriatr Nurs 2017; 38: 567–572 [DOI] [PubMed] [Google Scholar]
- 16.Rudolph JL, Doherty K, Kelly B, Driver JA, Archambault E: Validation of a Delirium Risk Assessment Using Electronic Medical Record Information. J Am Med Dir Assoc 2016; 17: 244–8 [DOI] [PubMed] [Google Scholar]
- 17.Radtke FM, Franck M, Schust S, et al. : A comparison of three scores to screen for delirium on the surgical ward. World J Surg 2010; 34: 487–94 [DOI] [PubMed] [Google Scholar]
- 18.Donovan AL, Braehler MR, Robinowitz DL, Anesthesia Resident Quality Improvement Committee, Finlayson E, Rogers S, Douglas VC, Whitlock EL. Reduced perioperative use of potentially inappropriate medications in older adults after a delirium prevention initiative. Anesth Analg, in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.