Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Jul 1.
Published in final edited form as: Int J Geriatr Psychiatry. 2019 Apr 23;34(7):1018–1028. doi: 10.1002/gps.5104

Predicting postoperative delirium severity in older adults: The role of surgical risk and executive function

Heidi Lindroth 1,2,3, Lisa Bratzke 2, Sara Twadell 1,4, Paul Rowley 1, Janie Kildow 1,5, Mara Danner 1, Lily Turner 1, Brandon Hernandez 1, Roger Brown 2, Robert D Sanders 1,*
PMCID: PMC6579704  NIHMSID: NIHMS1019646  PMID: 30907449

Abstract

Objectives:

Delirium is an important postoperative complication, yet predictive risk factors for postoperative delirium severity remain elusive. We hypothesized that the NSQIP risk calculation for serious complications (NSQIP-SC) or risk of death (NSQIP-D), and cognitive tests of executive function (Trail Making Test A and B [TMTA, TMTB]), would be predictive of postoperative delirium severity. Further, we demonstrate how advanced statistical techniques can be used to identify candidate predictors.

Methods/Design:

Data from an ongoing perioperative prospective cohort study of 100 adults (≥65yo) undergoing non-cardiac surgery were analyzed. In addition to NSQIP-SC, NSQIP-D, TMTA and TMTB; participant age, sex, American Society of Anesthesiologists (ASA) score, tobacco use, surgery type, depression, Framingham risk score, and preoperative blood pressure were collected. The Delirium Rating Scale-R-98 (DRS) measured delirium severity, the Confusion Assessment Method (CAM) identified delirium. LASSO and Best Subsets linear regression were employed to identify predictive risk factors.

Results:

Ninety-seven participants with a mean age of 71.68±4.55, 55% male (31/97 CAM+, 32%) and a mean Peak DRS of 21.5±6.40 were analyzed. LASSO and Best Subsets regression identified NSQIP-SC and TMTB to predict postoperative delirium severity (p<0.001, Adj. R2: 0.30). NSQIP-SC and TMTB were also selected as predictors for postoperative delirium incidence (AUROC 0.81, 95%CI 0.72–0.90).

Conclusions:

In this cohort, we identified NSQIP Risk score for Serious Complications and a measure of executive function, TMT-B, to predict postoperative delirium severity using advanced modeling techniques. Future studies should investigate the utility of these variables in a formal delirium severity prediction model.

Keywords: perioperative, delirium, severity, risk, aging, executive function

Introduction

Delirium, a type of acute brain failure, is a common surgical complication experienced by approximately 50% of patients, incurring an estimated annual U.S. cost of $152 billion.13 It is a crucial public health concern as it is significantly associated with increased mortality4 and morbidity in terms of cognitive decline5,6 and the loss of independence.7 A recent study identified an association between delirium severity and subsequent 3-year cognitive decline; the rate of cognitive decline nearly tripled in those that experienced the most severe delirium.6 Another study demonstrated that delirium severity was associated with increased risk of in-hospital mortality and discharge to an institutional facility instead of returning home.8 Taken together, these studies demonstrate the detrimental impact of high delirium severity on patient outcomes. While there are well-established predictors for delirium incidence, few studies to-date have outlined candidate predictors for delirium severity in older adults, and to our knowledge, a prediction model for postoperative delirium severity is not available.9,10 Further, prior delirium prediction modeling employed traditional statistical techniques in samples with limited events leading to model overfitting.9,1113 One way to overcome this limitation is to use advanced statistical methodologies such as Least Absolute Shrinkage and Selection Operator (LASSO) and Best Subsets Regression.1418 These statistical methods show efficiencies in modeling when compared to traditional regression procedures, reduce bias introduced by standard univariate selection, and limit model overfitting.18 Statistical shrinkage methods such as LASSO are recommended by the Transparent Reporting of a multivariable prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines, which is an evidenced based guide for prediction model development and validation.19,20

The ability to identify moderate to high-risk individuals prior to their surgery is critical to delirium prevention. To identify potential candidate predictors, we considered the pathogenesis of delirium and sought to identify both predisposing (age, depression, and medical comorbidities) and precipitating (surgery) variables. The Cognitive Disintegration Model21 posits that an individual with increasing risk, or vulnerability, to delirium will require less of a precipitating stimulus to cross over the “Delirium Threshold” and become delirious. This is illustrated in the cognitive trajectory in Figure 1. In contrast, an individual with fewer predisposing risk factors will require a large stimulus to precipitate delirium. Therefore, it is crucial to consider the future precipitating event in delirium severity prediction models when possible. The online surgical risk calculator built and copyrighted by the American College of Surgeons, National Surgical Quality Improvement Program (ACS NSQIP), may be ideal for delirium severity prediction as it includes several predisposing risk factors and the estimated magnitude of the precipitating event, the surgery.2224 Built using data from 1,414,006 patients including 1,557 distinct surgical procedures, the NSQIP risk score has been widely validated and applied to predict outcomes in various surgical populations, but has not been applied in delirium severity risk assessment.2529 We hypothesized that NSQIP risk of serious complications (NSQIP-SC) would be a stronger predictor of delirium over NSQIP risk of death (NSQIP-D) as the causal relationship between serious complications and delirium is likely stronger than the association between delirium and the risk of death.30 As delirium is a cognitive disorder, we further hypothesized that cognitive data (that is not included in the NSQIP calculation) could enhance the prediction of the surgical risk scores. Our recent systematic review identified that current delirium prediction models do not evaluate specific cognitive domains, such as executive function.9 Executive function encompasses diverse higher order cognitive functions such as attention and problem-solving.31 Significant associations between preoperative executive function and postoperative delirium incidence have been reported and is plausible given the extensive impairment of executive functions in delirium.3234

Figure 1.

Figure 1

illustrates the Cognitive Trajectory. The relationship between cognitive abilities (predisposing, y axis) and the precipitating event, i.e. surgery over time (x-axis) is shown, with each individual trajectory displayed with a horizontal line. The dashed line, situated above the x-axis of “time”, represents the “Delirium Threshold.” (A) Trajectory #1 (gray line, numbered 1) displays an individual with maximum cognitive abilities. They have a surgery, but do not cross the “Delirium Threshold.” Trajectory #2 (blue line, numbered 2) contrasts #1 by showing an individual with decreased cognitive abilities. This individual undergoes the same surgery and crosses over the “Delirium Threshold” to experience delirium. (B) Trajectory #3 (black-dashed line, numbered 3) returns to an individual with maximum cognitive abilities. A sufficiently large precipitating event will push this individual across the “Delirium Threshold”, inducing delirium. Trajectory #1 (gray line) is transposed onto this graph to show the difference in magnitude and impact of the precipitating event. (C) When developing a prediction model for delirium, it may be important to consider not only the predisposing risk factors, but also the influence of the precipitating event. A surgical risk score such as NSQIP combines both predisposing risk and the future-precipitating event into one score, which may be optimal for postoperative delirium severity prediction.

The purpose of this study is twofold. First, to examine the potential of the NSQIP risk scores and a measure of executive function to predict postoperative delirium severity among other potential candidate predictors. Second, to demonstrate the ability of advanced statistical modeling procedures, outlined by the TRIPOD guidelines, to identify candidate predictors in a sample with limited events. Our exploratory aim repeated these advanced modeling techniques in postoperative delirium incidence to identify candidate predictors. These findings will benefit individuals who are working towards identifying patients at moderate to high risk for postoperative delirium severity including researchers and clinicians. Results of this study, including the advanced modeling procedures, should be used to inform future studies focused on developing prediction models for postoperative delirium severity.

Methods

Source of Data and Participants

This analysis is a sub-study drawn from an ongoing prospective perioperative cohort study that is approved by the University of Wisconsin-Madison, Health Sciences Institutional Review Board (#2015–0374) and registered with ClinicalTrials.govgov (ref: NCT03124303, NCT01980511).

Between August 2015 and May 2018, 1,054 potential participants were screened from vascular, urology, general and spine surgical clinics with the consent of their surgeons (HL, ST, LT, PR, JK, MD, BH, RDS). As shown in Figure 2, 100 subjects were recruited and 97 were included in the final analysis.

Figure 2.

Figure 2

displays the inclusion and exclusion criteria and a flowchart detailing study screening, recruitment, consent, and attrition numbers.

Outcome Measurement

The primary outcome was delirium severity. The exploratory outcome was delirium incidence. Data was collected by the trained research team composed of the principal investigator (physician), graduate students (nurse, neuroscience), medical and undergraduate nursing students.

Delirium Severity and Incidence: Pre- and postoperatively, participants were formally assessed for delirium severity and incidence using the widely validated Confusion Assessment Method-long form (CAM),35 3D-CAM,36 and Delirium Rating Scale-R-98 (DRS),37 twice daily, between the hours of 0500–1000 and 1600–2200 on postoperative days one through four. The DRS-R-98 (DRS) is a 16-item assessment tool that measures delirium symptoms and severity. The maximum score is 44-points, an increasing score indicates worse delirium.37 The CAM and 3D-CAM were administered concurrently to provide both a comprehensive view of delirium symptoms while providing a structured interview format. If the participant was CAM positive at postoperative day-four PM assessment, the participant was followed until delirium resolved (delirium duration). If participants were ventilated in the intensive care unit (ICU), the CAM-ICU38 was administered. The Delirium Rating Scale-R-98 was applied in both settings due to its robust psychometric properties and administered by trained research staff.37

Predictors

Preoperatively, participants underwent an interview and completed assessments of executive function, functional status, depression, and delirium using the Confusion Assessment Method (CAM).35 Executive function was assessed through two well- validated and widely used measures, Trail Making Test A (TMTA) and Trail Making Test B (TMTB).39 These tests are quick to administer and require participants to connect a series of circles in ascending order. Scoring is based on time to completion; a longer completion time indicates worse executive function. Functional ability was assessed using the Instrumental Activities of Daily Living (iADL).40 Depression was assessed with the Geriatric Depression Scale-15 (GDS).41 Demographics, vitals, comorbidities, outpatient medications and American Society of Anesthesiologists (ASA) classification score were collected. Vascular surgery was selected a priori to be included as a covariate to examine whether surgical type was sufficient to predict delirium or if the surgical risk score provided important information for the prediction model. Preoperative comorbidities as predisposing risk factors were assessed through composite measures including the ASA score, the Framingham Cardiovascular Disease Ten-year Risk Calculator (Framingham CVD), and the NSQIP surgical risk calculator.22,4244 Composite measures provide the ability to combine several risk factors into a single score, which is advantageous to statistical modeling and clinical application. The ACS NSQIP online surgical risk calculator45 (http://riskcalculator.facs.org) was used to obtain the risk scores for serious complications (NSQIP-SC) and death (NSQIP-D). This calculator employs twenty patient preoperative risk factors and pairs these with the Current Procedural Terminology (CPT) code, providing a risk score specific to each procedure. Risk is calculated from both predisposing factors and precipitating factors (Figure 1). The input variables are age, sex, functional status (independent, partially dependent, dependent), emergency case, ASA classification, steroid use for chronic condition, ascites within 30-day prior to surgery, systemic sepsis within 48-hours prior to surgery, ventilator dependency, disseminated cancer, diabetes, hypertension with medications, congestive heart failure (within 30-days prior to surgery), dyspnea, current smoker (within 1-year), history of severe COPD, dialysis, acute renal failure, height and weight as well as surgical procedure. There are 1,557 distinct CPT codes, ranging from minor surgeries such as a cholecystectomy to major surgeries such as thoracoabdominal aortic aneurysm repair.

Sample Size

Sample size was based on linear regression and determined using the rule of 8–10 outcome events (delirium) per variable.46 The decision to analyze was made after 100 participants were recruited with a delirium incidence rate of 32%.

Missing Data

Missing data was identified in the following variables (#missing): TMTB (1), GDS15 (1), TMTA (6), and Tobacco Pack Years (10). Little’s test of missing completely at random (MCAR) was not significant indicating that the missing data were missing completely at random and likely, did not influence the analysis. Multiple imputation for these missing values were completed using single value regression analysis. As recommended by Jackobsen et al. (2017), the analysis was completed using data generated from the multiple imputations and a sensitivity analysis was completed using list-wise deletion.4749

Statistical Analysis

Patient characteristics were described using means ± standard deviations for continuous variables and frequency counts with percentages for categorical variables. Dependent on the distribution of the data, continuous variables were compared using Student’s t-test or Mann-Whitney U-test. Categorical variables were compared using x2. The primary outcome variable, delirium severity, was measured using the Peak DRS Total Score (DRS) for linear regression. Significance was notated with a p-value ≤0.05. NCSS v12.0, Stata/IC v15.0 and R v1.1453 were used for statistical analysis. HL and RB conducted the statistical analysis.

The DRS score was transformed using the Box-Cox Method50 with the optimal Lambda value due to the positive skew, please refer to Figure 3-A and B for raw and transformed plots. The independent variables NSQIP-SC, NSQIP-D, TMTA, TMTB, tobacco pack years, and GDS15 demonstrated a positive skew. These were not transformed as the distribution of the independent variables does not violate modeling assumptions.51,52 The errors of the regression model were examined and those were normally distributed. The assumptions of the regression model were examined and met. First, to identify candidate predictors for delirium severity and demonstrate the development of a prediction model, we employed LASSO and Best Subsets regression. We did not employ univariate statistics to select candidate predictors as this may lead to poor performing predictors and overfitting.53 To counter the effects of small sample sizes and reduce bias within data, we employed a statistical shrinkage regression technique, using Least Absolute Shrinkage and Selection Operator (LASSO).54 This technique reduces the noise within the data, allowing true signals to be detected and avoids common problems such as model overfitting. Further, this technique has shown to out-perform traditional statistical modeling procedures such as forward stepwise regression, leading to an efficient and precise modeling procedure.18 Candidate variables demonstrating the smallest Mallow’s Cp value,55 indicating precise predictors, were then applied in Best Subsets regression. Best Subsets regression is an automated regression approach that evaluates all possible combinations of candidate predictors.56 The output provides a set models with model fit statistics. Model selection was based on assessment of model fit using Akaike information criteria (AIC), Bayesian information criteria (BIC), and adjusted R2.57 Second, to evaluate the predictive ability of NSQIP-SC over NSQIP-D (as composites of the predisposing and precipitating factors), linear regression models were completed, nested, and compared to ASA classification and Framingham risk (as measures of predisposing only).

Figure 3.

Figure 3

illustrates the postoperative delirium symptom severity prediction model. Box A is a histogram showing the data distribution of the Peak Delirium Rating Scale Score (DRS). This value was transformed using the Box-Cox Method with an optimal lambda value of 0.35 achieving a near Gaussian distribution and is shown on the histogram in Box B. Boxes C-E display the predicted burden of delirium symptoms based on the NSQIP-SC and TMTB prediction model (Box C) and univariate analysis of NSQIP-SC (Box D) and TMTB (Box E). The statistics from each regression model are shown in the upper left hand corner of each box. The univariate NSQIP-SC regression model was analyzed with 97 participants. Due to one missing assessment of TMTB, Box C and E are analyzed with 96 participants.

The regression modeling procedures outlined in the paragraph above for linear regression were repeated for the logistic regression model to select candidate predictors for delirium incidence. The area under the receiver operating characteristic curve (AUROC) with 95% CI was calculated. Calibration was assessed through goodness-of-fit tests calculated by the Hosmer-Lemeshow statistic. Sensitivity, specificity, positive predictive and negative predictive values were calculated and reported.

Results

Thirty-one participants (32%) experienced postoperative delirium with a mean peak DRS severity total score of 21.48 (±SD 6.40). Forty-two percent of delirium cases were hypoactive. The median delirium duration was one day (24 hours). Participant characteristics are summarized in Table 1. Delirious patients had higher preoperative NSQIP risk scores of serious complications (NSQIP-SC) and death (NSQIP-D), worse executive function tests, were more likely to have had a vascular surgery, and higher ASA status (univariate, p<0.05). Significant pairwise correlations were demonstrated between the DRS and NSQIP-SC, NSQIP-D, TMTA and TMTB (univariate, p<0.05).

Table :1.

Description of sample and significant differences between no delirium and delirium

Variable  Mean (SD) n/N (%) N=97 No DeliriumN=66 Delirium N=31
Age 71.68 (4.55) 71.71 (4.79) 71.61 (4.09)
Sex 55 (male, 55%) 39 (59%) 16 (52%)
Years of Education 3 (<12yrs, 3%) 1 (1%) 2 (6%)
29 (12yrs, 30%) 19 (29%) 10 (32%)
65 (>12yrs, 67%) 46 (70%) 19 (61%)
NSQIP-SC 17.56 (11.66) 13.93 (9.34) 25.27 (12.49)***
NSQIP-D 2.52 (3.87) 1.85 (3.45) 3.95 (4.38)**
Framingham CVD 35.86 (19.6) 35.99 (20.74) 26.52 (16.98)
ASA 2.64 (3.87) 2.56 (0.61) 2.84 (0.52)*
Preoperative SBP 133 (17) 133 (17) 135 (17)
Preoperative DBP 74 (10) 74 (11) 74 (10)
Preoperative PP 59 (16) 59 (16) 61 (17)
Preoperative MAP 94 (11) 94 (12) 94 (10)
Tobacco Pack Years 19 (24) 17 (22) 25 (29)
Current tobacco user 19 (yes, 16%) 10 (15%) 6 (19%)
Past tobacco user 65 (yes, 67%) 43 (65%) 22 (71%)
Type of Surgery-Vascular 38 (vascular, 39%) 22 (33%) 16 (51%)*
Other 12 (general, 12%) 44 (66%) 15 (48%)
36 (Spine, 37%)
11 (Urology, 11%)
GDS15 2.47 (2.49) 2.35 (2.48) 2.74 (2.53)
Preoperative TMTA 42.10 (16.67) 39.53 (13.785) 47.58 (20.67)
Preoperative TMTB 98.73 (52.4) 89.92 (46.45) 117.48 (60.02)*
Peak DRS Total Score 11.5 (8.3) 6.82 (3.74) 21.48 (6.40)***
Delirium duration (median) 1 day
Delirium subtypes (%) Hypoactive: 42
Mixed: 32
Hyperactive: 19
RASS 0: 6

Significance levels:

* = p<0.05

**p<0.001

***p<0.0001

Abbreviations: ASA=American Society of Anesthesiologists classification score, DBP=Diastolic Blood Pressure, DRS=Delirium Rating Scale-98-R, Framingham CVD=Framingham Cardiovascular Risk Score, GDS15=Geriatric Depression Scale-15, MAP=Mean Arterial Pressure, NSQIP_SC=National Surgical Quality Improvement Program Risk for Serious Complications, NSQIP-D=National Surgical Quality

Improvement Program Risk for Death, PP=Pulse Pressure, SBP=Systolic Blood Pressure, SD=Standard Deviation, TMTA=Trail Making Test A, TMTB=Trail Making Test B

Predictor Selection, Model Development, and Performance

LASSO identified (NSQIP-SC, vascular surgery, Framingham CVD, Preoperative pulse pressure and mean arterial pressure, Tobacco pack years, TMTA and TMTB) as predictors for postoperative delirium severity and are displayed in Supplemental Figure S-2. These variables were applied to Best Subsets regression. The model demonstrating optimal fit statistics was a two-factor linear regression model contained preoperative NSQIP-SC and TMTB. This two-factor model reports an adjusted R2 of 0.30 (p<0.001), thus explaining 30% of the variability in observed delirium symptoms. For every 1-percent increase in the NSQIP-SC score, the peak DRS total score will increase by 0.29 points. Alternatively, a 10% increase in the NSQIP-SC score will increase the Peak DRS total score by 2.9 points. Further model details are displayed in supporting information, Table S-1 and Figure 3-C-E. Age, sex, NSQIP-D, ASA, tobacco pack years, vascular surgery, Framingham CVD, GDS15, TMTA, and Pre-BP were not identified as significant predictors in Best Subsets modeling procedures. As a sensitivity analysis, these statistical procedures were completed following list-wise deletion due to missing values (n=79). This analysis identified the same two-factor model containing NSQIP-SC and TMTB.

LASSO identified NSQIP-SC and TMT-B as predictors for postoperative delirium severity. To further examine how NSIQP-SC compared to the other surgical/vascular composite risk scores of NSQIP-D, Framingham CVD, and ASA status, we used standard linear regression models. Preoperative NSQIP-SC was confirmed as a predictor of postoperative peak DRS using simple linear regression models for NSQIP-SC (p<0.0001, AdjR2: 0.184). NSQIP-D was also significantly associated with DRS (p=0.04, AdjR2: 0.03). However, when nested, the NSQIP-SC model demonstrated higher adjusted R2, and lower AIC (1.525) and BIC (−290.718) metrics, providing support for that predictor over NSQIP-D. Similar to logistic regression for postoperative delirium incidence, ASA (p=0.04, AdjR2: 0.03) and Framingham CVD score (p=0.69, AdjR2: 0.01) did not demonstrate a strong predictive relationship with DRS. Further model comparison statistics are illustrated in Table 2.

Table 2–

Linear and Logistic Regression Model Statistics for NSQIP-SC, NSQIP-D, ASA and Framingham 10-year Cardiovascular Risk. These models were nested to compare performance.

Linear Regression with Transformed DRS
F-Stat p-value Variable Coef. Std. Error (95%CI) p-value Std. Betas Adj R- Squared AIC BIC
22.68 NSQIP-SC 0.02 0.004 (0.01–0.03) <0.001 0.44 0.18 1.525
−16.19
<0.0001
4.25 NSQIP-D 0.03 0.01 (0.001–0.06) <0.05 0.21 0.03 1.695
<0.05 0.33
4.18 ASA 0.20 0.096 (0.01–0.39) <0.05 0.21 0.03 1.696
<0.05 0.40
0.16 Framing. 0.001 0.003 (−0.004–0.007) P=0.69 0.04 0.002 1.737
P=0.69 4.14
Logistic Regression with Delirium Incidence
LR chi2 p-value Variable Odds Ratio Std. Error (95%CI) p-value AUROC (95%CI) McKelvey& Zavoina’s Adj R2 AIC BIC’
−50.67 NSQIP-SC 1.09 0.03 (1.05–1.15) <0.001 0.76 (0.66–0.87) 0.25 1.086
<0.0001 −15.63
−57.80 NSQIP-D 1.15 0.07 (1.02–1.29) <0.05 0.73 (0.62–0.84) 0.08 1.233
<0.05 −1.38
−58.44 ASA 2.24 0.86 (1.06–4.76) <0.05 0.63 (0.53–0.73) 0.07 1.246
<0.05 −0.09
−60.74 Framing. 1.00 0.11 (0.98–1.02) P=0.82 0.53 (0.40–0.65) −0.03 1.294
P=0.82 4.52

Abbreviations: Adj. =adjusted, AIC= Akaike information criteria, ASA=American Society of Anesthesiologists, AUROC=Area Under the Receiver Curve Operator, BIC=Bayesian Information Criteria, CI=Confidence Interval, DRS=Delirium Rating Scale-R-98, Framing. =Framingham Cardiovascular Risk Score, LR=Likelihood ratio, NSQIP-SC=National Surgical Quality Improvement Program-Risk of Serious Complications, NSQIP-D=National Surgical Quality Improvement Program-Risk of Death, Std. =Standard.

This study was designed in 2013–2014, prior to the publications of the CAM-S severity tool.58 To examine the reproducibility of these results in different severity tools, we crosswalked the Peak DRS Total Scores to both CAM-S long form and short form scores using the Network for Investigating Delirium: Unifying Scientists (NIDUS) BASIL harmonization tool.59,60 These results are provided in the supplemental section, Table S-3, Figure S-4, and Table S-5.

Exploratory outcome: Candidate predictors and Model Development for Delirium Incidence

A two-factor logistic regression model containing preoperative NSQIP-SC and TMTB was identified by LASSO and Best Subsets regression to predict postoperative delirium incidence. Hosmer-Lemeshow goodness-of-fit test was not significant (p=0.37), indicating accurate model calibration. The model demonstrated moderate predictive ability (AUROC 0.81, 95% CI: 0.72–0.90), a 5-percent increase in the NSQIP-SC score increased the probability of delirium incidence by 10% (Figure 4). Age, sex, NSQIP-D, ASA, tobacco pack years, vascular surgery, Framingham CVD, GDS15, TMTA, and Pre-BP were not identified as significant predictors. Table 2 displays the nested model comparison between NSQIP-SC, NSQIP-D, ASA, and Framingham CVD. Logistic regression model classification metrics are shown in supporting information, in Table S-1.

Figure 4.

Figure 4

illustrates the predictive ability of the NSQIP-SC and TMTB model for postoperative delirium incidence. (A) Displays the Area Under the Receiver Operator Curve statistic (AUROC). (B) Demonstrates the predicted probability of postoperative delirium incidence based on the % NSQIP-SC score. This is holding TMTB constant at zero.

Discussion

Our analysis of a prospective perioperative cohort using the advanced statistical methods of LASSO and Best Subsets regression identified NSQIP-SC to be a predictor of postoperative delirium severity and incidence. The preoperative NSQIP-SC score is a composite variable, combining data on both predisposing risk factors for delirium and surgical severity, (i.e. the precipitating event for delirium) and is well positioned to contribute information on the risk of delirium severity. This analysis demonstrates that as the risk for surgical complications increases, so does the risk for severe delirium. Further, NSQIP-SC out-performed age, ASA classification, Framingham Cardiovascular Disease Risk, and type of surgery alone when modeling procedures were applied. This suggests that NSQIP-SC may provide a more accurate estimation of delirium risk compared to age, comorbidities, and acute illness in surgical patients. The executive function measure, TMT-B, was also identified as a predictor for both postoperative delirium severity and incidence. Previous research has identified executive function to be significantly associated with delirium incidence.3234 However, as identified by our recent systematic review of delirium prediction models in older adults, an executive function measure has not been applied previously.9 Given the breakdown in executive function during delirium and prior data on the predisposition to delirium by impaired cognition, incorporating a cognitive variable appears biologically important; our data show that it is statistically important too. This study expands current knowledge by examining the utility of NSQIP risk scores and executive function in predicting delirium severity and incidence. Further, this study demonstrates the use of advanced statistical modeling procedures to select predictors, and to develop a prediction model in a sample with low event numbers.

The NSQIP-D score was significantly associated with both postoperative delirium severity and incidence, however, it was not selected as a predictor in our variable selection, rather NSQIP-SC was selected. Delirium often results from a complicated perioperative course, particularly following a major surgery, hence a relationship with NSQIP-SC is plausible. Furthermore, a recent systematic review questioned the strength of the association between postoperative delirium and mortality, hence a priori we hypothesized that NSQIP-SC would perform better than NSQIP-D for predicting postoperative delirium severity and incidence.30 This was supported by several statistical measures in our dataset. The recommended modeling procedures, LASSO and Best Subsets regression,19 did not identify NSQIP-D, vascular burden, comorbidities, smoking history, and depression as important predictors although these variables have been identified as significant risk factors for postoperative delirium incidence in prior research. This may be due to a number of factors. First, their prevalence in this population of study may not be sufficient for prediction. In order for a risk factor to also be an accurate and useful predictor, it must be sufficiently prevalent in the at-risk population.61 This prevalence is key as when predicting risk of an individual surgical patient, each factor selected in any model should exert meaningful influence on the prediction. Secondly, late-life depression, vascular burden, and tobacco use often co-occur leading to overlapping data capture. Variables that capture similar information fail to contribute important information during modeling procedures.9

The strengths of this study include its prospective perioperative design, statistical methods chosen, and rigorous delirium assessments including outcomes based on severity and incidence of delirium. This study represents a novel application of a two statistical methods to select candidate predictors and develop a prediction model in a sample size with a low number of outcome events. Further, NSQIP-SC is a readily available online tool that has potential for broad application in delirium-focused clinical care. The identification of at-risk individuals prior to surgery would provide an opportunity to develop a targeted plan of care centered on delirium prevention.62,63 The NSQIP-SC score combines several potential preoperative risk factors for delirium (age, functional status, current tobacco use, vascular burden) with the precipitating event, the planned surgery, and provides a single risk score that is easy to interpret, i.e., a 10% increase in NSQIP-SC results in a 2.9 point increase in delirium severity. Given that the patient becomes delirious postoperatively, quantifying the potential impact of this precipitating event is clearly a key feature of a delirium prediction model.

This study has several limitations to consider. While these models were built using statistical methods optimized for modeling a limited number of events, the small sample size may still have had an effect on the results. However, in this study we were able to model 97 events for delirium severity. Nonetheless, the recommendations by the TRIPOD Statement19, a guide for prediction modeling and transparent reporting, were followed to use statistical shrinkage procedures to minimize model overfitting. The NSQIP online risk tool is copyrighted and the American College of Surgeons does not currently allow this tool to be incorporated into electronic health records. Nevertheless, it is available online, providing free access to clinicians. The population is largely homogenous in terms of years of education and ethnicity. Patients with a dementia diagnosis were excluded from participation. This exclusion limits the generalizability of our findings to patients with dementia. In larger and more diverse populations, additional factors may enhance model performance.

Conclusion

In summary, this analysis of a prospective perioperative cohort study identified NSQIP-SC and a measure of executive function, TMTB, to be predictors of postoperative delirium severity. Two advanced statistical procedures, LASSO and Best Subsets, were used to select candidate predictors and develop a prediction model is a small sample size with limited outcome events. Future studies should focus on the broad external validation of these models following the statistical methodology demonstrated in this study and are outlined by the TRIPOD guidelines.

Supplementary Material

Supp info

Key points:

  • This study addresses an important knowledge gap by examining how surgical risk and executive function predict postoperative delirium severity. To our knowledge, no studies to-date have investigated these associations.

  • This study used advanced statistical methods to identify candidate predictors of postoperative delirium severity; the National Surgical Quality Improvement Program (NSQIP-SC)-Risk for Serious Complications and an executive function measure, TMT-B.

  • A ten percent increase in the NSQIP-SC score resulted in a 2.9 point increase in the peak Delirium Rating Scale Score. This is a straight-forward interpretation that could be readily implemented into clinical practice.

  • Both variables, NSQIP-SC and TMTB, were also identified as robust predictors of postoperative delirium incidence.

Acknowledgements:

Much gratitude to those that provided guidance and support throughout the conduct of this study. Supported and promoted study recruitment: Drs. Daniel Abbott, Charles Acher, Paul Anderson, Tracy Downs, and Clifford Tribus. Collected data: Mitch Whalen, Michelle Prihoda, Maggie Schmit, Casandra Stanfield, and Daniel Wayer. Provided statistical guidance: Wesley Chang Supported study: Staff and clinicians at University of Wisconsin Hospital and Clinics. Critically reviewed the proposal: Drs. Tonya Roberts, Kirk Hogan, Kris Kwekkeboom. David Dwyer

Source of Funding: This work was supported by the University of Wisconsin School of Medicine and Public Health, Department of Anesthesiology. Robert D. Sanders received support from the National Institute on Aging, 1K23AG055700–01A1. Heidi Lindroth receives support from the National Institute of Health, T32 NHBLI 5T32HL091816–07.

Role of the Funder/Sponsor: The funding agency had no role in the study design, data collection, analysis, data interpretation, and the decision to submit the paper for publication.

Clinical Trails: NCT03124303, NCT01980511

Footnotes

Work performed at: University of Wisconsin, School of Medicine and Public Health

Conflict of Interest:

The authors have no conflicts with this project.

References

  • 1.Inouye SK, Westendorp RG, Saczynski JS. Delirium in elderly people. Lancet 2014;383(9920):911–922. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Leslie DL, Inouye SK. The importance of delirium: economic and societal costs. Journal of the American Geriatrics Society 2011;59 Suppl 2:S241–243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Diagnostic and Statistical Manual of Mental Disorders-5. In: Diagnostic and Statistical Manual of Mental Disorders-5 Vol 5th American Psychiatric Association; 2013. [Google Scholar]
  • 4.Ha A, Krasnow RE, Mossanen M, et al. A contemporary population-based analysis of the incidence, cost, and outcomes of postoperative delirium following major urologic cancer surgeries. Urologic oncology 2018. [DOI] [PubMed]
  • 5.Inouye SK, Marcantonio ER, Kosar CM, et al. The short-term and long-term relationship between delirium and cognitive trajectory in older surgical patients. Alzheimers Dement 2016. [DOI] [PMC free article] [PubMed]
  • 6.Vasunilashorn SM, Fong TG, Albuquerque A, et al. Delirium Severity Post-Surgery and its Relationship with Long-Term Cognitive Decline in a Cohort of Patients without Dementia. J Alzheimers Dis 2018;61(1):347–358. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hshieh TT, Saczynski J, Gou RY, et al. Trajectory of Functional Recovery After Postoperative Delirium in Elective Surgery. Annals of surgery 2017;265(4):647–653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Khan BA, Perkins AJ, Gao S, et al. The Confusion Assessment Method for the ICU-7 Delirium Severity Scale: A Novel Delirium Severity Instrument for Use in the ICU. Crit Care Med 2017;45(5):851–857. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Lindroth H, Bratzke L, Purvis S, et al. Systematic review of prediction models for delirium in the older adult inpatient. BMJ Open 2018;8(4):e019223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Sanders RD, Pandharipande PP, Davidson AJ, Ma D, Maze M. Anticipating and managing postoperative delirium and cognitive decline in adults. BMJ (Clinical research ed) 2011;343:d4331. [DOI] [PubMed] [Google Scholar]
  • 11.Pavlou M, Ambler G, Seaman SR, et al. How to develop a more accurate risk prediction model when there are few events. BMJ (Clinical research ed) 2015;351:h3868. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Debray TPA, Moons KGM, Ahmed I, Koffijberg H, Riley RD. A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis. Statistics in medicine 2013;32(18):3158–3180. [DOI] [PubMed] [Google Scholar]
  • 13.Mallett S, Royston P, Dutton S, Waters R, Altman DG. Reporting methods in studies developing prognostic models in cancer: a review. BMC medicine 2010;8:20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ (Clinical research ed) 2015;350:g7594. [DOI] [PubMed] [Google Scholar]
  • 15.Debray TP, Damen JA, Snell KI, et al. A guide to systematic review and meta-analysis of prediction model performance. BMJ (Clinical research ed) 2017;356:i6460. [DOI] [PubMed] [Google Scholar]
  • 16.Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. European heart journal 2014;35(29):1925–1931. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Adams ST, Leveson SH. Clinical prediction rules. BMJ (Clinical research ed) 2012;344. [DOI] [PubMed] [Google Scholar]
  • 18.Hastie T The elements of statistical learning : data mining, inference, and prediction Second edition. New York: : Springer, [2009] ©2009; 2009. [Google Scholar]
  • 19.Moons KG, Altman DG, Reitsma JB, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Annals of internal medicine 2015;162(1):W1–73. [DOI] [PubMed] [Google Scholar]
  • 20.Moons KG, Altman DG, Reitsma JB, Collins GS. New Guideline for the Reporting of Studies Developing, Validating, or Updating a Multivariable Clinical Prediction Model: The TRIPOD Statement. Advances in anatomic pathology 2015;22(5):303–305. [DOI] [PubMed] [Google Scholar]
  • 21.Sanders RD. Hypothesis for the pathophysiology of delirium: role of baseline brain network connectivity and changes in inhibitory tone. Medical Hypotheses 2011;77(1):140–143. [DOI] [PubMed] [Google Scholar]
  • 22.Bilimoria KY, Liu Y, Paruch JL, et al. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. Journal Of The American College Of Surgeons 2013;217(5):833–842.e831–833. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Guillamondegui OD, Gunter OL, Hines L, et al. Using the National Surgical Quality Improvement Program and the Tennessee Surgical Quality Collaborative to improve surgical outcomes. Journal Of The American College Of Surgeons 2012;214(4):709–714; discussion 714–716. [DOI] [PubMed] [Google Scholar]
  • 24.Davenport DL, Holsapple CW, Conigliaro J. Assessing surgical quality using administrative and clinical data sets: a direct comparison of the University HealthSystem Consortium Clinical Database and the National Surgical Quality Improvement Program data set. Am J Med Qual 2009;24(5):395–402. [DOI] [PubMed] [Google Scholar]
  • 25.Bohnen JD, Mavros MN, Ramly EP, et al. Intraoperative Adverse Events in Abdominal Surgery: What Happens in the Operating Room Does Not Stay in the Operating Room. Annals of surgery 2017;265(6):1119–1125. [DOI] [PubMed] [Google Scholar]
  • 26.Mogal H, Vermilion SA, Dodson R, et al. Modified Frailty Index Predicts Morbidity and Mortality After Pancreaticoduodenectomy. Ann Surg Oncol 2017;24(6):1714–1721. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Helman SN, Brant JA, Moubayed SP, Newman JG, Cannady SB, Chai RL. Predictors of length of stay, reoperation, and readmission following total laryngectomy. The Laryngoscope 2017;127(6):1339–1344. [DOI] [PubMed] [Google Scholar]
  • 28.Kubasiak JC, Landin M, Schimpke S, et al. The effect of tobacco use on outcomes of laparoscopic and open ventral hernia repairs: a review of the NSQIP dataset. Surg Endosc 2017;31(6):2661–2666. [DOI] [PubMed] [Google Scholar]
  • 29.Cohen ME, Liu Y, Huffman KM, Ko CY, Hall BL. On-demand Reporting of Risk-adjusted and Smoothed Rates for Quality Profiling in ACS NSQIP. Annals of surgery 2016. [DOI] [PubMed]
  • 30.Hamilton GM, Wheeler K, Di Michele J, Lalu MM, McIsaac DI. A Systematic Review and Meta-analysis Examining the Impact of Incident Postoperative Delirium on Mortality. Anesthesiology 2017;127(1):78–88. [DOI] [PubMed] [Google Scholar]
  • 31.Diamond A Executive functions. Annu Rev Psychol 2013;64:135–168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Smith PJ, Attix DK, Weldon BC, Greene NH, Monk TG. Executive function and depression as independent risk factors for postoperative delirium. Anesthesiology 2009;110(4):781–787 787p. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Rudolph JL, Jones RN, Grande LJ, et al. Impaired executive function is associated with delirium after coronary artery bypass graft surgery. Journal of the American Geriatrics Society 2006;54(6):937–941. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Fong TG, Hshieh TT, Wong B, et al. Neuropsychological Profiles of an Elderly Cohort Undergoing Elective Surgery and the Relationship Between Cognitive Performance and Delirium. Journal of the American Geriatrics Society 2015;63(5):977–982 976p. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Inouye SK, van Dyck CH, Alessi CA, Balkin S, Siegal AP, Horwitz RI. Clarifying confusion: the confusion assessment method. A new method for detection of delirium. Annals of internal medicine 1990;113(12):941–948. [DOI] [PubMed] [Google Scholar]
  • 36.Marcantonio ER, Ngo LH, O’Connor M, et al. 3D-CAM: derivation and validation of a 3-minute diagnostic interview for. Annals of internal medicine 2014;161(8):554–561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Trzepacz PT, Mittal D, Torres R, Kanary K, Norton J, Jimerson N. Validation of the Delirium Rating Scale-revised-98: comparison with the delirium rating scale and the cognitive test for delirium. The Journal of neuropsychiatry and clinical neurosciences 2001;13(2):229–242. [DOI] [PubMed] [Google Scholar]
  • 38.Ely EW, Margolin R, Francis J, et al. Evaluation of delirium in critically ill patients: validation of the Confusion Assessment Method for the Intensive Care Unit (CAM-ICU). Critical Care Medicine 2001;29(7):1370–1379. [DOI] [PubMed] [Google Scholar]
  • 39.Strauss E A compendium of neuropsychological tests : administration, norms, and commentary 3rd ed. Oxford ; New York: : Oxford University Press, 2006; 2006. [Google Scholar]
  • 40.Graf C The Lawton Instrumental Activities of Daily Living (iADL) Scale. Try this: Best Practices in Nursing Care to Older Adults 2015; http://consultgerirn.org/uploads/File/trythis/try_this_23.pdf. Accessed April 7, 2015.
  • 41.Dennis M, Kadri A, Coffey J. Depression in older people in the general hospital: a systematic review of screening instruments. In:Age Ageing Vol 41 England: 2012:148–154. [DOI] [PubMed] [Google Scholar]
  • 42.D’Agostino V, Pencina, Wolf, Cobain, Massaro, Kannel. Framingham Heart Study; Cardiovascular Disease (10-year risk) 2008; https://www.framinghamheartstudy.org/fhs-risk-functions/cardiovascular-disease-10-year-risk/#. Accessed June 3 2018, 2018.
  • 43.Delegates AHo. ASA Physical Status Classification System. ASA Physical Status Classificaiton System 2014; https://www.asahq.org/resources/clinical-information/asa-physical-status-classification-system. Accessed June 3, 2018, 2018.
  • 44.Brouquet A, Cudennec T, Benoist S, et al. Impaired mobility, ASA status and administration of tramadol are risk factors for postoperative delirium in patients aged 75 years or more after major abdominal surgery. Annals of surgery 2010;251(4):759–765. [DOI] [PubMed] [Google Scholar]
  • 45.American College of Surgeons NSQIP. Surgical Risk Calculator 2018; https://riskcalculator.facs.org/RiskCalculator/. Accessed December 2017-May 15th, 2018, 2018.
  • 46.Green SB. HOW MANY SUBJECTS DOES IT TAKE TO DO A REGRESSION-ANALYSIS. Multivariate Behav Res 1991;26(3):499–510. [DOI] [PubMed] [Google Scholar]
  • 47.Roderick JAL. A Test of Missing Completely at Random for Multivariate Data with Missing Values. Journal of the American Statistical Association 1988;83(404):1198–1202. [Google Scholar]
  • 48.Zhu X Comparison of Four Methods for Handing Missing Data in Longitudinal Data Analysis through a Simulation Study % J Open Journal of Statistics 2014;Vol.04No.11:12. [Google Scholar]
  • 49.Potthoff RF, Tudor GE, Pieper KS, Hasselblad V. Can one assess whether missing data are missing at random in medical studies? Stat Methods Med Res 2006;15(3):213–234. [DOI] [PubMed] [Google Scholar]
  • 50.Arzideh F, Wosniok W, Haeckel R. Indirect reference intervals of plasma and serum thyrotropin (TSH) concentrations from intra-laboratory data bases from several German and Italian medical centres. Clinical chemistry and laboratory medicine 2011;49(4):659–664. [DOI] [PubMed] [Google Scholar]
  • 51.Ernst AF, Albers CJ. Regression assumptions in clinical psychology research practice-a systematic review of common misconceptions. PeerJ 2017;5:e3323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Williams MN, Grajales CAG, Kurkiewicz D. Assumptions of mutlple regression: Correctling two misconceptions. Practical Assessment, Research & Evaluation 2013;18(11):1–14. [Google Scholar]
  • 53.Ogundimu EO, Altman DG, Collins GS. Adequate sample size for developing prediction models is not simply related to events per variable. J Clin Epidemiol 2016;76:175–182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Tibshirani R, Bien J, Friedman J, et al. Strong rules for discarding predictors in lasso-type problems. Journal of the Royal Statistical Society Series B, Statistical methodology 2012;74(2):245–266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Mallows CL. Some Comments on CP. Technometrics 1973;15(4):661–675. [Google Scholar]
  • 56.King JE. Running a Best-Subsets Logistic Regression: An Alternative to Stepwise Methods. Educational and Psychological Measurement 2003;63(3):392–403. [Google Scholar]
  • 57.Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology (Cambridge, Mass) 2010;21(1):128–138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Inouye SK, Kosar CM, Tommet D, et al. The CAM-S: development and validation of a new scoring system for delirium severity in 2 cohorts. Annals of internal medicine 2014;160(8):526–533. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Gross AL, Tommet D, D’Aquila M, et al. Harmonization of delirium severity instruments: a comparison of the DRS-R-98, MDAS, and CAM-S using item response theory. BMC Med Res Methodol 2018;18(1):92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.(NIDUS) BHNfIoDUS. Crosswalk for Delirium Severity Meausures. Measurement 2018; https://deliriumnetwork.org/measurement/delirium-severity-crosswalk-tool/. Accessed 01–15-2019, 2019.
  • 61.Steyerberg EW, Eijkemans MJ, Harrell FE Jr., Habbema JD. Prognostic modeling with logistic regression analysis: in search of a sensible strategy in small data sets. Medical decision making : an international journal of the Society for Medical Decision Making 2001;21(1):45–56. [DOI] [PubMed] [Google Scholar]
  • 62.Freter S, Koller K, Dunbar M, MacKnight C, Rockwood K. Translating Delirium Prevention Strategies for Elderly Adults with Hip Fracture into Routine Clinical Care: A Pragmatic Clinical Trial. Journal of the American Geriatrics Society 2017;65(3):567–573. [DOI] [PubMed] [Google Scholar]
  • 63.Hebert C Evidence-Based Practice in Perianesthesia Nursing: Application of the American Geriatrics Society Clinical Practice Guideline for Postoperative Delirium in Older Adults. Journal of perianesthesia nursing : official journal of the American Society of PeriAnesthesia Nurses 2018;33(3):253–264. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp info

RESOURCES