Abstract
OBJECTIVE:
Early warning scores (EWSs) were developed to identify high-risk patients on the hospital wards. Research on EWSs has focused on patients in short-term acute care hospitals, but there are other settings, such as long-term acute care hospitals (LTACHs), where these tools could be useful. However, the accuracy of EWSs in LTACHs is unknown.
DESIGN:
Observational cohort study
SETTING:
Two LTACHs in Illinois from January 2002 until September 2017
PATIENTS:
Admitted adult LTACH patients
INTERVENTIONS:
None
MEASUREMENTS AND MAIN RESULTS:
Demographic characteristics, vital signs, laboratory values, nursing flowsheet data, and outcomes data were collected from the electronic health record. The accuracy of individual variables, the Modified Early Warning Score (MEWS), the National Early Warning Score version 2 (NEWS2), and our previously developed eCART score were compared for predicting the need for acute hospital transfer or death using the area under the receiver operating characteristic curve (AUC). A total of 12,497 patient admissions were included, with 3,550 experiencing the composite outcome. The median age was 65 (IQR 54–74), 46% were female, and the median length of stay in the LTACH was 27 days (IQR 17–40), with an 8% in-hospital mortality. Laboratory values were the best predictors, with blood urea nitrogen being the most accurate (AUC 0.63) followed by albumin, bilirubin, and white blood cell count (AUCs of 0.61). Systolic blood pressure was the most accurate vital sign (AUC 0.60). eCART (AUC 0.72) was significantly more accurate than NEWS2 (AUC 0.66) and MEWS (AUC 0.65; p<0.01 for all pairwise comparisons).
CONCLUSIONS:
In this retrospective cohort study, we found that eCART was significantly more accurate than MEWS and NEWS2 for predicting acute hospital transfer and mortality. Because laboratory values were more predictive than vital signs and the average length of stay in an LTACH is much longer than short-term acute hospitals, developing a score specific to the LTACH population would likely further improve accuracy, thus allowing earlier identification of high-risk patients for potentially life-saving interventions.
Keywords: early warning scores, risk prediction, machine learning, clinical deterioration, long-term acute care hospital
INTRODUCTION
Early warning scores have been developed and implemented across the United States and other countries around the world in order to improve the identification and treatment of clinical deterioration1. These scores include the commonly used Modified Early Warning Score (MEWS)1–2, which was developed using clinical opinion and only utilizes vital signs, as well as more complex scores developed using machine learning3. For example, our group developed the electronic Cardiac Arrest Risk Triage (eCART) score3 using a large, multicenter dataset that included vital signs, demographics, and laboratory results in a machine learning model. Although the literature is rapidly evolving, studies suggest that more complex scores outperform simpler ones, and pairing early warning scores to clinical actions in the acute hospital setting, such as the deployment of rapid response teams, may improve outcomes4–5.
To date, studies investigating the accuracy of early warning scores have focused on patients hospitalized in acute care hospitals. However, an increasing number of patients with chronic illnesses are transferred to long-term acute care hospitals (LTACHs) following their discharge from acute care settings6–7. These patients have high morbidity and mortality6, and some of them develop new acute conditions that cause clinical deterioration and the need for transfer back to the acute hospital setting8–9. Because early intervention can improve outcomes in many critical care conditions, including sepsis and cardiogenic shock10–12, it is possible that early warning scores could play a role in the LTACH setting as well in order to prompt potentially life-saving interventions. However, the accuracy of early warning scores in patients admitted to an LTACH is unknown. Therefore, we aimed to compare the accuracy of early warning scores, individual demographic characteristics, vital signs, and laboratory values for predicting clinical deterioration in a multicenter LTACH database.
MATERIALS AND METHODS
This study was deemed exempt by The University of Chicago’s Institutional Review Board (IRB#19–0974).
Study setting
All patients admitted to two RML Specialty Hospitals in Illinois (Chicago, with approximately 70 beds, and Hinsdale, with approximately 90 beds) from January 2002 until September 2017 were included in this retrospective study of prospectively collected data. Based on administrative data, up to two-thirds of patients are ventilator dependent at some point during their stay, while the remainder are admitted for complex medical issues or wound care. Demographic characteristics (age and sex), vital signs (temperature, blood pressure, heart rate, respiratory rate, oxygen saturation, mental status), laboratory values (electrolytes, renal function, complete blood count, liver function tests), nursing flowsheet data (Braden Scale and amount of oxygen delivered), and outcomes data were collected from the electronic health record (Meditech; Westwood, MA). ICD-9 and ICD-10 billing codes were collected from administrative data and used to determine the number of Elixhauser comorbidities for each patient admission.
Outcomes
The primary outcome of the study was the composite outcome of either death (unplanned or comfort care) or transfer to an acute care hospital. Both of these individual outcomes were investigated as secondary outcomes.
Statistical analysis
The MEWS, National Early Warning Score version 2 (NEWS2)13, and the eCART score were calculated for each observation in the dataset during a patient’s admission. The MEWS is based on vital signs, with abnormalities scored on a 0–3 scale and then summed to calculate a total MEWS score. The NEWS2 is a recently updated version of the NEWS, which is used throughout the UK, and is similar in format to the MEWS but with modified weightings. The eCART score was previously developed using logistic regression with linear splines and includes demographics, vital signs, and laboratory values as predictors. All scores were calculated based on prior publications and not modified for this study with the exception that the “prior intensive care unit stay” and “time on the general wards” variables were omitted from the eCART calculation as they are not relevant to this population. As per prior work in this area, if a variable was missing during an individual time point, then the most recent prior value for that variable was used, and if no prior value was available, then the median value for that variable was imputed.
Accuracy was calculated and compared between the individual variables, MEWS, NEWS2, and eCART for predicting death or acute hospital transfer within 24 hours using the area under the receiver operating characteristic curve (AUC) for all observations in the dataset. Analyses were performed using Stata version 15.1 (StataCorps; College Station, Texas), and a two-tailed p-value <0.05 was used to denote statistical significance.
RESULTS
A total of 12,497 admissions were included in the study, with 3,550 (28%) experiencing an acute hospital transfer (n=2,544; 20%) or death (n=1,006; 8%). The median (IQR) Braden Scale of the cohort was 13 (IQR 11–15) and they had a median of three (IQR 2–5) Elixhauser comorbidities. Patients who experienced the composite outcome were older (66 vs. 64 years; p<0.001), more likely to be male (55% vs. 53%; p=0.03), and had a shorter length of stay (median 18 (IQR 8–34) vs. 29 (IQR 20–41) days, p<0.001) than those who did not experience an event.
For the individual predictors of the composite outcome, laboratory values were the most accurate, with blood urea nitrogen being the most accurate (AUC 0.63) followed by albumin, bilirubin, and white blood cell count (AUCs of 0.61); Figure 1. Systolic blood pressure was the most accurate vital sign for predicting the composite outcome (AUC 0.60). Findings were similar for the individual outcomes, with blood urea nitrogen being the most accurate predictor of both death and acute hospital transfer (Table 1).
Table 1.
Variable |
AUC (95% CI) | |
---|---|---|
Hospital Transfer (n=2,544) |
Death (n=1,006) |
|
eCART | 0.67 (0.67, 0.68) | 0.83 (0.82, 0.83) |
NEWS2 | 0.62 (0.62, 0.63) | 0.75 (0.75, 0.76) |
MEWS | 0.62 (0.62, 0.63) | 0.72 (0.71, 0.73) |
BUN | 0.61 (0.60, 0.61) | 0.69 (0.69, 0.70) |
Bilirubin | 0.60 (0.59, 0.60) | 0.65 (0.64, 0.65) |
Albumin | 0.59 (0.58, 0.59) | 0.65 (0.65, 0.66) |
WBC | 0.58 (0.58, 0.59) | 0.67 (0.66, 0.67) |
SBP | 0.57 (0.57, 0.57) | 0.67 (0.67, 0.68) |
Anion gap | 0.58 (0.57, 0.58) | 0.63 (0.63, 0.64) |
Creatinine | 0.58 (0.57, 0.58) | 0.62 (0.62, 0.63) |
Respiratory rate | 0.57 (0.56, 0.57) | 0.64 (0.63, 0.65) |
CO2 | 0.57 (0.57, 0.58) | 0.61 (0.61, 0.62) |
Total protein | 0.57 (0.57, 0.58) | 0.61 (0.61, 0.62) |
DBP | 0.55 (0.54, 0.55) | 0.66 (0.65, 0.66) |
Heart rate | 0.57 (0.57, 0.58) | 0.58 (0.58, 0.59) |
Calcium | 0.56 (0.56, 0.57) | 0.59 (0.58, 0.59) |
O2 saturation | 0.54 (0.54, 0.54) | 0.64 (0.64, 0.65) |
Platelet count | 0.56 (0.56, 0.56) | 0.59 (0.58, 0.59) |
Hemoglobin | 0.57 (0.56, 0.57) | 0.57 (0.56, 0.58) |
FiO2 | 0.55 (0.55, 0.56) | 0.60 (0.60, 0.61) |
Glucose | 0.56 (0.55, 0.56) | 0.56 (0.55, 0.56) |
AST | 0.54 (0.54, 0.55) | 0.59 (0.58, 0.59) |
Braden scale | 0.53 (0.53, 0.53) | 0.61 (0.60, 0.61) |
Abbreviations: eCART = electronic Cardiac Arrest Risk Triage; NEWS2 = National Early Warning Score version 2; MEWS = Modified Early Warning Score; BUN = blood urea nitrogen; WBC = white blood cell count; SBP = systolic blood pressure; CO2 = carbon dioxide; DBP = diastolic blood pressure; O2 Saturation = oxygen saturation; FiO2 = fraction of inspired oxygen; AST = aspartate aminotransferase
eCART (AUC 0.72) was significantly more accurate than NEWS2 (AUC 0.66) and MEWS (AUC 0.65) for predicting the composite outcome (p<0.01 for all pairwise comparisons). Findings were similar for the secondary outcome of need for acute hospital transfer (AUC 0.67 for eCART vs. 0.62 for both NEWS2 and MEWS; p<0.01) and death (AUC 0.83 for eCART vs. 0.75 for NEWS2 and 0.72 for MEWS; p<0.01).
CONCLUSION
In this multicenter study, we found that the eCART score was significantly more accurate than MEWS and NEWS2 for predicting the combined outcome of need for acute hospital transfer or death as well as for each individual outcome. This relative improvement in accuracy was similar to the accuracy difference in the original development study published in a multicenter acute hospital database4. In addition, we found that laboratory results were the best predictors of the composite outcome, with systolic blood pressure being the most accurate vital sign predictor. These results suggest that early warning scores may have a role in identifying critically ill patients in the LTACH setting.
Our finding that laboratory values were the most accurate predictors of clinical deterioration in the LTACH setting differs from prior work on the acute hospital wards where vital signs, especially respiratory rate, have been shown to be the most accurate variables4, 14. Interestingly, this finding mirrors our prior work investigating predictors of intensive care unit readmission, where we also found that blood urea nitrogen was the most accurate predictor and laboratory results generally outperformed vital signs14. This suggests that in the LTACH setting, measures of chronic illness and comorbidity, such as albumin and blood urea nitrogen, are better predictors than traditional vital signs. The fact that respiratory rate and oxygen saturation were less predictive in this population than on the general wards may also be due to the fact that many LTACH patients have some component of compromise at baseline6, with two-thirds of the patient population in this study requiring mechanical ventilation, making it harder to determine whether abnormalities are new. This suggests that future, LTACH-specific models should incorporate trends in order to account for baseline abnormalities in these vitals, similar to what we have previously found in ward models where the addition of trends significantly improved accuracy15. As suggested by prior studies, the inclusion of additional laboratory biomarkers may also further improve accuracy16–17.
Similar to the acute hospital setting, we found that eCART was a significantly better predictor of clinical deterioration than MEWS in the LTACH setting4. Although the AUCs were lower in this study (0.72 vs. 0.77 for eCART and 0.65 vs. 0.70 for MEWS), the improvement in accuracy for the more complex eCART score was similar4. This suggests that eCART would demonstrate increased detection and lower false alarms if implemented in real-time to alert clinicians of high-risk patients. Such alerts could be used to direct attention and resources to the highest risk patients, which may potentially improve their outcomes (e.g., through earlier interventions in the LTACH or swifter transfer to an acute hospital if needed). This hypothesis deserves further study in future interventional trials.
The current study has several limitations. First, the study cohort was from two LTACHs in the same system in Illinois, and the results may not be generalizable to other centers. In addition, mental status documentation was available in less than 5% of patients. However, MEWS, NEWS2, and eCART include mental status as a predictor variable, so it is unlikely that this altered their comparative accuracies. Furthermore, manually collected variables, such as respiratory rate, are more prone to human error as compared to laboratory biomarkers, which could have affected the results of this study. Finally, we only compared MEWS, NEWS2, and eCART in this study; there are over 100 different early warning scores in the literature, and investigating the accuracy of all of these scores was beyond the scope of this work.
In conclusion, we found that eCART was significantly more accurate than MEWS and NEWS2 in this multicenter study of early warning scores in the LTACH setting. In addition, laboratory results were the best predictors of clinical deterioration, and most vital signs had fair to poor accuracy. These results suggest that early warning scores may have a role in the LTACH setting, thus allowing for earlier identification of high-risk patients and potentially life-saving interventions.
Acknowledgments
Copyright form disclosure: Dr. Churpek’s institution received funding from National Heart, Lung, and Blood Institute (K08HL121080) and National Institute for General Medicine Sciences (R01 GM123193), and he received support for article research from the National Insttiutes of Health. Dr. Churpek and Dr. Edelson received funding from research support from EarlySense (Tel Aviv, Israel), and they disclosed that they have a patent pending for risk stratification algorithms for hospitalized patients (ARCD. P0535US.P2). Dr. Prister disclosed that after the study was conducted, RML chose to pursue implementation of the eCART system at their LTCHs. Dana P Edelson c/f (received funding from in Quant HC (ownership interest) which is developing products for risk stratification of hospitalized patients, and she received research support and honoraria from Philips Healthcare. The remaining authors have disclosed that they do not have any potential conflicts of interest.
Conflicts of Interest and Source of Funding: Drs. Churpek and Edelson have a patent pending (ARCD. P0535US.P2) for risk stratification algorithms for hospitalized patients and have received research support from EarlySense (Tel Aviv, Israel). Dr. Churpek was supported by a career development award from the NHLBI (K08HL121080) and an R01 from NIGMS (NIGMS (R01 GM123193). Dr. Edelson has received research support and honoraria from Philips Healthcare (Andover, MA) as well as research support from the American Heart Association (Dallas, TX) and Laerdal Medical (Stavanger, Norway). She has ownership interest in Quant HC (Chicago, IL), which is developing products for risk stratification of hospitalized patients.
REFERENCES
- 1.Churpek MM, Yuen TC, Edelson DP: Risk stratification of hospitalized patients on the wards. Chest 2013;143(6):1758–1765. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Subbe CP, Kruger M, Rutherford P, et al. : Validation of a modified Early Warning Score in medical admissions. QJM 2001. October;94(10)521–6. [DOI] [PubMed] [Google Scholar]
- 3.Mao Y, Chen Y, Hackmann G, et al. : Medical Data Mining for Early Deterioration Warning in General Hospital Wards. IEEE 11th International Conference on Data Mining Workshops, Vancouver, BC 2011;1042–1049. [Google Scholar]
- 4.Churpek MM, Yuen TC, Winslow C, et al. : Multicenter development and validation of a risk stratification tool for ward patients. Am J Respir Crit Care Med 2014;190(6):649–655. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Subbe CP, Duller B, Bellomo R: Effect of an automated notification system for deteriorating ward patients on clinical outcomes. Crit Care 2017;21(1):52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Scheinhorn DJ, Hassenpflug MS, Votto JJ, et al. : Ventilation Outcomes Study Group. Post-ICU mechanical ventilation at 23 long-term care hospitals: a multicenter outcomes study. Chest 2007;131(1):85–93. [DOI] [PubMed] [Google Scholar]
- 7.Kahn JM, Le T, Angus DC, et al. : The epidemiology of chronic critical illness in the United States*. Crit Care Med 2015. February;43(2): 282–287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kahn JM, Benson NM, Appleby D, et al. : Long-term acute care hospital utilization after critical illness. JAMA 2010;303(22):2253–2259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Burke RE, Whitfield EA, Hittle D, et al. : Hospital readmission from post-acute care facilities: risk factors, timing, and outcomes. J Am Med Dir Assoc 2016. March 1;17(3):249–255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Cardoso LT, Grion CM, Matsuo T, et al. : Impact of delayed admission to intensive care units on mortality of critically ill patients: a cohort study. Crit Care 2011;15(1):R28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Seymour CW, Gesten F, Prescott HC, et al. : Time to treatment and mortality during mandated emergency care for sepsis. N Engl J Med 2017. June 8;376(23):2235–2244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Hochman JS, Sleeper LA, White HD, et al. : One-year survival following early revascularization for cardiogenic shock. JAMA 2001. January 10;285(2):190–2. [DOI] [PubMed] [Google Scholar]
- 13.Royal College of Physicians: National Early Warning Score (NEWS) 2: standardising the assessment of acute-illness severity in the NHS London, Royal College of Physicians; 2017. [Google Scholar]
- 14.Rojas JC, Carey KA, Edelson DP, et al. : Predicting intensive care unit readmission with machine learning using electronic health record data. Ann Am Thorac Soc 2018. July;15(7):846–853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Churpek MM, Adhikari R, Edelson DP: The Value of Vital Sign Trends for Detecting Clinical Deterioration on the Wards. Resuscitation 2016. May;102:1–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Nickel CH, Kellett J, Cooksley T, Bingisser R, Henriksen DP, Brabrand M: Combined use of the National Early Warning Score and D-dimer levels to predict 30-day and 365-day mortality in medical patients. Resuscitation 2016. September;106:49–52. [DOI] [PubMed] [Google Scholar]
- 17.Rasmussen LJH, Ladelund S, Haupt TH, Ellekilde GE, Eugen-Olsen J, Andersen O: Combining National Early Warning Score with Soluble Urokinase Plasminogen Activator Receptor (suPAR) improves risk prediction in acute medical patients: a registry-based cohort study. Crit Care Med 2018. December; 46(12): 1961–1968. [DOI] [PMC free article] [PubMed] [Google Scholar]