Using machine learning tools to predict outcomes for emergency department intensive care unit patients

Qiangrong Zhai; Zi Lin; Hongxia Ge; Yang Liang; Nan Li; Qingbian Ma; Chuyang Ye

doi:10.1038/s41598-020-77548-3

. 2020 Dec 1;10:20919. doi: 10.1038/s41598-020-77548-3

Using machine learning tools to predict outcomes for emergency department intensive care unit patients

Qiangrong Zhai ^1,^#, Zi Lin ^2,^#, Hongxia Ge ¹, Yang Liang ¹, Nan Li ³, Qingbian Ma ^1,^✉, Chuyang Ye ^2,^✉

PMCID: PMC7708467 PMID: 33262471

Abstract

The number of critically ill patients has increased globally along with the rise in emergency visits. Mortality prediction for critical patients is vital for emergency care, which affects the distribution of emergency resources. Traditional scoring systems are designed for all emergency patients using a classic mathematical method, but risk factors in critically ill patients have complex interactions, so traditional scoring cannot as readily apply to them. As an accurate model for predicting the mortality of emergency department critically ill patients is lacking, this study’s objective was to develop a scoring system using machine learning optimized for the unique case of critical patients in emergency departments. We conducted a retrospective cohort study in a tertiary medical center in Beijing, China. Patients over 16 years old were included if they were alive when they entered the emergency department intensive care unit system from February 2015 and December 2015. Mortality up to 7 days after admission into the emergency department was considered as the primary outcome, and 1624 cases were included to derive the models. Prospective factors included previous diseases, physiologic parameters, and laboratory results. Several machine learning tools were built for 7-day mortality using these factors, for which their predictive accuracy (sensitivity and specificity) was evaluated by area under the curve (AUC). The AUCs were 0.794, 0.840, 0.849 and 0.822 respectively, for the SVM, GBDT, XGBoost and logistic regression model. In comparison with the SAPS 3 model (AUC = 0.826), the discriminatory capability of the newer machine learning methods, XGBoost in particular, is demonstrated to be more reliable for predicting outcomes for emergency department intensive care unit patients.

Subject terms: Outcomes research, Translational research

Introduction

As the number of critically ill patients has increased in emergency departments (ED) globally¹, the demand for critical care has also increased substantially in the last decade, and exceeds capacity in many systems². Capacity is challenged because critical care is an expensive and limited resource, and critically ill patients should be admitted to the intensive care unit (ICU) without delay³. However, as the number of critically ill ED patients continues to increase, their stay in the ED has become even longer⁴. Because of the scarcity of ICU beds, many EDs are in the process of changing toward providing units capable of delivering critical care, known as ED intensive care units (ED-ICU). After providing resuscitation and stabilization, critical care of the patients is continued in the ED-ICU⁵. Staffing and resources are also, however, limited in emergency departments, and the ED-ICU patient population features diseases that differ in proportion from the general intensive care population. There are certain clinical scenarios in which the ED intensivist’s unique skill set can positively affect patient care, but studies have shown that ED-ICU patients have a higher unadjusted mortality than non-ED ICU patients. Traditional ICU scoring systems do not take emergency patients’ characteristics into consideration. Accurately assessing the severity of critically ill patients and predicting adverse outcomes are therefore important for initial triage and treatment in the ED⁶. Identifying non-survivals precisely is important not only because it provides information needed for avoiding unnecessary invasive care, but it also gives physicians credible evidence that could aid in the timely start of palliative care. Moreover, the early prediction of these cases helps the patients’ relatives prepare mentally and financially.

Several studies report scoring systems that can be applied in the ED, such as the DAVROS project⁷. However, the project was designed for all emergency department patients, not just the critically ill and injured. Both the derivation and validation cohorts in the project had an average mortality ranging from 4.2 to 6.9%, which is less than the ED-ICU population. On the other hand, the Simplified Acute Physiology Score (SAPS) model was developed specifically for ICU scoring, and was revised to SAPS 3 in 2005 in order to develop a new, improved model for risk adjustment⁸. Although robust illness severity scoring systems such as the SAPS have been developed and validated in the ICU setting, their predictive value is substantially degraded when applied to the rapidly changing physiology of an ED patient during the first several hours of resuscitation and critical care management. Given these limitations, it is clear that an outcome prediction tool optimized for the unique ED-ICU patient population is an essential foundation for future clinical research and practice.

In recent years, clinical use of machine learning has been evolving rapidly as opposed to typical clinical algorithms that often consist of handcrafted rules with numerous exceptions. The machine learning technique has been identified as a robust and reliable tool in predicting outcomes. Yet established methods become numerically unstable with large sets of predictors and their interactions⁹. Our objectives are to construct four machine learning models to predict the mortality of ED-ICU patients and compare their prediction performance.

Results

As shown in Fig. 1, 1624 cases were included in the final analysis. Of the included patients, 60% were male, and the mean age was 64.7 ± 18.1 years. Table 1 details the demographics, physiological characteristics, and laboratory results of the studied population, divided according to survival status. Significant differences were observed in age and laboratory results. Univariate analysis showed non-survivors were more likely to be older, and to have lower Glasgow coma scores, higher respiratory rates and systolic blood pressure (all p < 0.001). The classes imbalance was analyzed and it shows no effect on the result. (Supplementary Material-Fig. 1).

Table 1.

Baseline characteristics of study population.

Characteristic	Total (n = 1624)	Survivors (n = 1413)	Non-survivors (n = 211)	p
Demographics
Male, n (%)	969 (60.0)	844 (59.7)	125 (59.2)	0.91
Age, years	64.7 ± 18.1	63.7 ± 18.3	71.2 ± 14.9	< 0.001*
Physiologic characteristics
Glasgow coma scale	15 (15–15)	15 (15–15)	15 (7–15)	< 0.001*
Respiratory rate (breaths/min)	22.0 ± 7.0	21.6 ± 6.6	24.3 ± 6.6	< 0.001*
Heart rate (beats/min)	96.8 ± 30.5	96.4 ± 30.6	99.7 ± 32.5	0.147
Systolic blood pressure (mm Hg)	133.4 ± 31.1	134.4 ± 29.6	127.1 ± 39.8	0.002*
Laboratory results
White cell count (10⁹/L)	9.4 (7.0–13.0)	9.1 (6.9–12.3)	12.6 (8.8–17.0)	0.007*
Platelet count (10⁹/L)	213.7 ± 93.5	217.2 ± 91.5	189.9 ± 102.7	< 0.001*
Hemoglobin (g/L)	124.7 ± 31.3	126.5 ± 30.6	112.1 ± 33.3	< 0.001*
Serum potassium	4.2 ± 0.8	4.1 ± 1.8	4.4 ± 1.1	< 0.001*

Open in a new tab

*The difference between the survivor and non-survivor groups was statistically significant.

In the feature selection phase of the experiment, the dataset was ultimately divided into three groups. First it was divided into a test set (20%) and feature selection set (80%); Then the feature selection set was randomly divided into a training set (50% of the full dataset) and a validation set (30% of the full dataset), for the XGBoost training model and testing model, respectively. AUC values were used to measure the influence of each feature in each feature group in the model, and features with the least influence in each group were removed from the feature group. This step was repeated until all features were ranked by their importance. Figure 2 shows the process for determining the importance score of a feature. In a single experiment, each of the 75 features were ranked on a scale of 0 to 75 in order of importance, with 75 being the least important and 0 being the least. After ten experiments and feature scoring, we obtained the final ranking by synthesizing the scoring of all experiments. The importance ranking of all the features is shown in Table 2.

Feature selection steps adopted by machine learning.

Table 2.

Importance ranking of all features.

	Feature name	Total score
1	GCS score first	683
2	Hemoglobin	670
3	Glucose	639
4	FiO2 first	622
5	Planed admission	600
6	BUN	591
7	Septic shock	584
8	Shock first	577
9	White Blood Cell	569
10	Systolic blood pressure first	553
11	Cancer therapy	551
12	Respiratory rate first	543
13	Disch dx cerebrovas	534
14	Metastatic cancer	505
15	Sodium	505
16	SpO₂ first	503
17	pO₂ first	501
18	Platelet	498
19	Disch dx neoplasms	471
20	Tbil	471
21	Agitation	470
22	pH first	467
23	Coma	463
24	Disch dx digestive disease	450
25	Palpitation	445
26	Altered mental status	434
27	Potassium	434
28	Acute abdomen	425
29	Disch dx circulatory disease	422
30	Creatinine	421
31	Heart rate first	420
32	Active malignancy	403
33	Dyspnea	393
34	Severe acute pancreatitis	392
35	O₂ flow rate first	391
36	Fever	389
37	Age	385
38	Hypovolemic hemorrhagic shock	367
39	Focal neurologic deficit	356
40	Intracranial effect	356
41	Obtunded	350
42	Disch dx flu pneumonia	347
43	Disch dx gu disease	347
44	Confusion	342
45	Arrhythmia	337
46	Cirrhosis	328
47	Stupor	325
48	Vigilance disturbance	325
49	Anaphylactic shock	317
50	Disch dx resp	312
51	Disch dx other disease	307
52	Hypovolemic non-hemorrhagic shock	305
53	Disch dx chronic lower resp	304
54	Seizures	303
55	Live failure	299
56	O₂ device first	299
57	Vomiting	293
58	Mix shock	289
59	Use vasoactive drugs	278
60	Hematemesis	272
61	Infection	268
62	Disch dx aids	258
63	Steroid therapy	248
64	Chronic heart failure iv	231
65	Chest pain	208
66	Fatigue	192
67	Disch dx abnormal nos	187
68	Hematologic cancer	174
69	Disch dx injury	169
70	Trauma	147
71	Syncope	143
72	Bloody stools	138
73	Headache	124
74	Abdominal pain	120
75	Chest tightness	86

Open in a new tab

In order to verify the accuracy of the feature importance table, we used the feature selection set to train the model and tested it on the test set. Then we sorted the features according to their importance in Table 2, successively deleted the features with lower scores, and recorded the changes in AUC on the test set. The AUC variation curve with feature selection is shown in Fig. 3. Figure 4 shows the ROC curves of the model with and without feature selection. Through the analysis shown in Figs. 3 and 4, we find that our XGBoost machine learning model is still stable on the test set after the deletion of 50 features, and in some cases the model becomes more accurate due to the deletion of redundant information. Its predictive value is better than that of SAPS3. As the features continue to be deleted, the machine learning model based on just 13–15 features still has relatively reliable performance. By combining medical evaluation with machine learning, we attained a lightweight, high-quality predictive model. The AUC values are shown in Fig. 5, and the other indicators in Table 3. According to the table, we concluded that XGBoost had the best performance for our dataset in the three aggregate metrics. The XGBoost algorithm was therefore used to further analyze the selected features.

AUC curve for feature selection. The curve of the test set shows that the obtained feature importance table can play a role in optimizing the results for a lightweight model.

ROC curve for the three methods. By observing the ROC curves of the three methods, it is obvious that the machine learning method has better performance than the traditional SAPS-3 scoring method, and the machine learning model after feature selection is superior.

ROC curves for the four predictive models. The ROC curve of XGBoost is superior to other methods and has the best AUC index performance.

Table 3.

Predictive accuracy for the four predictive models.

Method	Sensitivity	Specificity	Overall accuracy	AUC	Youden Index (Se + Sp-1)
LR	0.742	0.772	0.832	0.822 (0.72–0.85)	0.514
SVM	0.712	0.777	0.802	0.794 (0.72–0.86)	0.489
GBDT	0.790	0.754	0.834	0.840 (0.76–0.88)	0.543
XGBoost	0.756	0.806	0.837	0.849 (0.81–0.89)	0.562

Open in a new tab

Discussion

Our primary goal in this study has been to accurately identify high-mortality patients as to best utilize emergency department resources. ED-ICU patients, as a unique subgroup of ICU population, are characterized by their uncertainty and rapid deterioration. The situation calls for an automated system to generate risk prediction in real time. Even though there are a number of risk adjustment algorithms available, the utilization rate remains low for ICU patients¹⁰. Heterogeneous variables combined with complicated mathematical calculation make implementation of traditional algorithms impractical. Previous studies have focused on using machine learning methods to predict outcome^11–13, and have shown that a machine learning approach could potentially outperform existing traditional analytic techniques for predicting in-hospital mortality of ED patients with sepsis¹⁴, coronary artery disease¹⁵ and critical care patients¹⁶.

It has been suggested that many aspects of patient care and assessment are ‘‘pattern recognition’’ tasks that could benefit from machine learning-based prediction models and application¹⁷. Machine learning models have several advantages over other methods. For example, they are adept at handling high-order interactions and non-linear relationships between the predictors and the outcome¹⁸. Machine learning techniques may be optimized and combined into a multimethod solution given the right parameters and performance criteria.

In this study, we tested several machine learning algorithms for predicting outcomes in critically ill emergency patients. Our models were built on original data that required few customization, so any medical institution that has access to electronic clinical and administrative systems can employ the method readily. Traditional predictive systems such as the APACHE risk adjustment algorithm focus on the most serious physiological parameters; in comparison, our models are fully automated, so they can consider all information to generate results. We found machine learning to be a favorable prediction tool, especially the XGBoost method; as shown in other research^19,20, predictive tools using XGBoost showed excellent performance in comparison to other algorithms. XGBoost combines weak learners (decision trees) to achieve stronger overall class discrimination²¹. Compared with gradient boosting, XGBoost can be more efficiently parallelized, and incorporates regularization and tree pruning to reduce overfitting²².

The feature selection process is important in developing the model because building a machine learning tool based on all the clinical variables is not feasible. Critical disease prediction involves many more variables than do predictive systems for other diseases. Current approaches incorporate only the essential variables based on physicians’ experience, while the machine learning methods used in this study were not limited to a small set of risk factors. However, expanding the spectrum of variables does not mean machine learning methods require all details to yield results. The XGBoost included 89 features to reach a conclusion, but it is also acceptable to input a dozen essential variables to obtain a reasonable result. Given the breadth of options, the first feature screening reduced the number to 75 features. The ranking order showed many similarities with other well-recognized scoring systems, such as the SAPS system. The presence of shock and altered mental status, for example, was heavily weighted in both systems, showing the machine learning method coinciding with physicians’ experience. With the help of a developed electronic health record database, it is feasible for our tool to generate real-time predictive value, which may help physicians to evaluate the effects of treatment. In comparison, traditional scoring systems often require data to be complete and restrictive in order to achieve fixed outcomes.

To our knowledge, this is the first published study that has applied machine learning methods to predict critically ill emergency patients. Machine learning ensured the appropriate evaluation of variables as well as accurate results. Nevertheless, several limitations exist in this study. This was a single center, retrospective study. External validation is still needed, and adequate validation in other centers is planned for the future. Extended follow-up is also needed for improving predictive applicability. In addition, machine learning techniques have been criticized as black boxes, and thus may be viewed with suspicion by clinicians. Chen and Asch²³ observed that while machine learning identifies factors predictive of mortality, such as palliative care, ending such care to reduce mortality would be irrational. This study has addressed this concern by adding the physician feature screening process to avoid building a model with such irrational variables. With the help of experienced clinicians, we have established a sound foundation for deriving a useful model, and hope that it will be applied in clinical practice in the future.

Methods

Setting, participants, and data collection

A retrospective cohort study was performed in the Peking University Third Hospital. The facility’s emergency system consists of an 18-bed resuscitation unit and a 15-bed ED intensive care unit. Patients were eligible for inclusion in the study if they were alive on emergency medical services (EMS) arrival and were admitted to the resuscitation unit or the ED-ICU between February 2015 and December 2015. The exclusion criteria are presented in the Fig. 1. The institutional Ethics Committee (Peking University Third Hospital Medical Science Research Ethics Committee) approved this study, and issued a waiver of informed consent since all examinations were part of standard patient care, and, since the study was retrospective, there was no interference with patient treatment.

An expert panel consisting of emergency physicians and epidemiologists was established to design the study. Before initiation of the study, the panel developed a standard searching strategy to be applied by researchers to avoid inconsistency. Data from the electronic medical records was extracted by several researchers who are physicians well-trained in resuscitation. Extracted data included patients’ demographics, comorbidities, physiological data, laboratory tests, diagnosis, and length of stay. The variables were restricted to those collected within 6 h of medical contact. All data were entered into a secure database managed by the research team who, from the medical records, identified death within seven days as the primary outcome.

Imputation of missing data

In the data set used for this study, some data was incomplete (even when we have already excluded patients with too much missing data). Since the missing proportion of the collected features was less than 5%, we chose to replace the missing values with the average of that variable based on the patient data that was complete.

Feature screening process

There are a lot of clinical variables are available in the Peking University Third Hospital health records database, and many of the variables are repetitive and not well organized. To build the optimum machine learning model, candidate variables were examined and discussed extensively in team meetings before they were entered into the four models. Existing risk adjustment algorithms and published ICU admission criteria (mainly DAVROS and SAPS 3) provided the basis for screening the variables. Additionally, the Delphi method and literature review were used in order to determine the effect every variable, or feature would have on the models. Each variable was evaluated based on its significance, representativeness, and accessibility. At the end of this initial screening process, 75 features were selected.

Model development

This study used the Python 2.7 (Anaconda) platform to train the model with the scikit-learn 0.19.1 framework. Python 2.7 makes it easy to create experiments and debug different machine learning frameworks. Four model types were compared:

Logistic regression⁷ logistic regression (LR) is a generalized logistic regression analysis model that is often used in data mining, automatic disease diagnosis, economic forecasting, and other applications. Using binary logistic regression, univariate analyses were performed on all 75 variables of interest to explore their association with death for all patients. Variables identified by univariate analysis to have a p value of less than 0.05 were considered to have clinical significance. Multivariate analyses were then performed on that group of variables to further explore their association with death. For example, the model can be used to explore risk factors related to a particular disease, and predict the probability of the disease occurrence based on presence of the risk factors. In this study, two groups of subjects were selected, one of which was constituted of death cases within seven days after cardiac arrest treatment, and the other constituted of survival cases. The two groups represented the dependent variable, that is, whether the patient was alive after 7 days. Since the two groups had different physiological characteristics and habits, the 75 independent variables included various types of patient features, either continuous or categorical. Through logistic regression analysis, the weight of the independent variables was obtained in order to determine which of the features were risk factors. At the same time, the weight value of the risk factors was able to predict the possibility of a patient's death.

Support vector machine⁷ Support Vector Machine (SVM) is a data mining method based on statistical learning theory. Its principle is to find an optimal classification hyperplane that meets the classification requirements²¹ so that the hyperplane can maximize the margins on both sides while ensuring classification accuracy. In theory, support vector machines can achieve optimal classification of linearly separable data.

Gradient boosting decision tree Gradient Boosting Decision Tree (GBDT) is one of the best machine learning algorithms for fitting real distributions, as it combines boosting and decision tree algorithms. The decision tree is a basic classification and regression tree algorithm (CART), which has the advantages of fast classification and visualization of the model. Boosting learns multiple classifiers by increasing and reducing the weight of the training samples, and linearly combining these classifiers to improve performance. The main idea is that each time the model is built, the gradient direction of the model loss function is established, so that the loss function decreases along the gradient direction. In summary, the GBDT algorithm uses a gradient descent algorithm to train multiple learners for complex tasks, and then combines the results of multiple learners to obtain a better classification result than it could using a single learner.

XGBoost²¹ XGBoost, which stands for extreme gradient boosting, is an improved form of GBDT. The traditional GBDT algorithm uses CART as the base classifier, and XGBoost also supports linear classifiers. However, the traditional GBDT algorithm only uses first-order derivative information in optimization, whereas XGBoost performs second-order Taylor expansion for minimizing the cost function, and uses first-order and second-order derivatives. To control the complexity of the model, XGBoost adds to the cost function a regularization term that contains both the number of leaf nodes of the tree and the sum of the squares of the L2 norms of each leaf node’s score output. The regularization term thus reduces the variance of the model, making the learned model simpler and preventing overfitting. XGBoost also supports column sampling in random forests, which not only reduces overfitting but also reduces computation overhead. These characteristics make XGBoost an improvement over the traditional GBDT algorithm and have made it one of the most popular machine learning algorithms.

The LR, SVM and GBDT algorithms were implemented by directly using the package in scikit-learn 0.19.1⁷. To implement XGBoost, the XGBoost 0.82 framework was integrated with scikit-learn 0.19.1. For parameter settings, we chose L1 regularization for the LR, and linear kernel for the SVM. The parameters for the GBDT and XGBoost algorithms were set to default.

Model comparisons

To compare the performance of the LR, SVM, GBDT, and XGBoost algorithms, we used ten-fold cross-validation; that is, we randomly divided the patient data (the 75 selected features) into ten folds and conducted ten experiments. In the i-th (i = 1, …, 10) experiment, we used the i-th fold of data as the test set, and the remaining data as the training set. The mean value of the results of all experiments was computed to measure the accuracy of the algorithm. The average accuracy (ACC), sensitivity (Se), specificity (Sp), Youden index, and area under the curve (AUC) of each of the models were established by the 10-fold cross-validation.

Statement

The authors must identify the committee that approved the research. We confirm that all research was performed in accordance with relevant regulations. The institutional Ethics Committee approved the study and issued a waiver of consent.

Supplementary information

Supplementary Information.^{(47.6KB, docx)}

Acknowledgements

We acknowledge the colleagues of emergency department at the Peking University Third Hospital.

Author contributions

Q.Z. and Z.L. are first authors. Q.M. and C.Y. are corresponding authors. Q.Z., Z.L. and N.L. contributed to the acquisition of data, or analysis and interpretation of data, Y.L. and H.G. contributed to drafting the article or revising it critically for important intellectual content, Q.M. and C.Y. contributed to the conception and design of the study.

Funding

This work was supported by the UMHS-PUHSC Joint Institute for Translational and Clinical Research Project “Risk Adjusted Outcome Prediction for Emergency Department Intensive Care Unit Patients (BMU20160561)”.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Qiangrong Zhai and Zi Lin.

Contributor Information

Qingbian Ma, Email: maqingbian@126.com.

Chuyang Ye, Email: chuyang.ye@bit.edu.cn.

Supplementary information

is available for this paper at 10.1038/s41598-020-77548-3.

References

1.Herring AA, et al. Increasing critical care admissions from U.S. emergency departments, 2001–2009. Critical Care Med. 2013;41:1197–1204. doi: 10.1097/CCM.0b013e31827c086f. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Halpern NA, Pastores SM. Critical care medicine in the United States 2000–2005: an analysis of bed numbers, occupancy rates, payer mix, and costs. Crit. Care Med. 2010;38:65. doi: 10.1097/CCM.0b013e3181b090d0. [DOI] [PubMed] [Google Scholar]
3.Physicians, A. C. O. E. Boarding of admitted and intensive care patients in the emergency department. Ann. Emerg. Med. 2008;52:188–189. doi: 10.1016/j.annemergmed.2008.05.030. [DOI] [PubMed] [Google Scholar]
4.Goldstein RS. Management of the critically ill patient in the emergency department: focus on safety issues. Crit. Care Clin. 2005;21:81–89. doi: 10.1016/j.ccc.2004.10.001. [DOI] [PubMed] [Google Scholar]
5.Aslaner MA, Akkaş M, Eroğlu S, Aksu NM, Özmen MM. Admissions of critically ill patients to the ED intensive care unit. Am. J. Emerg. Med. 2015;33:501–505. doi: 10.1016/j.ajem.2014.12.006. [DOI] [PubMed] [Google Scholar]
6.Hong KJ, Sang DS, Ro YS, Song KJ, Singer AJ. Development and validation of the excess mortality ratio-based Emergency Severity Index. Am. J. Emerg. Med. 2012;30:1491–1500. doi: 10.1016/j.ajem.2011.12.011. [DOI] [PubMed] [Google Scholar]
7.Goodacre S, Wilson R, Shephard N, Nicholl J. Derivation and validation of a risk adjustment model for predicting seven day mortality in emergency medical admissions: mixed prospective and retrospective cohort study. BMJ. 2012;344:e2904. doi: 10.1136/bmj.e2904. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Metnitz PGH, et al. SAPS 3: from evaluation of the patient to evaluation of the intensive care unit. Part 1: objectives, methods and cohort description. intensive Care Med. 2005;31:1336–1344. doi: 10.1007/s00134-005-2762-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Mahajan SM, Mahajan AS, King R, Negahban S. Predicting risk of 30-day readmissions using two emerging machine learning methods. Stud. Health Technol. Inform. 2018;250:250–255. [PubMed] [Google Scholar]
10.Glance LG, Dick AW, Osler TM. ICU scoring systems: after 30 years of reinventing the wheel, isn't it time to build the cart? Crit. Care Med. 2014;42:732–734. doi: 10.1097/01.ccm.0000435695.23586.04. [DOI] [PubMed] [Google Scholar]
11.Churpek MM, et al. Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards. Crit. Care Med. 2016;44:368. doi: 10.1097/CCM.0000000000001571. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS ONE. 2017;12:e0174944. doi: 10.1371/journal.pone.0174944. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Raita Y, et al. Emergency department triage prediction of clinical outcomes using machine learning models. Crit. Care. 2019;23:64. doi: 10.1186/s13054-019-2351-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Taylor RA, et al. Prediction of in-hospital mortality in emergency department patients with sepsis: a local big data-driven machine learning approach. Acad. Emerg. Med. 2016;23:269–278. doi: 10.1111/acem.12876. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Motwani M, Dey D, Berman DS, Germano G, Achenbach S, Al-Mallah MH, Andreini D, Budoff MJ, Cademartiri F, Callister TQ, Chang HJ. Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis. Eur. Heart J. 2016;38:188. doi: 10.1093/eurheartj/ehw188. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Delahanty RJ, Kaufman D, Jones SS. Development and evaluation of an automated machine learning algorithm for in-hospital mortality risk adjustment among critical care patients. Crit. Care Med. 2018;1:e484–e488. doi: 10.1097/CCM.0000000000003011. [DOI] [PubMed] [Google Scholar]
17.Liu NT, Salinas J. Machine learning for predicting outcomes in trauma. Shock Injury Inflamm Seps. Lab. Clin. Approaches. 2017;48:504–510. doi: 10.1097/SHK.0000000000000898. [DOI] [PubMed] [Google Scholar]
18.Kuhn, M. & Johnson, K. Discriminant analysis and other linear classification models (2013).
19.Taylor RA, Moore CL, Cheung KH, Brandt C. Predicting urinary tract infections in the emergency department with machine learning. PLoS ONE. 2018;13:e0194085. doi: 10.1371/journal.pone.0194085. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Xiao J, et al. Comparison and development of machine learning tools in the prediction of chronic kidney disease progression. J. Transl. Med. 2019;17:119. doi: 10.1186/s12967-019-1860-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Chen, T. & Guestrin, C. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 785–794 (ACM).
22.Wang J, Gribskov M. IRESpy: an XGBoost model for prediction of internal ribosome entry sites. BMC Bioinformat. 2019;20:409. doi: 10.1186/s12859-019-2999-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Chen JH, Asch SM. Machine learning and prediction in medicine: beyond the peak of inflated expectations. N. Engl. J. Med. 2017;376:2507. doi: 10.1056/NEJMp1702071. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information.^{(47.6KB, docx)}

[CR1] 1.Herring AA, et al. Increasing critical care admissions from U.S. emergency departments, 2001–2009. Critical Care Med. 2013;41:1197–1204. doi: 10.1097/CCM.0b013e31827c086f. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Halpern NA, Pastores SM. Critical care medicine in the United States 2000–2005: an analysis of bed numbers, occupancy rates, payer mix, and costs. Crit. Care Med. 2010;38:65. doi: 10.1097/CCM.0b013e3181b090d0. [DOI] [PubMed] [Google Scholar]

[CR3] 3.Physicians, A. C. O. E. Boarding of admitted and intensive care patients in the emergency department. Ann. Emerg. Med. 2008;52:188–189. doi: 10.1016/j.annemergmed.2008.05.030. [DOI] [PubMed] [Google Scholar]

[CR4] 4.Goldstein RS. Management of the critically ill patient in the emergency department: focus on safety issues. Crit. Care Clin. 2005;21:81–89. doi: 10.1016/j.ccc.2004.10.001. [DOI] [PubMed] [Google Scholar]

[CR5] 5.Aslaner MA, Akkaş M, Eroğlu S, Aksu NM, Özmen MM. Admissions of critically ill patients to the ED intensive care unit. Am. J. Emerg. Med. 2015;33:501–505. doi: 10.1016/j.ajem.2014.12.006. [DOI] [PubMed] [Google Scholar]

[CR6] 6.Hong KJ, Sang DS, Ro YS, Song KJ, Singer AJ. Development and validation of the excess mortality ratio-based Emergency Severity Index. Am. J. Emerg. Med. 2012;30:1491–1500. doi: 10.1016/j.ajem.2011.12.011. [DOI] [PubMed] [Google Scholar]

[CR7] 7.Goodacre S, Wilson R, Shephard N, Nicholl J. Derivation and validation of a risk adjustment model for predicting seven day mortality in emergency medical admissions: mixed prospective and retrospective cohort study. BMJ. 2012;344:e2904. doi: 10.1136/bmj.e2904. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Metnitz PGH, et al. SAPS 3: from evaluation of the patient to evaluation of the intensive care unit. Part 1: objectives, methods and cohort description. intensive Care Med. 2005;31:1336–1344. doi: 10.1007/s00134-005-2762-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Mahajan SM, Mahajan AS, King R, Negahban S. Predicting risk of 30-day readmissions using two emerging machine learning methods. Stud. Health Technol. Inform. 2018;250:250–255. [PubMed] [Google Scholar]

[CR10] 10.Glance LG, Dick AW, Osler TM. ICU scoring systems: after 30 years of reinventing the wheel, isn't it time to build the cart? Crit. Care Med. 2014;42:732–734. doi: 10.1097/01.ccm.0000435695.23586.04. [DOI] [PubMed] [Google Scholar]

[CR11] 11.Churpek MM, et al. Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards. Crit. Care Med. 2016;44:368. doi: 10.1097/CCM.0000000000001571. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS ONE. 2017;12:e0174944. doi: 10.1371/journal.pone.0174944. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Raita Y, et al. Emergency department triage prediction of clinical outcomes using machine learning models. Crit. Care. 2019;23:64. doi: 10.1186/s13054-019-2351-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Taylor RA, et al. Prediction of in-hospital mortality in emergency department patients with sepsis: a local big data-driven machine learning approach. Acad. Emerg. Med. 2016;23:269–278. doi: 10.1111/acem.12876. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Motwani M, Dey D, Berman DS, Germano G, Achenbach S, Al-Mallah MH, Andreini D, Budoff MJ, Cademartiri F, Callister TQ, Chang HJ. Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis. Eur. Heart J. 2016;38:188. doi: 10.1093/eurheartj/ehw188. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Delahanty RJ, Kaufman D, Jones SS. Development and evaluation of an automated machine learning algorithm for in-hospital mortality risk adjustment among critical care patients. Crit. Care Med. 2018;1:e484–e488. doi: 10.1097/CCM.0000000000003011. [DOI] [PubMed] [Google Scholar]

[CR17] 17.Liu NT, Salinas J. Machine learning for predicting outcomes in trauma. Shock Injury Inflamm Seps. Lab. Clin. Approaches. 2017;48:504–510. doi: 10.1097/SHK.0000000000000898. [DOI] [PubMed] [Google Scholar]

[CR18] 18.Kuhn, M. & Johnson, K. Discriminant analysis and other linear classification models (2013).

[CR19] 19.Taylor RA, Moore CL, Cheung KH, Brandt C. Predicting urinary tract infections in the emergency department with machine learning. PLoS ONE. 2018;13:e0194085. doi: 10.1371/journal.pone.0194085. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Xiao J, et al. Comparison and development of machine learning tools in the prediction of chronic kidney disease progression. J. Transl. Med. 2019;17:119. doi: 10.1186/s12967-019-1860-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR21] 21.Chen, T. & Guestrin, C. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 785–794 (ACM).

[CR22] 22.Wang J, Gribskov M. IRESpy: an XGBoost model for prediction of internal ribosome entry sites. BMC Bioinformat. 2019;20:409. doi: 10.1186/s12859-019-2999-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR23] 23.Chen JH, Asch SM. Machine learning and prediction in medicine: beyond the peak of inflated expectations. N. Engl. J. Med. 2017;376:2507. doi: 10.1056/NEJMp1702071. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Using machine learning tools to predict outcomes for emergency department intensive care unit patients

Qiangrong Zhai

Zi Lin

Hongxia Ge

Yang Liang

Nan Li

Qingbian Ma

Chuyang Ye

Abstract

Introduction

Results

Figure 1.

Table 1.

Figure 2.

Table 2.

Figure 3.

Figure 4.

Figure 5.

Table 3.

Discussion

Methods

Setting, participants, and data collection

Imputation of missing data

Feature screening process

Model development

Model comparisons

Statement

Supplementary information

Acknowledgements

Author contributions

Funding

Competing interests

Footnotes

Contributor Information

Supplementary information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases