Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2022 Jun 17;15(7):826–834. doi: 10.1016/j.jiph.2022.06.008

Machine learning decision tree algorithm role for predicting mortality in critically ill adult COVID-19 patients admitted to the ICU

Alyaa Elhazmi a,c,, Awad Al-Omari b,c, Hend Sallam d, Hani N Mufti e,f, Ahmed A Rabie g, Mohammed Alshahrani h, Ahmed Mady g,i, Adnan Alghamdi j, Ali Altalaq j, Mohamed H Azzam k, Anees Sindi l, Ayman Kharaba m, Zohair A Al-Aseri n,o, Ghaleb A Almekhlafi j, Wail Tashkandi p,q, Saud A Alajmi j, Fahad Faqihi g, Abdulrahman Alharthy g, Jaffar A Al-Tawfiq r,s, Rami Ghazi Melibari t, Waleed Al-Hazzani u,v, Yaseen M Arabi w,1
PMCID: PMC9212964  PMID: 35759808

Abstract

Background

Coronavirus disease-19 (COVID-19) is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and is currently a major cause of intensive care unit (ICU) admissions globally. The role of machine learning in the ICU is evolving but currently limited to diagnostic and prognostic values. A decision tree (DT) algorithm is a simple and intuitive machine learning method that provides sequential nonlinear analysis of variables. It is simple and might be a valuable tool for bedside physicians during COVID-19 to predict ICU outcomes and help in critical decision-making like end-of-life decisions and bed allocation in the event of limited ICU bed capacities. Herein, we utilized a machine learning DT algorithm to describe the association of a predefined set of variables and 28-day ICU outcome in adult COVID-19 patients admitted to the ICU. We highlight the value of utilizing a machine learning DT algorithm in the ICU at the time of a COVID-19 pandemic.

Methods

This was a prospective and multicenter cohort study involving 14 hospitals in Saudi Arabia. We included critically ill COVID-19 patients admitted to the ICU between March 1, 2020, and October 31, 2020. The predictors of 28-day ICU mortality were identified using two predictive models: conventional logistic regression and DT analyses.

Results

There were 1468 critically ill COVID-19 patients included in the study. The 28-day ICU mortality was 540 (36.8 %), and the 90-day mortality was 600 (40.9 %). The DT algorithm identified five variables that were integrated into the algorithm to predict 28-day ICU outcomes: need for intubation, need for vasopressors, age, gender, and PaO2/FiO2 ratio.

Conclusion

DT is a simple tool that might be utilized in the ICU to identify critically ill COVID-19 patients who are at high risk of 28-day ICU mortality. However, further studies and external validation are still required.

Keywords: COVID-19, SARS-Cov2, Decision tree, ICU, Predictors

Introduction

Background

Coronavirus disease 2019 (COVID-19) is caused by severe adult respiratory syndrome coronavirus 2 (SARS-CoV-2) and was first discovered in Wuhan City in late 2019 [1]. The World Health Organization (WHO) announced the disease to be a pandemic on March 11, 2020 [2]. Since then, extensive publications on the epidemiology, characteristics, and outcomes of the disease from different regions and populations showed variable results [3].

The role of machine learning in the intensive care unit (ICU) is evolving. It has been considered in the last few years, typically in oncological and cardiovascular pathologies [4], [5]. Yet, its use is limited to diagnostic and prognostic values. During the COVID-19 pandemic, few reports used it as a predictive tool for mortality to identify risk factors. Classical statistical analysis methods utilized to identify such risk factors are limited by their inability to highlight the effect on outcome implicated by possible interactions of these factors.

Machine learning includes many methods that could be utilized in the ICU, and these vary in their complexity [6]. Several studies used different complex machine learning models to predict ICU admission and mortality especially during the COVID-19 pandemic [7], [8]. Decision tree (DT) is a simple and intuitive machine learning method that provides sequential nonlinear analysis in algorithmic relationship of combined risk factors to produce a quantitative percentage of sensitivity to mortality. [9]. It might be a useful tool for bedside physicians during COVID-19 to identify critically ill patients and guide important decisions such as ICU resource utilization and clinical management during the COVID-19 pandemic. This study evaluated the predictors of 28-day ICU mortality in COVID-19 adults admitted to the ICU using a machine learning DT algorithm.

Objectives

We highlight the role of machine learning DT algorithms in the ICU at the time of a COVID-19 pandemic.

Methods

Study design

This was a prospective, multicenter national cohort study conducting in 14 hospitals of Saudi Arabia. We included COVID-19 patients admitted to the ICU at the participating centres between March 1, 2020, and October 31, 2020. Institutional review board (IRB) approvals were obtained from the Central Institutional Review Board at the Saudi Ministry of Health and the Ethical Boards for each participating center. The study was designed to be a platform for the COVID-19 patients for research purposes and could address many questions raised during the pandemic.

Setting

The participating ICUs were accredited governmental and non-governmental tertiary hospitals. The multidisciplinary treatment team included critical care physicians (consultants, specialists, and residents), registered ICU nurses, respiratory therapists, clinical pharmacists, and other ICU care providers who practiced according to national and international published protocols and guidelines. Non- ICU physicians from different specialties joined the critical care team during the COVID-19 surge; their participation was under the supervision of intensivists after receiving basic ICU management training.

Patients

Adult patients above the age of 18 were admitted to the ICU of participating hospitals with confirmed SARS-CoV-2 infection via real-time polymerase chain reaction (RT-PCR) from nasopharyngeal swabs or tracheal aspirate specimens. Patients who had a Do-Not-Resuscitate code prior to ICU admission were not included in the study. Immunocompromised status was defined as solid organ malignancy, leukemia, current use of steroids (prednisone>7 mg daily for>2 weeks), post-organ transplantation at any time, or rheumatological disease on immunomodulators (azathioprine, methotrexate, infliximab, mycophenolate mofetil, or others). Infection was defined by a positive culture in the blood or tracheal aspirate.

Data collection

The data was collected manually according to the clinical record form CRF and entered into the electronic database Research Electronic Data Capture (REDCap, Vanderbilt University, Nashville, TN) [10]. Next, it underwent secondary data validation. The collected data included patient demographics, comorbidities, signs, and symptoms of COVID-19 illness, laboratory values, mechanical ventilation (MV) utilization, adjunctive interventions, medications, complications, and outcomes. FiO2 was calculated for all spontaneously breathing patients by converting O2 supply flow to estimated FiO2 [11].

Measures of outcomes

The primary outcome was 28-day ICU mortality.

Statistical analysis

General analysis

Patient characteristics were summarized for the entire patient cohort using frequencies and percentages for categorical variables. An initial assessment of the variable distribution was made using the Shapiro-Wilk test of normality, histograms, and quantile-quantile plots for continuous variables. If the variable was a customarily distributed continuous variable (Gaussian distribution), the mean and standard deviation (SD) were used to summarize the data. If the variable was not a normally distributed continuous variable (non-Gaussian distribution), then the median and the interquartile range (IQR) were used to summarize the data. A Chi-square or Fisher’s exact tests compared the categorical variables. For continuous variables, the Student’s t-test was applied for normally distributed variables, and the Wilcoxon rank-sum test was used for non-normally distributed variables. We constructed Kaplan–Meier curves to assess cumulative mortality during the initial 60 days of ICU admission. In the first 28 days of ICU stay, risk factors for mortality were evaluated for the entire cohort using stepwise logistic regression analysis. Variables included in the stepwise logistic regression model were identified based on clinical interest and literature review, and used to generate the predictive models. These included demographics, co-morbid conditions, laboratory data on ICU admission, the respiratory components of the SOFA score [11], and the need for intubation or vasopressors. Regression analysis used variables on the need for intubation within first 48 hours and vasopressors during the first five days of ICU admission. The respiratory component of the SOFA score on ICU admission was classified as category 4 (PaO2/FiO2<100 with respiratory support) or category 0–3 (PaO2/FiO2 >100). Continuous variables were categorized using cut-off points based on either previous literature review or optimal cut-off points statistically identified using the cut pointer library in R software. This approach maximizes the Youden index, which determines the split point between survivors and non-survivors. The logistic regression model results were reported as odds ratios (OR) with a 95 % confidence interval (95 %CI). All statistical tests were two-tailed, and p-values< 0.05 were considered significant. All of the statistical analyses were done with R software version 4.0.2 (06–22–2020) from the R Foundation for Statistical Computing in Vienna, Austria.

DT analysis

Machine learning DT analysis identified characteristics of COVID-19 patients using the demographics and clinical variables on ICU admission that were predictive of 28-day ICU outcome. The model was generated using the standard setting in an open-source software library (Waikato Environment for Knowledge Analysis (WEKA, University of Waikato)) [12], using the C 4.5 classification algorithm (J48) with 20 cases as the minimum number of cases at the leaf of each branch (end of the tree). The C4.5 classifier used an information gain ratio split criterion to reduce bias towards multiple values [13].

We used "algorithm accuracy" as a general measure to assess the performance of the classifier. Accuracy is a common performance metric that represents the overall correctness of the algorithm. DT analysis used a ten-fold cross validation to assess the model's generalizability and avoid over-fitting. This approach could be used to calculate the accuracy, area under the receiver operating characteristic (AUROC), and confidence intervals. These models were tuned with a 10-fold cross-validation, fitted in the 75 per cent split of the derivation set and assessed in the remaining 25 per cent. In order to ascertain the model’s stability, this training and testing split was randomly repeated 100 times (bootstraps). Finally, to decide which model to select, performance was evaluated through the mean AUROC value. The AUROC can also be used to evaluate the performance of the DT model. It reports the predictive performance of the model across different thresholds of sensitivity (true positive rate (TPR)) plotted over different ranges of 1-specificity (false positive rate) [14]. Here, TPR is the true positive cases as determined by the algorithm, divided by the total positive cases (true positive + false positive).

Missing data

Missing data was treated as follows: Variables that had more than 25 % missing were excluded from the analysis unless deemed clinically necessary by the authors. Variables that had fewer than 25 % missing values were treated as missing not at random (MNAR), where the probability of missing depends on unobserved information (e.g., a test/measurement is only performed when the doctor decided that the patient was in a severe condition that justified ordering the test/measurement; however, the severity of the disease can be based on the subjective assessment of the ordering doctor). Based on the above and to avoid the complexity associated with the imputation of missing values, they were treated as unobserved values (also known as missing values).

Results

Patient characteristics and ICU admission data

There were 1468 patients admitted to the ICUs during this study period across the 14 participating hospitals. Table 1 shows the patients’ demographics and data over the first 24 hours of ICU admission among the 28-day ICU survivors vs. non-survivors. The mean age was 55.9 (SD±15.1) years; 74 % of the patients were males, and 69 patients (4.8 %) were healthcare workers. Hypertension, ischemic heart disease, and smoking were significantly more common in the non-survivors group (p-values of 0.0187, 0.0016, and 0.0333, respectively). SOFA score, median score of 7 (IQR 4–10), was significantly higher in patients who died within the first 28 days of ICU admission. Survivors had a higher PaO2/FiO2 ratio on the day of ICU admission than non-survivors at 28 days of ICU admission [142 (IQR 72–176) vs. 92 (IQR 66–138), p-value< 0.001]. (Table 1).

Table 1.

Baseline general characteristics and ICU admission data of 1468 patients according to their 28-day survival status.

Patient characteristic All Patients (n = 1468) /Denominator 28-day non-survivors (n = 540) 28-day survivors (n = 928) p-value
Age (years), mean ( ± SD) 55.9 (15.1)/1423 58.4 (15.2) 54.3 (14.9) < 0.001 *
Gender, n ( %)
Male gender, n ( %) 1085 (74)/1467 413 (76.6) 672 (72.4) 0.0765
Female Gender, n ( %) 382 (26)/1467 126 (23.4) 256 (27.6)
Pregnancy, n ( %) 19 (5.1)/372 2 (1.6) 17 (6.8) 0.0425 *
Healthcare worker, n ( %) 69 (4.8)/1436 15 (2.9) 54 (5.9) 0.009 *
BMI (kg/m2), mean ( ± SD) 30.1 (6.8)/1369 29.5 (6.5) 30.5 (7) 0.009 *
Comorbidity
Diabetes mellitus, n ( %) 770 (54.8)/1405 297 (58.1) 473 (52.9) 0.0589
Hypertension, n ( %) 676 (48.6)/1391 267 (52.8) 409 (46.2) 0.0187 *
Ischemic heart disease, n ( %) 184 (13.8)/1333 85 (17.8) 99 (11.6) 0.0016 *
Bronchial Asthma, n ( %) 128 (9.6)/1333 37 (7.7) 91 (10.6) 0.0845
Chronic Kidney Disease, n ( %) 123 (9.2)/1339 47 (9.7) 76 (8.9) 0.656
Smoker, n ( %) 85 (7.2)/1180 21 (5) 64 (8.4) 0.0333 *
Left ventricular failure, n ( %) 74 (5.6)/1331 32 (6.7) 42 (4.9) 0.181
Immunocompromised status, n ( %) 72 (5.4)/1332 19 (4) 53 (6.2) 0.079
Renal Replacement therapy, n ( %) 54 (4.1)/1332 25 (5.2) 29 (3.4) 0.12
Cancer, n ( %) 48 (3.6)/1333 17 (3.5) 31 (3.6) 0.913
Chronic Lung Disease, n ( %) 38 (2.9)/1327 14 (2.9) 24 (2.8) 0.891
Solid-organ transplant, n ( %) 29 (2.2)/1334 8 (1.7) 21 (2.5) 0.337
COPD, n ( %) 26 (2)/1327 12 (2.5) 14 (1.6) 0.27
Chronic Liver Disease, n ( %) 24 (1.8)/1332 10 (2.1) 14 (1.6) 0.557
Chronic Hematological Disease, n ( %) 12 (0.9)/1337 3 (0.6) 9 (1.1) 0.553
ICU admission data (first 24 hours)
Use of inotropes, n ( %) 189 (15.3)/1235 120 (29.4) 69 (8.3) < 0.001 *
New AKI on ICU admission, n ( %) 98 (8.6)/1137 60 (14.4) 38 (5.3) < 0.001 *
MAP (mmHg), mean ( ± SD) 86.1 (16.1)/1299 84.7 (16.7) 86.8 (16.3) 0.0246 *
HR (beat/min), mean ( ± SD) 91.2 (20.4)/1294 95.1 (20.7) 89.3 (20) < 0.001 *
RR (per min), mean ( ± SD) 27.9 (7.2)/1274 28.3 (7.1) 27.8 (7.3) 0.196
GCS, median ( ± IQR) 15 (14–15)/1259 15 (9–15) 15 (15–15) < 0.001 *
SOFA, median ( ± IQR) 4 (3–8)/1333 7 (4–10) 4 (2–6) < 0.001 *
PO2/FiO2 ratio, median (IQR) 134 (71–163)/1217 92 (66–138) 142 (72–176) < 0.001 *
PO2/FiO2 ratio, n ( %) < 0.001 *
< 100 550 (50)/1099 210 (57.9) 340 (46.2)
100 - < 200 361 (32.8)/1099 110 (30.3) 251 (34.1)
200–300 116 (10.6)/1099 27 (7.4) 89 (12.1)
> 300 72 (6.6)/1099 16 (4.4) 56 (7.6)
ICU admission laboratory data (first 24 hours)
WBC (x 109/L), mean ( ± SD) 10.7 (6.3)/1378 12.3 (7.8) 9.7 (5) < 0.001 *
NL Ratio, mean ( ± SD) 10.3 (8.7)/1126 11.8 (9.4) 9.4 (8.1) < 0.001 *
Creatinine (nmol/L), median (IQR) 83 (63–130)/1289 106 (71–187) 75 (60–105) < 0.001 *
Lactate (mmol/L), median (IQR) 1.5 (1.1–2.2)/652 1.8 (1.2–3) 1.5 (1.1–2) < 0.001 *
Procalcitonin (ng/mL), median (IQR) 0.36 (0.15–1.5)/669 0.96 (0.27–3.6) 0.3 (0.13–0.64) < 0.001 *
LDH (IU/L), median (IQR) 504 (363–706)/1028 584 (411–826) 467 (353–641) < 0.001 *
D-Dimer (mcg/mL), median (IQR) 1.51 (0.8–2.8)/1060 2.38 (1.2–5.4) 1.2 (0.7–2.8) < 0.001 *
Ferritin (ng/mL), median (IQR) 802 (396–1295)/850 915 (486–1166) 772 (370–1295) < 0.001 *
CRP (mg/l), median (IQR) 104 (33–196)/937 117 (40–198) 99 (30–194) 0.0524

COPD, chronic obstructive pulmonary disease. BMI, body mass index. COPD, chronic obstructive pulmonary disease. CRP, C - reactive protein. GCS, Glasgow coma scale. HR, heart rate. LDH, Lactic Acid Dehydrogenase. MAP mean arterial pressure. NL ratio, Neutrophil-to-lymphocyte ratio. RR, respiratory rate. SOFA, Sequential Organ Failure Assessment. WBC, white blood cells.

Interventions during the ICU stay

Of the study cohort, 778 patients (52.9 %) required invasive mechanical ventilation (IMV) during ICU admission. Of these, 128 patients were intubated prior to transfer to the ICUs of the participating centers. High flow nasal cannula (HFNC) was used in 446 patients (35.3 %) with a median duration of three days (IQR 2–6). In non-intubated patients, an awake and prone position was utilized in 350 patients (27.7 %), of whom 194 patients (57.4 %) utilized it for longer than 4 h/day. Of the patients who required IMV, 506 patients (75.3 %) received neuromuscular blockade, and 319 patients (47.8 %) received prone positioning ( Table 2).

Table 2.

Interventions, Respiratory support modalities,Respiratory data following invasive mechanical ventilation and medication during ICU stay.

Patient characteristic All Patients (n = 1468) /Denominator 28-day non-survivors (n = 540) 28-day survivors (n = 928) p-value
Vasopressors n ( %) 395 (26.9)/1468 246 (45.5) 149 (16) < 0.001 *
Oxygen delivery modes
HFNC, n ( %) 446 (35.3)/1264 105 (25.1) 341 (40.4) < 0.001 *
HFNC days, median (IQR) 3 (2–6)/429 2 (1–4) 4 (2–6) < 0.001 *
NIPPV, n ( %) 205 (16.2)/1267 96 (22.8) 109 (12.9) < 0.001 *
NIPPV days, median (IQR) 2 (1–4)/198 2 (1–4) 2 (2–6) 0.527
Awake prone positioning, n ( %) 350 (27.7)/1263 86 (20.6) 264 (31.2) < 0.001 *
Awake prone days, median (IQR) 3 (2–5)/316 2 (1–4) 4 (2–6) < 0.001 *
Awake prone > 4 h/day, n ( %) 194 (57.4)/338 49 (60.5)/81 145 (56.4) 0.518
IMV, n ( %) 778 (52.9)/1468 454 (83.9) 324 34.9) < 0.001 *
First 24 hs of intubation, mean ( ± SD)
PaO2/FiO2 125.5 (80)/535 115 (76) 137.7 (83) 0.001 *
PCO2 (mmHg) 46.6 (14.7)/558 48 (15.5) 45 (13.5) 0.015 *
Static Compliance (mL/cmH2O) 27.8 (11.2)/152 27 (11.7) 28.6 (10.6) 0.382
Dynamic Compliance (mL/cmH2O)) 22.2 (14.6)/324 20.6 (9.4) 24.1 (19.2) 0.045 *
Peak airway pressure (cmH2O) 31.2 (6.7)/335 31.7 (7.1) 30.6 (6.2) 0.131
Plateau pressure (cmH2O) 27.3 (5.7)/156 27 (5.4) 27.5 (5.9) 0.549
Tidal Volume (mL per IBW) 6.94 (1.34)/601 6.87 (1.2) 7.04 (1.51) 0.135
Interventions during IMV, n ( %)
Neuromuscular blockade infusion 562 (74.8)/751 332 (76.1) 230 (73) 0.329
Recruitment maneuvers use 91 (12.3)/737 50 (11.6) 41 (13.4) 0.482
iNO use 64 (8.6)/742 37 (8.5) 27 (8.7) 0.927
Prone positioning during MV 350 (47)/745 194 (44.6) 156 (50.3) 0.123
Rescue APRV use 20 (2.7)/739 10 (2.3) 10 (3.2) 0.444
Rescue HFOV use 12 (1.6)/740 8 (1.8) 4 (1.3) 0.563
Tracheostomy 60 (7.7)/778 8 (1.5) 52 (5.6) < 0.001 *
ECMO 71 (9.1)/778 40 (7.5) 31 (3.4) < 0.001 *
Medications and interventions, n ( %)
Azithromycin 1069 (74.2)/1440 366 (68.9) 703 (77.3) < 0.001 *
Corticosteroids 1048 (73.1)/1433 401 (75.8) 647 (71.6) 0.081
Chloroquine 429 (30.5)/1406 152 (29.2) 277 (31.3) 0.403
Tocilizumab 426 (30.1)/1414 126 (24) 300 (33.7) < 0.001 *
Favipiravir 316 (22.4)/1441 111 (21.2) 205 (23.1) 0.401
Ribavirin 241 (17.2)/1402 79 (15.3) 162 (18.3) 0.141
Convalescent plasma 53 (3.8)/1409 12 (2.3) 41 (4.6) 0.026 *
IVIG 51 (3.6)/1401 18 (3.5) 33 (3.7) 0.792
Plasmapheresis 26 (1.8)/1409 17 (3.2) 9 (1) 0.002 *
Remdesivir 13 (0.9)/1399 5 (1) 8 (0.9) 0.91

HFNC, high flow nasal cannula. IVIG, Intravenous immunoglobulin. NIPPV, non-invasive positive pressure ventilation. IMV, Invasive Mechanical Ventilation. PaO2/FiO2, Partial pressure of oxygen to fraction of inspired oxygen ratio. PCO2, Partial pressure of Co2. Fio2, fraction of inspired oxygen. MV, mechanical ventilator. APRV, Airway pressure release ventilation. ECMO, extracorporeal membrane oxygenation. HFOV, high-frequency oscillatory ventilation. iNO, inhaled nitric oxide.

Outcomes

Of the 1468 patients, 540 (36.8 %) died within 28 days of ICU admission; 757 (51.6 %) were discharged alive from the hospital. The 90-day ICU mortality was 600 patients (40.9 %), and the median ICU length of stay was nine days (IQR 5–16). The length of hospitalization was 15 (IQR 9–24) days. Blood cultures were positive in 267 patients (24.3 %) and respiratory samples in 227 patients (33.4 %) ( Table 3). The Kaplan Maier curve for COVID-19 cumulative incidence of mortality showed 40 % mortality at day 60 of ICU admission ( Fig. 1).

Table 3.

Patient's clinical outcomes.

All Patients (n = 1468) /Denominator 28-day non-survivors (n = 540) 28-day survivors (n = 928) p-value
ICU Mortality at 28 days, n ( %) 540 (36.8)
90-day mortality, n ( %) 600 (40.9)
Discharge from ICU alive, n ( %) 869 (59.1)
Discharged from hospital alive, n ( %) 757 (51.6)
Transfer to another institution, n ( %) 94 (6.5)
Mortality Subgroup analysis according to time of death
0–28 days, n ( %) 540 (90.05)
29–60 days, n ( %) 54(8.9)
61–90 days, n ( %) 6 (0.06)
ICU length of stay (days), Median (IQR) 9 (5–16)/1468 10 (5–16) 9 (5–17) 0.356
Hospital length of stay, (days) Median (IQR) 15 (9–24)/1468 13 (7–18) 18 (11–31) < 0.001 *
Mechanical ventilation duration (days) Median (IQR) 7 (0–14)/1082 10 (4–15) 3 (0–11) < 0.001 *
Infection (positive culture, respiratory), n ( %) 227 (33.4)/679 131 (46.6) 96 (24.1) < 0.001 *
Infection (positive culture, blood), n ( %) 267 (24.3)/1100 151 (34.2) 116 (17.6) < 0.001 *
AKI requiring RRT, n ( %) 178 (14.1)/1259 130 (28.6) 48 (6) < 0.001 *
Pneumothorax, n ( %) 89 (6.2)/1440 55 (10.4) 34 (3.7) < 0.001 *
Myocardial infarction, n ( %) 64 (4.5)/1433 40 (7.6) 24 (2.6) < 0.001 *
GI Bleeding, n ( %) 52 (3.6)/1438 27 (5.1) 25 (2.7) 0.019 *
Pulmonary embolism, n ( %) 44 (3.1)/1435 20 (3.8) 24 (2.6) 0.215
Deep vein thrombosis, n ( %) 33 (2.3)/1435 17 (3.2) 16 (1.8) 0.070
Intracranial hemorrhage, n ( %) 33 (2.3)/1432 15 (2.9) 18 (2) 0.277
Ischemic Stroke, n ( %) 32 (2.2)/1438 14 (2.7) 18 (2) 0.39

AKI, acute kidney injury ICU, intensive care unit. GI, gastrointestinal. RRT, renal replacement therapy.

Fig. 1.

Fig. 1

Kaplan Maier curve for COVID-19 cumulative incidence of mortality.

Predictors of 28-day ICU mortality

The results of the decision tree analysis

Five variables were identified and allocated to patients in the final binary outcome (survival versus mortality). These variables arranged according to their significance were the need for intubation or vasopressors, gender, PaO2/FiO2 on ICU admission, and age. The resulting DT assigned the root node (start of the tree or first splitting criteria) to the need for intubation. The tree continued to grow, and we then assigned patients into their respective groups sequentially, utilizing the four other variables. The model discrimination, DT model's ability to correctly assign patients to their respective groups, was assessed using the ROC-AUC and was 75.42 % (95 % CI = 74.84–78.95). The DT model accuracy was 73.1 % (number of retained patients on the model: 1043 out of 1468) ( Fig. 2).

Fig. 2.

Fig. 2

Decision tree (DT) algorithm for predictors of mortality.

The results of the logistic regression

The stepwise logistic regression analysis retained: age groups, gender, the respiratory component of the SOFA score (category 4), need for intubation, or vasopressors, and neutrophil-lymphocytes (NL) ratio as variables that may predict 28-day ICU mortality. ( Fig. 3).

Fig. 3.

Fig. 3

Stepwise logistic regression of 28-day mortality.

Discussion.

We utilized the DT analysis and identified the interaction of five features predictive of 28-day ICU outcomes: the need for intubation, vasopressors, age, gender, and PaO2/FiO2 ratio. The COVID-19 pandemic overwhelmed the health care system and led to constrained medical resources, especially in terms of critical care unit capacity; there was even a shortage of mechanical ventilators [15], [16], [17].

Many hospitals utilized machine learning analyses by combining clinical, radiological, and laboratory data for the prognostication and rapid risk stratification of PCR-confirmed COVID-19 patients [18], [19], [20]. The severity of illness among ICU patients was stratified via different general scoring methods such as the acute physiology and chronic health evaluation (APACHE) II and IV [21], [22], the Simplified Acute Physiology Score (SAPS) [23], SOFA scores [11], or COVID-19 specific scores as 4 C mortality scores [24], [25].

Machine-learning models have been increasingly utilized in the medical field, especially for cancer outcome predictions [27], [28], [29]. Random Forest classifiers, decision trees, and artificial neural networks (ANNs) were among the earliest used techniques in medical research [30], [31]. DT analysis is an effective classifier and has been applied in many domains [32], [33]. DTs are an intuitive nonlinear approach and can automatically detect independent variables that predict outcomes as well as the interactions between these variables. DTs also offer an easy-to-understand visual representation of the relationships between the variables and the primary outcome [34].

The standard logistic regression analysis can predict outcomes of interest, but it does not model nonlinear relationships of multiple dimensional data [26]. DT analysis were built using the same predefined set of variables used for stepwise logistic regression. Nevertheless, retained variables were comparable between both models. Our research emphasizes the benefits of DT analysis in terms of providing simple rules-based algorithmic prediction rather than merely identifying associations and relationships between variables, as conventional regression models offer [35], [36].

The use of machine learning in the ICU is evolving. It is currently limited to diagnostic and prognostic values. However, DT analysis offers a simple method for the sequential analysis of variables. For example, patients in this cohort who were not intubated nor required vasopressor support early in ICU admission, if they were in the age group younger than 40 years, the true positive rate of survival is 88 % (Fig. 2). This DT provides simple valuable tool for bedside physicians during COVID-19 to guide critical decisions, making decisions on end-of-life and bed allocation easier. The algorithmic relationship of combined risk factors offers a quantitative percentage of sensitivity to outcomes [9].

Predictors of mortality in COVID-19 are widely reported in many studies with different settings and designs. These include laboratory and radiological variables [37], [38]. However, there are limited reports on clinical variables on ICU admission as predictors of mortality. Such variables can facilitate the early identification of critically ill COVID-19 patients at a higher risk of 28-day mortality [39]. A metaanalysis by Du et al. addressed the predictors of mortality utilizing the classic logistic regression analysis; they showed that advanced age, male gender, comorbidities of chronic respiratory disease, DM, hypertension, and chronic kidney or cardiovascular diseases were associated with severe illness or death among COVID-19 patients [40].

Studies that report predictors of mortality utilizing DT analysis in critically ill COVID-19 patients are quite limited [41], [42], [43]. One of these analyses by Yang et al. showed a rapid, simple, and easy-to-interpret DT model built via three biochemical markers on ICU admission (LDH, NLR, and CRP). There was a high true sensitivity rate that could predict death in severe COVID-19 disease [41].

The strengths of this study include different nationalities and a multicenter nature, which improves generalizability. In addition, unlike earlier reported experiences from the Middle East [44], the 28-day ICU mortality of 36.8 % in this cohort was comparable to reported experiences during the pandemic [45], [46], [47]. We used stepwise logistic regression to evaluate the results of the DT analysis (Fig. 3). This is comparable to regression analysis. To our knowledge, the number of patients enrolled here is the largest in the Middle East. Thus, the results offer a valuable analysis to explain the disease and its effects in the Gulf and Middle East regions The performance of DT analysis was comparable to Stepwise Logistic Regression, as both had ROC AUC in the acceptable range, 70–80 %, (79.96 % (95 % CI = 76.91–83.02) 75.42 % (95 % CI = 74.84–78.95), respectively) as well as retraining similar variables as predictors of outcome. DT analysis, on the other hand, used a ten-fold cross validation to assess the model's generalizability and avoid over-fitting. DT analysis provides algorithmic visualization of non-linear interactions between variables that standard logistic regression cannot.

Our study does have some limitations, including the lack of external validation for the proposed model of predictors. We did not include centers as a preset variable for regression and DT analysis; rather, we decided not to adjust for centers in the logistic model or DT models because of the following: 1-The variability of resources available in each center might have an unobserved yet strong effect on 28-day mortality. 2-The number of cases varied significantly between centers and can significantly influence confidence intervals and interpretation of the results. 3-Difficulties in identifying a reference center due to the variability of the patient population presenting to more prominent hospitals in different regions. Finally, the DT model accuracy was 73.1 % (the number of retained patients on the model was 1043 out of 1468), despite being comparable to the analysis of logistic regression, which is not assuring and needs further research to prove that.

Conclusion

Five clinical predictors of 28-day ICU outcomes were identified using DT algorithmic analysis of COVID-19 patients admitted to the ICU. DT is a simple tool that might be utilized in the ICU for early identification of critically ill COVID-19 patients who are at high risk of 28-day mortality. However, further studies are required to validate these results and evaluate the role of DT analysis in the ICU.

Ethics approval

Approval was obtained from the Central Institutional Review Board at the Saudi Ministry of Health [20-80E]. Individual ethical board approvals of the participating centers were also obtained.

Funding

No funding was provided for this study.

CRediT authorship contribution statement

Design conception: AE, HS, AR. Data collection: AE, HS, AO, AM, AG, AT, M.H.A, AS, AK, ZA, GM, WT, S.A.A, FF, AH, JT, R.G.M, and the Saudi COVID working group. Data validation and cleaning: AE, HS, and HM. Statistical analysis, model development, and Validation: HM. Acquisition, analysis, or interpretation of data: AE. AR, and HM. Drafting and writing of the manuscript: AE & AR. Editing and reviewing: YA, WH, MS, ZA, AO, and HM. Critical revision of the manuscript for important intellectual content, final review, and approval by all authors.

Declaration of Competing Interest

None of the authors have a conflict of interest related to this work.

Acknowledgement

The authors would like to thank the Saudi Critical Care Trials Group for their kind guidance as well as the Saudi COVID working group for their marvelous help in data sharing and collection, Mohamed Ali Al-Odat, Huda A Mhawish, Bobby Rose Marasigan, Armalene Cabuyadao Bautista, Katrina Baguisa, Zahra Alfrdan, Bernadette Daher, Mohammed H. Lhmdi, Waleed Alatreeby, Mahdi Othman, Ashraf Shawat, Ahmed Y Zakarya, Mohamed Abdulzaher, Saleh Samir Saleh Ashgar, Jumana Khalid Nabeeh Alansari, Mohammed Amin, Mohammed Rio, Duaa Jihad Zain Basrawi, Hussam Mahmoud Omar Baeissa, Bashayer Mohammed Alsaadi, Ibrahim Mohamed Ibrahim Nasr, Mahdi Alajaj, Ozaz Hamza, Saud A Alghamdi, Hamza S Ghazal, Omar S Elrabi, Sumaih A Alomran, Saleh S Alomari, Taghred M Shwaish, Afaf S Alotaibi, Omar Alrabi, Amani Alharbi, Mohammed Alzanbaqi, Fahad Alsallum, Mohammed Bashanfar, Mohammed Jastaniah, Hamdan Alguydi, Ahmed Fakhri, and Sheryl S. Lugue. Their willingness to give their time so generously is very much appreciated.

Consent to participate

Consent was waived because all data are unidentified.

Consent for publication

All authors accept and confirm publication.

Data Availability

Alyaa Elhazmi, Hend Sallam, and Hani Mufti had full access to all the data in the study and took responsibility for the integrity of the data and the accuracy of the analysis. They are willing to submit the data for an external review upon request.

References

  • 1.Zhu N., Zhang D., Wang W., et al. A novel coronavirus from patients with Pneumonia in China, 2019. N Engl J Med. 2020;382(8):727–733. doi: 10.1056/NEJMoa2001017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Cucinotta D., Vanelli M. WHO declares COVID-19 a pandemic. Acta Biomed. 2020;91(1):157–160. doi: 10.23750/abm.v91i1.9397. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Heesakkers H., van der Hoeven J.G., Corsten S., Janssen I., Ewalds E., Simons K.S., et al. Clinical outcomes among patients with 1-year survival following intensive care unit treatment for COVID-19. JAMA. 2022 8;327(6):559–565. doi: 10.1001/jama.2022.0040. doi: 10.1001/jama.2022.0040. PMID: 35072716; PMCID: PMC8787680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Schwalbe N., Wahl B. Artificial intelligence and the future of global health. Lancet. 2020;395(10236):1579–1586. doi: 10.1016/S0140-6736(20)30226-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Motwani M., Dey D., Berman D.S., et al. Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis. Eur Heart J. 2017;38(7):500–507. doi: 10.1093/eurheartj/ehw188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Subudhi S., Verma A., Patel A.B., et al. Comparing machine learning algorithms for predicting ICU admission and mortality in COVID-19. npj Digit Med. 2021;4(1):87. doi: 10.1038/s41746-021-00456-x. 2021/05/21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hernandez-Pereira E., Fontenla-Romero O., Bolon-Canedo V., Cancela-Barizo B., Guijarro-Berdinas B., Alonso-Betanzos A. Machine learning techniques to predict different levels of hospital care of CoVid-19. Appl Intell. 2021:1–19. doi: 10.1007/s10489-021-02743-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Magunia H., Lederer S., Verbuecheln R., et al. Machine learning identifies ICU outcome predictors in a multicenter COVID-19 cohort. Crit Care. 2021;25(1):295. doi: 10.1186/s13054-021-03720-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Karthikeyan A., Garg A., Vinod P., Priyakumar U.D. Machine learning based clinical decision support system for early COVID-19 mortality prediction. Front Public Health. 2021:9. doi: 10.1007/s00134-020-06296-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Read K., LaPolla F.W.Z. A new hat for librarians: providing REDCap support to establish the library as a central data hub. J Med Libr Assoc. 2018;106(1):120–126. doi: 10.5195/jmla.2018.327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Vincent J.L., Moreno R., Takala J., et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. On behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine. Intensive Care Med. 1996;22(7):707–710. doi: 10.1007/bf01709751. [DOI] [PubMed] [Google Scholar]
  • 12.Witten I.H., F.E., Hall M.A. Morgan Kaufmann Publishers; San Francisco: 2011. Data Mining: Practical Machine Learning Tools and Techniques. [Google Scholar]
  • 13.Quinlan J.R. Morgan Kaufmann Publishers Inc; San Francisco, CA, USA: 1993. C4.5: Programs for Machine Learning.https://apps.who.int/iris/handle/10665/258695 [Google Scholar]
  • 14.Gu Q., Zhu L., Cai Z. Evaluation Measures of the Classification Performance of Imbalanced Data Sets. 2009:461. https://doi.org/10.1007/978–3-642–04962-0_53.
  • 15.Arabi Y.M., Azoulay E., Al-Dorzi H.M., et al. How the COVID-19 pandemic will change the future of critical care. Intensive Care Med. 2021;47(3):282–291. doi: 10.1007/s00134-021-06352-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.de Lange D.W., Soares M., Pilcher D. ICU beds: less is more? No. Intensive Care Med. 2020;46(8):1597–1599. doi: 10.1007/s00134-020-06089-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Vincent J.-L., Creteur J. Ethical aspects of the COVID-19 crisis: how to deal with an overwhelming shortage of acute beds. Eur Heart J Acute Cardiovasc Care. 2020;9(3):248–252. doi: 10.1177/2048872620922788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Li S., Lin Y., Zhu T., et al. Development and external evaluation of predictions models for mortality of COVID-19 patients using machine learning method. Neural Comput Appl. 2021:1–10. doi: 10.1007/s00521-020-05592-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Xu Q., Zhan X., Zhou Z., et al. AI-based analysis of CT images for rapid triage of COVID-19 patients. NPJ Digit Med. 2021;4(1):75. doi: 10.1038/s41746-021-00446-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Mahdjoub E., Mohammad W., Lefevre T., Debray M.P., Khalil A. Admission chest CT score predicts 5-day outcome in patients with COVID-19. Intensive Care Med. 2020;46(8):1648–1650. doi: 10.1007/s00134-020-06118-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Giangiuliani G., Mancini A., Gui D. Validation of a severity of illness score (APACHE II) in a surgical intensive care unit. Intensive Care Med. 1989;15(8):519–522. doi: 10.1007/bf00273563. [DOI] [PubMed] [Google Scholar]
  • 22.Cho D.Y., Wang Y.C. Comparison of the APACHE III, APACHE II and Glasgow Coma Scale in acute head injury for prediction of mortality and functional outcome. Intensive Care Med. 1997;23(1):77–84. doi: 10.1007/s001340050294. [DOI] [PubMed] [Google Scholar]
  • 23.Capuzzo M., Valpondi V., Sgarbi A., et al. Validation of severity scoring systems SAPS II and APACHE II in a single-center population. Intensive Care Med. 2000;26(12):1779–1785. doi: 10.1007/s001340000715. [DOI] [PubMed] [Google Scholar]
  • 24.Knight S.R., Ho A., Pius R., et al. Risk stratification of patients admitted to hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: development and validation of the 4C Mortality Score. bmj. 2020:370. doi: 10.1136/bmj.m3339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Polderman K.H., Jorna E.M., Girbes A.R. Inter-observer variability in APACHE II scoring: effect of strict guidelines and training. Intensive Care Med. 2001;27(8):1365–1369. doi: 10.1007/s001340101012. [DOI] [PubMed] [Google Scholar]
  • 26.Shipe M.E., Deppen S.A., Farjah F., Grogan E.L. Developing prediction models for clinical use using logistic regression: an overview. J Thorac Dis. 2019;11(Suppl 4):S574–s584. doi: 10.21037/jtd.2019.01.25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ahmad L.G., Eshlaghy A., Poorebrahimi A., Ebrahimi M., Razavi A. Using three machine learning techniques for predicting breast cancer recurrence. J Health Med Inf. 2013;4(124):3. doi: 10.4172/2157-7420.1000124. [DOI] [Google Scholar]
  • 28.Xiao Y., Wu J., Lin Z., Zhao X. A deep learning-based multi-model ensemble method for cancer prediction. Comput Methods Prog Biomed. 2018;153:1–9. doi: 10.1016/j.cmpb.2017.09.005. [DOI] [PubMed] [Google Scholar]
  • 29.Mihaylov I., Nisheva M., Vassilev D. Application of machine learning models for survival prognosis in breast cancer studies. Information. 2019;10(3):93. doi: 10.3390/info10030093. [DOI] [Google Scholar]
  • 30.Maclin P.S., Dempsey J., Brooks J., Rand J. Using neural networks to diagnose cancer. J Med Syst. 1991;15(1):11–19. doi: 10.1007/bf00993877. [DOI] [PubMed] [Google Scholar]
  • 31.Simes R.J. Treatment selection for cancer patients: application of statistical decision theory to the treatment of advanced ovarian cancer. J Chronic Dis. 1985;38(2):171–186. doi: 10.1016/0021-9681(85)90090-6. [DOI] [PubMed] [Google Scholar]
  • 32.Bishop C. (2006) Pattern Recognition and Machine Learning (Information Science and Statistics)(Springer‐Verlag New York, Inc., Secaucus, NJ, USA).
  • 33.Shabas L. Training algorithms for the decision tree method of pattern recognition. Cybernetics. 1976;12(3):409–416. doi: 10.1007/BF01070022. [DOI] [Google Scholar]
  • 34.Kuncheva L.I. John Wiley & Sons; 2014. Combining Pattern Classifiers: Methods and Algorithms. [DOI] [Google Scholar]
  • 35.Trevor H., Robert T., Jerome F., Hastie T., Friedman J., Tibshirani R. Vol. 2. Springer; 2009. (The Elements of Statistical Learning). [Google Scholar]
  • 36.Han J., Pei J., Kamber M. Elsevier; 2011. Data Mining: Concepts and Techniques. [Google Scholar]
  • 37.Danwang C., Endomba F.T., Nkeck J.R., Wouna D.L.A., Robert A., Noubiap J.J. A meta-analysis of potential biomarkers associated with severity of coronavirus disease 2019 (COVID-19) Biomark Res. 2020;8(1):1–13. doi: 10.1186/s40364-020-00217-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Williamson E.J., Walker A.J., Bhaskaran K., et al. Factors associated with COVID-19-related death using OpenSAFELY. Nature. 2020;584(7821):430–436. doi: 10.1038/s41586-020-2521-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Zanella A., Florio G., Antonelli M., et al. Time course of risk factors associated with mortality of 1260 critically ill patients with COVID-19 admitted to 24 Italian intensive care units. Intensive Care Med. 2021:1–14. doi: 10.1007/s00134-021-06495-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Du P., Li D., Wang A., Shen S., Ma Z., Li X. A systematic review and meta-analysis of risk factors associated with severity and death in COVID-19 patients. Can J Infect Dis Med Microbiol. 2021;2021 doi: 10.1155/2021/6660930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Yang Q., Li J., Zhang Z., et al. Clinical characteristics and a decision tree model to predict death outcome in severe COVID-19 patients. BMC Infect Dis. 2021;21(1):783. doi: 10.1186/s12879-021-06478-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Yan L., Zhang H., Goncalves J., et al. A machine learning-based model for survival prediction in patients with severe COVID-19 infection. medRxiv. March 2020;17. https://doi.org/10.1101/2020.02.27.20028027.
  • 43.Toraih E.A., Elshazli R.M., Hussein M.H., et al. Association of cardiac biomarkers and comorbidities with increased mortality, severity, and cardiac injury in COVID‐19 patients: a meta‐regression and decision tree analysis. J Med Virol. 2020;92(11):2473–2488. doi: 10.1002/jmv.26166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Armstrong R.A., Kane A.D., Kursumovic E., Oglesby F.C., Cook T.M. Mortality in patients admitted to intensive care with COVID-19: an updated systematic review and meta-analysis of observational studies. Anaesthesia. 2021;76(4):537–548. doi: 10.1111/anae.15425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Almazeedi S., Al-Youha S., Jamal M.H., et al. Characteristics, risk factors and outcomes among the first consecutive 1096 patients diagnosed with COVID-19 in Kuwait. EClinicalMedicine. 2020;24 doi: 10.1016/j.eclinm.2020.100448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Lee J.S., Godard A. Critical care for COVID-19 during a humanitarian crisis-lessons learnt from Yemen. Crit Care. 2020;24(1):572. doi: 10.1186/s13054-020-03281-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Shahriarirad R., Khodamoradi Z., Erfani A., et al. Epidemiological and clinical features of 2019 novel coronavirus diseases (COVID-19) in the South of Iran. BMC Infect Dis. 2020;20(1):427. doi: 10.1186/s12879-020-05128-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Alyaa Elhazmi, Hend Sallam, and Hani Mufti had full access to all the data in the study and took responsibility for the integrity of the data and the accuracy of the analysis. They are willing to submit the data for an external review upon request.


Articles from Journal of Infection and Public Health are provided here courtesy of Elsevier

RESOURCES