Skip to main content
BMC Medical Research Methodology logoLink to BMC Medical Research Methodology
. 2022 Jul 4;22:183. doi: 10.1186/s12874-022-01664-z

Machine learning approach for the prediction of 30-day mortality in patients with sepsis-associated encephalopathy

Liwei Peng 1,#, Chi Peng 2,#, Fan Yang 3,#, Jian Wang 1, Wei Zuo 1, Chao Cheng 1, Zilong Mao 1, Zhichao Jin 2,, Weixin Li 1,
PMCID: PMC9252033  PMID: 35787248

Abstract

Objective

Our study aimed to identify predictors as well as develop machine learning (ML) models to predict the risk of 30-day mortality in patients with sepsis-associated encephalopathy (SAE).

Materials and methods

ML models were developed and validated based on a public database named Medical Information Mart for Intensive Care (MIMIC)-IV. Models were compared by the area under the curve (AUC), accuracy, sensitivity, specificity, positive and negative predictive values, and Hosmer–Lemeshow good of fit test.

Results

Of 6994 patients in MIMIC-IV included in the final cohort, a total of 1232 (17.62%) patients died following SAE. Recursive feature elimination (RFE) selected 15 variables, including acute physiology score III (APSIII), Glasgow coma score (GCS), sepsis related organ failure assessment (SOFA), Charlson comorbidity index (CCI), red blood cell volume distribution width (RDW), blood urea nitrogen (BUN), age, respiratory rate, PaO2, temperature, lactate, creatinine (CRE), malignant cancer, metastatic solid tumor, and platelet (PLT). The validation cohort demonstrated all ML approaches had higher discriminative ability compared with the bagged trees (BT) model, although the difference was not statistically significant. Furthermore, in terms of the calibration performance, the artificial neural network (NNET), logistic regression (LR), and adapting boosting (Ada) models had a good calibration—namely, a high accuracy of prediction, with P-values of 0.831, 0.119, and 0.129, respectively.

Conclusions

The ML models, as demonstrated by our study, can be used to evaluate the prognosis of SAE patients in the intensive care unit (ICU). Online calculator could facilitate the sharing of predictive models.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12874-022-01664-z.

Keywords: Machine learning, Model interpretation, Sepsis-associated encephalopathy, SAE, Web-based calculator

Introduction

Sepsis-associated encephalopathy (SAE) is characterized by diffuse cerebral dysfunction resulted from a dysregulated host response without central nervous system (CNS) infection [1]. It develops in 8–70% of septic patients, based on the sepsis severity, patients’ profile, and SAE diagnostic criteria [25]. Symptoms in the acute stage contain sickness behavior, delirium, coma and so on. Further, survivors of the acute stage have a tendency to develop persistent neurocognitive impairment, including cognitive alterations, and even overt dementia [68]. It is reported that SAE was associated with longer duration of mechanical ventilation (MV) and prolonged lengths of stay (LOS) in the intensive care unit (ICU) as well as poor overall prognosis [9, 10].

Also, it was related to higher severity of scoring systems, including the Glasgow coma score (GCS), sequential organ failure assessment score (SOFA), and the Acute Physiology and Chronic Health Evaluation (APACHE II) [7, 10]. Moreover, with a mortality rate of up to 63% [3], SAE can be detrimental to patients’ health as well as add a heavy burden to the financial system. Accordingly, early identification, especially individual and measurable prediction models, and prompt management are of vital importance for the survival and prognosis of SAE patients [11]. Recently, the advent of machine learning (ML) algorithms has enabled us to predict disease events dynamically based on complicated clinical information. ML, an artificial intelligence method, can develop models “learning” from existing data [12]. Moreover, without particular model assumptions, ML, may be adept at handling intricate interactions between variables of one sort or another [13]. The present study aimed to investigate independent factors and then develop predictive models to quantitatively predict the likelihood of 30-day mortality in patients with SAE.

Methods

Data source

This retrospective study was conducted on the Medical Information Mart for Intensive Care (MIMIC)-IV version 1.0 [14]. Specifically, the MIMIC-IV database contained comprehensive, de-identified data of patients who have been admitted to the ICUs at the Beth Israel Deaconess Medical Center in Boston, Massachusetts, between 2008 and 2019, containing data from 383,220 admissions (single center). One author (CP) has obtained access to both databases and was responsible for data extraction (Certification number: 41657645). This study was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA). Requirement for individual patient consent was waived due to the fact that all protected health information was deidentified.

Participant selection

Inclusion criteria were patients with a diagnosis of sepsis in accordance with the Third International Consensus Definitions for Sepsis (Sepsis-3) [15]. People with an age of younger than 16 years old, ICU stays less than 48 h, primary brain injury (traumatic brain injury, ischemic stroke, hemorrhagic stroke, epilepsy, or intracranial infection), pre-existing liver or kidney failure affecting consciousness, chronic alcohol or drug abuse, and severe electrolyte imbalances were excluded from the study. In addition, for patients with multiple ICU admissions, only data of the first ICU admission of the first hospitalization were included in the analysis.

Predictors of 30-day mortality in SAE patients

In this study, the data extracted from MIMIC-IV included age, gender, race, and coexisting disorders. Hereafter, the Charlson comorbidity index (CCI) was calculated from its component variables [myocardial infarction, congestive heart failure, peripheral vascular disease, cerebrovascular disease, dementia, chronic pulmonary disease, rheumatic disease, peptic ulcer disease, diabetes, paraplegia, renal disease, malignant cancer, severe liver disease, metastatic solid tumor and acquired immunodeficiency syndrome (AIDS)]. Additionally, we retrospectively extracted the following data: vital signs, laboratory findings, injury types, different therapy strategies and scoring systems on the first day of ICU admission. Since values were missing at random, we used multiple imputation to deal with missing data. Details of missing data are shown in Supplementary Table 1 (Table S1).

Statistical analysis

Values were presented as means with standard deviations (if normal) or medians with interquartile ranges (IQR) (if non-normal) for continuous variables, and total numbers (%) for categorical variables. Continuous variables were compared by the t test or Wilcoxon rank sum test while proportions were compared using χ2 test or Fisher exact tests, if appropriate.

Recursive feature elimination (RFE), a resource selection method, was utilized to select the most relevant variables. In a word, RFE recursively fits into a model based on smaller resource sets until a specified termination criterion is reached. In each loop, characteristics are classified in accordance with their importance in the trained model. Ultimately, highly correlated and collinear variables were eliminated. The characteristics were then considered in groups of 15/25/35/45/ALL (ALL = 56 variables, as represented in Fig. 1) organized by the ranks obtained after the method of selection of the characteristics. In order to find the optimal hyperparameters, fivefold cross-validation was used as the resampling method. In each iteration, every four folds were used as a training subset, and the remaining one-fold was processed to adjust the hyperparameters. This training-test process was repeated thirty times. Thus, each sample would be involved in both the training model and the testing model, so that all data were used as much as possible.

Fig. 1.

Fig. 1

Overview of the methods used for data extraction, training, and testing. ICU, intensive care unit; MIMIC, Medical Information Mart for Intensive Care; ML, machine learning; NNET, artificial neural network; NB, naïve bayes; LR, logistic regression; GBM, gradient boosting machine; Ada, adapting boosting; RF, random forest; BT, bagged trees; XGB, eXtreme Gradient Boosting

In this study, we employed nine different ML algorithms to develop models, including artificial neural network (NNET), bayes naive (NB), logistic regression (LR), gradient boosting machine (GBM), adaptating boosting (Ada), random forest (RF), bagged trees (BT), eXtreme Gradient Boosting (XGB) and CatBoost. Firstly, the population was divided into development set and validation set. As for internal validation, bootstrap resampling technique with 100 iterations was employed. Median and 95% confidence intervals of area under the curve (AUC) were calculated. Other evaluation indicators, such as, accuracy, sensitivity, specificity, negative predictive value and positive predictive value were also calculated. Moreover, the calibration curve was employed by the Hosmer–Lemeshow test of good adaptation. More precisely, the chi-square value was calculated based on the actual observed and predicted value of the model for each group and, subsequently, the corresponding p value was obtained. Ultimately, the “Shiny” package in R was used to build a visual data analysis platform. All analyses were performed by the statistical software packages R version 4.0.2 (http://www.R-project.org, The R Foundation). In our study, we also used the “Caret” R packages and “Shiny” R packages to achieve the process. P values less than 0.05 (two-sided test) were considered as statistically significant.

Results

Baseline characteristic

In accordance with the inclusion and exclusion criteria, 6994 patients were finally included in the dataset. The process of data extraction, training preparation, data testing by diverse ML algorithms is demonstrated in Fig. 1. The characteristics of the participants are depicted in Table 1. People who died were more likely to be older, with more comorbidities (myocardial infarction, congestive heart failure, chronic pulmonary disease, rheumatic disease, mild liver disease, renal disease, malignant cancer, severe liver disease, metastatic solid tumor), higher heart rate, higher respiratory rate, higher white blood cell (WBC), higher mean corpuscular volume (MCV), higher red blood cell volume distribution width (RDW), longer activated partial thromboplastin time (APTT), longer prothrombin time (PT), higher international normalized ratio (INR), higher lactate, higher buffer excess (BE), higher anion gap, higher potassium, higher creatinine (CRE), higher blood urea nitrogen (BUN), higher vasopressor, higher sepsis related organ failure assessment (SOFA), higher acute physiology score III (APSIII), and higher systemic inflammatory response syndrome (SIRS). Furthermore, they were more likely to have lower temperature, lower mean artery pressure (MAP), lower red blood cell (RBC), lower mean corpuscular hemoglobin concentration (MCHC), lower platelet (PLT), lower hematocrit (HCT), lower pH, lower bicarbonate, lower PaO2, lower chloride, lower sodium, and lower Glasgow coma score (GCS).

Table 1.

Baseline characteristic of the MIMIC-IV cohorts

Variables Survival
(n = 5762)
Death
(n = 1232)
P Value
Demographics
 Age (y), median [Q1, Q3] 70.00 (58.00,81.00) 77.00 (66.00,85.25)  < 0.001
 Male, n (%) 3181 (55.21) 659 (53.49) 0.286
 Race, n (%) 0.001
 Black 490 (8.50) 93 (7.55)
 White 4000 (69.42) 824 (66.88)
 Hispanic 174 (3.02) 31 (2.52)
 Asian 193 (3.35) 31 (2.52)
 Others 905 (15.71) 253 (20.54)
Coexisting disorders, n (%)
 Myocardial infarction 1006 (17.46) 265 (21.51) 0.001
 Congestive heart failure 1977 (34.31) 540 (43.83)  < 0.001
 Peripheral vascular disease 696 (12.08) 174 (14.12) 0.054
 Cerebrovascular disease 259 (4.49) 51 (4.14) 0.636
 Dementia 300 (5.21) 81 (6.57) 0.064
 Chronic pulmonary disease 1728 (29.99) 421 (34.17) 0.004
 Rheumatic disease 234 (4.06) 79 (6.41)  < 0.001
 Peptic ulcer disease 229 (3.97) 64 (5.19) 0.063
 Mild liver disease 709 (12.30) 242 (19.64)  < 0.001
 Diabetes without complication 1308 (22.70) 255 (20.70) 0.135
 Diabetes with complication 507 (8.80) 105 (8.52) 0.798
 Paraplegia 124 (2.15) 17 (1.38) 0.101
 Renal disease 1349 (23.41) 387 (31.41)  < 0.001
 Malignant cancer 839 (14.56) 342 (27.76)  < 0.001
 Severe liver disease 225 (3.90) 93 (7.55)  < 0.001
 Metastatic solid tumor 343 (5.95) 205 (16.64)  < 0.001
 AIDS 41 (0.71) 10 (0.81) 0.849
 CCI, median [Q1, Q3] 6.00 (4.00,8.00) 7.00 (6.00,9.00)  < 0.001
Vital signs (1st 24 h)
 Temperature (°C), median [Q1, Q3] 36.90 (36.60,37.30) 36.70 (36.40,37.10)  < 0.001
 MAP (mmHg), median [Q1, Q3] 75.00 (70.00,82.00) 73.00 (68.00,80.00)  < 0.001
 Heart rate (min), median [Q1, Q3] 88.00 (77.00,100.00) 91.00 (80.00,104.00)  < 0.001
 Respiratory rate (min), median [Q1, Q3] 20.00 (17.00,23.00) 22.00 (19.00,25.00)  < 0.001
Laboratory findings (1st 24 h)
 RBC (× 109/L), median [Q1, Q3] 3.41 (3.02,3.92) 3.28 (2.88,3.80)  < 0.001
 WBC (× 109/L), median [Q1, Q3] 11.80 (8.60,16.03) 12.62 (8.77,17.50) 0.002
 MCH (pg), median [Q1, Q3] 30.20 (28.73,31.50) 30.12 (28.70,31.63) 0.925
 MCHC (%), median [Q1, Q3] 33.00 (31.85,34.00) 32.40 (31.30,33.50)  < 0.001
 MCV (fL), median [Q1, Q3] 91.00 (87.00,95.00) 92.75 (88.00,97.33)  < 0.001
 PLT (× 109/L), median [Q1, Q3] 197.00 (139.00,268.50) 189.33 (116.71,272.75) 0.001
 RDW (%), median [Q1, Q3] 14.77 (13.73,16.27) 16.00 (14.58,18.00)  < 0.001
 HCT (%), median [Q1, Q3] 31.30 (27.78,35.70) 30.38 (26.80,35.07)  < 0.001
 APTT (s), median [Q1, Q3] 31.90 (27.80,39.80) 35.70 (29.30,49.50)  < 0.001
 PT (s), median [Q1, Q3] 14.27 (12.85,16.50) 15.70 (13.40,20.00)  < 0.001
 INR, median [Q1, Q3] 1.30 (1.15,1.50) 1.40 (1.20,1.85)  < 0.001
 pH, median [Q1, Q3] 7.37 (7.32,7.42) 7.36 (7.31,7.42)  < 0.001
 Bicarbonate (mmol/L), median [Q1, Q3] 23.00 (20.50,26.00) 21.90 (19.00,25.50)  < 0.001
 Lactate (mmol/L), median [Q1, Q3] 1.65 (1.20,2.28) 1.95 (1.40,2.89)  < 0.001
 BE (mEq/L), median [Q1, Q3] -0.50 (-3.50,1.43) -1.33 (-5.00,1.00)  < 0.001
Aniongap (mmol/L), median [Q1, Q3] 14.00 (12.00,16.50) 15.50 (13.00,18.21)  < 0.001
 PaO2 (mmHg), median [Q1, Q3] 112.00 (76.00,173.00) 92.00 (68.00,131.00)  < 0.001
 PaCO2 (mmHg), median [Q1, Q3] 41.00 (37.00,47.00) 41.00 (35.00,48.00) 0.014
 Chloride (mmol/L), median [Q1, Q3] 104.50 (100.50,108.20) 103.00 (98.75,107.50)  < 0.001
 Calcium (mmol/L), median [Q1, Q3] 8.20 (7.73,8.63) 8.10 (7.63,8.65) 0.097
 Sodium (mmol/L), median [Q1, Q3] 138.60 (136.00,141.00) 138.00 (134.82,141.43) 0.001
 Potassium (mmol/L), median [Q1, Q3] 4.15 (3.83,4.55) 4.28 (3.88,4.78)  < 0.001
 Glucose (mmol/L), median [Q1, Q3] 128.50 (108.33,156.24) 131.00 (107.00,163.54) 0.130
 CRE (mg/dL), median [Q1, Q3] 1.05 (0.75,1.68) 1.35 (0.85,2.30)  < 0.001
 BUN (mg/dL), median [Q1, Q3] 22.29 (15.00,37.67) 32.58 (20.67,52.35)  < 0.001
Therapy (1st 24 h), n (%)
 Vasopressor 1799 (31.22) 538 (43.67)  < 0.001
Scoring system
 GCS 13.00 (9.00,14.00) 8.00 (3.00,12.00)  < 0.001
 SOFA 6.00 (4.00,9.00) 9.00 (6.00,12.00)  < 0.001
 APSIII 55.00 (41.00,72.00) 80.50 (63.00,102.00)  < 0.001
 SIRS 3.00 (2.00,3.00) 3.00 (2.75,4.00)  < 0.001

AIDS Acquired Immunodeficiency Syndrome, CCI Charlson Comorbidity Index, MAP Mean Artery Pressure, RBC Red Blood Cell, WBC White Blood Cell, MCH Mean Corpuscular Hemoglobin, MCHC Mean Corpuscular Hemoglobin Concentration, MCV Mean Corpuscular Volume, PLT Platelet, RDW Red blood cell volume Distribution Width, HCT Hematocrit, APTT Activated Partial Thromboplastin Time, PT Prothrombin Time, INR International Normalized Ratio, pH potential of Hydrogen, BE Buffer Excess, CRE Creatinine, BUN Blood Urea Nitrogen, GCS Glasgow Coma Score, SOFA Sepsis related Organ Failure Assessment, APSIII Acute Physiology Score III, SIRS Systemic Inflammatory Response Syndrome

Variable importance

Based on the threshold measure of importance, a total of 15 important predictors were selected by the RFE algorithm. (Fig. 2) These variables included APSIII, GCS, SOFA, CCI, RDW, BUN, age, respiratory rate, PaO2, temperature, lactate, CRE, malignant cancer, metastatic solid tumor, and PLT. Then, these 15 variables were used in all the subsequent analysis for all models in both training and testing sets.

Fig. 2.

Fig. 2

Association between the number of variables allowed to be considered at each split and the prediction accuracy in the REF algorithm. REF, recursive feature elimination

Comparisons among different ML models

The discriminatory abilities of all models for the prediction of 30-day mortality in SAE patients are shown in Fig. 3 and Table 2. Within the training set, the NNET, NB, LR, GBM, Ada, RF, BT, XGB, and CatBoost models were established, and the testing set obtained AUCs of 0.833, 0.816, 0.833, 0.824, 0.834, 0.825, 0.804, 0.830, and 0.830, respectively. Comparatively, the BT had the lowest discriminative ability (AUC 0.804, 95% CI 0.786 to 0.820) while the other eight models had a relatively higher discriminative ability (Table S2). In terms of the calibration performance, the NNET, LR, Ada models had a good calibration—namely, a high accuracy of prediction, with P-values of 0.831, 0.119, and 0.129, respectively (Fig. 4).

Fig. 3.

Fig. 3

AUC of ROC curve by ML models in the validation cohort. AUC, area under the curve; ROC, receiver operate characteristics; ML, machine learning; NNET, artificial neural network; NB, naïve bayes; LR, logistic regression; GBM, gradient boosting machine; Ada, adapting boosting; RF, random forest; BT, bagged trees; XGB, eXtreme Gradient Boosting

Table 2.

Analysis of sensitivity and specificity

Model Accuracy Sensitivity Specificity PPV NPV AUC Operating threshold 95% CI
NNET 0.840 0.802 0.733 0.391 0.946 0.833 0.164 (0.816, 0.849)
NB 0.833 0.767 0.800 0.450 0.941 0.816 0.058 (0.799, 0.833)
LR 0.843 0.808 0.731 0.391 0.947 0.833 0.162 (0.816, 0.848)
GBM 0.844 0.805 0.699 0.360 0.944 0.824 0.141 (0.807, 0.840)
Ada 0.846 0.786 0.737 0.390 0.942 0.834 0.148 (0.817, 0.849)
RF 0.840 0.856 0.642 0.338 0.954 0.825 0.150 (0.808, 0.841)
BT 0.836 0.715 0.745 0.375 0.925 0.804 0.240 (0.786, 0.820)
XGB 0.844 0.808 0.712 0.374 0.945 0.830 0.157 (0.814, 0.846)
CatBoost 0.842 0.789 0.741 0.394 0.943 0.830 0.165 (0.813, 0.846)

PPV Positive Predictive Values, NPV Negative Predictive Values, AUC Area Under the Curve, CI Confidence Interval, NNET artificial Neural Network, NB Naïve Bayes, LR Logistic Regression, GBM Gradient Boosting Machine, Ada Adapting boosting, RF Random Forest, BT Bagged Trees, XGB eXtreme Gradient Boosting

Fig. 4.

Fig. 4

Calibration curve in the validation cohort. NNET, artificial neural network; NB, naive bayes; LR, logistic regression; GBM, gradient boosting machine; Ada, adapting boosting; RF, random forest; BT, bagged trees; XGB, eXtreme Gradient Boosting

In the Fig. 5, fifth predictor variables in the ML are demonstrated. Each variable incorporated in the study had varying importance over SAE depending on the ML approach. In general, APSIII was the variable with greatest importance across all ML algorithms, followed by GCS, RDW, and so forth.

Fig. 5.

Fig. 5

Variable importance in nine different ML models. ML, machine learning; NNET, artificial neural network; NB, naïve bayes; LR, logistic regression; GBM, gradient boosting machine; Ada, adapting boosting; RF, random forest; BT, bagged trees; XGB, eXtreme Gradient Boosting;

Application of model

The Shiny package analyzed the entire training set, demonstrating the impact of each variable on predicting SAE (Fig. 6). For example, the information of one patient was input into the model: no metastatic solid tumor, no malignant cancer, APSIII (121), GCS (3), CCI (6), SOFA (16), age (92), temperature (32 ℃), respiratory rate (19 per/min), RDW (17.5%), PLT (158 × 109/L), lactate (4.6 mmol/L), BUN (20 mg/dL), CRE (1.1 mg/dL), PaO2 (85 mmHg). The model analyzed that the risk of in-hospital mortality in this patient was 84.20%, indicating that the 30-day mortality for this SAE patient was relatively high, and precaution measures were recommended. In order to better apply this model, we also made a web-based calculator (https://pengchi2009.shinyapps.io/Mortality_of_sepsis_associated_encephalopathy/).

Fig. 6.

Fig. 6

Examples of website usage. Entering the input value determined the mortality and displayed how each value contributed to the prediction. CCI, Charlson Comorbidity Index

Discussion

Herein, nine ML models were developed and further validated to predict 30-day mortality of SAE patients. In terms of the discrimination and calibration performance, the NNET, LR and Ada model outperformed the remaining models. To make it easier for surgeons to use the model, a web-based calculator was then developed. Only by inputting the variable values can the 30-day death rate be shown. Both physicians and patients could perform an individualized prediction of the 30-day mortality of SAE, which is consistent with the personalized medicine trend. Undoubtedly, this calculator is conducive to correct clinical decisions, and more importantly, timely treatment strategy.

A study of 69 cases of sepsis patients demonstrated that in patients with no encephalopathy (n = 20), mild encephalopathy (n = 17), severe encephalopathy (n = 32), the mortality rate was 0, 35%, and 53%, respectively, showing that mortality was correlated to the severity of SAE [16]. In this study, fifteen variables were identified as risk factors, involving APSIII, GCS, SOFA, CCI, RDW, BUN, age, respiratory rate, PaO2, temperature, lactate, CRE, malignant cancer, metastatic solid tumor, and PLT.

Published study conducted by Chen J et al. [17] indicated that APACHE II and SOFA were independent risk factors for 28-day mortality in SAE patients, which was similar to our findings. A range of previous studies also have found that the mortality rate of sepsis patients is related to higher values of the GCS, SOFA, and the APACHE II score [10, 18, 19]. As an established method of summarizing patient severity of illness on admission to the ICU, APSIII is a part of the APACHE system of equations for the prediction of outcomes for ICU patients [20, 21]. In our study, APSIII and SOFA were variables with the relatively higher weight in the importance plot, demonstrating that they had strong power to predict 30-day mortality of SAE patients. It is manifest that SAE patients with multiple organ dysfunction syndrome (MODS) are associated with an increased risk of mortality. Cascade immune response, circulatory abnormalities, mitochondrial dysfunction as well as hypoxia endothelial permeability increases may be responsible for such a complicated pathophysiological process [2225]. Consequently, the treatment of SAE is based both on the management of sepsis and on the correction of potential neurotoxic factors.

Similar to previous study conducted by Yang Y et al. [26], RDW was an important predictor for 30-day death of SAE patients. Although the mechanism remains, to a wide extent, unclear, it is estimated that inflammation reaction and oxidative stress might invite an increase in RDW values, and simultaneously, these mechanisms may play a pivotal role in the poor prognosis of SAE [2729]. Furthermore, in keeping with previous study, we also unearthed that age was independently associated with 30-day mortality [17]. It was probably attributable to the fact that elderly patients exhibit a higher risk and mortality from sepsis [30]. Further, our research offered insight into the fact that renal function (BUN, CRE), respiratory rate, PaO2, and PLT were identified as predictors for 30-day death in SAE patients. Previous study also found that sepsis patients with renal or multi-organ failure were more frequently affected than those without organ complications [3].

Additionally, renal function alteration is not only associated with biological alterations including severe acidosis and uremia but also associated with neurotoxic substances accumulation, such as, antibiotics and hypnotics [10]. Notably, caused by the enhanced activation of cytokine expression and vascular endothelial cells, platelet abnormalities may proceed to disseminated intravascular coagulation (DIC) [31, 32]. Accordingly, it is advisable that we properly improve respiratory and circulation status, and correct coagulation function to reduce the mortality of SAE patients.

Result from previous study indicated that temperature and lactate were significantly correlated with mortality in SAE patients [33], these findings have also been confirmed in our research. As is well known, lactate was an important indictor which reflected the prognosis of sepsis patients [34, 35]. In other words, serum lactate was used to evaluate disease severity and guide treatment plan [34], thereby indicating that SAE patients experienced microcirculation obstacles, which may induce tissue ischemia and hypoxia. Accordingly, for patients with lactate acidosis and hyperlactic acidosis, timely rehydration and other treatments are needed.

The strengths of this study lied in the fact that it applied modern ML approaches to predict 30-day mortality, ensured that surgeons can conduct triage of patients at risk timely. Another important point to note was that the use of cross-validation is instrumental in decreasing potential overfitting. Further, based on a real-world data with relatively large samples (n = 6994), this study underwent a rigorous statistical test.

There were limitations in this study. First, although cohorts were divided into training set and validation set (70%: 30%), external verification is still a necessity. And compared to traditional models, the evaluation indicators including AUCs and accuracy of Ada have a slight edge. Second, as an administrative database, there exist inherent limitations. For example, the neuroimaging data were not available. Third, as with all potential retrospective studies, there was a potential for unmeasured confounders. Fourth, since it was based on ICU patients, this study could not be generalizable to other population. Lastly, although the shiny package was utilized to help visualize the results, a more applicable model is still needed in clinical practice.

Conclusions

On a whole, ML models, are able to individually predict 30-day mortality in SAE patients. and thereby assisting in the early screening for SAE patients who are at risk. This is particularly crucial as early treatment may facilitate the neurocognitive outcome. Future studies should be concentrated on investigating the long-term prognosis of SAE patients and the underlying mechanism of SAE.

Supplementary Information

12874_2022_1664_MOESM1_ESM.docx (19.2KB, docx)

Additional file 1: Table S1. Missing number (%) for included variables in the dataset. Table S2. The comparison of different ROC curve by De Long’s method. 

Acknowledgements

We would like to thank the Massachusetts Institute of Technology and the Beth Israel Deaconess Medical Center for the MIMIC project.

Authors’ contributions

(I) Concept and design: L Peng, C Peng; (II) Acquisition of data: F Yang, C Peng; (III) Data analysis and interpretation: C Peng, J Wang, W Zuo, C Cheng, Z Mao; (IV) Manuscript writing: L Peng; (V) Revising it critically for important intellectual content: All authors; (VI) Final approval of manuscript: All authors.

Funding

The authors received no financial support for the research, author-ship, and/or publication of this article.

Availability of data and materials

Publicly available datasets were analyzed in this study. These data can be found in the physionet (https://physionet.org/content/mimiciv/1.0/).

Declarations

Ethics approval and consent to participate

The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. Owing to the fact that this de-identified database is publicly available, the institutional review board approval and the written informed consents are waived by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA). The study was reported in accordance with the Transparent Reporting of a multivariable Prediction model for Individual Prognosis or Diagnosis (TRIPOD) statement [36]. All methods were performed in accordance with the relevant guidelines and regulations.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Liwei Peng, Chi Peng, Fan Yang contributed equally to this work and are co-first authors.

Contributor Information

Zhichao Jin, Email: jinzhichao@smmu.edu.cn.

Weixin Li, Email: tangdunaowai@163.com.

References

  • 1.Gofton TE, Young GB. Sepsis-associated encephalopathy. Nat Rev Neurol. 2012;8(10):557–566. doi: 10.1038/nrneurol.2012.183. [DOI] [PubMed] [Google Scholar]
  • 2.Fraser GL, Riker RR, Coursin DC. Long-term cognitive impairment after critical illness. N Engl J Med. 2014;370(2):184. doi: 10.1056/NEJMc1313886. [DOI] [PubMed] [Google Scholar]
  • 3.Eidelman LA, Putterman D, Putterman C, Sprung CL. The spectrum of septic encephalopathy. Definitions, etiologies, and mortalities. JAMA. 1996;275(6):470–473. doi: 10.1001/jama.1996.03530300054040. [DOI] [PubMed] [Google Scholar]
  • 4.Young GB, Bolton CF, Austin TW, Archibald YM, Gonder J, Wells GA. The encephalopathy associated with septic illness. Clin Invest Med. 1990;13(6):297–304. [PubMed] [Google Scholar]
  • 5.Ely EW, Shintani A, Truman B, Speroff T, Gordon SM, Harrell FE, Jr, Inouye SK, Bernard GR, Dittus RS. Delirium as a predictor of mortality in mechanically ventilated patients in the intensive care unit. JAMA. 2004;291(14):1753–1762. doi: 10.1001/jama.291.14.1753. [DOI] [PubMed] [Google Scholar]
  • 6.Chung HY, Wickel J, Brunkhorst FM, Geis C. Sepsis-associated encephalopathy: from delirium to dementia? J Clin Med. 2020;9(3):703. doi: 10.3390/jcm9030703. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Iwashyna TJ, Ely EW, Smith DM, Langa KM. Long-term cognitive impairment and functional disability among survivors of severe sepsis. JAMA. 2010;304(16):1787–1794. doi: 10.1001/jama.2010.1553. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Widmann CN, Heneka MT. Long-term cerebral consequences of sepsis. Lancet Neurol. 2014;13(6):630–636. doi: 10.1016/S1474-4422(14)70017-1. [DOI] [PubMed] [Google Scholar]
  • 9.Salluh JI, Wang H, Schneider EB, Nagaraja N, Yenokyan G, Damluji A, Serafim RB, Stevens RD. Outcome of delirium in critically ill patients: systematic review and meta-analysis. BMJ. 2015;350:h2538. doi: 10.1136/bmj.h2538. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Sonneville R, de Montmollin E, Poujade J, Garrouste-Orgeas M, Souweine B, Darmon M, Mariotte E, Argaud L, Barbier F, Goldgran-Toledano D, et al. Potentially modifiable factors contributing to sepsis-associated encephalopathy. Intensive Care Med. 2017;43(8):1075–1084. doi: 10.1007/s00134-017-4807-z. [DOI] [PubMed] [Google Scholar]
  • 11.Barr J, Fraser GL, Puntillo K, Ely EW, Gélinas C, Dasta JF, Davidson JE, Devlin JW, Kress JP, Joffe AM, et al. Clinical practice guidelines for the management of pain, agitation, and delirium in adult patients in the intensive care unit. Crit Care Med. 2013;41(1):263–306. doi: 10.1097/CCM.0b013e3182783b72. [DOI] [PubMed] [Google Scholar]
  • 12.Kuhle S, Maguire B, Zhang H, Hamilton D, Allen AC, Joseph KS, Allen VM. Comparison of logistic regression with machine learning methods for the prediction of fetal growth abnormalities: a retrospective cohort study. BMC Pregnancy Childbirth. 2018;18(1):333. doi: 10.1186/s12884-018-1971-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Goecks J, Jalili V, Heiser LM, Gray JW. How Machine Learning Will Transform Biomedicine. Cell. 2020;181(1):92–101. doi: 10.1016/j.cell.2020.03.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Goldberger AL, Amaral LA, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody GB, Peng CK, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation. 2000;101(23):E215–220. doi: 10.1161/01.CIR.101.23.e215. [DOI] [PubMed] [Google Scholar]
  • 15.Singer M, Deutschman CS, Seymour CW, Shankar-Hari M, Annane D, Bauer M, Bellomo R, Bernard GR, Chiche JD, Coopersmith CM, et al. The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3) JAMA. 2016;315(8):801–810. doi: 10.1001/jama.2016.0287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Young GB, Bolton CF, Archibald YM, Austin TW, Wells GA. The electroencephalogram in sepsis-associated encephalopathy. J Clin Neurophysiol. 1992;9(1):145–152. doi: 10.1097/00004691-199201000-00016. [DOI] [PubMed] [Google Scholar]
  • 17.Chen J, Shi X, Diao M, Jin G, Zhu Y, Hu W, Xi S. A retrospective study of sepsis-associated encephalopathy: epidemiology, clinical features and adverse outcomes. BMC Emerg Med. 2020;20(1):77. doi: 10.1186/s12873-020-00374-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Sprung CL, Peduzzi PN, Shatney CH, Schein RM, Wilson MF, Sheagren JN, Hinshaw LB. Impact of encephalopathy on mortality in the sepsis syndrome. The Veterans Administration Systemic Sepsis Cooperative Study Group. Crit Care Med. 1990;18(8):801–806. doi: 10.1097/00003246-199008000-00001. [DOI] [PubMed] [Google Scholar]
  • 19.Zhang LN, Wang XT, Ai YH, Guo QL, Huang L, Liu ZY, Yao B. Epidemiological features and risk factors of sepsis-associated encephalopathy in intensive care unit patients: 2008–2011. Chin Med J. 2012;125(5):828–831. [PubMed] [Google Scholar]
  • 20.LeGall JR, Loirat P, Alpérovitch A. APACHE II–a severity of disease classification system. Crit Care Med. 1986;14(8):754–755. doi: 10.1097/00003246-198608000-00027. [DOI] [PubMed] [Google Scholar]
  • 21.Le Gall JR, Lemeshow S, Saulnier F. A new Simplified Acute Physiology Score (SAPS II) based on a European/North American multicenter study. JAMA. 1993;270(24):2957–2963. doi: 10.1001/jama.1993.03510240069035. [DOI] [PubMed] [Google Scholar]
  • 22.Nolt B, Tu F, Wang X, Ha T, Winter R, Williams DL, Li C. Lactate and immunosuppression in sepsis. Shock. 2018;49(2):120–125. doi: 10.1097/SHK.0000000000000958. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Vincent JL, De Backer D. Circulatory shock. N Engl J Med. 2013;369(18):1726–1734. doi: 10.1056/NEJMra1208943. [DOI] [PubMed] [Google Scholar]
  • 24.Sun J, Zhang J, Tian J, Virzì GM, Digvijay K, Cueto L, Yin Y, Rosner MH, Ronco C. Mitochondria in sepsis-induced AKI. J Am Soc Nephrol. 2019;30(7):1151–1161. doi: 10.1681/ASN.2018111126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Opal SM, van der Poll T. Endothelial barrier dysfunction in septic shock. J Intern Med. 2015;277(3):277–293. doi: 10.1111/joim.12331. [DOI] [PubMed] [Google Scholar]
  • 26.Yang Y, Liang S, Geng J, Wang Q, Wang P, Cao Y, Li R, Gao G, Li L. Development of a nomogram to predict 30-day mortality of patients with sepsis-associated encephalopathy: a retrospective cohort study. J Intensive Care. 2020;8:45. doi: 10.1186/s40560-020-00459-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Lippi G, Targher G, Montagnana M, Salvagno GL, Zoppini G, Guidi GC. Relation between red blood cell distribution width and inflammatory biomarkers in a large cohort of unselected outpatients. Arch Pathol Lab Med. 2009;133(4):628–632. doi: 10.5858/133.4.628. [DOI] [PubMed] [Google Scholar]
  • 28.Vayá A, Sarnago A, Fuster O, Alis R, Romagnoli M. Influence of inflammatory and lipidic parameters on red blood cell distribution width in a healthy population. Clin Hemorheol Microcirc. 2015;59(4):379–385. doi: 10.3233/CH-141862. [DOI] [PubMed] [Google Scholar]
  • 29.Perlstein TS, Weuve J, Pfeffer MA, Beckman JA. Red blood cell distribution width and mortality risk in a community-based prospective cohort. Arch Intern Med. 2009;169(6):588–594. doi: 10.1001/archinternmed.2009.55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Martin GS, Mannino DM, Moss M. The effect of age on the development and outcome of adult sepsis. Crit Care Med. 2006;34(1):15–21. doi: 10.1097/01.CCM.0000194535.82812.BA. [DOI] [PubMed] [Google Scholar]
  • 31.Claushuis TA, van Vught LA, Scicluna BP, Wiewel MA, Klein Klouwenberg PM, Hoogendijk AJ, Ong DS, Cremer OL, Horn J, Franitza M, et al. Thrombocytopenia is associated with a dysregulated host response in critically ill sepsis patients. Blood. 2016;127(24):3062–3072. doi: 10.1182/blood-2015-11-680744. [DOI] [PubMed] [Google Scholar]
  • 32.Levi M, Toh CH, Thachil J, Watson HG. Guidelines for the diagnosis and management of disseminated intravascular coagulation. British Committee for Standards in haematology. Br J Haematol. 2009;145(1):24–33. doi: 10.1111/j.1365-2141.2009.07600.x. [DOI] [PubMed] [Google Scholar]
  • 33.Zhao L, Li Y, Wang Y, Gao Q, Ge Z, Sun X, Li Y. Development and validation of a nomogram for the prediction of hospital mortality of patients with encephalopathy caused by microbial infection: a retrospective cohort study. Front Microbiol. 2021;12:737066. doi: 10.3389/fmicb.2021.737066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Suetrong B, Walley KR. Lactic acidosis in sepsis: it’s not all anaerobic: implications for diagnosis and management. Chest. 2016;149(1):252–261. doi: 10.1378/chest.15-1703. [DOI] [PubMed] [Google Scholar]
  • 35.Liu Z, Meng Z, Li Y, Zhao J, Wu S, Gou S, Wu H. Prognostic accuracy of the serum lactate level, the SOFA score and the qSOFA score for mortality among adults with Sepsis. Scand J Trauma Resusc Emerg Med. 2019;27(1):51. doi: 10.1186/s13049-019-0609-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015;350:g7594. doi: 10.1136/bmj.g7594. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

12874_2022_1664_MOESM1_ESM.docx (19.2KB, docx)

Additional file 1: Table S1. Missing number (%) for included variables in the dataset. Table S2. The comparison of different ROC curve by De Long’s method. 

Data Availability Statement

Publicly available datasets were analyzed in this study. These data can be found in the physionet (https://physionet.org/content/mimiciv/1.0/).


Articles from BMC Medical Research Methodology are provided here courtesy of BMC

RESOURCES