Development and validation of machine learning models to predict in-hospital mortality in ICU patients with sepsis and chronic kidney disease

Shuoyan An; Zixiang Ye; Wuqiang Che; Yanxiang Gao; Jiahui Li; Jingang Zheng

doi:10.1186/s12879-025-11949-5

. 2025 Nov 5;25:1504. doi: 10.1186/s12879-025-11949-5

Development and validation of machine learning models to predict in-hospital mortality in ICU patients with sepsis and chronic kidney disease

Shuoyan An ¹, Zixiang Ye ², Wuqiang Che ¹, Yanxiang Gao ¹, Jiahui Li ^1,^✉, Jingang Zheng ^1,^2,^✉

PMCID: PMC12587723 PMID: 41194045

Abstract

Background

Sepsis is a life-threatening condition, particularly in intensive care unit (ICU) patients with chronic kidney disease (CKD). However, accurate prediction of in-hospital mortality in this high-risk population remains a clinical challenge. This study aimed to develop and validate machine learning (ML) models to predict in-hospital mortality among ICU patients with sepsis and CKD.

Methods

Patients diagnosed with both sepsis and CKD were identified from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database. Feature selection was performed using the Boruta algorithm. Multiple ML models were developed, including logistic regression (LR), decision tree, k-nearest neighbors (KNN), random forest (RF), support vector machine (SVM), neural network (NN), gradient boosting decision tree (GBDT), and extreme gradient boosting (XGBoost), along with the Sequential Organ Failure Assessment (SOFA) score for comparison. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) and average precision (AP). The best-performing model was externally validated in an independent cohort from the eICU Collaborative Research Database (eICU-CRD) and further interpreted using Shapley Additive Explanations (SHAP).

Results

A total of 4,686 ICU patients with sepsis and CKD were included in the development cohort. Among the models, XGBoost demonstrated the best performance with an AUC of 0.911, AP of 0.771, specificity of 96%, and sensitivity of 62%. In the external validation cohort of 3,718 patients, XGBoost also achieved excellent predictive performance with an AUC of 0.855. Model calibration and decision curve analysis confirmed its clinical utility. The top 20 predictors were visualized and ranked based on SHAP values.

Conclusions

Machine learning models, particularly XGBoost, can accurately predict in-hospital mortality in ICU patients with sepsis and CKD. These models may assist clinicians in risk stratification and decision-making for this vulnerable patient population.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12879-025-11949-5.

Keywords: Sepsis, Chronic kidney disease, Machine learning, Mortality

Introduction

Sepsis is an overwhelming host dysregulated systemic response to severe infection, which is a major reason for hospitalization and one of the primary causes for admission to the intensive care unit (ICU) [1]. Despite significant efforts to improve early detection and timely treatment of sepsis, it still accounts for approximately 5.3 million deaths worldwide, resulting in a substantial economic and health burden [2]. Extensive previous researches have attempted to identify the susceptibility factors and predictors of poor outcomes for sepsis [1, 3, 4]. As previous studies showed, renal function showed a complicated relationship with sepsis and received increasing attention. Chronic kidney disease (CKD), which is defined as abnormalities of kidney structure or function presenting for more than 3 months [5], triples the risk of sepsis and doubles the risk of mortality compared to that of non-CKD individuals [3]. This increased risk is due to multiple factors, including etiology factors (diabetes, autoimmune diseases receiving immunosuppressive treatments), alterations in immune function (uremic toxins and cytokines, malnutrition, chronic inflammation) and increased exposure to pathogens (frequent hospitalizations, vascular access, impaired gastrointestinal barrier function) [6]. In patients with sepsis, more than 50% will develop acute kidney injury (AKI), with this percentage being higher in those with pre-existing CKD, which carries a high risk of persisting kidney impairment and death [7, 8]. The mechanism may include hemodynamic instability leading to ischemia, inflammation, metabolic reprogramming, complement activation, mitochondrial dysfunction, microcirculatory abnormality [9] as well as genetic susceptibility [10]. In other words, CKD heightens the risk of developing sepsis, and sepsis further deteriorates kidney function, creating a vicious cycle that leads to adverse clinical outcomes. Since the introduction of the “Sepsis Six Bundle,” fluid resuscitation and antibiotics have made sepsis more treatable [11]. However, these treatments become more complicated in patients with CKD due to the challenges in maintaining fluid balance and determining the appropriate antibiotic dosage. Therefore, it is crucial to promptly identify high-risk CKD patients with sepsis and administer effective treatment without delay. Several parameters were used in clinical practice for prognosis prediction, such as Sequential Organ Failure Assessment (SOFA), Acute Physiology, Age and Chronic Health Evaluation II (APACHE II) score, National Early Warning Score (NEWS), Modified Early Warning Score (MEWS), However, their accuracy and sensitivity are often unsatisfactory [1], partly due to the heterogeneity of sepsis [12]. In the complex setting of sepsis with CKD, predictive and prognostic enrichment combining clinical data, traditional and novel biomarkers are necessary as well as specific data analysis methods.

With the rapid progress in artificial intelligence, machine learning (ML)-derived models have been developed to improve clinical diagnosis, workflows and prognosis assessment in critically ill patients with sepsis [2]. ML excels in handling large-scale data and can uncover intrinsic relationships without linear constraints. This approach significantly outperforms traditional statistical methods in terms of predictive accuracy [13]. ML has been regarded as a potential instrument in both critical care and nephrology fields, which has demonstrated remarkable accuracy in forecasting the progression of CKD, predicting AKI incidence and prognosis in sepsis, and distinguishing subphenotypes of complex clinical conditions [14–16]. However, despite extensive research on sepsis prognosis, patients with concomitant CKD remain largely underrepresented in previous models, even though this subgroup is of particular importance because CKD patients are highly vulnerable to infections, have impaired immune and metabolic responses, and experience disproportionately higher morbidity and mortality in the ICU. In addition, most existing studies have either focused on single-center datasets or lacked external validation, and model interpretability has rarely been addressed. To bridge these gaps, the present study aimed to develop and externally validate an ML-based algorithm for predicting in-hospital mortality in ICU patients with both sepsis and CKD, while also applying SHAP analysis to enhance interpretability and clinical applicability.

Methods

Data sources

Data from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database were extracted in the present study [17]. MIMIC-IV was a freely accessed database, which contained more than 50,000 ICU admissions from 2008 to 2019 in Beth Israel Deaconess Medical Center (Boston, Massachusetts). Each admission had the following information: demographics, vital signs, laboratory results, diagnosis of International Classification of Diseases and Ninth Revision (ICD-9) codes, treatment and follow-up results. Data from Telehealth Intensive Care Unit Collaborative Research Database (eICU-CRD) was used for external validation [18]. The eICU-CRD was a publicly available multicenter database containing more than 200,000 admissions from more than 200 hospitals across the United States between 2014 and 2015. Certification to access to the former two database was obtained by one author (ASY, certification number: 39674606) and data extraction needed in the study was collected by the author. Due to the unidentified nature of patient in these databases, individual consent was not required.

Study population and data extraction

All patients admitted for the first time and diagnosed with CKD and sepsis were included in this study. Sepsis was defined according to the Sepsis-3 definition, which requires the presence of a suspected or confirmed infection and an acute increase of at least 2 points in the Sequential Organ Failure Assessment (SOFA) score [19]. As information on pre-ICU baseline SOFA is not available in MIMIC-IV and eICU, we assumed a baseline SOFA of zero, consistent with prior database-based studies. The SOFA score was calculated based on the first 24 h after ICU admission. Patients with chronic kidney disease (CKD) were identified based on ICD-9/10 diagnosis codes. If the ICD code explicitly included a CKD stage, this was adopted as the staging criterion. For patients without stage-specific ICD codes, we calculated the estimated glomerular filtration rate (eGFR) using the CKD-EPI equation from the earliest available serum creatinine value after ICU admission, and CKD stage was assigned accordingly. We acknowledge that admission creatinine may be influenced by acute illness, and therefore some degree of misclassification cannot be excluded. To minimize this bias, the earliest creatinine measurement was used [20]. Patients who stayed in hospital for less than 6 h, under 18 years old, without follow-up information and with more than 30% missing data were excluded. Data of demographic information, comorbidities, lab results, vital signs, and in-hospital mortality from MIMIC-IV and eICU-CRD database were collected using pgAdmin PostgreSQL (version 1.22.1). Variables extracted from these databases included demographic variables, comorbidities, vital signs, laboratory findings, medical treatments and SOFA score. These variables were selected because they are routinely measured in clinical practice and have been reported as potential prognostic indicators in critically ill patients. Demographic variables were included for age and gender. Comorbidities were included for the stage of CKD, chronic pulmonary disease, coronary artery disease, atrial fibrillation, heart failure, hypertension, and cerebrovascular disease. For vital signs, we selected the maximal, minimal and mean values during ICU admission for the following variables: heart rate, systolic blood pressure (SBP), diastolic blood pressure (DBP), mean arterial pressure (MBP), and pulse oximeter readings of oxygen saturation (SpO₂). For laboratory findings, we selected the maximal, minimal and mean value for the following variables: white blood cell count (WBC), hemoglobin, platelet, alanine aminotransferase (ALT), aspartate aminotransferase (AST), serum creatinine (Scr), blood urea nitrogen (BUN), serum calcium, serum sodium, serum potassium, serum magnesium, serum phosphate, international normalized ratio (INR), prothrombin time (PT), partial thromboplastin time (PTT), lactate. Medical treatments were included for use of ventilation, renal replacement therapy (RRT), and vasopressor. Candidate predictors included demographic characteristics, comorbidities, laboratory tests, and clinical scores. All predictors were extracted from the first 24 h after ICU admission. For SOFA, we used the admission SOFA score as the predictor, in contrast to MEWS and NEWS which were calculated using the worst values within the first 24 h according to the standard equation [21]. The prediction target was in-hospital mortality, defined as death occurring during the index hospitalization. Thus, the prediction horizon was from ICU admission to hospital discharge.

Statistical analysis

Data preprocessing and feature selection

Patients were divided into two groups according to whether deceased during hospitalization. Categorical variables were presented as numbers with percentages and compared by Fisher’s exact probability method or Chi-square tests. Continuous variables, expressed as the median with interquartile ranges, were tested by The Wilcoxon rank sum test.

Predictors with more than 30% missingness were excluded. For the remaining variables, missing values were imputed using the MICE algorithm with a random forest method (m = 5, maxit = 5) in R, conducted separately for the MIMIC-IV and eICU cohorts. The outcome variable was not imputed. Additional details, including missingness rates and convergence diagnostics, are provided in Supplementary Materials. Outliers were identified using the Interquartile Range (IQR) method. Data points falling below Q1–1.5IQR or above Q3 + 1.5IQR were flagged as outliers and were replaced by the mean value. Patients from MIMIC-IV database were randomly divided into training set and internal validation set in a 7:3 ratio. Boruta algorithm was applied for feature selection. A shadow feature was created by permuting the values for each original feature in the dataset, which resulted in a new dataset with double number of features. Then a random forest classifier was trained on the new dataset. The importance of each feature (original and shadow) was calculated based on the mean decrease in the Gini impurity. The importance scores of the original features were compared with the maximum importance score of the shadow features. Only features with the importance score higher than the highest importance score of the shadow features were considered important [22].

Model development and validation

After feature selection process, ML models, including K-nearest neighbors (KNN), Decision Tree, random forest (RF), Gradient Boosting Decision Tree Machine (GBDT), Support Vector Machine (SVM), Neural Network (NN) and Extreme Gradient Boosting (XGBoost) were used for model development using training set in MIMIC-IV database. For XGBoost, we set the scale_pos_weight parameter according to the ratio of negative to positive cases to mitigate class imbalance. Besides, traditional logistic regression analysis, SOFA score, NEWS score, and MEWS score were also used for mortality prediction. Then the above models were assessed in the internal validation set in MIMIC-IV database. Tenfold cross-validation was performed to acquire average accuracy. The receiver operating characteristic (ROC) curve (AUC) and precision/recall (P-R) curves were used to assess the model performance in the validation set. Model with the highest area under the (ROC) curve (AUC) and average precision (AP) were regarded as the best. Hyperparameter tuning was conducted using Grid Search with the “scikit-learn” library and its “GridSearchCV” function in Python. The optimal hyperparameter settings were determined by 5-fold cross-validation. Model with the best performance in the internal validation set was then tested in the external validation cohort using data from eICU-CRD database using AUC, calibration curves and decision curve analysis (DCA). Risk factors most related to in-hospital mortality in the best model were visually exhibited by Shapley Additive Explanations (SHAP) method. Sensitivity analyses were performed on the best-performing model to assess the robustness of findings under different strategies for handling missing data, including complete-case analysis and alternative imputation settings.

The Boruta algorithms were conducted by R (version 4.1.3, Austria). The ML algorithms and SHAP were conducted via Python (version 3.9.12). A P-value <0.05 (two-sided) was regarded to be statistically significant.

Results

Baseline characteristics

The study flow chart was shown in Fig. 1. According to inclusion and exclusion criteria, 4686 septic patients with CKD in MIMIC-IV database were included, 874 (18.65%) died during hospitalization and 3812 (81.35%) survived to discharge. Baseline characteristics between the two groups were shown in Table 1. Patients deceased were older [78.00 (68.00, 85.00) years old vs. 74 (64, 83) years old, p < 0.001] with more advanced kidney disease and more commodities including chronic pulmonary disease, coronary artery disease, atrial fibrillation and heart failure. More patients in the death group received RRT (29.1% vs. 20.3%p < 0.001), vasoactive agents (70.1% vs. 44.0%, p < 0.001) and ventilation therapy (69.2% vs. 48.8%, p < 0.001) with longer ICU stay [3.83 (1.75, 7.87) days vs. 2.79 (1.53, 5.17) days, p < 0.001]. In the laboratory results, patients in the death group had higher white blood cell (WBC) count, lower platelet count, worse hepatic and renal function and higher lactate level. Systolic blood pressure (SBP) and diastolic blood pressure (DBP) were lower in them, while heart rate and SOFA score [4(3,6] vs. 3(2,5], p < 0.001] were higher.

Table 1.

Baseline characteristics of MIMIC-IV patients with sepsis and CKD

	Overall	Survival during hospitalization	Death during hospitalization	p
n	4686	3812	874
age	75.00 [65.00, 83.00]	74.00 [64.00, 83.00]	78.00 [68.00, 85.00]	< 0.001
Male (%)	2869 (61.2)	2340 (61.4)	529 (60.5)	0.666
CKD stage (%)				< 0.001
1	52 (1.1)	46 (1.2)	6 (0.7)
2	342 (7.3)	305 (8.0)	37 (4.2)
3	2190 (46.7)	1829 (48.0)	361 (41.3)
4	970 (20.7)	753 (19.8)	217 (24.8)
5	560 (12.0)	414 (10.9)	146 (16.7)
dialysis	572 (12.2)	465 (12.2)	107 (12.2)
CPD (%)	893 (19.1)	689 (18.1)	204 (23.3)	< 0.001
CAD (%)	1766 (37.7)	1405 (36.9)	361 (41.3)	0.016
Atrial Fibrillation (%)	1988 (42.4)	1555 (40.8)	433 (49.5)	< 0.001
Heart Failure(%)	2411 (51.5)	1911 (50.1)	500 (57.2)	< 0.001
Diabetes (%)	2331 (49.7)	1919 (50.3)	412 (47.1)	0.095
Hypertension (%)	4088 (87.2)	3343 (87.7)	745 (85.2)	0.057
CVD (%)	240 (5.1)	195 (5.1)	45 (5.1)	> 0.99
RRT (%)	1029 (22.0)	775 (20.3)	254 (29.1)	< 0.001
vasoactive_agents (%)	2291 (48.9)	1678 (44.0)	613 (70.1)	< 0.001
ventilation (%)	2466 (52.6)	1861 (48.8)	605 (69.2)	< 0.001
los_hospital (days)	9.00 [5.00, 15.00]	9.00 [6.00, 15.00]	7.00 [3.00, 14.00]	< 0.001
los_icu (days)	2.92 [1.58, 5.71]	2.79 [1.53, 5.17]	3.83 [1.75, 7.87]	< 0.001
wbc_max (K/uL)	15.20 [11.00, 21.00]	14.50 [10.70, 19.80]	19.10 [13.60, 26.15]	< 0.001
wbc_min (K/uL)	6.80 [5.10, 8.80]	6.60 [5.00, 8.40]	8.40 [5.70, 11.90]	< 0.001
wbc_mean (K/uL)	10.35 [7.93, 13.31]	9.95 [7.73, 12.57]	12.81 [9.53, 17.40]	< 0.001
hemoglobin_max (g/dL)	11.12 (1.79)	11.15 (1.78)	11.02 (1.84)	0.061
hemoglobin_min (g/dL)	8.37 (1.80)	8.39 (1.75)	8.26 (1.97)	0.042
hemoglobin_mean (g/dL)	9.66 (1.54)	9.68 (1.51)	9.60 (1.66)	0.191
platelet_max (K/uL)	256.00 [186.00, 350.00]	263.00 [193.00, 355.00]	226.00 [151.50, 319.00]	< 0.001
platelet_min (K/uL)	132.00 [88.00, 189.00]	135.00 [95.00, 190.25]	115.00 [57.00, 178.00]	< 0.001
platelet_mean (K/uL)	188.57 [136.46, 253.52]	192.42 [143.05, 256.00]	166.55 [105.51, 236.95]	< 0.001
alt_max (IU/L)	30.00 [17.00, 78.00]	28.00 [16.00, 62.00]	52.50 [23.00, 193.75]	< 0.001
alt_min (IU/L)	18.00 [11.00, 30.00]	17.00 [11.00, 28.00]	22.00 [12.00, 42.00]	< 0.001
alt_mean (IU/L)	24.67 [15.00, 52.25]	22.67 [14.50, 44.00]	37.00 [18.90, 112.21]	< 0.001
ast_max (IU/L)	45.50 [25.00, 118.00]	40.00 [24.00, 90.00]	95.50 [40.00, 370.25]	< 0.001
ast_min (IU/L)	25.00 [17.00, 38.00]	23.50 [17.00, 34.00]	33.00 [20.00, 65.25]	< 0.001
ast_mean (IU/L)	35.55 [23.00, 70.62]	32.22 [21.67, 58.50]	61.48 [32.00, 196.59]	< 0.001
creatinine_max (mg/dL)	2.60 [1.70, 4.60]	2.40 [1.70, 4.50]	3.50 [2.30, 5.00]	< 0.001
creatinine_min (mg/dL)	1.40 [1.00, 2.10]	1.30 [1.00, 2.00]	1.60 [1.10, 2.60]	< 0.001
creatinine_mean (mg/dL)	1.96 [1.39, 3.24]	1.86 [1.34, 3.11]	2.41 [1.74, 3.60]	< 0.001
bun_max (mg/dL)	57.00 [37.00, 83.00]	54.00 [36.00, 79.00]	71.00 [48.00, 97.00]	< 0.001
bun_min (mg/dL)	23.00 [16.00, 36.00]	22.00 [16.00, 33.00]	31.00 [19.75, 49.00]	< 0.001
bun_mean (mg/dL)	39.33 [27.31, 55.75]	37.48 [26.00, 52.75]	49.08 [34.38, 69.51]	< 0.001
potassium_max (mEq/L)	5.23 (0.93)	5.18 (0.91)	5.44 (0.97)	< 0.001
potassium_min (mEq/L)	3.57 (0.53)	3.55 (0.47)	3.67 (0.72)	< 0.001
potassium_mean (mEq/L)	4.27 (0.47)	4.23 (0.42)	4.41 (0.61)	< 0.001
sodium_max (mEq/L)	143.63 (5.47)	143.50 (4.97)	144.22 (7.23)	< 0.001
sodium_min (mEq/L)	133.87 (5.41)	133.94 (5.03)	133.55 (6.81)	0.052
sodium_mean (mEq/L)	138.82 (4.31)	138.80 (3.89)	138.90 (5.80)	0.532
total_calcium_max (mg/dL)	9.10 [8.60, 9.60]	9.10 [8.60, 9.60]	9.00 [8.50, 9.70]	0.324
total_calcium_min (mg/dL)	7.80 [7.30, 8.30]	7.90 [7.40, 8.30]	7.60 [7.00, 8.10]	< 0.001
total_calcium_mean (mg/dL)	8.47 (0.65)	8.49 (0.60)	8.36 (0.79)	< 0.001
magnesium_max (mg/dL)	2.50 [2.30, 2.80]	2.50 [2.30, 2.80]	2.50 [2.30, 2.80]	0.001
magnesium_min (mg/dL)	1.80 [1.60, 2.00]	1.80 [1.60, 1.90]	1.80 [1.60, 2.00]	0.015
magnesium_mean (mg/dL)	2.11 [1.98, 2.29]	2.10 [1.97, 2.28]	2.15 [2.01, 2.34]	< 0.001
phosphate_max (mg/dL)	5.10 [4.10, 6.60]	4.90 [4.00, 6.30]	6.30 [5.00, 8.00]	< 0.001
phosphate_min (mg/dL)	2.60 [2.00, 3.30]	2.60 [2.00, 3.20]	2.80 [1.90, 4.00]	< 0.001
phosphate_mean (mg/dL)	3.76 [3.20, 4.55]	3.67 [3.14, 4.38]	4.24 [3.49, 5.44]	< 0.001
PT_max (seconds)	23.39 ± 17.19	22.30 ± 15.94	28.09 ± 21.16	< 0.001
PT_min (seconds)	13.70 ± 4.92	13.20 ± 2.99	15.86 ± 9.19	< 0.001
PT_mean (seconds)	16.88 ± 7.39	16.14 ± 5.76	20.09 ± 11.61	< 0.001
PTT_max (seconds)	63.62 ± 43.01	61.06 ± 42.20	74.68 ± 44.74	< 0.001
PTT_min (seconds)	29.26 ± 7.55	28.61 ± 6.16	32.10 ± 11.36	< 0.001
PTT_mean (seconds)	40.73 ± 16.04	39.23 ± 14.51	47.20 ± 20.17	< 0.001
INR_max	2.19 ± 1.72	2.08 ± 0.60	2.65 ± 2.07	< 0.001
INR_min	1.24 ± 0.46	1.20 ± 0.29	1.45 ± 0.84	< 0.001
INR_mean	1.55 ± 0.70	1.48 ± 0.57	1.85 ± 1.05	< 0.001
lactate_max (mmol/L)	2.30 [1.50, 3.90]	2.10 [1.40, 3.30]	3.90 [2.20, 7.73]	< 0.001
lactate_min (mmol/L)	1.10 [0.90, 1.50]	1.10 [0.80, 1.40]	1.30 [1.00, 1.90]	< 0.001
lactate_mean (mmol/L)	1.70 [1.25, 2.35]	1.60 [1.20, 2.14]	2.31 [1.59, 3.95]	< 0.001
sbp_max (mmHg)	159.00 [142.00, 178.00]	160.00 [144.00, 179.00]	155.00 [137.00, 174.00]	< 0.001
sbp_min (mmHg)	81.93 (19.73)	85.41 (17.35)	66.72 (22.16)	< 0.001
sbp_mean (mmHg)	119.65 (16.53)	121.54 (15.93)	111.41 (16.56)	< 0.001
dbp_max (mmHg)	96.00 [83.00, 113.00]	97.00 [83.00, 114.00]	95.00 [81.00, 112.00]	0.015
dbp_min (mmHg)	37.75 (11.33)	39.09 (10.89)	31.92 (11.39)	< 0.001
dbp_mean (mmHg)	59.76 (10.15)	60.51 (10.10)	56.50 (9.76)	< 0.001
mbp_max (mmHg)	112.00 [98.00, 131.00]	112.00 [98.00, 131.00]	112.00 [97.00, 135.00]	0.669
mbp_min (mmHg)	51.00 [43.00, 58.00]	52.00 [45.00, 59.00]	41.75 [28.00, 51.00]	< 0.001
mbp_mean (mmHg)	74.78 [69.14, 81.78]	75.60 [70.00, 82.49]	70.94 [65.99, 77.39]	< 0.001
hr_max (beats/minute)	109.00 [94.00, 128.00]	106.00 [92.00, 124.00]	122.00 [105.00, 140.00]	< 0.001
hr_min (beats/minute)	63.17 (14.36)	63.92 (12.95)	59.87 (19.00)	< 0.001
hr_mean (beats/minute)	83.76 (13.64)	82.58 (13.05)	88.93 (14.88)	< 0.001
spo₂_max	100.00 [100.00, 100.00]	100.00 [100.00, 100.00]	100.00 [100.00, 100.00]	> 0.99
spo₂_min	90.00 [84.00, 92.00]	90.00 [86.00, 93.00]	83.00 [71.00, 90.00]	< 0.001
spo₂_mean	96.78 [95.54, 97.90]	96.83 [95.65, 97.90]	96.52 [94.89, 97.88]	< 0.001
SOFA_score	3.00 [2.00, 5.00]	3.00 [2.00, 5.00]	4.00 [3.00, 6.00]	< 0.001

Open in a new tab

MIMIC, Medical Information Mort for Intensive Care; CKD, chronic kidney disease; CPD, chronic pulmonary disease; CAD, coronary artery disease; CVD, cerebrovascular disease; RRT, renal replacement therapy; los_hospital, length of hospital stay; los_icu, length of stay in the intensive care unit; max, maximal value; min, minimal value; mean, average value; wbc, white blood cell; alt, alanine transaminase; ast, aspartate aminotransferase; bun, blood urea nitrogen; PT, prothrombin time; PTT, partial thromboplastin time; INR, international normalized ratio; sbp, systolic blood pressure; dbp, diastolic blood pressure; mbp, mean arterial pressure; hr, heart rate; spo2, pulse oximeter readings of oxygen saturation; SOFA, sequential organ failure assessment

Feature selection

Results of feature selection process using the Boruta algorithm was shown in Fig. 2. A total of 82 features correlated with in-hospital mortality were selected. According to the Z-values, the top 20 variables associated with in-hospital mortality were minimum value of Spo₂, minimum value of SBP, maximal value of lactate, mean value of lactate, length of hospital stay, minimum count of WBC, minimum value of heart rate, mean value of WBC, mean value of SBP, minimum value of mean blood pressure (MBP), minimum value of lactate, minimum value of phosphate, length of ICU stay, minimum value of platelet, maximal value of platelet, maximal value of sodium, mean value of MBP, mean value of phosphate and sodium and maximal heart rate. The models were trained to predict in-hospital mortality based on data available within the first 24 h of ICU admission.

Model development and performance assessment

Seven ML models, along with traditional logistic regression analysis and SOFA score were developed to predict the in-hospital mortality in patients with sepsis and CKD. ROC and PR curve of these prediction model were shown in the Fig. 3. Three ML models, including XGBoost (AUC = 0.911, AP = 0.766), GBDT (AUC = 0.905, AP = 0.751) and SVM (AUC = 0.907, AP = 0.742) were better than traditional logistic regression analysis (AUC = 0.885, AP = 0.725), SOFA score (AUC = 0.574, AP = 0.209), NEWS score (AUC = 0.683, AP = 0.187) and MEWS score (AUC = 0.741, AP = 0.186). Among all predictive models, XGBoost had the best predictive accuracy in mortality prediction in septic patients with CKD. So, the XGBoost model was chosen for further analysis. Detailed performance was presented in Table 2. The predictive performance of XGBoost remained stable across sensitivity analyses (AUC and AP differences within ± 0.01), indicating that the main findings were not materially influenced by the handling of missing data (Supplementary Table 3). With class-imbalance handling applied, XGBoost showed stable discrimination (AUC 0.911–0.920; AP 0.769–0.796). At the Youden index threshold, the model favored specificity (sensitivity 62%; specificity 96%). When the threshold was adjusted to achieve sensitivity ≥ 95%, sensitivity increased to 95% at the expense of specificity (62%), reflecting the expected trade-off (Supplementary Table 4).

Fig. 3 — ROC and PR curve of the predictive models. A: ROC of the predictive models. B: P-R curves of the predictive models. ROC, receiver operating characteristic; P-R curve, precision/recall curve; SVM, support vector machine; GBDT, Gradient Boosting Decision Tree Machine; KNN, k-nearest neighbors; NN, neural network; XGBoost, Extreme Gradient Boosting; Sequential Organ Failure Assessment (SOFA); National Early Warning Score (NEWS); Modified Early Warning Score (MEWS); AUC: area under the Receiver Operating Characteristic

Table 2.

The performance of machine learning models

	AUC	Precision	Specificity	Sensitivity
XGBOOST	0.911	0.769	0.960	0.621
SVM	0.907	0.744	0.960	0.511
GBDT	0.904	0.722	0.949	0.576
Logistic Regression	0.885	0.762	0.961	0.550
Random Forest	0.847	0.724	0.949	0.580
NN	0.756	0.617	0.920	0.565
Decision Tree	0.697	0.459	0.860	0.519
KNN	0.488	0.259	0.965	0.053

Open in a new tab

XGBoost, Extreme Gradient Boosting; SVM, Support Vector Machine; GBDT, Gradient Boosting Decision Tree Machine; NN, Neural Network; KNN, K-nearest neighbors

External validation of XGBoost model

According to the inclusive and exclusive criteria, 3718 patients from the eICU-CRD database were used as an external validation cohort (see Fig. 1). Baseline characteristics were shown in supplementary Table 1. Patients were 71 years old in average, and female accounted for more than a half (54.6%). More than 90% were with CKD stage 3 and above. In the external database, 3.7% of patients received RRT (renal replacement therapy), and 28.0% received ventilation. During hospitalization, 709 patients (19.07%) died. AUC, calibration plot and DCA were used to assess the prediction ability of the XGBoost model (Fig. 4). As shown in the Fig. 4A, AUC of the model was 0.855. The calibration curve presented in the Fig. 4B revealed that the XGBoost model performed well in the probability range of 10–70%. Figure 4C, the DCA curve showed that the XGBoost model exhibited good net benefit in the threshold probability rang of 8%-69%, indicating that the XGBoost model was with favorable clinical utility.

Fig. 4 — ROC, calibration plot and DCA curve of XGBoost in external validation cohort (eICU-CRD). ROC, receiver operating characteristic; AUC: area under the Receiver Operating Characteristic; DCA, decision curve analysis, XGBoost, Extreme Gradient Boosting; eICU-CRD, Telehealth Intensive Care Unit Collaborative Research Database

Model explanation by SHAP

The top 20 risk factors associated with in-hospital mortality in septic patients with CKD recognized in XGBoost model was visualized by SHAP algorithm in Fig. 5A in descending order. The minimum value of Spo₂ ranked first in predictive power, followed by the minimum value of SBP and age. Figure 5B presented whether that feature was positively or negatively associated with in-hospital mortality. Minimum value of spo₂ was negatively associated with in-hospital mortality. While age, maximal sodium value and minimum WBC were positively associated with in-hospital mortality. Noteworthy, the minimum and maximal phosphate value showed strong positive correlation with in-hospital mortality in septic patients with CKD. For more specific explanation of the prediction model, the SHAP force plots were shown in the Fig. 6A and B by randomly drawing a single sample from the validation cohort and showed how each variable influenced the outcome during hospitalization. The Fig. 6C represented SHAP dependence plot for the top 9 features in the XGBoost model. The study population were then divided into dialysis and non-dialysis subgroups. SHAP summary plots of the dialysis subgroup were shown in Supplementary Fig. 1A and Supplementary Fig. 1B. In the dialysis subgroup, minimum WBC value, average and maximal lactate level, maximal phosphate level, minimum value of SpO₂ and age were top features for in-hospital mortality prediction. Among them the minimum Spo₂ was negatively associated with in-hospital mortality, and the rest five features showed positive association with in-hospital mortality. SHAP summary plots of the non-dialysis subgroup were shown in supplementary Fig. 2A and supplementary Fig. 2B. In the non-dialysis subgroup, minimum Spo₂, minimum SBP, age, minimum and average WBC, and maximal phosphate were important features, with negative association in the former two features and positive association in the latter four features with in-hospital mortality. In addition, to cross-check SHAP-derived patterns, we conducted univariate restricted cubic spline (RCS) analyses for key predictors (SpO₂, SBP, sodium, and phosphate). These analyses confirmed their significant associations with in-hospital mortality and suggested potential non-linear relationships for these indicators, consistent with SHAP results (Supplementary Fig. 3).

Fig. 5 — SHARP summary plot for the top 20 features of the XGBoost model. A: Bar charts that rank the top 20 significant variables most correlated to in-hospital mortality; B: Impact of each feature on the in-hospital mortality. XGBoost, Extreme Gradient Boosting; SHAP, Shapley Additive Explanations; min, minimum value; max, maximum value; mean, average value; spo₂, oxyhemoglobin saturation; sbp, systolic blood pressure; wbc, white blood cell; los_hospital, length of hospital stay; hr, heart rate; los_icu, length of stay in intensive care unit

Fig. 6 — SHAP force and dependence plot for the XGBoost model. A and B: SHAP force plot of a sample in the validation cohort. The color represents the contributions of each feature, with red being positive and blue being negative. C: SHAP dependence plot for the top 9 features in the XGBoost model. XGBoost, Extreme Gradient Boosting; SHAP, Shapley Additive Explanations; min, minimum value; max, maximum value; mean, average value; spo2, oxyhemoglobin saturation; sbp, systolic blood pressure; wbc, white blood cell; los_hospital, length of hospital stay; hr, heart rate; los_icu, length of stay in intensive care unit

Discussion

Despite advances in therapy, in-hospital mortality of sepsis is still high, bring a big economic and health challenge. The prevalence of CKD had been rapidly increasing in recent decades. Patients with CKD are more prone to develop sepsis than those without CKD, while CKD is also an important prognostic risk factor in sepsis [23, 24]. Though the prognosis of sepsis was widely discussed in previous studies, risk factors and prediction models for CKD patients with sepsis were scarcely discussed before. In the present study, we focused on ICU patients with sepsis and CKD. We developed and compared seven ML algorithm (XGBoost, GBDT, SMV, NN, KNN, Decision Tree, Random Forest) against traditional logistic regression analysis and SOFA score. Our results showed the XGBoost excelled among these models. The accuracy was then validated in an external validation cohort.

Clinical outcomes in CKD patients and septic patients were discussed separately in several studies using MIMIC-IV database. Hu C et al. used the MIMIC database and reported an in-hospital mortality of 12.56% in septic patients [1]. In the same database, the in-hospital mortality rate for ICU patients with CKD was reported to be 16.5% [25]. But when these two clinical conditions were present in combination, the mortality was scarcely reported. Two previous studies explored mortality in septic patients with CKD patients (stage 3 or higher), finding a 90-day mortality rate of 25.8%-36.8% [26, 27]. Our study differs from these previous studies by including patients with all stages of CKD and focusing on in-hospital mortality as the primary outcome. We reported in-hospital mortality rates from two separate ICU databases, 18.65% in the MIMIC-IV database and 19.07% in the eICU database. To our knowledge, this is the first study to report in-hospital mortality for septic patients with a full range of CKD stages.

Our machine learning models identified several key predictors of in-hospital mortality among ICU patients with sepsis and CKD. According to the SHAP analysis, the top four features contributing to in-hospital mortality were SpO₂, SBP, age, and lactate level, all of which are well-recognized prognostic indicators. SHAP patterns and subsequent logistic regression analysis suggested that the minimal value of SBP may have a non-linear association with mortality risk. This observation aligns with previous studies that reported a non-linear or U-shaped relationship between blood pressure and sepsis outcomes, although most prior work focused on MAP rather than SBP [28–30].

Patients with CKD often complicated by the internal environment disorders, acid-base imbalances, and electrolyte disturbances. Dysnatremias, which reflects an imbalance between extracellular water and sodium, are common among hospitalized patients, occurring in 10% to 20% of cases [31]. The relationship between dysnatremias and clinical outcomes had been studied in a variety of clinical settings [32]. In CKD patients from a large, nationally cohort of US veterans, both hyper- and hypo-natremia were associated with worse clinical outcomes [33, 34]. In the present study, the SHAP results and logistic regression analysis indicated that both hypo- and hypernatremia were associated with higher mortality risk, which is consistent with prior studies reporting worse outcomes in sepsis and CKD patients with dysnatremias [31, 34]. Likewise, phosphate abnormalities were highlighted by SHAP as relevant predictors and were then proven in the logistic analysis. While earlier studies primarily emphasized hyperphosphatemia [35–37], our model suggested that both high and low phosphate levels may carry adverse prognostic implications in patients with CKD and sepsis. Nevertheless, these associations should be considered exploratory and hypothesis-generating. Recent expert opinion has even suggested that SHAP should not be considered a reliable tool for explainability in high-risk clinical domains [38]. Similarly, Bienefeld, N. et al. [39] reported that while developers viewed SHAP-based interpretability as useful, clinicians interacting with an AI decision-support prototype found that SHAP explanations did not provide meaningful guidance in practice. These findings underscore that SHAP should not be overstated as a source of clinical interpretability. In our study, we therefore present SHAP results as hypothesis-generating and supplementary to established clinical knowledge, rather than as conclusive explanations. The association between top indicators and primary outcome were then analyzed using logistic regression analysis.

Among the top predictors identified by the model, some factors such as SBP and SpO₂ are potentially modifiable physiological variables, and clinicians are inherently interested in their causal effect on patient outcomes. To address such causal questions, predictive models alone are insufficient due to confounding biases. The framework of Target Trial Emulation (TTE) provides a robust methodological paradigm for using observational data to estimate causal effects by explicitly emulating the design of a randomized controlled trial [32]. Future research could build upon our predictive study by designing such an emulated trial.

ML has emerged as a powerful tool for rapid prognosis assessment and demonstrated its superiority across numerous domains. Its ability to analyze vast amounts of data and make accurate predictions with a remarkable speed has revolutionized various industries. In healthcare, ML aids in early disease detection and personalized treatment plans, which improved patient outcomes [8, 9, 22]. This is particularly true in critical care medicine, where patients often present with complex multiorgan dysfunctions, rendering traditional methods insufficient for prognosis prediction. Numerous studies have applied ML to address this issue utilizing the public databases, especially in sepsis. Moor M et al. [21] utilized four databases to develop a deep learning model for predicting the onset of sepsis. This model was able to detect 80% of sepsis cases three hours before onset, providing a crucial window for intervention. Zhuang, J. et al. [4] developed a XGBoost model using MIMIC database and validated in three external cohorts to stratify in-hospital mortality risk in sepsis patients. Hou, N. et al. also used XGBoost to predict 30-day mortality for sepsis [40]. In the study by Kong G et al., RF, least absolute shrinkage and selection operator (LASSO), gradient boosting machine (GBM) was used to predict in-hospital mortality of sepsis, also using MIMIC database [41]. Given that sepsis could result in multiorgan dysfunction, several studies tried to identify or predict the organ failure complicated by sepsis. The kidney, one of the most commonly affected organs, has received significant attention. Yue S., et al. developed and compared seven ML algorithms to predict AKI after sepsis using MIMIC database, and the study showed that XGBoost performed best [8]. Similar result was shown in the study by Zhang L et al. [42]. Additionally, early recognition and mortality prediction models have been developed for sepsis complicated by acute respiratory distress syndrome (ARDS) [43], delirium [44], encephalopathy [45] and coagulopathy [46].

There are many types of algorithms in ML. Each algorithm had its own specialty. Among these fundamental algorithms, KNN, Decision Tree, RF, GBDT, SVM, NN and XGBoost were commonly used and showed high accuracy in disease detection or prognosis prediction. KNN was an easily understandable algorithm to solve classification issues and showed some accuracy in disease detection, but computation was expensive as the number of attributes increases [47]. Decision tree, a tree-like model, could classify data items into branches and used for classification and regression tasks. It helps in electrocardiographic signals (ECG) recognition and disease management [48]. But decision tree was easy to overfitting and the performance relied on the dataset features. While RF was an ensemble of multiple decision trees and offers greater stability with less overfitting, it came at the cost of complex and expensive computation [49]. SVM could efficiently perform a non-linear classification and good at high-dimensional spaces. NN was inspired by mimicking human bran neural network, more suitable for complicated, non-linear variables and showed good ability in Alzheimer’s disease diagnosis and other chronic disease early detection [50]. But for complex problems, the computation was expensive. GBDT and XBGoost were known for their high accuracy to handle complex, non-linear relationships, especially in big data with rapid speed. Several studies have demonstrated the reliability of prognostic prediction [22, 40, 51, 52]. But these two algorithms look to some extent like “black box”. Although it includes tools for feature importance and model interpretation (such as SHARP used in the present study), the results were still less interpretable. These models were proved to be effective and reliable tools in prognostic prediction and clinical decision-making in many medical scenarios [53]. In the present study, we chose these ML algorithms to determine which one performed best in CKD patients suffering from sepsis, a more complicated and life- threatening condition. Our study proved that XGBoost model performed the best among the traditional and new ML methods.

When CKD patients were admitted to the ICU due to sepsis, ML, especially XGBoost, can accurately help physicians predict whether patients will experience in-hospital mortality (with specificity of 96%, and sensitivity of 62%). In clinical practice, the model can be utilized during hospitalization when sepsis is diagnosed, aiding clinicians in assessing mortality risk and tailoring management strategies. This approach provides doctors with a more accurate tool to assess the risk of patients and make timely interventions to potentially improve patient outcomes. By harnessing the power of ML, healthcare professionals can benefit from advanced algorithms that analyze multiple variables and patterns to provide valuable insights and predictions. This innovative approach has the potential to enhance clinical decision-making and improve patient care in critical care settings.

Sepsis is a heterogeneous syndrome with diverse pathogens, immune responses, and organ dysfunction patterns, which complicates both treatment and predictive modeling. Although restricting our cohort to CKD patients adds some homogeneity, unrecognized subphenotypes (e.g., hyperinflammatory vs. immunoparalysis) likely remain, and model performance may differ across them. This study should therefore be seen as a foundational step. Future work will use approaches such as latent class analysis or clustering to identify subphenotypes in sepsis with CKD, validate the model within these groups, and potentially develop tailored models to advance precision medicine [54].

Although mortality prediction in sepsis has been extensively studied, our work provides several distinct contributions. First, unlike most prior studies that examined the general sepsis population, we specifically focused on critically ill patients with concomitant CKD, a subgroup with particularly poor outcomes but limited evidence for risk stratification. Second, we adopted a comprehensive feature selection strategy using the Boruta algorithm, which is more suitable for machine learning approaches than traditional logistic-based methods. Third, we systematically compared multiple machine learning models and incorporated SHAP analysis to improve interpretability, thus bridging the gap between predictive performance and clinical applicability.

Limitation

Our study had a few limitations. First, its retrospective and observational design may have introduced selection bias, although external validation partly mitigated this concern. Second, some variables with high missingness were excluded, despite the use of statistical imputation to retain as many predictors as possible. Third, class imbalance was present (∼19% mortality), but discrimination metrics that are robust to class imbalance, including the AUC and the PR curve, were reported and scale_pos_weight was tuned in tree-based models. Fourth, baseline SOFA scores were assumed to be zero due to lack of pre-ICU data, and CKD chronicity could not be fully confirmed, introducing possible misclassification. Although we used the earliest creatinine measurement to reduce the impact of acute changes related to sepsis, some degree of misclassification bias is possible, and our findings should be interpreted in this context. Fifth, our study was based on publicly available databases, which may limit the generalizability of the findings to other healthcare systems. To address this, our team has already initiated a prospective study focusing on patients with CKD complicated by cardiovascular disease and critical illness in our institution, which is currently undergoing ethical approval and registration. Future work will incorporate data from this study and other multicenter cohorts to further validate and refine the proposed models.

Conclusion

ML algorithms, especially the XGBoost can predict the in-hospital mortality for ICU patients with sepsis and CKD accurately and several variables were recognized as mortality predictors, which may assist clinicians in tailoring early and precise management to reduce mortality.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1^{(712.6KB, docx)}

Acknowledgements

The authors acknowledge all participants in the MIMC-IV and eICU-CRD team for survey design and data collection.

Abbreviations

CKD: Chronic kidney disease
MIMIC-IV: Medical Information Mart for Intensive Care IV
eICU-CRD: Telehealth Intensive Care Unit Collaborative Research Database
LR: Logistic regression
RF: Random forest
GBDT: Gradient Boosting Decision Tree Machine
KNN: K-nearest neighbors
SVM: Support Vector Machine
NN: Neural Network
XGBoost: Extreme Gradient Boosting
SOFA: sequential organ failure assessment
(APACHE II) score: Acute Physiology, Age and Chronic Health Evaluation II
NEWS: National Early Warning Score
MEWS: Modified Early Warning Score
PR curve: Precision/Recall curve
AP: Average precision
AUC: Area under the receiver operating characteristic curve
SHAP: Shapley Additive Explanations
ML: Machine learning
ICD-9: International Classification of Diseases and Ninth Revision
ICU: Intensive care unit
DCA: Decision curve analysis
los_icu: Length of stay in intensive care unit
los_hospital: Length of hospital stay
bun: Blood urine nitrogen
eGFR: Estimated glomerular filtration rate
CPD: Chronic pulmonary disease
CAD: Coronary artery disease
CVD: Cerebrovascular disease
RRT: Renal replacement therapy
max: Maximum
min: Minimum
wbc: White blood cell
alt: Alanine aminotransferase
ast: Aspartate aminotransferase
alp: Alkaline phosphatase
PT: Prothrombin time
PTT: Partial thromboplastin time
INR: International normalized ratio
sbp: Systolic blood pressure
dbp: Diastolic blood pressure
mbp: Mean blood pressure
hr: Heart rate
spo₂: Oxyhemoglobin saturation

Author contributions

Jingang Zheng, Jiahui Li, and Shuoyan An contributed to the study design. Shuoyan An and Zixiang Ye contributed to data collection and manuscript writing. Zixiang Ye and Wuqiang Che contributed to data processing. Yanxiang Gao contributed to the figure mapping. Jiahui Li review and edit the manuscript. All the authors have read and agreed to the published version of the manuscript.

Funding

This study was financially supported by grants from the Capital’s Funds for Health Improvement and Research (no. 2022-1-4062), National High Level Hospital Clinical Research Funding (no. 2022-NHLHCRF-YSPY-01), Beijing Research Ward Construction Clinical Research Project (no. 2022-YJXBF-04-03), National Key Clinical Specialty Construction Project (no. 2020-QTL-009), and Chinese Society of Cardiology’s Foundation (No. CSCF2021B02).

Data availability

The data used in the present study were available from the MIMIC-IV and eICU-CRD, which could be freely accessed via physionet on https://physionet.org/content/mimiciv/2.2/ and https://physionet.org/content/eicu-crd/2.0/, respectively. Data are available from the author Shuoyan An (anshuoyan@126.com) upon reasonable request and with permission of MIMIC-IV and eICU-CRD.

Declarations

Ethics approval and consent to participate

We used the MIMIC-IV and eICU-CRD databases in the present study. Data in the two databases were publicly available and all patient data were de-identified. The use of the MIMIC-IV and eICU has been approved by the Institutional Review Board of Beth Israel Deaconess Medical Center (BIDMC) and Massachusetts Institute of Technology, and an informed consent waiver has been obtained. One of the authors (ASY) completed the web-based course and obtained permission for the database (certificate number: 39674606). All procedures performed in this study were in accordance with the Declaration of Helsinki and relevant guidelines.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Jiahui Li, Email: veighlee@163.com.

Jingang Zheng, Email: mdjingangzheng@yeah.net.

References

1.Hu C, et al. Interpretable machine learning for early prediction of prognosis in sepsis: A discovery and validation study. Infect Dis Ther. 2022;11(3):1117–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Bomrah S, et al. A scoping review of machine learning for sepsis prediction- feature engineering strategies and model performance: a step towards explainability. Crit Care. 2024;28(1):180. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Liyanarachi KV et al. Chronic kidney disease and risk of bloodstream infections and sepsis: a 17-year follow-up of the population-based Trøndelag Health Study in Norway. Infection. 2024. [DOI] [PMC free article] [PubMed]
4.Zhuang J, et al. A generalizable and interpretable model for mortality risk stratification of sepsis patients in intensive care unit. BMC Med Inf Decis Mak. 2023;23(1):185. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Saran R, et al. Epidemiology of Kidney Disease in the United States. Am J Kidney Dis. 2018;71(3 Suppl 1):A7. US Renal Data System 2017 Annual Data Report. [DOI] [PMC free article] [PubMed]
6.Ganz T, et al. Iron Administration, infection, and anemia management in CKD: untangling the effects of intravenous iron therapy on immunity and infection risk. Kidney Med. 2020;2(3):341–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Donaldson LH, et al. Quantifying the impact of alternative definitions of sepsis-associated acute kidney injury on its incidence and outcomes: A systematic review and meta-analysis. Crit Care Med. 2024. [DOI] [PubMed]
8.Yue S, et al. Machine learning for the prediction of acute kidney injury in patients with sepsis. J Transl Med. 2022;20(1):215. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Zarbock A, et al. Sepsis-associated acute kidney injury: consensus report of the 28th acute disease quality initiative workgroup. Nat Rev Nephrol. 2023;19(6):401–17. [DOI] [PubMed] [Google Scholar]
10.Yang S, et al. Unraveling the genetic and molecular landscape of sepsis and acute kidney injury: A comprehensive GWAS and machine learning approach. Int Immunopharmacol. 2024;137:112420. [DOI] [PubMed] [Google Scholar]
11.Haley M, et al. Fluid resuscitation and sepsis management in patients with chronic kidney disease or End-Stage renal disease: scoping review. Am J Crit Care. 2024;33(1):45–53. [DOI] [PubMed] [Google Scholar]
12.Biebelberg B, et al. Heterogeneity of sepsis presentations and mortality rates. Ann Intern Med. 2024. [DOI] [PubMed]
13.G A, K LN. Improving sepsis classification performance with artificial intelligence algorithms: A comprehensive overview of healthcare applications. J Crit Care. 2024;83:154815. [DOI] [PubMed] [Google Scholar]
14.Hu C, et al. Application of machine learning for clinical subphenotype identification in sepsis. Infect Dis Ther. 2022;11(5):1949–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Lee KH et al. Artificial intelligence for risk prediction of end-stage renal disease in sepsis survivors with chronic kidney disease. Biomedicines. 2022;10(3). [DOI] [PMC free article] [PubMed]
16.Fan Z, et al. Construction and validation of prognostic models in critically ill patients with sepsis-associated acute kidney injury: interpretable machine learning approach. J Transl Med. 2023;21(1):406. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Johnson A et al. Mimic-iv. PhysioNet. 2020. Available online at: https://physionet.org/content/mimiciv/1.0/. (Accessed August 23, 2021)
18.Pollard TJ, et al. The eICU collaborative research Database, a freely available multi-center database for critical care research. Sci Data. 2018;5:180178. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Singer M, et al. The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA. 2016;315(8):801–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.KDIGO 2024 Clinical Practice Guideline for the Evaluation and Management of Chronic Kidney Disease. Kidney Int. 2024;105(4s):S117–314. [DOI] [PubMed] [Google Scholar]
21.Moor M, et al. Predicting sepsis using deep learning across international sites: a retrospective development and validation study. EClinicalMedicine. 2023;62:102124. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Ye Z, et al. The prediction of in-hospital mortality in chronic kidney disease patients with coronary artery disease using machine learning models. Eur J Med Res. 2023;28(1):33. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Leelahavanichkul A, et al. Chronic kidney disease worsens sepsis and sepsis-induced acute kidney injury by releasing high mobility group box Protein-1. Kidney Int. 2011;80(11):1198–211. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Yang HY, et al. Reduced risk of sepsis and related mortality in chronic kidney disease patients on Xanthine oxidase inhibitors: A National cohort study. Front Med (Lausanne). 2021;8:818132. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Li X, et al. Machine learning algorithm to predict the in-hospital mortality in critically ill patients with chronic kidney disease. Ren Fail. 2023;45(1):2212790. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Neyra JA, et al. Impact of acute kidney injury and CKD on adverse outcomes in critically ill septic patients. Kidney Int Rep. 2018;3(6):1344–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Rimes-Stigare C, et al. Long-term mortality and risk factors for development of end-stage renal disease in critically ill patients with and without chronic kidney disease. Crit Care. 2015;19:383. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Asfar P, et al. High versus low blood-pressure target in patients with septic shock. N Engl J Med. 2014;370(17):1583–93. [DOI] [PubMed] [Google Scholar]
29.Khanna AK, et al. Association of systolic, diastolic, mean, and pulse pressure with morbidity and mortality in septic ICU patients: a nationwide observational study. Ann Intensiv Care. 2023;13(1):9. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Zhu JL, et al. Influence of systolic blood pressure trajectory on in-hospital mortality in patients with sepsis. BMC Infect Dis. 2023;23(1):90. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Arzhan S, et al. Dysnatremias in chronic kidney disease: Pathophysiology, Manifestations, and treatment. Front Med (Lausanne). 2021;8:769287. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Yang J, et al. A comprehensive step-by-step approach for the implementation of target trial emulation: evaluating fluid resuscitation strategies in post-laparoscopic septic shock as an example. Laparosc Endoscopic Robotic Surg. 2025;8(1):28–44. [Google Scholar]
33.Kovesdy CP, et al. Hyponatremia, hypernatremia, and mortality in patients with chronic kidney disease with and without congestive heart failure. Circulation. 2012;125(5):677–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Castello LM, et al. Hypernatremia and moderate-to-severe hyponatremia are independent predictors of mortality in septic patients at emergency department presentation: A sub-group analysis of the need-speed trial. Eur J Intern Med. 2021;83:21–7. [DOI] [PubMed] [Google Scholar]
35.Al Harbi SA, et al. Association between phosphate disturbances and mortality among critically ill patients with sepsis or septic shock. BMC Pharmacol Toxicol. 2021;22(1):30. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Wang H, et al. Hyperphosphatemia rather than hypophosphatemia indicates a poor prognosis in patients with sepsis. Clin Biochem. 2021;91:9–15. [DOI] [PubMed] [Google Scholar]
37.Li Z, Shen T, Han Y. Effect of serum phosphate on the prognosis of septic patients: A retrospective study based on MIMIC-IV database. Front Med (Lausanne). 2022;9:728887. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Huang X, Marques-Silva J. On the failings of Shapley values for explainability. Int J Approximate Reasoning. 2024;171:109112. [Google Scholar]
39.Bienefeld N, et al. Solving the explainable AI conundrum by bridging clinicians’ needs and developers’ goals. Npj Digit Med. 2023;6(1):94. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Hou N, et al. Predicting 30-days mortality for MIMIC-III patients with sepsis-3: a machine learning approach using XGboost. J Transl Med. 2020;18(1):462. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Kong G, Lin K, Hu Y. Using machine learning methods to predict in-hospital mortality of sepsis patients in the ICU. BMC Med Inf Decis Mak. 2020;20(1):251. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Zhang L, et al. Developing an ensemble machine learning model for early prediction of sepsis-associated acute kidney injury. iScience. 2022;25(9):104932. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Bai Y, et al. Using machine learning for the early prediction of sepsis-associated ARDS in the ICU and identification of clinical phenotypes with differential responses to treatment. Front Physiol. 2022;13:1050849. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Zhang Y, et al. Development of a machine learning-based prediction model for sepsis-associated delirium in the intensive care unit. Sci Rep. 2023;13(1):12697. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Lu X, et al. CLINICAL PHENOTYPES OF SEPSIS-ASSOCIATED ENCEPHALOPATHY: A RETROSPECTIVE COHORT STUDY. Shock. 2023;59(4):583–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Zhao QY, et al. A Machine-Learning approach for dynamic prediction of sepsis-Induced coagulopathy in critically ill patients with sepsis. Front Med (Lausanne). 2020;7:637434. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Zaar O, et al. Evaluation of the diagnostic accuracy of an online artificial intelligence application for skin disease diagnosis. Acta Derm Venereol. 2020;100(16):adv00260. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Domingues R, et al. Evaluation of the responsiveness pattern to caffeine through a smart data-driven ECG non-linear multi-band analysis. Heliyon. 2024;10(11):e31721. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Shi T, Horvath S. Unsupervised learning with random forest predictors. J Comput Graphical Stat. 2006;15(1):118–38. [Google Scholar]
50.Zhang Y, et al. Multi-modal graph neural network for early diagnosis of alzheimer’s disease from sMRI and PET scans. Comput Biol Med. 2023;164:107328. [DOI] [PubMed] [Google Scholar]
51.Hsiao YW, et al. A risk prediction model of gene signatures in ovarian cancer through bagging of GA-XGBoost models. J Adv Res. 2021;30:113–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Lv H, et al. Machine Learning-Driven models to predict prognostic outcomes in patients hospitalized with heart failure using electronic health records: retrospective study. J Med Internet Res. 2021;23(4):e24996. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Uddin S, et al. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inf Decis Mak. 2019;19(1):281. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Yang J, et al. Identification of clinical subphenotypes of sepsis after laparoscopic surgery. Laparosc Endoscopic Robotic Surg. 2024;7(1):16–26. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1^{(712.6KB, docx)}

Data Availability Statement

[CR1] 1.Hu C, et al. Interpretable machine learning for early prediction of prognosis in sepsis: A discovery and validation study. Infect Dis Ther. 2022;11(3):1117–32. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Bomrah S, et al. A scoping review of machine learning for sepsis prediction- feature engineering strategies and model performance: a step towards explainability. Crit Care. 2024;28(1):180. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Liyanarachi KV et al. Chronic kidney disease and risk of bloodstream infections and sepsis: a 17-year follow-up of the population-based Trøndelag Health Study in Norway. Infection. 2024. [DOI] [PMC free article] [PubMed]

[CR4] 4.Zhuang J, et al. A generalizable and interpretable model for mortality risk stratification of sepsis patients in intensive care unit. BMC Med Inf Decis Mak. 2023;23(1):185. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Saran R, et al. Epidemiology of Kidney Disease in the United States. Am J Kidney Dis. 2018;71(3 Suppl 1):A7. US Renal Data System 2017 Annual Data Report. [DOI] [PMC free article] [PubMed]

[CR6] 6.Ganz T, et al. Iron Administration, infection, and anemia management in CKD: untangling the effects of intravenous iron therapy on immunity and infection risk. Kidney Med. 2020;2(3):341–53. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Donaldson LH, et al. Quantifying the impact of alternative definitions of sepsis-associated acute kidney injury on its incidence and outcomes: A systematic review and meta-analysis. Crit Care Med. 2024. [DOI] [PubMed]

[CR8] 8.Yue S, et al. Machine learning for the prediction of acute kidney injury in patients with sepsis. J Transl Med. 2022;20(1):215. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Zarbock A, et al. Sepsis-associated acute kidney injury: consensus report of the 28th acute disease quality initiative workgroup. Nat Rev Nephrol. 2023;19(6):401–17. [DOI] [PubMed] [Google Scholar]

[CR10] 10.Yang S, et al. Unraveling the genetic and molecular landscape of sepsis and acute kidney injury: A comprehensive GWAS and machine learning approach. Int Immunopharmacol. 2024;137:112420. [DOI] [PubMed] [Google Scholar]

[CR11] 11.Haley M, et al. Fluid resuscitation and sepsis management in patients with chronic kidney disease or End-Stage renal disease: scoping review. Am J Crit Care. 2024;33(1):45–53. [DOI] [PubMed] [Google Scholar]

[CR12] 12.Biebelberg B, et al. Heterogeneity of sepsis presentations and mortality rates. Ann Intern Med. 2024. [DOI] [PubMed]

[CR13] 13.G A, K LN. Improving sepsis classification performance with artificial intelligence algorithms: A comprehensive overview of healthcare applications. J Crit Care. 2024;83:154815. [DOI] [PubMed] [Google Scholar]

[CR14] 14.Hu C, et al. Application of machine learning for clinical subphenotype identification in sepsis. Infect Dis Ther. 2022;11(5):1949–64. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Lee KH et al. Artificial intelligence for risk prediction of end-stage renal disease in sepsis survivors with chronic kidney disease. Biomedicines. 2022;10(3). [DOI] [PMC free article] [PubMed]

[CR16] 16.Fan Z, et al. Construction and validation of prognostic models in critically ill patients with sepsis-associated acute kidney injury: interpretable machine learning approach. J Transl Med. 2023;21(1):406. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Johnson A et al. Mimic-iv. PhysioNet. 2020. Available online at: https://physionet.org/content/mimiciv/1.0/. (Accessed August 23, 2021)

[CR18] 18.Pollard TJ, et al. The eICU collaborative research Database, a freely available multi-center database for critical care research. Sci Data. 2018;5:180178. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Singer M, et al. The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA. 2016;315(8):801–10. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.KDIGO 2024 Clinical Practice Guideline for the Evaluation and Management of Chronic Kidney Disease. Kidney Int. 2024;105(4s):S117–314. [DOI] [PubMed] [Google Scholar]

[CR21] 21.Moor M, et al. Predicting sepsis using deep learning across international sites: a retrospective development and validation study. EClinicalMedicine. 2023;62:102124. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR22] 22.Ye Z, et al. The prediction of in-hospital mortality in chronic kidney disease patients with coronary artery disease using machine learning models. Eur J Med Res. 2023;28(1):33. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR23] 23.Leelahavanichkul A, et al. Chronic kidney disease worsens sepsis and sepsis-induced acute kidney injury by releasing high mobility group box Protein-1. Kidney Int. 2011;80(11):1198–211. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR24] 24.Yang HY, et al. Reduced risk of sepsis and related mortality in chronic kidney disease patients on Xanthine oxidase inhibitors: A National cohort study. Front Med (Lausanne). 2021;8:818132. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] 25.Li X, et al. Machine learning algorithm to predict the in-hospital mortality in critically ill patients with chronic kidney disease. Ren Fail. 2023;45(1):2212790. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR26] 26.Neyra JA, et al. Impact of acute kidney injury and CKD on adverse outcomes in critically ill septic patients. Kidney Int Rep. 2018;3(6):1344–53. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR27] 27.Rimes-Stigare C, et al. Long-term mortality and risk factors for development of end-stage renal disease in critically ill patients with and without chronic kidney disease. Crit Care. 2015;19:383. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR28] 28.Asfar P, et al. High versus low blood-pressure target in patients with septic shock. N Engl J Med. 2014;370(17):1583–93. [DOI] [PubMed] [Google Scholar]

[CR29] 29.Khanna AK, et al. Association of systolic, diastolic, mean, and pulse pressure with morbidity and mortality in septic ICU patients: a nationwide observational study. Ann Intensiv Care. 2023;13(1):9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Zhu JL, et al. Influence of systolic blood pressure trajectory on in-hospital mortality in patients with sepsis. BMC Infect Dis. 2023;23(1):90. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] 31.Arzhan S, et al. Dysnatremias in chronic kidney disease: Pathophysiology, Manifestations, and treatment. Front Med (Lausanne). 2021;8:769287. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR32] 32.Yang J, et al. A comprehensive step-by-step approach for the implementation of target trial emulation: evaluating fluid resuscitation strategies in post-laparoscopic septic shock as an example. Laparosc Endoscopic Robotic Surg. 2025;8(1):28–44. [Google Scholar]

[CR33] 33.Kovesdy CP, et al. Hyponatremia, hypernatremia, and mortality in patients with chronic kidney disease with and without congestive heart failure. Circulation. 2012;125(5):677–84. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR34] 34.Castello LM, et al. Hypernatremia and moderate-to-severe hyponatremia are independent predictors of mortality in septic patients at emergency department presentation: A sub-group analysis of the need-speed trial. Eur J Intern Med. 2021;83:21–7. [DOI] [PubMed] [Google Scholar]

[CR35] 35.Al Harbi SA, et al. Association between phosphate disturbances and mortality among critically ill patients with sepsis or septic shock. BMC Pharmacol Toxicol. 2021;22(1):30. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR36] 36.Wang H, et al. Hyperphosphatemia rather than hypophosphatemia indicates a poor prognosis in patients with sepsis. Clin Biochem. 2021;91:9–15. [DOI] [PubMed] [Google Scholar]

[CR37] 37.Li Z, Shen T, Han Y. Effect of serum phosphate on the prognosis of septic patients: A retrospective study based on MIMIC-IV database. Front Med (Lausanne). 2022;9:728887. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR38] 38.Huang X, Marques-Silva J. On the failings of Shapley values for explainability. Int J Approximate Reasoning. 2024;171:109112. [Google Scholar]

[CR39] 39.Bienefeld N, et al. Solving the explainable AI conundrum by bridging clinicians’ needs and developers’ goals. Npj Digit Med. 2023;6(1):94. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR40] 40.Hou N, et al. Predicting 30-days mortality for MIMIC-III patients with sepsis-3: a machine learning approach using XGboost. J Transl Med. 2020;18(1):462. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR41] 41.Kong G, Lin K, Hu Y. Using machine learning methods to predict in-hospital mortality of sepsis patients in the ICU. BMC Med Inf Decis Mak. 2020;20(1):251. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR42] 42.Zhang L, et al. Developing an ensemble machine learning model for early prediction of sepsis-associated acute kidney injury. iScience. 2022;25(9):104932. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] 43.Bai Y, et al. Using machine learning for the early prediction of sepsis-associated ARDS in the ICU and identification of clinical phenotypes with differential responses to treatment. Front Physiol. 2022;13:1050849. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR44] 44.Zhang Y, et al. Development of a machine learning-based prediction model for sepsis-associated delirium in the intensive care unit. Sci Rep. 2023;13(1):12697. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR45] 45.Lu X, et al. CLINICAL PHENOTYPES OF SEPSIS-ASSOCIATED ENCEPHALOPATHY: A RETROSPECTIVE COHORT STUDY. Shock. 2023;59(4):583–90. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR46] 46.Zhao QY, et al. A Machine-Learning approach for dynamic prediction of sepsis-Induced coagulopathy in critically ill patients with sepsis. Front Med (Lausanne). 2020;7:637434. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR47] 47.Zaar O, et al. Evaluation of the diagnostic accuracy of an online artificial intelligence application for skin disease diagnosis. Acta Derm Venereol. 2020;100(16):adv00260. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR48] 48.Domingues R, et al. Evaluation of the responsiveness pattern to caffeine through a smart data-driven ECG non-linear multi-band analysis. Heliyon. 2024;10(11):e31721. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR49] 49.Shi T, Horvath S. Unsupervised learning with random forest predictors. J Comput Graphical Stat. 2006;15(1):118–38. [Google Scholar]

[CR50] 50.Zhang Y, et al. Multi-modal graph neural network for early diagnosis of alzheimer’s disease from sMRI and PET scans. Comput Biol Med. 2023;164:107328. [DOI] [PubMed] [Google Scholar]

[CR51] 51.Hsiao YW, et al. A risk prediction model of gene signatures in ovarian cancer through bagging of GA-XGBoost models. J Adv Res. 2021;30:113–22. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR52] 52.Lv H, et al. Machine Learning-Driven models to predict prognostic outcomes in patients hospitalized with heart failure using electronic health records: retrospective study. J Med Internet Res. 2021;23(4):e24996. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR53] 53.Uddin S, et al. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inf Decis Mak. 2019;19(1):281. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR54] 54.Yang J, et al. Identification of clinical subphenotypes of sepsis after laparoscopic surgery. Laparosc Endoscopic Robotic Surg. 2024;7(1):16–26. [Google Scholar]

PERMALINK

Development and validation of machine learning models to predict in-hospital mortality in ICU patients with sepsis and chronic kidney disease

Shuoyan An

Zixiang Ye

Wuqiang Che

Yanxiang Gao

Jiahui Li

Jingang Zheng

Abstract

Background

Methods

Results

Conclusions

Supplementary Information

Introduction

Methods

Data sources

Study population and data extraction

Statistical analysis

Data preprocessing and feature selection

Model development and validation

Results

Baseline characteristics

Fig. 1.

Table 1.

Feature selection

Fig. 2.

Model development and performance assessment

Fig. 3.

Table 2.

External validation of XGBoost model

Fig. 4.

Model explanation by SHAP

Fig. 5.

Fig. 6.

Discussion

Limitation

Conclusion

Supplementary Information

Acknowledgements

Abbreviations

Author contributions

Funding

Data availability

Declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases