Skip to main content
ESC Heart Failure logoLink to ESC Heart Failure
. 2024 Sep 19;12(1):353–368. doi: 10.1002/ehf2.15062

Explainable machine learning and online calculators to predict heart failure mortality in intensive care units

An‐Tian Chen 1,2,3, Yuhui Zhang 2,, Jian Zhang 2,4,
PMCID: PMC11769656  PMID: 39300773

Abstract

Aims

This study aims to develop explainable machine learning models and clinical tools for predicting mortality in patients in the intensive care unit (ICU) with heart failure (HF).

Methods

Patients diagnosed with HF who experienced their first ICU stay lasting between 24 h and 28 days were selected from the Medical Information Mart for Intensive Care IV (MIMIC‐IV) database. The primary outcome was all‐cause mortality within 28 days. Data analysis was performed using Python and R, with feature selection conducted via least absolute shrinkage and selection operator (LASSO) regression. Fifteen models were evaluated, and the most effective model was rendered explainable through the Shapley additive explanations (SHAP) approach. A nomogram was developed based on logistic regression to facilitate interpretation. For external validation, the eICU database was utilized.

Results

After selection, the study included 2343 records, with 1808 surviving and 535 deceased patients. The median age of the study population was 70.00, with ~3/5 males (60.31%). The median length of stay in the ICU was 6.00 days. The median age of the survival group was younger than the non‐survival group (69.00 vs. 73.00), and non‐survival patients spent longer time in the ICU. Seventy‐five features were initially selected, including basic information, vital signs, laboratory tests, haemodynamics and oxygen status. LASSO regression determined the shrinkage parameter α = 0.020, and 44 features were chosen for model construction. The linear discriminant analysis (LDA) model showed the best performance, and the accuracy reached 0.8354 in the training cohort and 0.8563 in the testing cohort. It showed satisfying area under the curve (AUC), recall, precision, F1 score, Cohen's kappa score and Matthew's correlation coefficient. The concordance index (c‐index) reached 0.7972 in the training cohort and 0.8125 in the testing cohort. In external validation, the LDA model achieved approximately 0.9 in accuracy, precision, recall and F1 score, with an AUC of 0.79. Univariable analysis was performed in the training cohort. Features that differed significantly between the survival and non‐survival groups were subjected to multiple logistic regression. The nomogram built on multiple logistic regression included 14 features and demonstrated excellent performance. The AUC of the nomogram is 0.852 in the training cohort, 0.855 in the internal validation cohort and 0.770 in the external validation cohort. The calibration curve showed good consistency.

Conclusions

The study developed an LDA and a nomogram model for predicting mortality in HF patients in the ICU. The SHAP approach was employed to elucidate the LDA model, enhancing its utility for clinicians. These models were made accessible online for clinical application.

Keywords: heart failure, machine learning, nomogram, predictive model

Introduction

Heart failure (HF) is a devastating end‐stage condition for patients with heart disease and has been recognized as a growing epidemic since 1997. 1 Globally, an estimated 64.3 million people are living with HF, and known HF accounts for 1%–2% of the general adult population in developed countries. 2 In the United States, the National Health and Nutrition Examination Survey (NHANES), a well‐known database containing data on the health and nutritional status of the population, estimated the prevalence of HF to be 2.5%. 3 , 4 HF substantially increases the risk of all‐cause death, and this risk is more acute if the condition appears early. 5 In this study, we aim to conduct a retrospective analysis using a publicly available database to predict the 28 day in‐hospital all‐cause mortality of HF patients in the intensive care unit (ICU).

HF is a complex disorder affecting multiple systems and characterized by significant disturbances in circulatory homeostasis and a range of myocardial structural and functional changes that impair cardiac pumping capacity in the systolic stage and heart filling in the diastolic stage. 6 Patients in the ICU are critically ill and face a high risk of mortality, with an all‐cause mortality rate of up to 52.3%, and HF has been identified in 32.4% of ICU patients. 7

Machine learning is a powerful and increasingly attractive approach, and the medical field is no exception. 8 Predictive machine learning models have been developed to forecast outcomes in acute stroke, 9 acute kidney injury 10 and malaria. 11 Predictive deep learning models have also been developed for HF diagnosis. 12 Combining machine learning with clinical data can provide a better understanding and build useful predictive tools to support clinical practice.

Data

This study is a retrospective cohort study based on the Medical Information Mart for Intensive Care IV (MIMIC‐IV, Version 2.0, released on 12 June 2022) 13 database for analysing and the eICU Collaborative Research Database (eICU, Version 2.0.1, published on 6 May 2021) 14 for external validation available through PhysioNet. 15 The MIMIC‐IV database contains over 70 000 records of patients admitted to ICUs between 2008 and 2019. The eICU Collaborative Research Database is a freely available multi‐centre database for critical care research. 16 One of the authors (An‐Tian Chen) has credentialed access to both MIMIC‐IV and eICU data (record ID: 51101140).

Methods

Study patients

In this study, we collected ICU patients with a diagnosis of HF from the MIMIC‐IV database. HF diagnosis was defined using both the International Classification of Diseases (ICD)‐9 and ICD‐10 codes. As one patient may be admitted to the hospital multiple times, only the first‐admission record was collected for the study.

Eligibility criteria and primary outcome

  • a)

    Inclusion criteria

    • The inclusion criteria are patients who are diagnosed with HF and admitted to the ICU for the first time.

  • b)
    Exclusion criteria
    1. Patients who stay <24 h or over 28 days in the ICU are excluded in the study.
    2. Patients with over 30% missing features (after feature extraction) are also excluded in the study.
  • c)

    Primary outcome

    • The primary outcome is all‐cause mortality within 28 days of HF patients being admitted to the ICU.

Software

We used Python (Version 3.9.12) for data preprocessing, feature selection, statistical analysis, machine learning model building and evaluation. R (Version 4.2.2) was used for logistic regression (LR), nomogram construction and evaluation.

Statistical analysis

Patients who satisfied the above criteria were divided into two groups based on their survival or death status during their 28 day ICU stay. Categorical variables were described using percentages, while continuous variables were presented as mean ± standard deviation (SD) or median and interquartile range (IQR) based on whether the data followed a normal distribution or not. We performed normal tests using the normaltest function from the scipy package in Python, which is based on D'Agostino and Pearson's test to examine normality. 17 As none of the features followed a normal distribution, we used the two‐sided Mann–Whitney U rank test for intergroup comparison.

Feature extraction

To lower the impact of overfitting and for dimensional reduction of the prediction model, we first excluded features with <1000 cases. We then selected 75 features referring to clinical experience and published articles 18 , 19 including basic information, vital signs and laboratory tests. We applied the least absolute shrinkage and selection operator (LASSO) regression in variable selection and performed 10‐fold cross‐validation to help determine the penal factor (alpha).

Missing data processing, model construction and evaluation

In this study, missing data are replaced using the K‐nearest neighbour (KNN) algorithm to minimize the impact of missing values on classification. Patients are then divided into training and test subgroups in a 7:3 ratio. Model construction is performed using 15 machine learning algorithms, including linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), LR, ridge classifier, light gradient boosting machine (LightGBM), extreme gradient boosting (XGBoost), gradient boosting, decision tree (DT), random forest (RF), extremely randomized trees (Extra Trees), adaptive boosting (AdaBoost), naive Bayes (NB), dummy classifier, KNN and linear kernel support vector machine (SVM). The performance of each model is evaluated using accuracy, recall (sensitivity), precision (positive prediction value), F1 score, area under the curve (AUC) of the receiver operating characteristic (ROC) curve, Cohen's kappa score and Matthew's correlation coefficient (MCC). Accuracy is used to determine the optimal model. Additionally, the Shapley additive explanations (SHAP) approach is adopted to make the final optimized model more interpretable. A nomogram is established using LR, and the AUC of the ROC curve and calibration curve are used to evaluate the nomogram model.

External validation

Patients with HF who stayed between 24 h and 28 days in the ICU were selected from the eICU database regardless of their admission times. Lab and physical examination results were extracted based on features of the chosen machine learning model. Patients with over 30% missing features were removed. Features that are in the chosen model but not in the validation cohort are either (1) replaced with values of relevant features or (2) filled with the median value from the MIMIC‐IV database.

Results

Characteristics of ICU patients with HF

A total of 12 254 ICU admission records were extracted from the MIMIC‐IV database, as shown in Figure 1. Only the first ICU admission record was included for patients with multiple records. The feature values were the mean values of all recorded measurements if measured or examined for more than one time. After excluding patients who stayed in the ICU for <24 h or over 28 days, or with over 30% missing features, outliers were replaced with null values, and missing values were checked again. Finally, the study included 2343 first‐admission records of HF patients in the ICU, with 1808 surviving and 535 deceased patients. Table 1 presents the characteristics of ICU patients with HF and the differences between the survival and non‐survival groups.

Figure 1.

Figure 1

Patients' selection flowchart of the Medical Information Mart for Intensive Care IV database. HF, heart failure; ICU, intensive care unit.

Table 1.

Characteristics of intensive care unit patients with heart failure from the MIMIC‐IV database.

Feature(s) Total (N = 2343) Survival (N = 1808) Non‐survival (N = 535) P value
Basic information
Gender (N = 2343) 0.284
Male 60.31% 60.90% 58.32%
Female 39.69% 39.10% 41.68%
Age (years, N = 2343) 70.00 (61.00, 79.00) 69.00 (59.00, 77.00) (N = 1808) 73.00 (66.00, 81.00) (N = 535) <0.001
Length of stay (days, N = 2343) 6.00 (3.29, 10.23) 5.63 (3.22, 9.79) (N = 1808) 6.95 (3.66, 11.87) (N = 535) <0.001
Vital sign
Temperature (°C, N = 2316) 36.86 (36.61, 37.16) 36.86 (36.63, 37.13) (N = 1793) 36.82 (36.44, 37.25) (N = 523) 0.237
HR (b.p.m., N = 2343) 85.69 (77.77, 95.72) 84.30 (76.49, 93.85) (N = 1808) 90.68 (81.70, 101.58) (N = 535) <0.001
ABPm (mmHg, N = 1880) 74.21 (69.33, 79.62) 75.00 (70.50, 80.50) (N = 1430) 71.24 (66.26, 76.56) (N = 450) <0.001
ABPs (mmHg, N = 1885) 111.32 (104.21, 119.57) 112.64 (106.10, 121.03) (N = 1434) 106.21 (98.37, 114.22) (N = 451) <0.001
ABPd (mmHg, N = 1884) 55.66 (50.39, 61.58) 56.03 (50.98, 61.97) (N = 1433) 54.38 (48.48, 59.97) (N = 451) <0.001
NBPm (mmHg, N = 2307) 74.10 (68.26, 80.47) 75.21 (69.46, 81.54) (N = 1794) 69.89 (64.49, 75.72) (N = 513) <0.001
NBPs (mmHg, N = 2303) 111.57 (102.90, 121.28) 113.61 (104.55, 122.82) (N = 1790) 105.21 (98.13, 113.24) (N = 513) <0.001
NBPd (mmHg, N = 2298) 61.45 (54.85, 68.11) 62.23 (56.17, 69.00) (N = 1786) 57.78 (52.00, 64.52) (N = 512) <0.001
EtCO2 (N = 805) 34.80 (30.83, 38.40) 35.53 (32.00, 39.50) (N = 593) 31.95 (27.36, 35.83) (N = 212) <0.001
Laboratory test
Blood routine
Haemoglobin (g/dL, N = 2341) 9.48 (8.53, 10.67) 9.58 (8.56, 10.77) (N = 1806) 9.28 (8.43, 10.21) (N = 535) <0.001
Haematocrit (%, serum) (N = 2340) 29.11 (26.69, 32.69) 29.25 (26.77, 32.94) (N = 1806) 28.77 (26.38, 31.79) (N = 534) 0.010
WBC (K/μL, N = 2341) 12.95 (10.15, 16.16) 12.62 (9.93, 15.46) (N = 1807) 14.66 (11.08, 19.02) (N = 534) <0.001
Basophil (absolute) (K/μL, N = 1208) 0.03 (0.02, 0.05) 0.03 (0.02, 0.05) (N = 994) 0.03 (0.02, 0.05) (N = 214) 0.007
Eosinophil (absolute) (K/μL, N = 1174) 0.11 (0.04, 0.22) 0.12 (0.05, 0.22) (N = 972) 0.09 (0.02, 0.19) (N = 202) 0.001
Lymphocyte (absolute) (K/μL, N = 1379) 1.31 (0.78, 1.95) 1.39 (0.86, 2.05) (N = 1110) 0.91 (0.58, 1.48) (N = 269) <0.001
Monocyte (absolute) (K/μL, N = 1376) 0.76 (0.49, 1.12) 0.73 (0.47, 1.07) (N = 1105) 0.90 (0.61, 1.30) (N = 271) <0.001
Neutrophil (absolute) (K/μL, N = 1382) 10.81 (7.52, 14.73) 10.49 (7.33, 14.35) (N = 1111) 12.04 (9.20, 17.06) (N = 271) <0.001
Basophil (differential) (%, N = 1889) 0.22 (0.14, 0.40) 0.25 (0.16, 0.40) (N = 1502) 0.20 (0.10, 0.30) (N = 387) <0.001
Eosinophil (differential) (%, N = 1871) 0.90 (0.33, 1.75) 1.00 (0.40, 1.85) (N = 1494) 0.60 (0.20, 1.40) (N = 377) <0.001
Lymphocyte (differential) (%, N = 2186) 9.54 (5.93, 14.70) 10.48 (6.70, 15.70) (N = 1690) 6.50 (4.00, 10.70) (N = 496) <0.001
Monocyte (differential) (%, N = 2180) 5.00 (3.54, 7.20) 5.06 (3.60, 7.10) (N = 1683) 5.00 (3.50, 7.45) (N = 497) 0.840
Neutrophil (differential) (%, N = 2190) 81.18 (74.66, 86.29) 80.30 (73.90, 85.50) (N = 1691) 84.25 (77.75, 88.44) <0.001
Platelet count (K/μL, N = 2343) 162.08 (117.54, 223.97) 163.94 (121.24, 224.09) (N = 1808) 156.2 (102.27, 223.57) (N = 535) 0.004
Acid–base balance
PH (arterial) (N = 2238) 7.38 (7.34, 7.42) 7.39 (7.36, 7.42) (N = 1715) 7.35 (7.30, 7.40) (N = 523) <0.001
PH (venous) (N = 1647) 7.38 (7.32, 7.42) 7.39 (7.34, 7.43) (N = 1273) 7.32 (7.27, 7.38) (N = 374) <0.001
PH (dipstick) (N = 1647) 5.75 (5.50, 6.00) 5.75 (5.50, 6.00) (N = 1220) 5.70 (5.40, 6.00) (N = 427) 0.063
Arterial base excess (mEq/L, N = 2170) −1.00 (−3.27, 1.35) −0.57 (−2.53, 1.60) (N = 1653) −2.8 (−6.36, 0.12) (N = 517) <0.001
Anion gap (mEq/L, N = 2343) 14.56 (12.50, 17.00) 13.89 (12.04, 16.00) (N = 1808) 17.03 (14.75, 19.92) (N = 535) <0.001
Lactic acid (mmol/L, N = 2329) 1.90 (1.40, 2.70) 1.79 (1.36, 2.46) (N = 1795) 2.54 (1.70, 4.17) (N = 534) <0.001
TCO2 (arterial) (mEq/L, N = 2240) 24.50 (21.83, 27.17) 24.88 (22.50, 27.40) (N = 1716) 22.78 (19.08, 26.12) (N = 524) <0.001
TCO2 (venous) (mEq/L, N = 1128) 26.00 (22.00, 30.58) 26.42 (23.00, 31.00) (N = 798) 24.90 (21.00, 29.63) (N = 330) <0.001
Liver function
ALT (IU/L, N = 2010) 35.89 (17.53, 102.90) 31.00 (16.50, 78.94) (N = 1479) 59.20 (22.92, 255.65) (N = 531) <0.001
AST (IU/L, N = 2010) 56.00 (30.67, 163.63) 47.40 (28.82, 117.20) (N = 1479) 106.67 (42.13, 474.29) (N = 531) <0.001
ALP (IU/L, N = 1998) 87.80 (65.00, 129.00) 83.00 (62.23, 121.00) (N = 1470) 103.18 (76.13, 152.88) (N = 528) <0.001
Albumin (g/dL, N = 1673) 2.93 (2.55, 3.33) 3.00 (2.60, 3.40) (N = 1215) 2.75 (2.35, 3.17) (N = 458) <0.001
LDH (IU/L, N = 1755) 372.40 (268.13, 594.67) 345.58 (261.00, 497.00) (N = 1274) 491.67 (312.00, 950.10) (N = 481) <0.001
Total bilirubin (mg/dL, N = 2010) 0.80 (0.45, 1.46) 0.72 (0.44, 1.29) (N = 1480) 1.09 (0.54, 2.38) (N = 530) <0.001
Renal function
BUN (mg/dL, N = 2339) 31.33 (19.99, 48.30) 28.18 (18.50, 44.57) (N = 1805) 42.57 (29.03, 63.90) (N = 534) <0.001
Creatinine (serum) (mg/dL, N = 2343) 1.42 (0.97, 2.26) 1.26 (0.90, 2.03) (N = 1808) 2.02 (1.34, 2.87) (N = 535) 2.895
Electrolyte
Potassium (serum) (mEq/L, N = 2343) 4.19 (3.95, 4.48) 4.16 (3.93, 4.43) (N = 1808) 4.31 (4.04, 4.67) (N = 535) <0.001
Sodium (serum) (mEq/L, N = 2343) 138.16 (135.40, 141.22) 138.19 (135.67, 141.09) (N = 1808) 138.00 (134.81, 141.70) (N = 535) 0.207
Chloride (serum) (mEq/L, N = 2342) 102.47 (98.72, 105.78) 102.67 (99.21, 105.81) (N = 1808) 101.45 (97.20, 105.74) (N = 534) <0.001
Calcium (ionized) (mmol/L, N = 2131) 1.12 (1.08, 1.16) 1.12 (1.09, 1.16) (N = 1630) 1.09 (1.05, 1.14) (N = 501) <0.001
Calcium (non‐ionized) (mg/dL, N = 2329) 8.35 (8.00, 8.73) 8.37 (8.03, 8.73) (N = 1794) 8.28 (7.84, 8.74) (N = 535) 0.001
HCO3− (serum) (mEq/L, N = 2343) 23.63 (21.08, 26.29) 24.00 (21.77, 26.67) (N = 1808) 21.40 (18.20, 24.64) (N = 535) <0.001
Magnesium (mg/dL, N = 2342) 2.18 (2.05, 2.37) 2.19 (2.05, 2.39) (N = 1807) 2.17 (2.04, 2.35) (N = 535) 0.083
Phosphorous (mg/dL, N = 2333) 3.74 (3.17, 4.48) 3.61 (3.10, 4.28) (N = 1798) 4.20 (3.52, 5.28) (N = 535) <0.001
Cardiac markers
BNP (pg/mL, N = 394) 7605.00 (2639.00, 19 481.00) 6963.00 (2491.50, 18 278.50) (N = 287) 9261.50 (3367.00, 28 732.00) (N = 107) 0.066
CK (IU/L, N = 1349) 198.00 (73.00, 632.00) 187.00 (73.00, 605.79) (N = 931) 238.50 (74.38, 698.54) (N = 418) 0.148
CK‐MB (ng/mL, N = 1442) 7.00 (3.00, 20.67) 6.17 (3.00, 19.17) (N = 1007) 8.86 (4.00, 23.33) (N = 435) 0.001
CK‐MB fraction (%, N = 595) 5.50 (2.88, 8.59) 5.53 (2.95, 8.48) (N = 395) 5.37 (2.70, 8.85) (N = 200) 0.896
Troponin T (ng/mL, N = 1498) 0.19 (0.04, 0.97) 0.17 (0.03, 0.97) (N = 1052) 0.22 (0.06, 1.02) (N = 446) 0.004
Coagulation
PT (s, N = 2332) 15.28 (13.64, 18.68) 14.94 (13.44, 17.48) (N = 1799) 17.41 (14.49, 23.44) (N = 533) <0.001
PTT (s, N = 2340) 38.66 (30.43, 56.54) 35.76 (29.70, 53.08) (N = 1806) 49.62 (36.06, 66.63) (N = 534) <0.001
Fibrinogen (mg/dL, N = 1470) 276.17 (205.00, 400.58) 266.50 (201.25, 376.13) (N = 1134) 323.33 (218.54, 466.50) (N = 336) <0.001
INR (N = 2332) 1.40 (1.24, 1.71) 1.37 (1.22, 1.60) (N = 1799) 1.59 (1.32, 2.17) (N = 533) <0.001
Metabolism
Glucose (serum) (mg/dL, N = 2341) 135.63 (118.90, 166.54) 132.80 (117.37, 158.90) (N = 1807) 151.71 (126.65, 190.13) (N = 534) <0.001
Triglyceride (mg/dL, N = 547) 131.00 (90.00, 196.50) 134.00 (92.25, 199.88) (N = 398) 124.00 (83.50, 189.33) (N = 149) 0.466
Haemodynamics
CVP (mmHg, N = 1650) 12.34 (9.46, 16.33) 11.81 (9.03, 15.25) (N = 1286) 14.57 (11.41, 20.30) (N = 364) <0.001
PA line cm mark (cm, N = 896) 54.00 (50.00, 57.91) 53.33 (49.50, 57.00) (N = 767) 57.48 (52.00, 62.60) (N = 129) <0.001
PAPm (mmHg, N = 1044) 27.44 (23.50, 31.97) 26.50 (23.11, 31.31) (N = 890) 31.77 (27.49, 37.72) (N = 154) <0.001
PAPs (mmHg, N = 1049) 39.50 (33.48, 46.58) 38.61 (32.82, 45.51) (N = 894) 45.56 (38.61, 54.50) (N = 155) <0.001
PAPd (mmHg, N = 1048) 19.96 (16.92, 23.42) 19.40 (16.56, 22.57) (N = 893) 23.89 (19.97, 27.81) (N = 155) <0.001
Respiration
Mixed venous oxygen saturation (%, N = 1279) 65.88 (60.33, 72.33) 66.17 (60.96, 72.33) (N = 1005) 64.00 (56.78, 72.00) (N = 274) 0.002
PO2 (arterial) (mmHg, N = 2238) 123.29 (100.50, 162.56) 129.04 (102.34, 171.45) (N = 1714) 112.53 (95.01, 131.57) (N = 524) <0.001
PO2 (mixed venous) (mmHg, N = 638) 38.00 (34.00, 44.00) 38.00 (33.80, 43.35) (N = 460) 38.00 (34.26, 46.00) (N = 178) 0.229
PO2 (venous) (mmHg, N = 1130) 44.00 (37.00, 57.00) 44.00 (37.00, 57.46) (N = 799) 44.00 (37.50, 55.10) (N = 331) 0.426
SaO2 (%, N = 1733) 96.20 (94.76, 97.33) 96.35 (95.00, 97.47) (N = 1297) 95.84 (93.67, 97.00) (N = 436) <0.001
SvO2 (%, N = 727) 64.55 (60.00, 69.53) 65.11 (61.00, 69.96) (N = 604) 61.00 (54.00, 67.78) (N = 123) <0.001
PCO2 (arterial) (mmHg, N = 2238) 39.60 (35.50, 43.90) 39.68 (36.00, 43.73) (N = 1714) 39.24 (34.00, 44.53) (N = 524) 0.155
PCO2 (mmHg, venous) (N = 1128) 45.55 (39.00, 52.33) 45.00 (39.00, 52.00) (N = 798) 46.00 (39.52, 55.00) (N = 330) 0.101
Urine
Specific gravity (N = 1611) 1.02 (1.01, 1.02) 1.02 (1.01, 1.02) (N = 1192) 1.02 (1.01, 1.02) (N = 419) 0.093

Note: (1) All statistics are in this format: median (Q1, Q3). (2) The unit of measurement is extracted without changes. 1 mEq/L = 1 mmol/L. (3) In the Medical Information Mart for Intensive Care IV (MIMIC‐IV) database, all patients more than or equal to 89 years old have their age set to 91 for privacy.

Abbreviations: ABPd, arterial blood pressure diastolic; ABPm, arterial blood pressure mean; ABPs, arterial blood pressure systolic; ALP, alkaline phosphate; ALT, alanine transaminase; AST, aspartate transaminase; BNP, brain natriuretic peptide; BUN, blood urea nitrogen; CK, creatine kinase; CK‐MB, creatine kinase‐MB; CVP, central vein pressure; EtCO2, end‐tidal CO2; HCO3 , bicarbonate ion; HR, heart rate; INR, international normalized ratio; LDH, lactate dehydrogenase; NBPd, non‐invasive blood pressure diastolic; NBPm, non‐invasive blood pressure mean; NBPs, non‐invasive blood pressure systolic; PA, pulmonary artery; PAPd, pulmonary artery pressure diastolic; PAPm, pulmonary artery pressure mean; PAPs, pulmonary artery pressure systolic; PCO2, partial pressure of carbon dioxide; PH, potential of hydrogen; PO2, partial pressure of oxygen; PT, prothrombin time; PTT, partial thromboplastin time; SaO2, arterial oxygen saturation; SvO2, venous oxygen saturation; TCO2, total carbon dioxide; WBC, white blood cell.

Feature selection

We applied LASSO regression and determined the shrinkage parameter α to be 0.020 (Figure 2A,B). Subsequently, we selected 44 features using the training cohort for further machine learning model building (Figure 3).

Figure 2.

Figure 2

Feature selection using least absolute shrinkage and selection operator (LASSO) binary logistic regression. (A) Tuning parameter optimization using 10‐fold cross‐validation. The optimal alpha is chosen to be 0.020. (B) LASSO coefficients of all 75 features.

Figure 3.

Figure 3

Features after selection. There are 75 features in total before the selection process, and 44 features remain after using least absolute shrinkage and selection operator regression.

Machine learning model construction

We constructed a total of 15 models based on the 44 selected features using 10‐fold cross‐validation in the training set and evaluated them in the testing cohort by various metrics, such as accuracy, AUC, recall, precision, F1 score, Cohen's kappa score and MCC (Table 2). The LDA model demonstrated the highest accuracy, Cohen's kappa score and MCC compared with other models. The LDA model was chosen as the best model after considering all evaluation parameters comprehensively. The ROC curve, precision–recall curve and performance on the testing cohort of the chosen LDA model are shown in Figure 4. The concordance index (c‐index) is a metric used to evaluate the performance of predictive models, especially in the context of survival analysis and binary classification problems. For the LDA model, the c‐index reached 0.7972 in the training cohort and 0.8125 in the testing cohort, indicating great discrimination.

Table 2.

Evaluation of different machine algorithms.

Model Accuracy AUC Recall Precision F1 Kappa MCC
LDA Linear discriminant analysis
Training 0.8354 0.8555 0.4897 0.7130 0.5762 0.4785 0.4942
Testing 0.8563 0.8784 0.5590 0.7500 0.6406 0.5532 0.5625
GB Gradient boosting
Training 0.8311 0.8605 0.4680 0.7184 0.5562 0.4585 0.4806
Testing 0.8321 0.8796 0.4969 0.6838 0.5755 0.4742 0.4836
LR Logistic regression
Training 0.8305 0.8433 0.4683 0.6939 0.5559 0.4566 0.4718
Testing 0.8478 0.8728 0.5155 0.7411 0.6081 0.5174 0.5305
Ridge Ridge classifier
Training 0.8293 0.0000 0.4043 0.7466 0.5167 0.4248 0.4582
Testing 0.8435 0.7042 0.4472 0.7742 0.5669 0.4797 0.5066
LightGBM Light gradient boosting machine
Training 0.8287 0.8611 0.4546 0.7082 0.5455 0.4469 0.4679
Testing 0.8393 0.8789 0.4907 0.7182 0.5830 0.4878 0.5014
RF Random forest
Training 0.8274 0.8626 0.3482 0.7881 0.4737 0.3904 0.4404
Testing 0.8250 0.8801 0.3789 0.7262 0.4980 0.4044 0.4359
XGBoost Extreme gradient boosting
Training 0.8220 0.8516 0.4576 0.6695 0.5378 0.4332 0.4483
Testing 0.8350 0.8825 0.4969 0.6957 0.5797 0.4806 0.4911
Extra Trees Extremely randomized trees
Training 0.8207 0.8563 0.3022 0.7911 0.4311 0.3500 0.4097
Testing 0.8279 0.8756 0.3416 0.7857 0.4762 0.3918 0.4406
AdaBoost Adaptive boosting
Training 0.7915 0.8028 0.4493 0.5726 0.4936 0.3665 0.3761
Testing 0.8506 0.8668 0.6087 0.7000 0.6512 0.5567 0.5589
QDA Quadratic discriminant analysis
Training 0.7890 0.7962 0.3826 0.5561 0.4486 0.3253 0.3359
Testing 0.7980 0.8151 0.4224 0.5812 0.4892 0.3672 0.3745
NB Naive Bayes
Training 0.7884 0.8029 0.4201 0.5540 0.4731 0.3449 0.3525
Testing 0.8065 0.8298 0.4783 0.5969 0.5310 0.4110 0.4150
Dummy Dummy classifier
Training 0.7720 0.5000 0.0000 0.0000 0.0000 0.0000 0.0000
Testing 0.7710 0.5000 0.0000 0.0000 0.0000 0.0000 0.0000
KNN K‐nearest neighbour
Training 0.7591 0.5699 0.1311 0.4189 0.1981 0.0998 0.1242
Testing 0.7440 0.5727 0.1429 0.3538 0.2035 0.0827 0.0948
DT Decision tree
Training 0.7463 0.6444 0.4571 0.4419 0.4471 0.2832 0.2847
Testing 0.7596 0.6651 0.4907 0.4759 0.4832 0.3266 0.3267
SVM Linear kernel support vector machine
Training 0.6085 0.0000 0.4489 0.3498 0.2718 0.0977 0.1302
Testing 0.7795 0.5361 0.0870 0.6364 0.1530 0.1036 0.1742

Note: Highlights indicate best performance: yellow for the training cohort and green for the testing cohort.

Abbreviations: AUC, area under the curve; MCC, Matthew's correlation coefficient.

Figure 4.

Figure 4

(A) Receiver operating characteristic (ROC) curve, (B) precision–recall (PR) curve of the linear discriminant analysis model and (C) its performance on the testing cohort. AUC, area under the curve.

Model explanation

We first examined the feature importance of the LDA model, which revealed that lactic acid, bicarbonate ion (HCO3−) (serum) and partial pressure of carbon dioxide (PCO2) (arterial) were the top three most important features. Additionally, white blood cell (WBC), length of stay, total bilirubin, non‐invasive blood pressure mean (NBPm), neutrophil (absolute), arterial oxygen saturation (SaO2) and lymphocyte (differential) were also among the top 10 features (Figure 5A). Using a radar plot, we displayed the top five predictors with different relative importance (Figure 5B).

Figure 5.

Figure 5

Feature importance. (A) Feature importance plot of the top 10 features. (B) Radar plot for the five most important predictors of death.

Furthermore, we used the SHAP approach to make the model more explainable. The top 20 features are listed in Figure 6, which shows that older patients, patients with longer ICU stays and those with higher levels of PCO2 (both arterial and venous), lactic acid, partial thromboplastin time (PTT), blood urea nitrogen (BUN), WBC, heart rate (HR), glucose (serum) and fibrinogen are less likely to survive. Conversely, lower levels of HCO3−, NBPm, arterial blood pressure systolic (ABPs), non‐invasive blood pressure systolic (NBPs), partial pressure of oxygen (PO2) (arterial), lymphocyte (differential), neutrophil (absolute), platelet count and alanine transaminase (ALT) indicate a higher possibility of death.

Figure 6.

Figure 6

Shapley additive explanations (SHAP) contribution value in the linear discriminant analysis model.

Web‐based calculator

A web‐based calculator was developed using the LDA model. The calculator is accessible through https://tal‐cat‐28‐day‐all‐cause‐mortality‐prediction‐hf‐predict‐7rbxqk.streamlit.app/ (Figure 7). It utilizes 44 features to provide predictions on patient survival status. The columns can be edited by typing values or clicking on the +/− signs. After clicking on the submit button at the bottom, a prediction is generated.

Figure 7.

Figure 7

Calculator using the linear discriminant analysis model for predicting 28 day all‐cause mortality of heart failure (HF) patients in the intensive care unit (ICU): a web‐based, publicly accessible calculator.

External validation

Followed by a similar selection process as MIMIC‐IV, the eICU external validation dataset included 2040 admission records (not restricted to the first time) of HF patients in the ICU, with 1871 survivors and 169 non‐survival patients (Figure S1). There are 40 features selected (Table S1). The current HR and blood pressure were used for external validation instead of the maximum and minimum values for better generality. Features that are not available in the eICU database were either replaced with relevant features or filled with the median of the MIMIC‐IV database. Finally, missing data are replaced using the KNN algorithm. After evaluation, the LDA model achieved 0.9064 in accuracy, 0.8947 in precision, 0.9064 in recall and an F1 score of 0.8997. The AUC (0.79) indicated the acceptable prediction ability of the established LDA model in external cohorts (Figure 8).

Figure 8.

Figure 8

Receiver operating characteristic (ROC) curve of the linear discriminant analysis model in external eICU dataset validation. AUC, area under the curve.

Nomogram using LR

A nomogram was constructed using LR, which is a visualization tool. To identify the features contributing to the 28 day in‐hospital all‐cause mortality of HF patients in the ICU, univariable analysis was performed between the survival and non‐survival groups in the training cohort (Table 3). Features that differed significantly between groups (P value < 0.05) were then subjected to multiple LR, and the results were reported as odds ratios and 95% confidence intervals (CIs). Table 4 shows that age, length of stay, HR, ABPs, WBC, lymphocyte (differential), anion gap, lactic acid, BUN, PTT, pulmonary artery (PA) line cm mark and PO2 (arterial) are independent risk factors associated with 28 day in‐hospital mortality of critically ill patients with HF.

Table 3.

Univariate logistic regression analysis of all‐cause mortality in the training set.

Feature(s) Odds ratio 95% CI P value
Basic information
Age 1.025 1.015–1.035 <0.001
Length of stay 1.033 1.013–1.053 0.001
Vital sign
HR 1.032 1.023–1.041 <0.001
ABPm 0.949 0.934–0.964 <0.001
ABPs 0.954 0.944–0.964 <0.001
NBPm 0.939 0.926–0.952 <0.001
NBPs 0.954 0.944–0.963 <0.001
NBPd 0.961 0.950–0.973 <0.001
EtCO2 0.948 0.926–0.970 <0.001
Laboratory test
Blood routine
Haematocrit (serum) 0.977 0.954–1.000 0.051
WBC 1.068 1.048–1.089 <0.001
Neutrophil (absolute) 1.038 1.019–1.058 <0.001
Lymphocyte (differential) 0.937 0.919–0.956 <0.001
Monocyte (differential) 1.013 0.982–1.044 0.420
Platelet count 0.998 0.997–1.000 0.019
Acid–base balance
Arterial base excess 0.872 0.847–0.898 <0.001
Anion gap 1.262 1.219–1.308 <0.001
Lactic acid 1.525 1.411–1.649 <0.001
Liver function
ALT 1.001 1.000–1.001 <0.001
AST 1.000 1.000–1.001 <0.001
ALP 1.003 1.001–1.004 <0.001
LDH 1.000 1.000–1.001 <0.001
Total bilirubin 1.110 1.066–1.157 <0.001
Renal function
BUN 1.026 1.021–1.031 <0.001
Electrolyte
Sodium (serum) 0.989 0.963–1.014 0.821
HCO3− (serum) 0.865 0.840–0.891 <0.001
Cardiac markers
BNP 1.000 1.000–1.000 0.556
CK 1.000 1.000–1.000 0.012
CK‐MB 1.002 1.000–1.003 0.070
Coagulation
PT 1.072 1.054–1.089 <0.001
PTT 1.032 1.026–1.039 <0.001
Fibrinogen 1.001 1.001–1.002 0.001
Metabolism
Glucose (serum) 1.009 1.007–1.012 <0.001
Triglyceride 1.000 0.999–1.002 0.430
Haemodynamics
CVP 1.007 1.002–1.012 0.009
PA line cm mark 1.041 1.018–1.064 <0.001
PAPs 1.031 1.018–1.043 <0.001
Respiration
Mixed venous O2 saturation 0.977 0.964–0.991 0.082
PO2 (arterial) 0.992 0.990–0.995 <0.001
PO2 (mixed venous) 1.002 0.995–1.008 0.651
PO2 (venous) 0.996 0.991–1.000 0.065
SaO2 0.970 0.947–0.994 0.013
PCO2 (arterial) 0.996 0.982–1.010 0.977
PCO2 (venous) 1.005 0.994–1.016 0.112

Abbreviations: ABPm, arterial blood pressure mean; ABPs, arterial blood pressure systolic; ALP, alkaline phosphate; ALT, alanine transaminase; AST, aspartate transaminase; BNP, brain natriuretic peptide; BUN, blood urea nitrogen; CI, confidence interval; CK, creatine kinase; CK‐MB, creatine kinase‐MB; CVP, central vein pressure; EtCO2, end‐tidal CO2; HCO3 , bicarbonate ion; HR, heart rate; LDH, lactate dehydrogenase; NBPd, non‐invasive blood pressure diastolic; NBPm, non‐invasive blood pressure mean; NBPs, non‐invasive blood pressure systolic; PA, pulmonary artery; PAPs, pulmonary artery pressure systolic; PCO2, partial pressure of carbon dioxide; PO2, partial pressure of oxygen; PT, prothrombin time; PTT, partial thromboplastin time; SaO2, arterial oxygen saturation; WBC, white blood cell.

Table 4.

Multivariate analysis of independent predictors.

Feature(s) Odds ratio 95% CI P value
Basic information
Age 1.029 1.017–1.041 <0.001
Length of stay 1.056 1.029–1.083 <0.001
Vital sign
HR 1.014 1.003–1.025 0.009
ABPs 0.967 0.955–0.979 <0.001
Laboratory test
Blood routine
WBC 1.037 1.018–1.057 <0.001
Lymphocyte (differential) 0.973 0.956–0.991 0.003
Acid–base balance
Lactic acid 1.421 1.290–1.566 <0.001
Liver function
ALT 1.000 0.999–1.000 0.034
LDH 1.000 1.000–1.000 0.001
Renal function
BUN 1.019 1.013–1.025 <0.001
Electrolyte
HCO3− (serum) 0.929 0.897–0.962 <0.001
Coagulation
PTT 1.020 1.013–1.028 <0.001
Fibrinogen 1.001 1.000–1.002 0.046
Respiration
PO2 (arterial) 0.992 0.989–0.996 <0.001

Abbreviations: ABPs, arterial blood pressure systolic; ALT, alanine transaminase; BUN, blood urea nitrogen; CI, confidence interval; HCO3 , bicarbonate ion; HR, heart rate; LDH, lactate dehydrogenase; PO2, partial pressure of oxygen; PTT, partial thromboplastin time; WBC, white blood cell.

Finally, a nomogram was established using the features in the multivariate LR model (Figure 9). The AUC of the nomogram is 0.852, indicating strong predictive power in the training cohort (Figure 10A). The nomogram also demonstrates great accuracy in the testing cohort, with an AUC of 0.855 (Figure 10C). The calibration curve also showed good consistency in both the training and internal validation cohorts (Figure 10B,D). The accuracy and consistency were further proved in the eICU external validation dataset with an accuracy of 0.770 (Figure 10E) and a consistent calibration curve (Figure 10F).

Figure 9.

Figure 9

Nomogram using multivariate logistic regression (*P < 0.05, **P < 0.01 and ***P < 0.001). HF, heart failure; ICU, intensive care unit.

Figure 10.

Figure 10

Receiver operating characteristic curve in (A) the training cohort, (C) the testing cohort and (E) the external eICU validation cohort, and calibration curve in (B) the training cohort, (D) the testing cohort and (F) the external eICU validation cohort. AUC, area under the curve.

A dynamic nomogram for predicting 28 day all‐cause mortality of HF patients in the ICU was also established. It is a publicly accessible web‐based calculator, which can be found at https://at‐c.shinyapps.io/DynNomapp/ (Figure 11). By interactively selecting variables, predictions and 95% CIs will be displayed in the ‘Graphical Summary’, and the specific value can be accessed in the ‘Numerical Summary’.

Figure 11.

Figure 11

Dynamic nomogram for predicting 28 day all‐cause mortality of heart failure patients in the intensive care unit: a web‐based, publicly accessible calculator.

Discussion

HF is a condition that receives significant attention due to its severity, particularly for patients in the ICU. The ability to predict mortality for these patients is essential for both physicians and family members. The selection of patients who stay between 24 h and 28 days was based on several considerations: (1) Patients who stay <24 h often, under extreme cases, either recover rapidly or suffer from rapid‐progressive disease. Extreme cases may not be suitable for predictive model construction, so patients who stay <24 h were excluded; (2) patients who stay more than 28 days may be too complicated, with numerous complications to deal with, making the prediction less general. Those patients could also be ‘stable’ but still require mechanical ventilation, which makes prediction less meaningful; (3) we referred to publications with similar ideas. 20 , 21 In this study, we aimed to analyse the clinical characteristics of HF patients in the ICU using the MIMIC‐IV database and build machine learning models and a nomogram for mortality prediction. Intensivists can use these tools developed for general critically ill HF patients.

Statistical analysis between the survival and non‐survival groups

The statistical analysis presented in Table 1 demonstrates that older patients and those with longer ICU stays are at higher risk of mortality. This finding is understandable because severe patients tend to spend more time in the hospital and are more likely to experience mortality. In terms of vital signs, patients in the survival group were found to have higher blood pressure, with great significance observed in both arterial and non‐invasive blood pressure, whether in systolic, diastolic stage or mean value. Blood pressure can reflect the ability of cardiac function, and HF patients with lower blood pressure are at higher risk of mortality. The finding that higher end‐tidal CO2 (EtCO2) levels are observed in survival patients is consistent with previous studies that suggest the prognostic value of resting EtCO2. 15 Published research has also focused on HF patients exercising, indicating that reduced EtCO2 reflects impairments in cardiac performance. 22

Regarding laboratory tests, several items in blood routine were found to decrease in non‐survival patients, including haemoglobin, haematocrit (serum), basophil (absolute and differential), eosinophil, lymphocyte and platelet count. Meanwhile, other items such as WBC, monocyte (absolute) and neutrophil (absolute and differential) were elevated. Haemoglobin and haematocrit measure the status of red blood cells and the body's capacity to carry oxygen. The Vericiguat Global Study in Patients with Heart Failure and Reduced Ejection Fraction (VICTORIA) found that lower haemoglobin levels are associated with a higher frequency of clinical events, including cardiovascular death or HF hospitalization. 23 Lower haemoglobin levels are more common in advanced HF (>50%) than in patients with only mild symptoms (<10%). 24 A previous study also discovered that a 1% decline in haematocrit is associated with a 3% increase in mortality risk in a linear manner. 25 These findings are consistent with our results.

The study's analysis of blood cell trends in non‐survival patients shows varying levels. An elevation in WBC and neutrophil levels usually indicates infection, which is a significant problem in the ICU. Therefore, it is reasonable that mortality is more often observed in patients with higher WBC and neutrophil levels. The monocyte/macrophage system regulates inflammation and extracellular matrix remodelling in heart disease and is closely related to HF. Hence, elevated monocyte levels may reveal a poor prognosis. Lower levels of absolute eosinophil and lymphocyte counts may predict major cardiovascular events in patients with acute decompensated HF with reduced ejection fraction (HFrEF), as reported in previous studies. However, only differential values are included in this study, and a decline in differential values may be attributed to the elevation of other blood cells, necessitating further study.

The study also reveals that acidosis is a vital factor contributing to mortality. Non‐survival patients have lower pH (both arterial and venous), total carbon dioxide (TCO2) (both arterial and venous), serum HCO3−, higher arterial base excess, anion gap and lactic acid, which are all indicators of an unbalanced acid environment. End‐stage HF results in a lack of perfusion, making it challenging to transport and degrade metabolic waste, leading to the accumulation of waste and acidosis. Respiratory features also show significant differences, with non‐survival patients exhibiting lower PO2 (arterial), mixed venous oxygen saturation (SvO2), SaO2 and SvO2. These features reflect the status of low oxygen supply, which also causes the accumulation of lactic acid, contributing to acidosis. Previous studies have reported that acidosis is associated with mortality in high‐risk acute HF patients and serum anion gap with all‐cause mortality among critically ill congestive HF patients. 26 , 27

Electrolyte levels also vary between survival and non‐survival patients. Non‐survival patients have lower serum chloride and calcium levels, both ionized and non‐ionized, while serum potassium and phosphorous levels are higher than those in the survival group. Higher potassium results from both cell damage and acidosis, indicating a poor prognosis. Calcium is vital for the normal function of the heart as it is required for the sarcoplasmic reticulum, which plays a vital role in myocardial contraction. Altered sarcoplasmic reticulum calcium cycling is a target for the treatment of HF. 28 , 29 Omecamtiv mecarbil, which aims to achieve cardiac myosin activation, has already been developed. 30 However, chloride is often neglected in HF. Hypochloraemia is an independent predictor of adverse outcomes in both acute and chronic HF and can lead to diuretic resistance, resulting in a vicious circle. 31

The functions of several organs are worse in the non‐survival group, including the liver, kidney and heart. Increased glucose in the non‐survival group is consistent with the previous study. 32 Elevated levels of ALT, aspartate transaminase (AST), alkaline phosphate (ALP), lactate dehydrogenase (LDH) and total bilirubin, along with a decrease in albumin, indicate liver dysfunction. Total bilirubin was also reported to predict poor prognosis in non‐liver disease patients. 33 Interestingly, low ALT was one of the factors for a worse survival rate, while people would expect higher ALT to be a predictor of worse survival. As the ALT assay needs vitamin B6 as a catalytic cofactor, otherwise falsely low ALT results would be expected; low ALT is possibly a sign of malnutrition as seen in the elderly and alcoholics, which makes it an unexpecting indicator for a worse prognosis. 34 Low ALT has also been reported to be independently associated with mortality risk in atrial fibrillation patients. 35 High levels of BUN indicate poor renal function. In addition, while all patients are diagnosed with HF, surviving patients have less cardiac injury, as indicated by lower levels of creatine kinase‐MB (CK‐MB) and troponin T compared with the non‐survival group. Coagulation is impaired in non‐survival patients, as evidenced by prolonged prothrombin time (PT), PTT, higher international normalized ratio (INR) and elevated fibrinogen. Though brain natriuretic peptide (BNP) is a well‐established predictor in HF, it is not included in the model after construction, which may result from several reasons. HF patients in the ICU tend to have high BNP levels, making the difference between the survival and non‐survival groups less significant, requiring a higher demand for sample size.

Haemodynamics also shows significant differences between the survival and non‐survival groups. Elevated central vein pressure (CVP) in non‐survival patients suggests severe congestion status, which contributes to higher mortality rates. Pulmonary artery pressure (PAP), whether systolic, diastolic or mean, is also higher in non‐survival patients, which may be caused by congestion or underlying disease, and indicates poor prognosis.

Interestingly, there is a significant difference in the PA line cm mark between the groups. The PA line cm mark refers to the PA catheter, also known as the Swan‐Ganz catheter. Each thin dash on the catheter represents 10 cm, and the thick dash represents the mark for 50 cm. 36 There is no guideline regarding the suggested depth numerically. The catheter should be placed by visual guidance of pressure curves. 37 Although direct fluoroscopy guidance is an alternative, it is not available in most cases. 38 Recently, a formula has been established to estimate a safe insertion depth via the right internal jugular vein, which is height (cm)/2.35–23.5. 39 In this study, the catheter is deeper in non‐survival patients. However, it is not appropriate to conclude that a shallow placement is preferred because it may result from poor haemodynamic status. Further research is needed to investigate this interesting finding.

Model explanation

This study aimed to develop predictive models using machine learning techniques based on the MIMIC‐IV database to predict 28 day in‐hospital all‐cause mortality in HF patients in the ICU. Predictive modelling is a process of creating or selecting a model to predict an event's probability as accurately as possible. 40 The LDA model was selected for predicting prognosis, as it demonstrated the highest accuracy, recall, F1 score, kappa and MCC among the 15 models evaluated in this study.

To provide explanations for the LDA model's predictions, the SHAP method was used to generate local explanations using the classic Shapley values and related extensions. 41 Every patient in the model is allocated one dot for each feature, with red representing higher feature values and blue for lower feature values. 42 The top 20 features were listed, which were associated with prognosis in the built model. The features with the highest SHAP value were serum HCO3−, arterial PCO2 and lactic acid, indicating their significant contribution to predicting prognosis in HF patients in the ICU. These features point to acidosis, which is a vital, unbalanced status contributing to mortality. A web‐based, publicly accessible calculator was developed for public usage.

Visualization tool for clinical decision: nomogram

A nomogram was also generated as LR performed well, and a visualization tool was often preferred. In the nomogram, a reference line is available for reading scores (usually 0–100). The sum is calculated manually, after which the prediction probability can be obtained at the bottom. 43

The nomogram developed in this study included 14 features selected from multivariate LR, including age, length of stay, ABPs, ALT, BUN, lymphocyte (differential), fibrinogen, serum HCO3−, HR, LDH, lactic acid, arterial PO2, PTT and WBC. These features can be easily obtained in clinical settings as they are routinely measured and examined because of their significance, making the prediction accessible. A dynamic nomogram, which is also a web‐based, publicly accessible calculator, was set up to predict the 28 day all‐cause mortality of HF patients in the ICU in real‐life usage.

Our study does have certain limitations. Both models require complete data input, making accurate predictions challenging when data are missing. Therefore, any missing data must be preprocessed before usage. Additionally, given the continuous medical processes involved in treating ICU patients, the models were constructed based on the mean values of all recorded measurements when measurements were taken multiple times. As a result, calculations were necessary before accessing the models.

Conclusions

In conclusion, this study demonstrates the potential of machine learning in predicting 28 day all‐cause mortality for HF patients in the ICU. Unlike traditional analysis that relies on assumptions, machine learning can perform better without them, resulting in higher accuracy. The model used in this study includes numerous clinical features, making the predictions practical and easily accessible. 44 The SHAP analysis ensures the model's explainability, and a nomogram using LR provides a visualization tool for predicting prognosis. Additionally, two web‐based, publicly accessible calculators are provided based on either the LDA model or the dynamic nomogram. However, the model's limitations include not involving all the features that may benefit clinical decision‐making, such as echocardiography or other clinical scoring systems.

Overall, this study's findings provide a significant contribution to the clinical prediction of HF patients in the ICU. The web‐based, publicly accessible calculators may assist clinical doctors in making predictions about 28 day all‐cause mortality for HF patients in the ICU, ultimately improving patient outcomes. Future studies may focus on expanding the features to further improve the applicability.

Conflict of interest statement

The authors declare no conflict of interest.

Funding

This work was supported by grants from the Beijing Natural Science Foundation (Grant Number 7222143), the Chinese Academy of Medical Sciences Initiative for Innovative Medicine (CAMS Innovation Fund for Medical Science) (Grant Number 2020‐I2M‐1‐002) and the High‐level Hospital Clinical Research Fund of Fuwai Hospital, Chinese Academy of Medical Sciences (Grant Number 2022‐GSP‐GG‐9).

Supporting information

Figure S1. Patients selection flowchart of eICU database.

EHF2-12-353-s002.png (47.6KB, png)

Table S1. Unit of measurement in MIMIC‐IV and eICU database.

EHF2-12-353-s001.docx (58.6KB, docx)

Chen, A.‐T. , Zhang, Y. , and Zhang, J. (2025) Explainable machine learning and online calculators to predict heart failure mortality in intensive care units. ESC Heart Failure, 12: 353–368. 10.1002/ehf2.15062.

Contributor Information

An‐Tian Chen, Email: chenantian1@163.com.

Yuhui Zhang, Email: yuhuizhangjoy@163.com.

Jian Zhang, Email: fwzhangjian62@126.com.

References

  • 1. Roger VL. Epidemiology of heart failure: A contemporary perspective. Circ Res 2021;128:1421‐1434. doi: 10.1161/CIRCRESAHA.121.318172 [DOI] [PubMed] [Google Scholar]
  • 2. Groenewegen A, Rutten FH, Mosterd A, Hoes AW. Epidemiology of heart failure. Eur J Heart Fail 2020;22:1342‐1356. doi: 10.1002/ejhf.1858 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Paulose‐Ram R, Graber JE, Woodwell D, Ahluwalia N. The National Health and Nutrition Examination Survey (NHANES), 2021–2022: Adapting data collection in a COVID‐19 environment. Am J Public Health 2021;111:2149‐2156. doi: 10.2105/AJPH.2021.306517 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Benjamin EJ, Virani SS, Callaway CW, Chamberlain AM, Chang AR, Cheng S, et al. Heart disease and stroke statistics—2018 update: A report from the American Heart Association. Circulation 2018;137: doi: 10.1161/CIR.0000000000000558 [DOI] [PubMed] [Google Scholar]
  • 5. Li W, Gao H, Zhao X, Wang Y, Yang J, Zheng H, et al. Associations of heart failure onset age with all‐cause mortality: The Kailuan study. CVIA 2024;9: doi: 10.3389/fcvm.2021.706999 [DOI] [Google Scholar]
  • 6. Diwan A, Hill JA. 1—Molecular basis of heart failure. In: Felker GM, Mann DL, eds. Heart Failure: A Companion to Braunwald's Heart Disease (Fourth Edition). Philadelphia: Elsevier; 2020:1‐27.e3. doi: 10.1016/B978-1-4160-5895-3.10002-6 [DOI] [Google Scholar]
  • 7. Unal AU, Kostek O, Takir M, Caklili O, Uzunlulu M, Oguz A. Prognosis of patients in a medical intensive care unit. North Clin Istanb 2015;2:189‐195. doi: 10.14744/nci.2015.79188 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Deo RC. Machine learning in medicine. Circulation 2015;132:1920‐1930. doi: 10.1161/CIRCULATIONAHA.115.001593 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Heo J, Yoon JG, Park H, Kim YD, Nam HS, Heo JH. Machine learning‐based model for prediction of outcomes in acute stroke. Stroke 2019;50:1263‐1265. doi: 10.1161/STROKEAHA.118.024293 [DOI] [PubMed] [Google Scholar]
  • 10. Kate RJ, Pearce N, Mazumdar D, Nilakantan V. A continual prediction model for inpatient acute kidney injury. Comput Biol Med 2020;116:103580. doi: 10.1016/j.compbiomed.2019.103580 [DOI] [PubMed] [Google Scholar]
  • 11. Lee YW, Choi JW, Shin EH. Machine learning model for predicting malaria using clinical information. Comput Biol Med 2021;129:104151. doi: 10.1016/j.compbiomed.2020.104151 [DOI] [PubMed] [Google Scholar]
  • 12. Liu Z, Huang Y, Li H, Li W, Zhang F, Ouyang W, et al. A generalized deep learning model for heart failure diagnosis using dynamic and static ultrasound. J Transl Int Med 2023;11:138‐144. doi: 10.2478/jtim-2023-0088 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Johnson A, Bulgarelli L, Pollard T, Horng S, Celi LA, Roger M. MIMIC‐IV (version 2.0). PhysioNet 2022; doi: 10.13026/7vcr-e114 [DOI] [Google Scholar]
  • 14. Johnson A, Pollard T, Badawi O, Raffa J. eICU Collaborative Research Database demo (version 2.0.1). PhysioNet 2021; Available from: doi: 10.13026/4mxk-na84 [DOI] [Google Scholar]
  • 15. Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PC, Mark RG, et al. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation 2000;101:e215‐e220. doi: 10.1161/01.cir.101.23.e215 [DOI] [PubMed] [Google Scholar]
  • 16. Pollard TJ, Johnson AEW, Raffa JD, Celi LA, Mark RG, Badawi O. The eICU Collaborative Research Database, a freely available multi‐center database for critical care research. Sci Data 2018;5:180178. doi: 10.1038/sdata.2018.178 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. scipy.stats.normaltest. 2022. https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.normaltest.html. Accessed 18 November 2022
  • 18. Peng S, Huang J, Liu X, Deng J, Sun C, Tang J, et al. Interpretable machine learning for 28‐day all‐cause in‐hospital mortality prediction in critically ill patients with heart failure combined with hypertension: A retrospective cohort study based on medical information mart for intensive care database‐IV and eICU databases. Front Cardiovasc Med 2022;9: doi: 10.3389/fcvm.2022.994359 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Deshmukh F, Merchant SS. Explainable machine learning model for predicting GI bleed mortality in the intensive care unit. Am J Gastroenterol 2020;115:1657‐1668. doi: 10.14309/ajg.0000000000000632 [DOI] [PubMed] [Google Scholar]
  • 20. Li R, Chen Y, Liang Q, Zhou S, An S. Lower serum chloride concentrations are associated with an increased risk of death in ICU patients with acute kidney injury: An analysis of the MIMIC‐IV database. Minerva Anestesiol 2023;89:166‐174. doi: 10.23736/S0375-9393.22.16686-1 [DOI] [PubMed] [Google Scholar]
  • 21. Liu Q, Zheng HL, Wu MM, Wang QZ, Yan SJ, Wang M, et al. Association between lactate‐to‐albumin ratio and 28‐days all‐cause mortality in patients with acute pancreatitis: A retrospective analysis of the MIMIC‐IV database. Front Immunol 2022;13:1076121. doi: 10.3389/fimmu.2022.1076121 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Myers J, Gujja P, Neelagaru S, Hsu L, Vittorio T, Jackson‐Nelson T, et al. End‐tidal CO2 pressure and cardiac performance during exercise in heart failure. Med Sci Sports Exerc 2009;41:19‐25. doi: 10.1249/MSS.0b013e318184c945 [DOI] [PubMed] [Google Scholar]
  • 23. Ezekowitz JA, Zheng Y, Cohen‐Solal A, Melenovský V, Escobedo J, Butler J, et al. Hemoglobin and clinical outcomes in the Vericiguat Global Study in Patients With Heart Failure and Reduced Ejection Fraction (VICTORIA). Circulation 2021;144:1489‐1499. doi: 10.1161/CIRCULATIONAHA.121.056797 [DOI] [PubMed] [Google Scholar]
  • 24. Groenveld HF, Januzzi JL, Damman K, van Wijngaarden J, Hillege HL, van Veldhuisen DJ, et al. Anemia and mortality in heart failure patients: A systematic review and meta‐analysis. J Am Coll Cardiol 2008;52:818‐827. doi: 10.1016/j.jacc.2008.04.061 [DOI] [PubMed] [Google Scholar]
  • 25. Desai AS. Hemoglobin concentration in acute decompensated heart failure: A marker of volume status? J Am Coll Cardiol 2013;61:1982‐1984. doi: 10.1016/j.jacc.2013.02.021 [DOI] [PubMed] [Google Scholar]
  • 26. Park JJ, Choi DJ, Yoon CH, Oh IY, Lee JH, Ahn S, et al. The prognostic value of arterial blood gas analysis in high‐risk acute heart failure patients: An analysis of the Korean Heart Failure (KorHF) registry. Eur J Heart Fail 2015;17:601‐611. doi: 10.1002/ejhf.276 [DOI] [PubMed] [Google Scholar]
  • 27. Tang Y, Lin W, Zha L, Zeng X, Zeng X, Li G, et al. Serum anion gap is associated with all‐cause mortality among critically ill patients with congestive heart failure. Dis Markers 2020;2020:8833637. doi: 10.1155/2020/8833637 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Kho C, Lee A, Hajjar RJ. Altered sarcoplasmic reticulum calcium cycling—Targets for heart failure therapy. Nat Rev Cardiol 2012;9:717‐733. doi: 10.1038/nrcardio.2012.145 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Marks AR. Calcium cycling proteins and heart failure: Mechanisms and therapeutics. J Clin Invest 2013;123:46‐52. doi: 10.1172/JCI62834 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Teerlink JR, Diaz R, Felker GM, McMurray J, Metra M, Solomon SD, et al. Cardiac myosin activation with omecamtiv mecarbil in systolic heart failure. N Engl J Med 2021;384:105‐116. doi: 10.1056/NEJMoa2025797 [DOI] [PubMed] [Google Scholar]
  • 31. Zandijk AJL, van Norel MR, Julius FEC, Sepehrvand N, Pannu N, McAlister FA, et al. Chloride in heart failure: The neglected electrolyte. JACC Heart Fail 2021;9:904‐915. doi: 10.1016/j.jchf.2021.07.006 [DOI] [PubMed] [Google Scholar]
  • 32. Yang X, Su G, Zhang T, Yang H, Tao H, du X, et al. Comparison of admission glycemic variability and glycosylated hemoglobin in predicting major adverse cardiac events among type 2 diabetes patients with heart failure following acute ST‐segment elevation myocardial infarction. J Transl Internal Med 2024;12:188‐196. doi: 10.2478/jtim-2024-0006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Li Y, Yao Z, Li Y, Yang Z, Li M, Chen Z, et al. Prognostic value of serum ammonia in critical patients with non‐hepatic disease: A prospective, observational, multicenter study. J Transl Int Med 2023;11:401‐409. doi: 10.2478/jtim-2022-0021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Maeda D, Sakane K, Kanzaki Y, Okuno T, Nomura H, Hourai R, et al. Relation of aspartate aminotransferase to alanine aminotransferase ratio to nutritional status and prognosis in patients with acute heart failure. Am J Cardiol 2021;139:64‐70. doi: 10.1016/j.amjcard.2020.10.036 [DOI] [PubMed] [Google Scholar]
  • 35. Saito Y, Okumura Y, Nagashima K, Fukamachi D, Yokoyama K, Matsumoto N, et al. Low alanine aminotransferase levels are independently associated with mortality risk in patients with atrial fibrillation. Sci Rep 2022;12:12183. doi: 10.1038/s41598-022-16435-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Rishi B. Pulmonary artery catheter/Swan‐Ganz structure, waveforms, and interpreting the numbers. 2017. https://rk.md/2017/pulmonary‐artery‐catheter‐structure‐waveforms/. Accessed 2 December 2022
  • 37. Shah MR, Hasselblad V, Stevenson LW, Binanay C, O'Connor CM, Sopko G, et al. Impact of the pulmonary artery catheter in critically ill patients: Meta‐analysis of randomized clinical trials. JAMA 2005;294:1664‐1670. doi: 10.1001/jama.294.13.1664 [DOI] [PubMed] [Google Scholar]
  • 38. Wiener RS, Welch HG. Trends in the use of the pulmonary artery catheter in the United States, 1993–2004. JAMA 2007;298:423‐429. doi: 10.1001/jama.298.4.423 [DOI] [PubMed] [Google Scholar]
  • 39. Walz R, Roth S, Hollmann MW, Huhn R. Formula for safe insertion depth of a pulmonary artery catheter. Br J Anaesth 2021;127:e25‐e27. doi: 10.1016/j.bja.2021.04.012 [DOI] [PubMed] [Google Scholar]
  • 40. Malek S, Hui C, Aziida N, Cheen S, Toh S, Milow P. Ecosystem monitoring through predictive modeling. 2019.
  • 41. SHAP documentation. https://shap.readthedocs.io/en/latest/index.html. Accessed 12 April 2022
  • 42. Tseng PY, Chen YT, Wang CH, Chiu KM, Peng YS, Hsu SP, et al. Prediction of the development of acute kidney injury following cardiac surgery by machine learning. Crit Care 2020;24:478. doi: 10.1186/s13054-020-03179-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. nomogram: Draw a nomogram representing a regression fit. https://www.rdocumentation.org/packages/rms/versions/6.3‐0/topics/nomogram. Accessed 12 April 2022
  • 44. Johnson KW, Torres Soto J, Glicksberg BS, Shameer K, Miotto R, Ali M, et al. Artificial intelligence in cardiology. J Am Coll Cardiol 2018;71:2668‐2679. doi: 10.1016/j.jacc.2018.03.521 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1. Patients selection flowchart of eICU database.

EHF2-12-353-s002.png (47.6KB, png)

Table S1. Unit of measurement in MIMIC‐IV and eICU database.

EHF2-12-353-s001.docx (58.6KB, docx)

Articles from ESC Heart Failure are provided here courtesy of Oxford University Press

RESOURCES