Skip to main content
Lupus Science & Medicine logoLink to Lupus Science & Medicine
. 2025 Feb 13;12(1):e001397. doi: 10.1136/lupus-2024-001397

Prediction of mortality risk in critically ill patients with systemic lupus erythematosus: a machine learning approach using the MIMIC-IV database

Zhihan Chen 1,2,3,0,0, Yunfeng Dai 1,2,3,0,0, Yilin Chen 4,0,0, Han Chen 2,3,5,6, Huiping Wu 4,7,8,✉,1, Li Zhang 2,3,9,*,1
PMCID: PMC11831314  PMID: 39947742

Abstract

Objective

Early prediction of long-term outcomes in patients with systemic lupus erythematosus (SLE) remains a great challenge in clinical practice. Our study aims to develop and validate predictive models for the mortality risk.

Methods

This observational study identified patients with SLE requiring hospital admission from the Medical Information Mart for Intensive Care (MIMIC-IV) database. We downloaded data from Fujian Provincial Hospital as an external validation set. Variable selection was performed using the Least Absolute Shrinkage and Selection Operator (LASSO) regression. Then, we constructed two predictive models: a traditional nomogram based on logistic regression and a machine learning model employing a stacking ensemble approach. The predictive ability of the models was evaluated by the areas under the receiver operating characteristic curve (AUC) and the calibration curve.

Results

A total of 395 patients and 100 patients were enrolled respectively from MIMIC-IV database and the validation cohort. The LASSO regression identified 18 significant variables. Both models demonstrated good discrimination, with AUCs above 0.8. The machine learning model outperformed the nomogram in terms of precision and specificity, highlighting its potential superiority in risk prediction. The SHapley additive explanations analysis further elucidated the contribution of each variable to the model’s predictions, emphasising the importance of factors such as urine output, age, weight and alanine aminotransferase.

Conclusions

The machine learning model provides a superior tool for predicting mortality risk in patients with SLE, offering a basis for clinical decision-making and potential improvements in patient outcomes.

Keywords: Systemic Lupus Erythematosus, Risk Factors, Mortality


WHAT IS ALREADY KNOWN ON THIS TOPIC

  • Developing an effective mortality risk prediction model is crucial for enhancing the prognosis of patients with systemic lupus erythematosus (SLE).

WHAT THIS STUDY ADDS

  • We constructed two predictive models: a traditional nomogram based on logistic regression and a machine learning model employing a stacking ensemble approach. By comparison, the machine learning model was found to be superior in terms of accuracy and specificity.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

  • We identified the ensemble learning model based on SMOTEENN-SVM (Synthetic Minority Over-Sampling Technique and Edited Nearest Neighbors - Support Vector Machine) and multilayer perceptron as the effective model for predicting mortality risk in SLE.

Introduction

Systemic lupus erythematosus (SLE) is a pervasive chronic autoimmune disorder characterised by a spectrum of clinical presentations, affecting women of childbearing age at a rate of 20 to 150 cases per 100 000 population.1 The chief causes of mortality in SLE are atherosclerotic complications, malignancy and infection, with active disease contributing to a lesser extent.2 While systemic therapy with corticosteroids and immunosuppressants has been shown to enhance the prognosis for many patients with SLE, treatment outcomes can be highly variable.3 A subset of patients with severe SLE may require intensive care unit intervention, yet they still face a mortality rate two to five times higher than those with milder forms of the disease.4 Therefore, developing an effective mortality risk prediction model is crucial for enhancing the prognosis of patients with SLE.

A nomogram, akin to an ancient slide rule, is a graphical calculator that illustrates the outcomes of logistic or Cox regression models.5 This tool is extensively applied across various medical disciplines for epidemiological purposes, such as diagnosing diseases, assessing patient prognoses and predicting the likelihood of disease recurrence.6,8 Its visual approach simplifies the interpretation of complex statistical data, enhancing its utility in clinical practice. Furthermore, machine learning (ML) techniques are emerging as robust and dependable tools for outcome assessment. Compared with traditional statistical modelling methods, ML methods are capable of processing a larger number of variables and tend to output more accurate and precise results.9 A variety of ML models, encompassing hybrid, ensemble, and deep learning approaches, have been increasingly integrated into the medical field.10 This technological advancement is significantly enhancing the predictive capabilities and decision-making processes in medical practice.

Both traditional and ML models may serve as useful tools for improving prognostic judgement by clinicians. Thus, this study is designed to investigate the independent risk factors associated with all-cause mortality in patients with SLE and to construct predictive models using a combination of these methodologies. By leveraging the extensive dataset from the Medical Information Mart for Intensive Care IV (MIMIC-IV), our analysis has successfully pinpointed high-risk patients, thereby providing a more detailed and nuanced understanding of SLE.

Materials and methods

Data sources and study design

This retrospective study used data sourced from the MIMIC-IV database (V.2.2), a widely recognised repository developed and curated by the MIT Computational Physiology Laboratory. This database is comprised of extensive and high-quality medical records of patients who were admitted to the intensive critical care units of the Beth Israel Deaconess Medical Center.11 Consent was obtained when the database was established and the original data were collected. Therefore, the Institutional Review Board of Fujian Provincial Hospital waived the informed consent for the utilisation of the database. Dr. Han Chen has been authorised to extract the data from the MIMIC database (certification number: 53297811). The study was designed and conducted under the Declaration of Helsinki.

The study population within the MIMIC-IV database comprised 410 individuals, aged 18 years and above, diagnosed with SLE and admitted to the hospital for the first time. To maintain the integrity of the analysis, patients were excluded if they had a length of stay of less than 24 hours or had incomplete follow-up data. This selection process resulted in an analytical cohort of 395 patients. Based on the same inclusion and exclusion criteria, we have collected a cohort of 100 critically ill patients with SLE admitted to Fujian Provincial Hospital from January 2023 to October 2024, serving as an external validation set. These patients were subsequently categorised into survival group and non-survival group. The flow diagram of the study is detailed in figure 1.

Figure 1. Overall flowchart of this study. LASSO, Least Absolute Shrinkage and Selection Operator; LightGBM, Light Gradient Boosting Machine; MIMIC-IV, Medical Information Mart for Intensive Care IV; MLP, multilayer perceptron; RF, Random Forest; SLE, systemic lupus erythematosus; SVM, Support Vector Machine; XGBoost, eXtreme Gradient Boosting.

Figure 1

Data collection

The software PostgresSQL (V.13.7.2) and Navicate Premium (V.16) were used to extract information with a running Structured Query Language. The extraction of potential variables could be divided into five main groups: (1) demographics, such as age, gender, race, height and weight. (2) Comorbidities, including diabetes, hypertension, coronary heart disease, chronic obstructive pulmonary disease (COPD), heart failure, chronic kidney disease, dialysis and cerebrovascular disease. (3) Laboratory indicators, including white blood cells, neutrophils, lymphocytes, monocytes, red blood cells, haemoglobin, platelet, albumin, globulin, total bilirubin, indirect bilirubin, alanine aminotransferase (ALT), aspartate aminotransferase, lactate dehydrogenase, total cholesterol, high-density lipoprotein, low-density lipoprotein, triglycerides, creatine kinase, creatine kinase-myocardial band, serum creatinine, blood urea nitrogen (BUN), uric acid, potassium, sodium, iron, ferritin, transferrin, glucose and prothrombin time (PT). (4) Vital signs, including heart rate (HR), systolic blood pressure (SBP), diastolic blood pressure, mean blood pressure, temperature, oxygen saturation (SpO2) and urine output. (5) Severity of illness scores at admission, including the Sepsis-related Organ Failure Assessment Score (SOFA), the Glasgow Coma Scale Score, the Simplified Acute Physiology Score II (SAPSII), the Model for End-stage Liver Disease (MELD), the Acute Physiology Score III (APSIII) and Systemic Inflammatory Response Syndrome (SIRS).

To avoid possible bias, variables were excluded if they had more than 20% missing values. Variables with missing data <20% were processed by multiple imputation using a Random Forest (RF) algorithm (trained by other non-missing variables) by the ‘mice’ package of R software.12

Clinical outcomes

The primary outcomes were all-cause mortality at 1-year postadmission. The survival data for these intervals were extracted from the MIMIC-IV database.

Feature selection

Least Absolute Shrinkage and Selection Operator (LASSO) regression was a regression method for high-dimensional predictive variables that can retain valuable variables, estimate parameters simultaneously and avoid overfitting.13 We incorporated variables into our LASSO regression model, utilising patient outcomes as the dependent variable, guided by clinical significance and corroborated by the relevant literature. The model was constructed using the R package ‘glmnet’ with a fivefold cross-validation approach. For the ‘family’ parameter, we specified ‘binomial’, and the optimal lambda value was determined by the ‘lambda.1se’ criterion, which represents the smallest value within one SE of the minimum cross-validated error. We plotted the logarithmic profiles of the LASSO coefficients and the partial likelihood deviance curve, alongside the logarithmic lambda curve. Subsequently, we identified the optimal lambda value corresponding to the 1-SE rule, which is the value within one SD of the minimum standard.

Statistical analysis

Categorical variables were described using frequencies (n) and percentages (%). Continuous variables were tested for normality using the Shapiro-Wilk test. Data conforming to a normal distribution were described using the mean and SD, while non-normally distributed data were presented by the median and IQR (25th percentile, 75th percentile). χ2 tests for independence were employed to compare categorical variables between groups. Independent samples t-tests were used to analyse differences in normally distributed continuous variables between two groups, while the Mann-Whitney U test was used for non-normally distributed continuous variables. One-way analysis of variance was applied to assess differences in normally distributed continuous variables across multiple groups, with post hoc pairwise comparisons conducted using the Student-Newman-Keuls method when significant differences were observed. The Kruskal-Wallis H test was used to compare non-normally distributed continuous variables across groups, followed by post hoc pairwise comparisons using the Bonferroni method when applicable.

Nomogram prediction model

The feature variables identified through LASSO regression analysis were integrated into the model to construct a logistic regression-based nomogram model, aimed at predicting the mortality rate of patients with SLE. Nomogram was drawn by the rms’ package to elevate the operability and practicability of the risk model. The model’s discriminatory power was assessed utilising the area under the curve (AUC) of the receiver operating characteristic (ROC) curve and the Concordance Index, while its calibration performance was evaluated through the calibration curve analysis.

Machine learning model

In this study, we employed two distinct sampling strategies for the construction of binary classification models: one without data augmentation and the other utilising SMOTEENN. SMOTEENN, an advanced hybrid of the SMOTE and ENN algorithms, was specifically designed to address class imbalance by generating synthetic samples for the minority class—representing death cases—with subsequent refinement to eliminate potential mislabelling.14 These strategies were employed to enhance the overall model performance.

The dataset was meticulously partitioned into a training subset for model development and a testing subset for evaluation, adhering to an 8:2 ratio to ensure balanced representation of both positive and negative samples. The datasets, both in their original form and after undergoing SMOTEENN resampling, were analysed using five sophisticated ML algorithms: eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Naive Bayes, RF and Support Vector Machine (SVM). To refine the predictive accuracy, a comprehensive grid search approach was employed for hyperparameter optimisation across all models. The classifiers’ generalisation efficacy was meticulously evaluated using a suite of performance metrics, including AUC, precision, recall, F1 score, specificity and accuracy.

Stacking ensemble model

Stacking ensemble learning is an ensemble learning method that trains a new model based on the combined predictions of two (or more) previous ML models. Stacking ensemble learning often performs better than individual ML techniques.15 In the context of this study, the primary layer of our stacking algorithm leverages the models with superior predictive performance as identified in the preceding analysis. For the meta-learner, we experiment with both traditional logistic regression and multilayer perceptron (MLP) to explore potential enhancements in model performance and to assess the risk of overfitting.

SHapley Additive Explanations analysis

SHapley Additive Explanations (SHAP) is an explainable artificial intelligence technique that helps clinicians understand the results of ML models.16 We integrate the optimal predictive outcomes with the SHAP model to offer a nuanced interpretation of sample predictions based on feature importance. The SHAP approach assigns a global importance to each feature, conceptualised as the average absolute value of the feature’s contribution across all samples. Consequently, the higher the importance assigned to a feature, the more significant its influence on the model’s predictive accuracy.

Statistical significance was set at α=0.05, with p<0.05 indicating a significant result. All statistical analyses were performed by the R software (V.4.1.2) and Statistical Product and Service Solutions (SPSS) 27.0.

Results

Baseline characteristics

A total of 395 patients diagnosed with SLE were eligible for the MIMIC-IV database based on predefined inclusion and exclusion criteria. 360-day mortality was 21.0%. Gender distribution included 81 men (20.5%) and 314 women (79.5%). Table 1 highlights significant baseline differences between survivors and non-survivors among patients with SLE. Non-survivors tended to be older, with a median age of 65 compared with 53 years for survivors, indicating that age significantly impacts the prognosis of patients with SLE. Renal function indicators such as creatinine and BUN were also significantly elevated in non-survivors, implying that renal insufficiency may be associated with a poorer prognosis. Additionally, lower albumin levels in non-survivors may reflect malnutrition or an inflammatory state. Notably, the SOFA, SAPSII, APACHE II and SIRS scores were significantly higher in the non-survivor group compared with the survivor group.

Table 1. Baseline characteristics of the survivors and non-survivors groups.

Characteristics Survivor Non-survivor P value*
n 312 83
Age, years, median (IQR) 53 (40, 65.25) 65 (52.5, 72.5) < 0.001
Sex, n (%) 0.356
 Female 245 (62%) 69 (17.5%)
 Male 67 (17%) 14 (3.5%)
Weight, kg, median (IQR) 78.15 (63.8, 95.55) 74 (59.25, 89.7) 0.127
Neutrophils, %, median (IQR) 80 (72.9, 86.7) 82.7 (72.2, 87.55) 0.281
Lymphocytes, %, median (IQR) 12.1 (6.5, 17.5) 8.4 (5.6, 11.9) 0.002
Monocytes, %, median (IQR) 4.7 (3, 6.7) 4.9 (3.35, 8) 0.217
RBC, m/uL, mean±SD 3.6442±0.75726 3.4248±0.76528 0.020
Haemoglobin, g/dL, mean±SD 10.632±2.166 10.18±2.221 0.093
Platelet, K/uL, median (IQR) 201 (143, 270.25) 189 (117, 289.5) 0.526
Albumin, g/dL, median (IQR) 3.1 (2.6, 3.7) 3 (2.3, 3.4) 0.003
Globulin, mg/dL, median (IQR) 0.4 (0.3, 0.725) 0.6 (0.4, 1.4) < 0.001
TBIL, mg/dL, median (IQR) 24 (16, 41.25) 22 (15, 45) 0.587
IBIL, mg/dL, median (IQR) 29 (22, 58.75) 36 (23, 72) 0.132
LDH, IU/L, median (IQR) 257.5 (197, 343.5) 331 (271, 502.5) < 0.001
Cr, mg/dL, median (IQR) 0.9 (0.7, 1.625) 1.2 (0.8, 1.9) 0.016
BUN, mg/dL, median (IQR) 17.5 (11, 30.25) 26 (15, 42) < 0.001
Potassium, mEq/L, median (IQR) 4 (3.7, 4.5) 4.3 (3.75, 4.7) 0.112
Sodium, mEq/L, median (IQR) 139 (136, 141) 138 (134, 141.5) 0.227
PT, sec, median (IQR) 13.25 (11.675, 15.525) 15.9 (13, 18.5) < 0.001
NLR, median (IQR) 7.075 (4.05, 13.377) 9.39 (5.49, 16.255) 0.034
PLR, median (IQR) 17.325 (8.74, 30.805) 24.91 (11.545, 48.32) 0.020
LMR, median (IQR) 2.585 (1.4825, 4.64) 1.78 (1, 2.975) 0.002
First-time heart rate, beats/min, median (IQR) 90 (77, 104) 89 (75.5, 103.5) 0.542
Maximal heart rate, beats/min, median (IQR) 107 (93, 124) 119 (99, 136.5) 0.002
Minimum heart rate,beats/min, median (IQR) 68 (59, 77) 63 (53.5, 74) 0.014
Mean heart rate, beats/min, mean±SD 85.836±14.007 88.168±14.696 0.183
First-time SBP, mm Hg, median (IQR) 126 (109, 144) 119 (101, 137.5) 0.091
Minimum SBP, mm Hg, mean±SD 91.343±17.41 75.928±22.931 < 0.001
First-time DBP, mm Hg, median (IQR) 70 (58, 83.25) 64 (52, 75.5) 0.008
First-time MBP, mm Hg, median (IQR) 85 (72, 100) 80 (68, 94.5) 0.032
First-time RR, beats/min, median (IQR) 19 (15, 23) 20 (16, 23.5) 0.155
Minimum RR, beats/min, median (IQR) 11 (9, 13) 11 (7, 14) 0.582
First-time temperature, ℃, median (IQR) 36.805 (36.56, 37.11) 36.56 (36.305, 37.085) 0.006
Maximal temperature, ℃, median (IQR) 99.3 (98.6, 100.43) 99.5 (98.6, 101.55) 0.191
First-time SPO2, %, median (IQR) 98 (96, 100) 98 (96, 100) 0.930
Minimum SPO2, %, median (IQR) 92 (88, 94) 88 (76.5, 92) < 0.001
Glucose, mg/dL, median (IQR) 117 (102.75, 155.25) 119 (88, 148.5) 0.200
Urine output, mL, median (IQR) 1470 (932.5, 2331.2) 995 (579, 1732.5) < 0.001
SOFA, median (IQR) 3 (1, 5) 5 (4, 8) < 0.001
GCS, median (IQR) 15 (15, 15) 15 (15, 15) 0.006
SAPSII, median (IQR) 28 (21, 37) 42 (35.5, 50) < 0.001
APSIII, median (IQR) 37 (27, 49) 53 (41, 66.5) < 0.001
MELD, median (IQR) 10 (7, 20) 18.518 (10, 25.82) < 0.001
SIRS, median (IQR) 2 (2, 3) 3 (2, 4) 0.010
Diabetes, n (%) 0.056
 Yes 93 (23.5%) 16 (4.1%)
 No 219 (55.4%) 67 (17%)
Hypertension, n (%) 0.188
 Yes 183 (46.3%) 42 (10.6%)
 No 129 (32.7%) 41 (10.4%)
Coronary heart disease, n (%) 0.209
 No 231 (58.5%) 67 (17%)
 Yes 81 (20.5%) 16 (4.1%)
COPD, n (%) 0.095
 No 278 (70.4%) 79 (20%)
 Yes 34 (8.6%) 4 (1%)
Dialysis, n (%) 0.620
 No 309 (78.2%) 81 (20.5%)
 Yes 3 (0.8%) 2 (0.5%)
Heart failure, n (%) 0.020
 No 222 (56.2%) 48 (12.2%)
 Yes 90 (22.8%) 35 (8.9%)
LN, n (%) 0.214
 Yes 261 (66.1%) 74 (18.7%)
 No 51 (12.9%) 9 (2.3%)
CKD, n (%) 0.111
 No 166 (42%) 36 (9.1%)
 Yes 146 (37%) 47 (11.9%)
Cerebrovascular disease, n (%) 0.012
 No 293 (74.2%) 71 (18%)
 Yes 19 (4.8%) 12 (3%)
*

The significant baseline differences between survivors and non-survivors among patients with SLE are labelled in red.

APSIII, acute physiology score III; BUN, blood urea nitrogen; CKDchronic kidney diseaseCOPDchronic obstructive pulmonary diseaseCr, serum creatinine; DBP, diastolic blood pressure; GCS, Glasgow Coma Scale Score; IBIL, indirect bilirubin; LDH, lactate dehydrogenase; LMR, lymphocyte to monocyte ratio; LNlupus nephritisMBP, mean blood pressure; MELD, model for end-stage liver disease; NLR, neutrophil to lymphocyte ratio; PLR, platelet to lymphocyte ratio; PT, prothrombin time; RBC, red blood cells; RR, respiratory rate; SAPSII, Simplified Acute Physiology Score II; SBP, systolic blood pressure; SIRSSystemic Inflammatory Response SyndromeSOFA, sepsis-related organ failure assessment score; SpO2, oxygen saturation; TBIL, total bilirubin

Feature selection

To identify the most discriminative diagnostic variables capable of significantly distinguishing between survival and non-survival groups, we conducted LASSO analysis on the initial set of candidate variables. The selection of the optimal tuning parameter, λ, was guided by the LASSO coefficient profiles and the selection map, which indicated λ=0.0221 as the threshold (figure 2A,B). Utilising this criterion, we ultimately extracted 18 variables that demonstrated robust predictive power for the risk of mortality in severe SLE. The selected variables were as follows: age, weight, ALT, PT, platelet to lymphocyte ratio, lymphocyte to monocyte ratio, HR, SBP, SpO2, glucose, urine output, SAPSII, MELD and the presence of comorbidities such as coronary heart disease, COPD, dialysis, lupus nephritis and cerebrovascular disease (table 2, figure 3).

Figure 2. LASSO parameter selection and coefficient profiles. (A) The optimal tuning parameter selection map for the LASSO analysis, highlighting the chosen lambda (λ) value. (B) The coefficient profiles of the variables, indicating their relative importance and contribution to the model, which led to the final selection of diagnostic variables. LASSO, Least Absolute Shrinkage and Selection Operator.

Figure 2

Table 2. The results of logistic regression analyses.

Coef SE Wald Z Pr(>|Z|)
Intercept 0.1245 1.9848 0.06 0.95
Age, years 0.0419 0.0137 3.05 0.0023
Weight, kg −0.0097 0.0086 −1.13 0.2583
ALT, IU/L −0.0011 0.0007 −1.74 0.0823
PT, s 0.0259 0.0208 1.25 0.213
PLR 0.0053 0.0046 1.15 0.2494
LMR −0.0925 0.0752 −1.23 0.2184
Heart rate, beats/min −0.014 0.0111 −1.26 0.2068
SpO2,% −0.0433 0.0131 −3.31 0.0009
SBP, mm Hg −0.0105 0.0103 −1.01 0.3104
Glucose, mg/dL −0.0073 0.0034 −2.13 0.0329
Urine output, mL 0 0.0002 0.09 0.9318
SAPSII 0.0666 0.0185 3.6 0.0003
MELD 0.0079 0.0275 0.29 0.7743
Coronary heart disease −1.3551 0.4689 −2.89 0.0039
COPD −0.7911 0.713 −1.11 0.2672
Dialysis 1.8494 1.3607 1.36 0.1741
Lupus nephritis 0.8481 0.5447 1.56 0.1195
Cerebrovascular 1.4244 0.548 2.6 0.0093

ALTalanine aminotransferaseCOPDchronic obstructive pulmonary diseaseLMRlymphocyte to monocyte ratioMELDModel for End-stage Liver DiseasePLRplatelet to lymphocyte ratioPTprothrombin timeSAPSIISimplified Acute Physiology Score IISBPsystolic blood pressure

Figure 3. The nomogram for prediction of 1-year mortality risk in patients with SLE. SAPSII, Simplified Acute Physiology Score II; SLE, systemic lupus erythematosus; SpO2, oxygen saturation.

Figure 3

Prediction by using nomogram prediction model

To assess the performance of the nomogram model, we calculated the C-index values for both the training and testing datasets. The C-index for the training set was 0.841 (95% CI 0.778 to 0.889), and for the testing set, it was 0.882 (95% CI 0.785 to 0.946), indicating that the model possesses stability and reliability. Additionally, the ROC curves depicted in figure 4A,B demonstrate AUC values exceeding 0.8, with a particularly high AUC of 0.9231 in the testing set, further substantiating the model’s high predictive accuracy. Figure 4C,D shows that the model’s predicted outcomes closely align with the actual outcomes for both the training and testing sets, signifying the model’s robust predictive performance.

Figure 4. The receiver operating characteristic (ROC) curve and calibration curve of nomogram. (A,B) The ROC curves of the nomogram in training set and test set. (C,D) The Calibration curves of the nomogram in training set and test set.

Figure 4

Prediction by using ML models

The datasets, both untreated and subjected to SMOTEENN resampling, were used to conduct predictions using five distinct ML algorithms. Table 3 presents a quantitative assessment of the generalisation capabilities of each ML model. To facilitate a direct comparison among the models, the AUC and specificity for each model under different sampling conditions were visualised, as depicted in figure 5.

Table 3. The metrics of different machine learning models.

Model Sampling method AUC Precision Recall F1 score Specificity Accuracy
XGBoost No-sampling 0.8416 0.8429 0.9516 0.8939 0.3529 0.8228
SMOTEENN 0.8425 0.8929 0.8065 0.8475 0.6471 0.7722
LightGBM No-sampling 0.8510 0.8310 0.9516 0.8872 0.2941 0.8101
SMOTEENN 0.8605 0.9286 0.8387 0.8814 0.7647 0.8228
Naive Bayes No-sampling 0.7789 0.8571 0.9677 0.9091 0.4118 0.8481
SMOTEENN 0.8805 0.8852 0.8710 0.8780 0.5882 0.8101
SVM No-sampling 0.8487 0.8676 0.9516 0.9077 0.4706 0.8481
SMOTEENN 0.8634 0.9455 0.8387 0.8889 0.8235 0.8354
RF No-sampling 0.7566 0.8841 0.9839 0.9313 0.5294 0.8861
SMOTEENN 0.8340 0.9333 0.9032 0.9180 0.7647 0.8734

LightGBM, Light Gradient Boosting Machine; RF, Random ForestSMOTEENNSynthetic Minority Over-Sampling Technique and Edited Nearest NeighborsSVM, Support Vector Machine; XGBoost, eXtreme Gradient Boosting

Figure 5. Comparison of AUC and specificity across models and sampling techniques. AUC, areas under the receiver operating characteristic curve; gbm, Gradient Boosting Machine; RF, Random Forest; SMOTEENN, Synthetic Minority Over-Sampling Technique and Edited Nearest Neighbors; SVM, Support Vector Machine; XGB, eXtreme Gradient Boosting.

Figure 5

Before undergoing resampling, the majority of classification models exhibited high accuracy but relatively low specificity, likely due to the imbalance in the dataset that inclined the models towards the majority class. The models showed good performance across AUC, F1 score and accuracy metrics, with the RF and XGBoost models performing particularly well. The RF model had an accuracy of 0.8861, an F1 score of 0.9313 and an AUC of 0.7566; the XGBoost model achieved an AUC of 0.8416 and an F1 score of 0.8939.

After applying the SMOTEENN sampling technique to the dataset, the ratio of deceased to surviving patients was balanced, resulting in an enhancement of predictive performance for positive samples across all models. The AUC and specificity of the five models were both improved, demonstrating good performance under the balanced data condition. Notably, the SVM model showed exceptional results after SMOTEENN resampling, with an AUC of 0.8634, an F1 score of 0.8889, a specificity of 0.8235 and an accuracy of 0.8354. Considering all metrics, the SMOTEENN-SVM model exhibited the best performance in predicting mortality risk and was recommended as the optimal model. The RF and LightGBM models also performed well across multiple indicators and could serve as alternative models. This indicated that appropriate sampling techniques and model selection on imbalanced datasets can significantly enhance predictive performance.

Prediction by using stacking ensemble model

In this study, the stacking model, leveraging SMOTEENN-SVM as the component learner, was analysed with logistic regression and MLP as the meta-classifiers. As shown in table 4, the stacking model utilising MLP as the meta-classifier exhibited a slight decrease in specificity by 0.14% compared with the model with logistic regression as the meta-classifier. However, it demonstrated superior performance across other critical metrics, including accuracy, precision and recall. Specifically, the MLP yielded a precision of 0.9072, recall of 0.9196, F1-score of 0.9128, specificity of 0.9209 and accuracy of 0.9200. In contrast, the logistic regression model achieved a precision of 0.8905, recall of 0.8926, F1-score of 0.8915, specificity of 0.9019 and accuracy of 0.9223. Moreover, when compared with the five classification models previously discussed, the integrated stacking model showed a further enhancement in predictive performance.

Table 4. Comparison of prediction performance of stacking models.

Component learner Meta-classifiers Precision Recall F1-score Specificity Accuracy
SMOTEENN-SVM LR 0.8905 0.8926 0.8915 0.9223 0.9019
MLP 0.9072 0.9196 0.9128 0.9209 0.9200

LR, logistic regression; MLP, multilayer perceptronSMOTEENNSynthetic Minority Over-Sampling Technique and Edited Nearest NeighborsSVM, Support Vector Machine

Validation of model predictive performance

The predictive outcomes elucidated that the stacking model, with SMOTEENN-SVM as the component learner and MLP as the meta-classifier, exhibited optimal performance in 1-year mortality prediction. To further substantiate our findings, we conducted a comprehensive evaluation of the model’s performance on a validation cohort. The model demonstrated an accuracy of 0.8000, a precision of 0.7500 and a recall of 0.8500. With an F1 score of 0.7972 and an AUC of 0.8750 (see online supplemental figure 1 for details), the model showed a strong ability to discriminate between classes. Collectively, these metrics indicate that the model has robust classification performance.

SHAP analysis

This section merges the SHAP model with the predictive results to offer a nuanced interpretation based on feature importance. As illustrated in figure 6, the 18 features are sorted by their significance for predicting survival and death, with urine output, age, weight and ALT emerging as pivotal for assessing death risk. Patients exhibiting advanced age, elevated body weight and compromised hepatic and renal function are associated with an increased risk of mortality. This prioritisation of feature importance not only enhances our understanding of the model’s decision-making process but also provides clinicians with a valuable reference to better evaluate patient prognoses and devise treatment strategies.

Figure 6. SHAP analysis plot of the early identification model for SLE. ALT, alanine aminotransferase; PLR, platelet to lymphocyte ratio; SAPSII, Simplified Acute Physiology Score II; SHAP, SHapley additive explanations; SLE, systemic lupus erythematosus.

Figure 6

Discussion

SLE is a complex, multisystem autoimmune disorder with an aetiology that remains elusive.17 Despite significant advancements in treatment that have led to improved long-term survival rates for individuals with SLE, the disease continues to necessitate frequent hospitalisations. Over the past two decades, SLE has been recognised as a serious autoimmune condition that can lead to ICU admission in severe cases, with a high risk of death associated with such complications.18 19 The development of predictive models and the identification of characteristic variables for SLE are pivotal for enhancing our capacity to detect high-risk patients early, thereby improving their clinical outcomes and reducing the associated healthcare burden.

In this study, which encompassed 395 patients with SLE from the MIMIC IV database, we developed predictive models for mortality risk among patients with SLE. Initially, we employed LASSO regression to select significant feature variables associated with the mortality risk in patients with SLE. Following this, we constructed a nomogram model based on logistic regression and an ensemble learning model based on SMOTEENN-SVM and MLP. By comparing the performance of both models, we identified the optimal model, and sed the SHAP method to elucidate the key influential factors.

When evaluating the nomogram and stacking ensemble learning models, we performed a comprehensive analysis. Both models exhibited strong predictive capabilities, with AUC values above 0.8. The nomogram model offers a probabilistic 1-year survival estimate for patients with SLE, requiring clinicians to apply their experience to interpret these probabilities. In contrast, the stacking ensemble learning model provides a clear binary classification of survival outcomes. Moreover, the stacking model’s integration of multiple algorithms enhances its adaptability to complex datasets, potentially outperforming the nomogram in scenarios with significant data variability. Therefore, despite the simplicity of the nomogram’s operation, the stacking ensemble learning model may overall prove to be superior in performance.

In comparison to Su et al’s study,3 which used the MIMIC database to predict 30-day survival rates for critically ill SLE patients, our research diverges in two primary aspects: the temporal scope of mortality analysis and the complexity of the predictive models employed. First, our study broadens the scope to encompass long-term mortality, providing insights into extended patient outcomes beyond the initial month. Second, while Su et al’s RSM-LDA model excels in handling high-dimensional data with a simple yet effective approach, our adoption of a stacking ensemble learning model is designed for a more sophisticated and tailored prediction system. This model is particularly advantageous for dissecting the intricate patterns associated with long-term survival probabilities in ICU settings. Consequently, our study not only complements the existing literature but also enhances the predictive framework for clinical decision-making and patient management in SLE.

Additionally, the SHAP analysis found that urine output, age, weight and ALT were the most important predictors. Clinically, urine volume is an indicator of renal physiology, and a decreased amount might indicate renal insufficiency.20 Kidney damage is a considerable cause of mortality and disability in patients with SLE and includes glomerular, tubular, renal interstitial and blood vessel destruction.21 22 A large multicentre international cohort research has found young patients with SLE remained at increased risk for premature death.2 However, our study found non-survivors tended to be older, with a median age of 65 compared with 53 years for survivors. This may be attributable to the fact that with advancing age, patients may incur a higher risk of mortality due to the accumulation of organ damage and the adverse effects of long-term pharmacological treatments.23 Several studies have shown that obesity is an independent risk factor associated with worse SLE disease activity23 and cumulative organ damage (eg, nephritis).24 In addition, patients who are underweight may be malnourished and have vitamin deficiencies that impact immune function.25 So it is necessary to strengthen weight management in patients with SLE. Elevated serum alanine ALT levels serve not merely as an indicator of hepatic status, but research conducted by Yamada et al has revealed that they also function as an independent marker for the activation of systemic inflammation and the escalation of oxidative stress.26 In patients with SLE, the presence of systemic inflammation can accelerate the progression of atherosclerosis,26 thereby increasing the risk of mortality.

Our research has developed predictive models and identified significant variables associated with mortality in patients with SLE within a US cohort. However, there are several limitations to our study. First, while ML methods excel in handling high-dimensional and non-linear data, the relatively small sample size included in this study might constrain the models’ performance. Second, the incomplete data in the MIMIC database, particularly the lack of SLICC Damage Index, Systemic Lupus Erythematosus Disease Activity Index, treatment history and common clinical indicators like erythrocyte sedimentation rate, C reactive protein and complement levels, may bias our analysis and limit the model’s accuracy and generalisability. Additionally, although our study focused on 360-day mortality as a long-term prognostic indicator, it is also crucial to consider short-term outcomes, such as 28-day or 90-day mortality rates, for immediate risk assessment and evaluating treatment effectiveness. Finally, prospective cohort studies are essential for validating our findings. Therefore, we plan to expand the scope of our study to include samples from different regions for external validation and to enhance the generalisability of our results.

Conclusion

In summary, we constructed two models to predict all-cause mortality in patients with SLE. By comparing the performance, we identified the ensemble learning model based on SMOTEENN-SVM and MLP as the optimal model. We also found that urine output, age, weight and ALT levels were key prognostic factors.

supplementary material

online supplemental file 1
lupus-12-1-s001.pdf (84.1KB, pdf)
DOI: 10.1136/lupus-2024-001397

Footnotes

Funding: This work was supported by the Natural Science Foundation of Fujian province [Grant No. 2023J011199] and Joint Funds for the Innovation of Science and Technology, Fujian province [Grant NO.2023Y9305].

Provenance and peer review: Not commissioned; externally peer-reviewed.

Patient consent for publication: Not applicable.

Ethics approval: The data used in this study from the MIMIC-IV database were publicly accessible and did not require approval from an ethics committee. However, for the validation set comprising patients from Fujian Provincial Hospital, we ensured ethical conduct by obtaining approval from the Ethics Committee of Fujian Provincial Hospital (Approval Number: K2024-09-051). Because of the retrospective nature of this study, in-person informed consent was exempted.

Data availability free text: The data that support the findings of this study are available from https://physionet.org/content/mimiciv/2.0/ but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of PhysioNet.

Patient and public involvement: Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

Data availability statement

Data are available upon reasonable request.

References

  • 1.Tsokos GC. Systemic lupus erythematosus. N Engl J Med. 2011;365:2110–21. doi: 10.1056/NEJMra1100359. [DOI] [PubMed] [Google Scholar]
  • 2.Bernatsky S, Boivin J-F, Joseph L, et al. Mortality in systemic lupus erythematosus. Arthritis Rheum. 2006;54:2550–7. doi: 10.1002/art.21955. [DOI] [PubMed] [Google Scholar]
  • 3.Su B, Li H. Development and validation of models for risk of death in patients with systemic lupus erythematosus admitted to the intensive care unit: a retrospective study. Clin Rheumatol. 2023;42:2987–99. doi: 10.1007/s10067-023-06701-w. [DOI] [PubMed] [Google Scholar]
  • 4.Aragón CC, Ruiz-Ordoñez I, Quintana JH, et al. Clinical characterization, outcomes, and prognosis in patients with systemic lupus erythematosus admitted to the intensive care unit. Lupus (Los Angel) 2020;29:1133–9. doi: 10.1177/0961203320935176. [DOI] [PubMed] [Google Scholar]
  • 5.Grimes DA. The nomogram epidemic: resurgence of a medical relic. Ann Intern Med. 2008;149:273–5. doi: 10.7326/0003-4819-149-4-200808190-00010. [DOI] [PubMed] [Google Scholar]
  • 6.Gold JS, Gönen M, Gutiérrez A, et al. Development and validation of a prognostic nomogram for recurrence-free survival after complete surgical resection of localised primary gastrointestinal stromal tumour: a retrospective analysis. Lancet Oncol. 2009;10:1045–52. doi: 10.1016/S1470-2045(09)70242-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Yuan H-L, Zhang X, Li Y, et al. A Nomogram for Predicting Risk of Thromboembolism in Gastric Cancer Patients Receiving Chemotherapy. Front Oncol. 2021;11:598116. doi: 10.3389/fonc.2021.598116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Jeong S-H, Kim RB, Park SY, et al. Nomogram for predicting gastric cancer recurrence using biomarker gene expression. Eur J Surg Oncol. 2020;46:195–201. doi: 10.1016/j.ejso.2019.09.143. [DOI] [PubMed] [Google Scholar]
  • 9.Smith JB, Shew M, Karadaghy OA, et al. Predicting salvage laryngectomy in patients treated with primary nonsurgical therapy for laryngeal squamous cell carcinoma using machine learning. Head Neck. 2020;42:2330–9. doi: 10.1002/hed.26246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Aldhyani THH, Alshebami AS, Alzahrani MY. Soft Clustering for Enhancing the Diagnosis of Chronic Diseases over Machine Learning Algorithms. J Healthc Eng. 2020;2020:4984967. doi: 10.1155/2020/4984967. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Johnson AEW, Bulgarelli L, Shen L, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data. 2023;10:1. doi: 10.1038/s41597-022-01899-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Blazek K, van Zwieten A, Saglimbene V, et al. A practical guide to multiple imputation of missing data in nephrology. Kidney Int. 2021;99:68–74. doi: 10.1016/j.kint.2020.07.035. [DOI] [PubMed] [Google Scholar]
  • 13.Chen W, Ou M, Tang D, et al. Identification and Validation of Immune-Related Gene Prognostic Signature for Hepatocellular Carcinoma. J Immunol Res. 2020;2020:5494858. doi: 10.1155/2020/5494858. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Bai Z, Lu J, Li T, et al. Clinical Feature-Based Machine Learning Model for 1-Year Mortality Risk Prediction of ST-Segment Elevation Myocardial Infarction in Patients with Hyperuricemia: A Retrospective Study. Comput Math Methods Med. 2021;2021:7252280. doi: 10.1155/2021/7252280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Xu X, Yu Z, Ge Z, et al. Web-Based Risk Prediction Tool for an Individual’s Risk of HIV and Sexually Transmitted Infections Using Machine Learning Algorithms: Development and External Validation Study. J Med Internet Res. 2022;24:e37850. doi: 10.2196/37850. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Hung P-S, Lin P-R, Hsu H-H, et al. Explainable Machine Learning-Based Risk Prediction Model for In-Hospital Mortality after Continuous Renal Replacement Therapy Initiation. Diagnostics (Basel) 2022;12:1496. doi: 10.3390/diagnostics12061496. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Oud L. The Epidemiology and Outcomes of Mental Disorders in Critically Ill Patients With Systemic Lupus Erythematosus: A Population-Based Study. J Clin Med Res. 2020;12:508–16. doi: 10.14740/jocmr4269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Quintero OL, Rojas-Villarraga A, Mantilla RD, et al. Autoimmune diseases in the intensive care unit. An update. Autoimmun Rev. 2013;12:380–95. doi: 10.1016/j.autrev.2012.06.002. [DOI] [PubMed] [Google Scholar]
  • 19.Hsu C-L, Chen K-Y, Yeh P-S, et al. Outcome and prognostic factors in critically ill patients with systemic lupus erythematosus: a retrospective study. Crit Care. 2005;9:R177–83. doi: 10.1186/cc3481. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Hazarika A, Singla K, Patel G, et al. Does Arterial Blood Gas (ABG) Provide a Safety Net for Extubation in Surgical Patients? Cureus. 2023;15:e33561. doi: 10.7759/cureus.33561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.D’Cruz DP, Khamashta MA, Hughes GRV. Systemic lupus erythematosus. Lancet. 2007;369:587–96. doi: 10.1016/S0140-6736(07)60279-7. [DOI] [PubMed] [Google Scholar]
  • 22.Seshan SV, Jennette JC. Renal disease in systemic lupus erythematosus with emphasis on classification of lupus glomerulonephritis: advances and implications. Arch Pathol Lab Med. 2009;133:233–48. doi: 10.5858/133.2.233. [DOI] [PubMed] [Google Scholar]
  • 23.Lertkovit S, Siriussawakul A, Suraarunsumrit P, et al. Polypharmacy in Older Adults Undergoing Major Surgery: Prevalence, Association With Postoperative Cognitive Dysfunction and Potential Associated Anesthetic Agents. Front Med (Lausanne) 2022;9:811954. doi: 10.3389/fmed.2022.811954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Kang J-H, Xu H, Choi S-E, et al. Obesity increases the incidence of new-onset lupus nephritis and organ damage during follow-up in patients with systemic lupus erythematosus. Lupus (Los Angel) 2020;29:578–86. doi: 10.1177/0961203320913616. [DOI] [PubMed] [Google Scholar]
  • 25.Meltzer-Bruhn AT, Esper GW, Herbosa CG, et al. The Role of Smoking and Body Mass Index in Mortality Risk Assessment for Geriatric Hip Fracture Patients. Cureus. 2022;14:e26666. doi: 10.7759/cureus.26666. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Yamada J, Tomiyama H, Yambe M, et al. Elevated serum levels of alanine aminotransferase and gamma glutamyltransferase are markers of inflammation and oxidative stress independent of the metabolic syndrome. Atherosclerosis. 2006;189:198–205. doi: 10.1016/j.atherosclerosis.2005.11.036. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

online supplemental file 1
lupus-12-1-s001.pdf (84.1KB, pdf)
DOI: 10.1136/lupus-2024-001397

Data Availability Statement

Data are available upon reasonable request.


Articles from Lupus Science & Medicine are provided here courtesy of BMJ Publishing Group

RESOURCES