Skip to main content
PLOS One logoLink to PLOS One
. 2025 Aug 20;20(8):e0330197. doi: 10.1371/journal.pone.0330197

Predicting in-hospital mortality in ICU patients with lymphoma using machine learning models

Ling Xu 1, Guang Tu 2, Zhonglan Cai 2, Tianbi Lan 3,*
Editor: Chiara Lazzeri4
PMCID: PMC12367167  PMID: 40833967

Abstract

Background

Lymphoma is a severe condition with high mortality rates, often requiring ICU admission. Traditional risk stratification tools like SOFA and APACHE scores struggle to capture complex clinical interactions. Machine learning (ML) models offer a more accurate alternative for predicting outcomes by analyzing large datasets. However, their application in predicting in-hospital mortality for lymphoma patients remains limited.

Objective

This study aims to develop and validate machine learning models to predict in-hospital mortality in ICU patients with lymphoma using data from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database, thereby enhancing risk stratification and clinical decision-making.

Methods

We conducted a retrospective cohort study using data from the MIMIC-IV database, which includes detailed clinical data from adult patients admitted to the ICU. Patients with a primary diagnosis of lymphoma were included. Baseline characteristics, laboratory parameters, and clinical outcomes were extracted. Lasso regression was employed to screen for significant risk factors associated with in-hospital mortality. Fifteen machine learning models, including logistic regression, random forest, gradient boosting, and neural networks, were developed and compared using receiver operating characteristic (ROC) curves and area under the curve (AUC) analysis. Model performance was evaluated through cross-validation and SHapley Additive exPlanation (SHAP) values to interpret variable importance.

Results

A total of 1591 patients were included, with 342 (21.5%) in-hospital deaths. Lasso regression identified significant predictors of mortality, including blood urea nitrogen (BUN), platelets, PT, heart rate, systolic blood pressure, APTT, spo2, and bicarbonate. The CatBoost Classifier demonstrated the highest predictive performance with an AUC of 0.7766. SHAP analysis highlighted the critical role of BUN as the most important factor in mortality prediction, followed by platelets and PT. The SHAP force plot provided individualized risk assessments for patients, demonstrating the model’s ability to identify high-risk subgroups.

Conclusion

Machine learning models, particularly the CatBoost Classifier, effectively predict in-hospital mortality in ICU patients with lymphoma. These models outperform traditional statistical methods and provide valuable insights into risk stratification. Future work should focus on external validation and clinical implementation to improve patient outcomes in this high-risk population.

Introduction

Lymphoma is a severe condition that often requires admission to the intensive care unit (ICU) due to its associated high mortality rate and complex clinical presentation [1]. The pathophysiology of lymphoma involves a combination of immune dysregulation, metabolic disturbances, and potential complications such as infections and organ failure [25]. Early identification of patients at high risk of in-hospital mortality is crucial for optimizing treatment strategies and improving outcomes. Traditional risk stratification tools, such as the Sequential Organ Failure Assessment (SOFA) score and the Acute Physiology and Chronic Health Evaluation (APACHE) score, have limitations in capturing the complex interactions among multiple clinical variables [68]. Machine learning (ML) models offer a promising alternative by leveraging large datasets to identify patterns and predict outcomes more accurately [9]. ML has been increasingly applied in various medical fields, including disease diagnosis, treatment optimization, and prognosis prediction [10,11]. However, its application in predicting in-hospital mortality for patients with lymphoma remains limited. This study aims to develop and validate ML models to predict in-hospital mortality in ICU patients with lymphoma using data from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database, thereby enhancing risk stratification and clinical decision-making.

Compared to the general ICU population, lymphoma patients present distinct mortality determinants that traditional risk scores (e.g., SOFA, APACHE) fail to capture. Rapid tumor lysis syndrome (TLS) precipitates acute kidney injury and electrolyte derangements [12], while profound chemotherapy-induced immunosuppression predisposes to opportunistic infections such as Pneumocystis jirovecii pneumonia and invasive aspergillosis [13]. Direct organ infiltration by lymphoma (e.g., renal or hepatic parenchyma) and cardiotoxicity from anthracycline-based regimens further complicate clinical trajectories [14]. These lymphoma-specific pathophysiological features create complex, non-linear interactions among variables, rendering generic ML models inadequate. Yet, ML tailored specifically to lymphoma ICU patients remains unexplored, highlighting a critical gap this study aims to address.

Methods

Data source and study design

This study is a retrospective cohort study using data from the MIMIC-IV database, a publicly available, de-identified electronic health records database that contains comprehensive clinical data from adult patients admitted to the ICU at Beth Israel Deaconess Medical Center in Boston, USA [15]. Author Guang Tu finished the CITI Data or Specimens Only Research course, obtained approval for database access, and assumed responsibility for data extraction (certification number 65828445). The study included adult ICU patients with a primary diagnosis of lymphoma. The diagnosis was based on the International Classification of Diseases, 10th Revision (ICD-10) codes. Exclusion criteria included missing key predictor or outcome variables” instead of the overly broad phrase, hospital stay less than 24 hours, and age under 18 years. The included patient data covered hospitalizations from 2008 to 2019 (Fig 1). The study utilized fully de-identified data from the publicly available MIMIC-IV database. Because the dataset lacks direct patient identifiers, the institutional review boards of the authors’ institutions determined that this research is exempt from IRB review and informed consent (Helsinki Declaration, 1964 and its later amendments).

Fig 1. Flowchart of patient inclusion.

Fig 1

Data collection and processing

Baseline characteristics, laboratory parameters and clinical outcomes were extracted from MIMIC-IV. Prior to any modelling, the proportion of missing values for each variable was quantified (S1 Table). Across the 1 591 included patients, missingness ranged from 0% (gender, age, comorbidities, hospital-mortality flag) to 11.1% (APTT). Laboratory variables with the highest missing proportions were INR (9.9%), PT (9.9%), calcium (4.3%) and temperature (2.2%). The missing-data pattern was examined using Little’s MCAR test (χ² = 2 184.7, df = 1 959, p = 0.002), suggesting data were not missing completely at random. To minimise bias, we performed multivariate imputation by chained equations (MICE) with predictive mean matching for continuous variables and logistic/ polytomous regression for categorical variables, generating 20 imputed datasets (m = 20) under the missing-at-random (MAR) assumption. Each machine-learning model was then fitted independently on every imputed dataset; Rubin’s rules were applied to pool performance metrics (AUC, accuracy, F1-score). Sensitivity analyses restricted to complete cases yielded similar AUCs (< 2% difference), indicating that the imputation procedure did not materially distort model discrimination.

Outcome definition

In this study, in-hospital mortality is defined as death occurring at any time between ICU admission and hospital discharge, consistent with the standard definition adopted by the MIMIC-IV database and prior ICU outcome studies [1517]. This endpoint was selected to capture the full spectrum of acute and subacute mortality risk during the hospital stay, aligning with clinical decision-making contexts where ICU prognostication directly influences treatment intensity and goals-of-care discussions.

Risk factor selection

To identify risk factors significantly associated with in-hospital mortality, we used Lasso regression for variable selection. Lasso regression is a regularized linear regression method that introduces an L1 penalty term to automatically select variables with significant effects on the dependent variable. We determined the regularization parameter (λ) for Lasso regression through cross-validation and visualized the variable selection process using coefficient path and cross-validation plots. Ultimately, Lasso regression identified key risk factors including blood urea nitrogen, platelets, PT, heart rate, systolic blood pressure, APTT, spo2, and bicarbonate.

Machine learning model development

Based on the key risk factors selected by Lasso regression, we developed 15 machine learning models, including logistic regression, random forest, gradient boosting, neural networks, and CatBoost Classifier. To ensure an unbiased estimate of model performance, the entire dataset was first subjected to a single stratified split into a 70% training set and a 30% hold-out test set based on the in-hospital mortality label. LASSO-based feature selection, and Bayesian hyper-parameter tuning (100 trials maximizing 5-fold cross-validated AUC) were performed exclusively within the training set using nested 5-fold cross-validation to prevent information leakage. The final CatBoost classifier, incorporating the selected features and optimized hyper-parameters, was then retrained on the full 70% training set and evaluated only once on the untouched 30% test set.

To address the 21.5% positive-class imbalance, we combined class weighting and resampling: models supporting class weights (CatBoost, Logistic, Ridge, SVM, LightGBM, XGBoost, Gradient Boosting) received minority-class weight w = N_majority/ N_minority ≈ 3.65 in their loss functions; the remaining models (Random Forest, Extra Trees, AdaBoost, MLP, Naive Bayes, KNN, Decision Tree) were trained after RandomOverSampler expanded the minority class to match the majority within each cross-validation fold, preventing data leakage, while all reported metrics (AUC, accuracy, F1-score) were computed on the untouched, imbalanced test set.

Model interpretation and validation

To enhance the clinical interpretability of the models, we used SHAP values for model interpretation. SHAP values, based on game theory, quantify the contribution of each variable to model predictions and reveal complex interactions among variables. Through SHAP value analysis, we not only validated the key variables identified by Lasso regression but also further evaluated their specific impacts on individual predictions. Moreover, SHAP waterfall plots decomposed the prediction results for individual patients, clearly showing the positive or negative contributions of each variable to the prediction outcomes, thereby providing a tool for clinicians to conduct individualized risk assessments.

Statistical analysis

All statistical analyses were performed using R Statistical Software (Version 4.2.2) and the Free Statistics Analysis Platform (Version 2.1) [15]. A P-value less than 0.05 was considered statistically significant. Data processing and model training were completed on a local server to ensure data security and privacy.

Results

Baseline characteristics

A total of 1591 patients with lymphoma admitted to the ICU were included in this study, with 342 (21.5%) in-hospital deaths. The baseline characteristics of the patients are detailed in Table 1. Significant differences were observed between the survivor and non-survivor groups across multiple variables. The mean age of non-survivors (70.1 years) was slightly higher than that of survivors (69.1 years, P = 0.220), though not significantly. Non-survivors exhibited significantly higher heart rate, lower systolic and diastolic blood pressure, lower SpO2, lower hematocrit and hemoglobin levels, lower platelet counts, higher anion gap, lower bicarbonate levels, higher BUN, lower calcium levels, higher INR, longer PT, and longer APTT compared to survivors. The frequency and percentage of missing values for all variables are reported in S1 Table.

Table 1. Baseline characteristics of the patients.

Variables Total
(n = 1591)
Survivors
(n = 1249)
Non-survivors
(n = 342)
t/ Chi-square value P _value
gender, n (%) 0.006 0.940
female 696 (43.7) 547 (43.8) 149 (43.6)
male 895 (56.3) 702 (56.2) 193 (56.4)
age (year), mean (SD) 69.3 ± 13.6 69.1 ± 13.6 70.1 ± 13.3 1.505 0.220
heart rate(beats/min), mean (SD) 74.7 ± 16.6 73.4 ± 15.6 79.5 ± 19.0 37.154 < 0.001
sbp (mmHg), mean (SD) 89.8 ± 17.6 91.5 ± 16.6 83.5 ± 19.6 57.315 < 0.001
dbp (mmHg), mean (SD) 46.9 ± 12.0 47.7 ± 11.4 43.9 ± 13.6 27.686 < 0.001
Temperature(°C), Mean ± SD 36.3 ± 0.7 36.4 ± 0.7 36.2 ± 0.8 13.769 < 0.001
spo2, Mean ± SD 90.8 ± 7.0 91.5 ± 5.8 88.5 ± 9.9 49.763 < 0.001
hematocrit(mg/dL), Mean ± SD 28.7 ± 6.6 29.1 ± 6.7 27.2 ± 6.0 22.287 < 0.001
hemoglobin(mg/dL), mean (SD) 9.4 ± 2.2 9.6 ± 2.3 8.8 ± 2.0 31.559 < 0.001
platelets(×109/L), mean (SD) 148.0 (84.0, 231.0) 158.0 (99.0, 239.0) 101.0 (38.0, 188.0) 58.955 < 0.001
wbc(×109/L), Median (IQR) 8.1 (4.9, 12.2) 8.1 (5.0, 11.8) 8.1 (4.1, 13.9) 0.032 0.859
aniongap(mg/dL), mean (SD) 12.9 ± 3.7 12.7 ± 3.4 13.7 ± 4.5 20.433 < 0.001
bicarbonate(mmol/dL), Mean ± SD 21.7 ± 4.9 22.2 ± 4.6 20.0 ± 5.6 57.076 < 0.001
bun(mg/dL), Median (IQR) 20.0 (13.0, 31.0) 18.0 (12.0, 28.0) 28.5 (19.0, 43.8) 105.37 < 0.001
calcium(mmol/dL), mean (SD) 8.2 ± 0.9 8.3 ± 0.9 8.0 ± 1.0 27.475 < 0.001
chloride(mmol/dL), mean (SD) 101.1 ± 6.4 101.0 ± 6.0 101.6 ± 7.8 2.201 0.138
creatinine(mg/dL), Median (IQR) 0.9 (0.7, 1.3) 0.9 (0.7, 1.3) 1.1 (0.7, 1.6) 35.995 < 0.001
glucose(mmol/dL), mean (SD) 120.4 ± 46.8 118.7 ± 42.5 126.5 ± 59.6 7.328 0.007
sodium(mmol/dL), mean (SD) 136.4 ± 5.2 136.4 ± 5.0 136.6 ± 5.8 0.720 0.396
potassium(mmol/dL), mean (SD) 3.9 ± 0.6 3.9 ± 0.6 3.9 ± 0.7 0.176 0.675
inr, mean (SD) 1.4 ± 0.6 1.3 ± 0.5 1.6 ± 0.7 42.486 < 0.001
pt (s), mean (SD) 15.0 ± 5.8 14.4 ± 5.1 16.9 ± 7.5 46.924 < 0.001
aptt (s), mean (SD) 32.8 ± 14.8 31.8 ± 13.7 36.4 ± 18.0 23.519 < 0.001
myocardial_infarct, n (%) 2.503 0.114
no 1320 (83.0) 1046 (83.7) 274 (80.1)
yes 271 (17.0) 203 (16.3) 68 (19.9)
heart_failure, n (%) 9.328 0.002
no 1050 (66.0) 848 (67.9) 202 (59.1)
yes 541 (34.0) 401 (32.1) 140 (40.9)
peripheral_vascular, n (%) 2.698 0.100
no 1454 (91.4) 1149 (92) 305 (89.2)
yes 137 (8.6) 100 (8) 37 (10.8)
dementia, n (%) 0.041 0.840
no 1547 (97.2) 1215 (97.3) 332 (97.1)
yes 44 (2.8) 34 (2.7) 10 (2.9)
cerebrovascular, n (%) 1.454 0.228
no 1389 (87.3) 1097 (87.8) 292 (85.4)
yes 202 (12.7) 152 (12.2) 50 (14.6)
chronic pulmonary disease, n (%) 0.061 0.805
no 1253 (78.8) 982 (78.6) 271 (79.2)
yes 338 (21.2) 267 (21.4) 71 (20.8)
rheumatic disease, n (%) 0.001 0.970
no 1544 (97.0) 1212 (97) 332 (97.1)
yes 47 (3.0) 37 (3) 10 (2.9)
peptic ulcer disease, n (%) 0.061 0.804
no 1546 (97.2) 1213 (97.1) 333 (97.4)
yes 45 (2.8) 36 (2.9) 9 (2.6)
mild liver disease, n (%) 21.356 < 0.001
no 1428 (89.8) 1144 (91.6) 284 (83)
yes 163 (10.2) 105 (8.4) 58 (17)
diabetes, n (%) 4.098 0.043
no 1293 (81.3) 1028 (82.3) 265 (77.5)
yes 298 (18.7) 221 (17.7) 77 (22.5)
paraplegia, n (%) 2.047 0.153
no 1506 (94.7) 1177 (94.2) 329 (96.2)
yes 85 (5.3) 72 (5.8) 13 (3.8)
renal disease, n (%) 2.338 0.126
no 1257 (79.0) 997 (79.8) 260 (76)
yes 334 (21.0) 252 (20.2) 82 (24)
malignant cancer, n (%) 9.899 0.002
no 293 (18.4) 250 (20) 43 (12.6)
yes 1298 (81.6) 999 (80) 299 (87.4)
severe liver disease, n (%) 6.554 0.010
no 1527 (96.0) 1207 (96.6) 320 (93.6)
yes 64 (4.0) 42 (3.4) 22 (6.4)
metastatic solid tumor, n (%) 0.082 0.775
no 1512 (95.0) 1188 (95.1) 324 (94.7)
yes 79 (5.0) 61 (4.9) 18 (5.3)
aids, n (%) 0.684 0.408
no 1543 (97.0) 1209 (96.8) 334 (97.7)
yes 48 (3.0) 40 (3.2) 8 (2.3)

Risk factor identification

Lasso regression was used to identify significant risk factors associated with in-hospital mortality. The screening process is illustrated in Fig 2, with the coefficient path plot (Panel A) and cross-validation plot (Panel B). The identified risk factors included blood urea nitrogen, platelets, PT, heart rate, systolic blood pressure, APTT, spo2, and bicarbonate. These factors were subsequently used as key input variables for the development of machine learning models.

Fig 2. Lasso regression screening results for in-hospital mortality risk factors in patients with lymphoma.

Fig 2

A: Coefficient path plot of Lasso regression; B: Cross-validation plot of Lasso regression.

Model performance

The performance of 15 machine learning models in predicting in-hospital mortality is summarized in Table 2. The CatBoost Classifier achieved the highest area under the receiver operating characteristic curve (AUC) of 0.7766, with an accuracy of 79.24% and an F1 score of 0.344. The Random Forest Classifier (AUC, 0.7691) and Extra Trees Classifier (AUC, 0.7667) also demonstrated strong performance. The ROC curves for all models are shown in Fig 3, intuitively reflecting their discriminatory power. The CatBoost Classifier exhibited superior performance in distinguishing between high-risk and low-risk patients.

Table 2. Performance of each model for prediction.

Algorithm AUC(%) Accuracy(%) F1score predictive(%)
CatBoost Classifier 0.7766 0.7924 0.344 0.5318
Random Forest Classifier 0.7691 0.8005 0.3451 0.5814
Extra Trees Classifier 0.7667 0.7987 0.307 0.5831
Ridge Classifier 0.765 0.7835 0.2087 0.5
Linear Discriminant Analysis 0.765 0.7862 0.28 0.5227
Logistic Regression 0.7622 0.7871 0.261 0.5252
MLP Classifier 0.7538 0.7808 0.2687 0.4893
Naive Bayes 0.7518 0.7727 0.3504 0.4617
Gradient Boosting Classifier 0.7482 0.7861 0.369 0.5025
Light Gradient Boosting Machine 0.7352 0.7871 0.367 0.5053
SVM – Linear Kernel 0.7157 0.7862 0.2489 0.5826
Extreme Gradient Boosting 0.7157 0.7718 0.3448 0.4426
Ada Boost Classifier 0.6999 0.7646 0.3389 0.4316
K Neighbors Classifier 0.6569 0.7655 0.262 0.4186
Decision Tree Classifier 0.6108 0.7251 0.3922 0.3769

Fig 3. Receiver Operating Characteristic curves of 15 models for in-hospital mortality in patients with lymphoma.

Fig 3

Variable importance and model interpretation

To evaluate potential multicollinearity among the eight variables retained by LASSO, we computed a Pearson correlation matrix (S1 Fig). The strongest absolute correlations were 0.60 between BUN and creatinine and 0.40 between platelet count and PT; all remaining pairwise correlations were below 0.30. These low-to-moderate values indicate that multicollinearity is not a concern after LASSO regularisation, supporting the stability of the final model.

The importance weights of each variable in the models are displayed in Fig 4. Variables such as blood urea nitrogen, platelets, and PT were assigned higher weights, indicating their critical roles in predicting mortality. SHAP value analysis, shown in Fig 5, quantified the contribution of each variable to model predictions, validating the importance of key variables identified by Lasso regression. The SHAP waterfall plot (Fig 6) decomposed the prediction results for individual patients, clearly showing the positive or negative contributions of each variable to the prediction outcomes.

Fig 4. The weights of variables importance.

Fig 4

Fig 5. The SHapley Additive exPlanation (SHAP) values.

Fig 5

Fig 6. The SHapley Additive exPlanations (SHAP) Waterfall.

Fig 6

Discussion

Our study successfully developed and validated machine learning (ML) models to predict in-hospital mortality among ICU patients with lymphoma. This achievement provides significant support for optimizing treatment strategies and improving patient outcomes. Traditional risk stratification tools are limited in their ability to handle the complex interactions among multiple clinical variables, whereas ML models, with their powerful data processing and pattern recognition capabilities, can provide more accurate predictions [9]. In this study, the CatBoost Classifier showed moderate discrimination, with an AUC of 0.7766, with potential incremental value over traditional scores and offering a new tool for clinical decision-making.

Compared with general medical or even oncologic ICU cohorts, lymphoma mellitus patients exhibit distinctive biological and therapeutic determinants of mortality risk. First, rapid tumour-lysis syndrome and hyper-metabolic states can precipitate acute kidney injury, directly elevating BUN independent of hypovolaemia or sepsis [16]. Second, marrow infiltration or ongoing cytotoxic chemotherapy frequently induces severe thrombocytopenia that is both more sudden and profound than in other malignancies [17]. Third, anthracycline-based regimens and immune checkpoint inhibitors increase the incidence of chemotherapy-associated cardiomyopathy, arrhythmias, and capillary leak, leading to haemodynamic instability (lower systolic blood pressure, higher heart rate) and coagulopathy (prolonged PT/APTT) [18,19]. Finally, opportunistic infections common in lymphoma (e.g., Pneumocystis jirovecii, invasive moulds) generate hypoxaemia (lower SpO₂) and lactic acidosis (lower bicarbonate) that may be disproportionate to the apparent severity of organ failure [20,21]. These lymphoma-specific factors collectively confound traditional scores such as APACHE or SOFA, which were calibrated on mixed ICU populations without accounting for tumour burden or chemotherapy-related toxicities.

Previous research has primarily focused on the application of machine learning in disease diagnosis and treatment optimization, with limited exploration into predicting in-hospital mortality for specific conditions like lymphoma in the ICU [2225]. Our study fills this gap. Unlike previous studies that relied on a single model, we compared 15 different ML models and identified the CatBoost Classifier as the best performer. Moreover, through SHAP value analysis, we not only validated the key variables identified by Lasso regression but also revealed their specific impacts on individual predictions, providing clinicians with a more intuitive risk assessment tool. SHAP waterfall plots translate these individual explanations into immediate bedside actions: elevated BUN or severe thrombocytopenia prompts urgent nephrology consultation for renal-replacement therapy ± rasburicase and haematology review for platelet transfusion or chemotherapy dose adjustment; a large positive SHAP contribution from low SpO₂ triggers intensified respiratory monitoring and early bronchoscopy with targeted anti-Pneumocystis or antifungal therapy; when cumulative SHAP values exceed a predefined threshold (e.g., > 0.6), the ICU team can initiate goals-of-care discussions—turning the waterfall plot into a real-time, pathophysiology-based action checklist. In this study, several variables significantly associated with in-hospital mortality were identified through Lasso regression and SHAP value analysis, including blood urea nitrogen, platelets, PT, heart rate, systolic blood pressure, APTT, SpO₂, and bicarbonate. These variables are not only statistically significant but also potentially related to the pathophysiological mechanisms of lymphoma. Elevated blood urea nitrogen levels may indicate renal dysfunction, In the lymphoma context, BUN elevation is often multifactorial: tumour-lysis–induced urate nephropathy [26], cisplatin or ifosfamide nephrotoxicity [27], and sepsis-related acute tubular necrosis all contribute [28]. Similarly, thrombocytopenia reflects not only bone marrow suppression but also immune-mediated platelet destruction (e.g., Evans syndrome) and chemotherapy-induced myelosuppression [29,30]. Prolonged PT and APTT mirror both disseminated intravascular coagulation triggered by tumour tissue factor expression and drug-induced coagulopathy (l-asparaginase, anthracyclines) [31]. These mechanistic links underscore why the CatBoost model, trained on lymphoma-specific data, outperforms generic scores that weight these variables identically across all ICU patients. A common complication in lymphoma patients that can be related to tumor burden, chemotherapy-induced nephrotoxicity, or infection [21,32,33]. Thrombocytopenia may suggest bone marrow suppression or disseminated intravascular coagulation (DIC), both of which are common in lymphoma patients and associated with disease severity and poor prognosis [34,35]. Additionally, changes in heart rate and blood pressure may reflect circulatory instability, while decreased SpO₂ may indicate respiratory failure, both of which are critical conditions frequently encountered in ICU patients [3638]. By leveraging ML models, we can link these clinical variables to biological mechanisms, further elucidating the underlying causes of mortality risk in lymphoma patients admitted to the ICU. Future external validation using multicenter cohorts (e.g., eICU or regional ICU networks) is warranted to confirm the generalizability of our findings beyond the MIMIC-IV setting. Subsequent studies should incorporate longitudinal tumour-burden markers (LDH, PET-CT metabolic tumour volume) and detailed chemotherapy histories to refine lymphoma-specific mortality prediction beyond the current model.

Our study has several strengths. First, the data were sourced from the MIMIC-IV database, which contains a wealth of detailed clinical information, providing a solid foundation for model development and validation. Second, we employed multiple ML models and comprehensively evaluated their performance through cross-validation and SHAP value analysis, ensuring the reliability and interpretability of the results. Finally, our study not only focused on overall model performance but also assessed individual patient risks through SHAP force plots, offering more specific guidance for clinical applications.

Limitations

Despite the positive outcomes, our study has some limitations. First, MIMIC-IV is a single-center database derived from Beth Israel Deaconess Medical Center (Boston, USA), which may introduce selection bias and limit generalizability to other ICUs with different patient demographics, clinical practices, or resource availability. Future research should conduct external validation on multiple independent datasets to confirm the generalizability of the models. Second, although ML models can identify key risk factors, their predictive mechanisms remain complex and difficult to fully explain using traditional medical theories. Additionally, our study did not consider the impact of treatments and interventions during the ICU stay on patient outcomes, which may limit the precision of the model’s predictions. Because MIMIC-IV lacks lymphoma-specific variables such as histological subtype, Ann Arbor stage, recent chemotherapy exposure, tumour-lysis syndrome, or neutrophil nadir, these high-risk features could not be included in the analysis; future multicenter studies with dedicated oncology-ICU databases are needed to capture these predictors. Given the absence of granular treatment data (e.g., specific chemotherapy agents, immunotherapies, or evolving ICU bundles) in MIMIC-IV, we could not control for temporal changes in lymphoma management or critical-care practices. Future multicenter studies that incorporate longitudinal drug and intervention data are warranted to address this limitation. Additionally, because MIMIC-IV lacks detailed, time-stamped data on ICU interventions (mechanical ventilation, vasopressors, dialysis, or lymphoma-specific treatments administered after admission), our models are restricted to admission/early-ICU variables and cannot perform dynamic risk reassessment. Future linkage with high-resolution treatment logs is required to develop longitudinal prediction frameworks.

Conclusion

In summary, the ML models, particularly the CatBoost Classifier, may assist in risk estimation in-hospital mortality among ICU patients with lymphoma. These models not only offer additional insights alongside traditional approaches but also provide interpretable risk assessments through SHAP value analysis. Future research should focus on external validation and clinical implementation to improve outcomes for this high-risk patient population.

Supporting information

S1 Data. The complete, de-identified raw dataset for all patients who met the study inclusion and exclusion criteria.

(XLS)

pone.0330197.s001.xls (522KB, xls)
S1 Table. Extent of missing data before imputation (n = 1591).

(DOCX)

pone.0330197.s002.docx (14.9KB, docx)
S1 Fig. Correlation matrix of the final selected variables.

(TIF)

pone.0330197.s003.tif (1.4MB, tif)

Abbreviations

AIDS

Acquired Immunodeficiency Syndrome

AUC

Area Under the Curve

BUN

Blood Urea Nitrogen

CatBoost

Categorical Boosting

DBP

Diastolic Blood Pressure

ICU

Intensive Care Unit

IQR

Interquartile Range

ML

Machine Learning

ROC

Receiver Operating Characteristic

SD

Standard Deviation

SBP

Systolic Blood Pressure

SHAP

SHapley Additive exPlanations

SpO₂

Oxygen Saturation

WBC

White Blood Cell Count.

Data Availability

The datasets analyzed during this study are derived from the publicly available MIMIC-IV v3.1 database (https://physionet.org/content/mimiciv/3.1/). Credentialed researchers can obtain access by completing the CITI Data or Specimens Only Research course and submitting a data use agreement via PhysioNet. All data are de-identified and HIPAA-compliant, and the analysis code along with variable extraction scripts have been deposited in the public GitHub repository (https://github.com/tianbilan/MIMIC-IV-Lymphoma-Mortality-ML).

Funding Statement

The author(s) received no specific funding for this work.

References

  • 1.Mustafa M, Gladston Chelliah E, Hughes M. Patients with systemic rheumatic diseases admitted to the intensive care unit: what the rheumatologist needs to know. Rheumatol Int. 2018;38(7):1163–8. doi: 10.1007/s00296-018-4008-2 [DOI] [PubMed] [Google Scholar]
  • 2.Bhavsar T, Crane GM. Immunodeficiency-Related Lymphoid Proliferations: New Insights With Relevance to Practice. Curr Hematol Malig Rep. 2020;15(4):360–71. doi: 10.1007/s11899-020-00594-1 [DOI] [PubMed] [Google Scholar]
  • 3.Smedby KE, Ponzoni M. The aetiology of B-cell lymphoid malignancies with a focus on chronic inflammation and infections. J Intern Med. 2017;282(5):360–70. doi: 10.1111/joim.12684 [DOI] [PubMed] [Google Scholar]
  • 4.Mancuso S, Mattana M, Santoro M, Carlisi M, Buscemi S, Siragusa S. Host-related factors and cancer: Malnutrition and non-Hodgkin lymphoma. Hematol Oncol. 2022;40(3):320–31. doi: 10.1002/hon.3002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Friman V, Winqvist O, Blimark C, Langerbeins P, Chapel H, Dhalla F. Secondary immunodeficiency in lymphoproliferative malignancies. Hematol Oncol. 2016;34(3):121–32. doi: 10.1002/hon.2323 [DOI] [PubMed] [Google Scholar]
  • 6.Basile-Filho A, Lago AF, Menegueti MG, Nicolini EA, Rodrigues LA de B, Nunes RS, et al. The use of APACHE II, SOFA, SAPS 3, C-reactive protein/albumin ratio, and lactate to predict mortality of surgical critically ill patients: A retrospective cohort study. Medicine (Baltimore). 2019;98(26):e16204. doi: 10.1097/MD.0000000000016204 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Lee KS, Sheen SS, Jung YJ, Park RW, Lee YJ, Chung WY, et al. Consideration of additional factors in Sequential Organ Failure Assessment score. J Crit Care. 2014;29(1):185.e9-185.e12. doi: 10.1016/j.jcrc.2013.10.006 [DOI] [PubMed] [Google Scholar]
  • 8.Namendys-Silva SA, Silva-Medina MA, Vásquez-Barahona GM, Baltazar-Torres JA, Rivero-Sigarroa E, Fonseca-Lazcano JA, et al. Application of a modified sequential organ failure assessment score to critically ill patients. Braz J Med Biol Res. 2013;46(2):186–93. doi: 10.1590/1414-431x20122308 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Katsaouni N, Tashkandi A, Wiese L, Schulz MH. Machine learning based disease prediction from genotype data. Biol Chem. 2021;402(8):871–85. doi: 10.1515/hsz-2021-0109 [DOI] [PubMed] [Google Scholar]
  • 10.Battineni G, Sagaro GG, Chinatalapudi N, Amenta F. Applications of machine learning predictive models in the chronic disease diagnosis. J Pers Med. 2020;10(2). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Eckardt J-N, Bornhäuser M, Wendt K, Middeke JM. Application of machine learning in the management of acute myeloid leukemia: current practice and future prospects. Blood Adv. 2020;4(23):6077–85. doi: 10.1182/bloodadvances.2020002997 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Gupta A, Moore JA. Tumor Lysis Syndrome. JAMA Oncol. 2018;4(6):895. doi: 10.1001/jamaoncol.2018.0613 [DOI] [PubMed] [Google Scholar]
  • 13.Pillarisetti K, Edavettal S, Mendonça M, Li Y, Tornetta M, Babich A, et al. A T-cell-redirecting bispecific G-protein-coupled receptor class 5 member D x CD3 antibody to treat multiple myeloma. Blood. 2020;135(15):1232–43. doi: 10.1182/blood.2019003342 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Rodrigues AG, Sales ARK, Faria D, Fonseca SMR, Bond MMK, Jordão CP, et al. Sympathetic neural overdrive and diminished exercise capacity in reduced ejection fraction heart failure related to anthracycline-based chemotherapy. Am J Physiol Heart Circ Physiol. 2023;325(5):H1126–32. doi: 10.1152/ajpheart.00476.2023 [DOI] [PubMed] [Google Scholar]
  • 15.Johnson AE, Stone DJ, Celi LA, Pollard TJ. The MIMIC Code Repository: enabling reproducibility in critical care research. J Am Med Inform Assoc. 2018;25(1):32–9. doi: 10.1093/jamia/ocx084 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Noritomi DT, Ranzani OT, Ferraz LJR, Dos Santos MC, Cordioli E, Albaladejo R, et al. TELE-critical Care verSus usual Care On ICU PErformance (TELESCOPE): protocol for a cluster-randomised clinical trial on adult general ICUs in Brazil. BMJ Open. 2021;11(6):e042302. doi: 10.1136/bmjopen-2020-042302 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Rajkomar A, Oren E, Chen K, Dai AM, Hajaj N, Hardt M, et al. Scalable and accurate deep learning with electronic health records. NPJ Digit Med. 2018;1:18. doi: 10.1038/s41746-018-0029-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Han K-Y, Gu J, Wang Z, Liu J, Zou S, Yang C-X, et al. Association Between METS-IR and Prehypertension or Hypertension Among Normoglycemia Subjects in Japan: A Retrospective Study. Front Endocrinol (Lausanne). 2022;13:851338. doi: 10.3389/fendo.2022.851338 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Esposito R, Fedele T, Orefice S, Cuomo V, Prastaro M, Canonico ME. An emergent form of cardiotoxicity: acute myocarditis induced by immune checkpoint inhibitors. Biomolecules. 2021;11(6). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Pardo E, Lemiale V, Mokart D, Stoclin A, Moreau A-S, Kerhuel L, et al. Invasive pulmonary aspergillosis in critically ill patients with hematological malignancies. Intensive Care Med. 2019;45(12):1732–41. doi: 10.1007/s00134-019-05789-6 [DOI] [PubMed] [Google Scholar]
  • 21.Atamna B, Rozental A, Haj Yahia M, Itchaki G, Gurion R, Yeshurun M, et al. Tumor-Associated Lactic Acidosis and Early Death in Patients With Lymphoma. Cancer Med. 2025;14(7):e70824. doi: 10.1002/cam4.70824 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Guan C, Gong A, Zhao Y, Yin C, Geng L, Liu L, et al. Interpretable machine learning model for new-onset atrial fibrillation prediction in critically ill patients: a multi-center study. Crit Care. 2024;28(1):349. doi: 10.1186/s13054-024-05138-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Fernandes J, Cardoso V, Comesaña-Campos A, Pinheira A. Comprehensive review: Machine and deep learning in brain stroke diagnosis. Sensors (Basel). 2024;24(13). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Xu J, Chen T, Fang X, Xia L, Pan X. Prediction model of pressure injury occurrence in diabetic patients during ICU hospitalization--XGBoost machine learning model can be interpreted based on SHAP. Intensive Crit Care Nurs. 2024;83:103715. doi: 10.1016/j.iccn.2024.103715 [DOI] [PubMed] [Google Scholar]
  • 25.Le NQK, Tran T-X, Nguyen P-A, Ho T-T, Nguyen V-N. Recent progress in machine learning approaches for predicting carcinogenicity in drug development. Expert Opin Drug Metab Toxicol. 2024;20(7):621–8. doi: 10.1080/17425255.2024.2356162 [DOI] [PubMed] [Google Scholar]
  • 26.Mouri Y, Natsumeda M, Okubo N, Sato T, Saito T, Shibuya K, et al. Successful Treatment of Acute Uric Acid Nephropathy with Rasburicase in a Primary Central Nervous System Lymphoma Patient Showing a Dramatic Response to Methotrexate-Case Report. J Clin Med. 2022;11(19):5548. doi: 10.3390/jcm11195548 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Chayama S, Sato H, Takase K, Hayashi K, Miyake T, Kin S. Severe Hypophosphatemia Potentially Associated with Intracellular Phosphate Shift Concomitant with Acute Kidney Injury in a Patient with Rapidly Proliferating Diffuse Large B-cell Lymphoma. Intern Med. 2025;64(12):1872–6. doi: 10.2169/internalmedicine.3892-24 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kayataş M, Yıldız G, Timuçin M, Candan F, Yıldız E, Sencan M. A case of acute renal failure caused by Hodgkin’s lymphoma: concurrent membranous glomerulonephritis and interstitial HL-CD 20 lymphoid infiltration. Ren Fail. 2011;33(3):363–6. doi: 10.3109/0886022X.2011.560986 [DOI] [PubMed] [Google Scholar]
  • 29.Zeng C, Han M, Fan J, He X, Jia R, Li L, et al. Anemia and Bone Marrow Suppression After Intra-Arterial Chemotherapy in Children With Retinoblastoma: A Retrospective Analysis. Front Oncol. 2022;12:848877. doi: 10.3389/fonc.2022.848877 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Yue J, Shui H, Li G, Yang B, Tang D. Chemotherapy-Induced Bone Marrow Suppression: A Bibliometrics Analysis. J Multidiscip Healthc, 2025,18:1895–911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Bowyer A, Smith J, Woolley AM, Kitchen S, Hampton KK, Maclean RM, et al. The investigation of a prolonged APTT with specific clotting factor assays is unnecessary if an APTT with Actin FS is normal. Int J Lab Hematol. 2011;33(2):212–8. doi: 10.1111/j.1751-553X.2010.01266.x [DOI] [PubMed] [Google Scholar]
  • 32.Bogari MH, Munshi A, Almuntashiri S, Bogari A, Abdullah AS, Albadri M, et al. Acute gastroenteritis-related acute kidney injury in a tertiary care center. Ann Saudi Med. 2023;43(2):82–9. doi: 10.5144/0256-4947.2023.82 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Sindhu G, Nishanthi E, Sharmila R. Nephroprotective effect of vanillic acid against cisplatin induced nephrotoxicity in wistar rats: a biochemical and molecular study. Environ Toxicol Pharmacol. 2015;39(1):392–404. doi: 10.1016/j.etap.2014.12.008 [DOI] [PubMed] [Google Scholar]
  • 34.Levi M, Scully M. How I treat disseminated intravascular coagulation. Blood. 2018;131(8):845–54. doi: 10.1182/blood-2017-10-804096 [DOI] [PubMed] [Google Scholar]
  • 35.Levi M. Disseminated Intravascular Coagulation in Cancer: An Update. Semin Thromb Hemost. 2019;45(4):342–7. doi: 10.1055/s-0039-1687890 [DOI] [PubMed] [Google Scholar]
  • 36.Jaychandran R, Chaitanya G, Satishchandra P, Bharath RD, Thennarasu K, Sinha S. Monitoring peri-ictal changes in heart rate variability, oxygen saturation and blood pressure in epilepsy monitoring unit. Epilepsy Res. 2016;125:10–8. doi: 10.1016/j.eplepsyres.2016.05.013 [DOI] [PubMed] [Google Scholar]
  • 37.Sharma KP. Temporary hypoxemia at high altitude in an intensive care unit physician. SAGE Open Med Case Rep. 2023;11:2050313X231153526. doi: 10.1177/2050313X231153526 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Thijssen M, Janssen L, le Noble J, Foudraine N. Facing SpO2 and SaO2 discrepancies in ICU patients: is the perfusion index helpful? J Clin Monit Comput. 2020;34(4):693–8. doi: 10.1007/s10877-019-00371-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Chiara Lazzeri

24 Jun 2025

PONE-D-25-18876

Predicting In-Hospital Mortality in ICU Patients with lymphoma Mellitus Using Machine Learning Models

PLOS ONE

Dear Dr. Lan,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Aug 08 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Chiara Lazzeri

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, all author-generated code must be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

3. When completing the data availability statement of the submission form, you indicated that you will make your data available on acceptance. We strongly recommend all authors decide on a data sharing plan before acceptance, as the process can be lengthy and hold up publication timelines. Please note that, though access restrictions are acceptable now, your entire data will need to be made freely accessible if your manuscript is accepted for publication. This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If you are unable to adhere to our open data policy, please kindly revise your statement to explain your reasoning and we will seek the editor's input on an exemption. Please be assured that, once you have provided your new statement, the assessment of your exemption will not hold up the peer review process.

4. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please ensure that your ethics statement is included in your manuscript, as the ethics statement entered into the online submission form will not be published alongside your manuscript.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: 1. Regarding the outcome measure, while the study designates "in-hospital mortality" as the primary endpoint, further clarification on the temporal boundary of this definition would enhance the clinical interpretability of results. Given that mortality at different time points in ICU patients carries distinct clinical implications (e.g., 7-day mortality reflecting acute intervention efficacy, 28-day mortality indicating comprehensive treatment outcomes, and long-term mortality associated with underlying disease progression), it is recommended to reference the standard definitions in comparable studies and supplement the methodological rationale for time frame delineation.

2. Potential collinearity exists among variables selected by Lasso regression (e.g., BUN and Cr as renal function markers, platelet count and PT in coagulation function). Although Lasso regularization mitigates this issue, adding a correlation matrix would strengthen the argument for model robustness. Additionally, incorporating SHAP dependence plots or interaction value analyses to explore nonlinear interactions between key variables (e.g., BUN and platelet count) could provide richer evidence for clinical interpretation of model prediction mechanisms.

3. While baseline characteristics acknowledge the impact of multiple comorbidities, integrating a standardized comorbidity index (e.g., Charlson Index) to quantify comorbidity burden would systemize the risk factor analysis. As different comorbidities contribute differently to mortality, supplementing correlation analyses between comorbidity indices and death outcomes would enhance the clinical relevance of findings.

Reviewer #2: Its my pleasure to review the manuscript titled "Predicting In-Hospital Mortality in ICU Patients with lymphoma Mellitus Using Machine Learning Models". The study develops and validates ML models for predicting in-hospital mortality in ICU patients with presumed lymphoma. While the methodology is generally sound and addresses a clinically relevant problem, critical terminology errors undermine the foundation of the work. Significant revisions are required before consideration for PLOS ONE. The AUC of 0.7766 is modest, and clinical applicability needs stronger justification.

________________________________________

The major Issues of the research included:

1. "Lymphoma Mellitus" is incorrect and non-existent. "Mellitus" specifically refers to diabetes (e.g., Diabetes Mellitus). The correct term is simply "Lymphoma".

2. The introduction could better emphasize the specific challenges of predicting mortality in lymphoma patients compared to the general ICU population (e.g., unique complications like tumor lysis syndrome, immunosuppression-related infections, specific organ involvement). The gap regarding ML for this specific subgroup is adequately stated.

3. The exclusion of 10,054 patients due to "missing values for all variables" (Fig 1) is unusual and concerning. It suggests potential selection bias. Were these patients truly missing every single variable collected? Clarification or rephrasing is needed (e.g., "missing key predictor or outcome variables").

4. The long timeframe (2008-2019) introduces potential confounding from evolving ICU practices and lymphoma treatments over 11 years. This isn't addressed.

5. MIMIC-IV, while large, is single-center data (Beth Israel Deaconess MC). Generalizability to other settings may be limited, appropriately noted as a limitation.

6. While LASSO identified predictors, the justification for the initial set of variables extracted from MIMIC-IV is somewhat brief. Were lymphoma-specific variables (e.g., disease stage, type [Hodgkin/Non-Hodgkin], recent chemotherapy, presence of tumor lysis syndrome, neutropenia) considered or available? These are highly relevant to mortality risk in lymphoma patients.

7. The models only use admission/early ICU data. Interventions during the ICU stay (e.g., mechanical ventilation, vasopressors, dialysis, specific lymphoma treatments) are not included as potential predictors or confounders, significantly limiting the model's potential clinical applicability for dynamic risk assessment. This is a major omission.

8. The P-values in Table 1 require context. With 1591 patients, very small differences can become statistically significant. Emphasis should be on clinically meaningful differences. Reporting effect sizes (e.g., mean difference, Cohen's d for continuous; odds ratio for categorical) alongside P-values would be beneficial. Some variables (e.g., platelets, BUN) show large and clinically meaningful differences.

9. Details on hyperparameter tuning for the ML models (especially complex ones like CatBoost, NN) are lacking. Was tuning performed? How? This impacts performance and reproducibility.

10. The best model's AUC (0.7766) is modest for a clinical prediction model. While potentially better than traditional scores (though direct comparison isn't rigorously made), an AUC <0.8 often indicates limited clinical utility for individual prediction. This needs careful interpretation and tempering of claims about "high predictive performance" or "significant outperforming".

11. The description of data splitting for training/validation is unclear. Was a strict hold-out test set used after feature selection (LASSO) and hyperparameter tuning? Or was everything done within cross-validation folds? Preventing data leakage is crucial; the methodology section needs clarification.

12. The class imbalance (21.5% mortality) is acknowledged but the specific techniques used to handle it during model training (e.g., class weighting, sampling methods) are not described. This can significantly impact model performance, especially for metrics like F1-score.

13. Table 1 Presentation: Mixing "mean (SD)" and "Median (IQR)" formats for continuous variables without a clear rationale based on distribution (e.g., normality) is inconsistent. Variables like platelets and BUN are correctly presented as median (IQR) as they are skewed, but others (e.g., heart rate) presented as mean (SD) should be checked for normality or also presented as median (IQR) for consistency. Statistical tests (presumably t-tests/Wilcoxon, Chi-square/Fisher) used for Table 1 are not explicitly named.

14. P-value Interpretation: Reliance on P-values <0.05 in Table 1 without adjustment for multiple comparisons (e.g., Bonferroni, FDR) risks false positives. Given the large number of comparisons, discussing clinically significant differences is more important than purely statistical significance.

15. Missing Data: While median/mode imputation is common, the potential bias introduced by imputing missing values (especially if not missing at random - MAR) isn't discussed. The extent of missingness per variable before imputation isn't reported.

16. The discussion existed several shortcomings.

(1) Overstatement of Performance: The modest AUC (0.7766) is not sufficiently critically discussed. Claims of "outstanding performance" or "significantly outperforming traditional methods" are not fully supported by the data presented (no direct comparison to SOFA/APACHE scores is shown). The clinical utility of an AUC of 0.7766 needs realistic appraisal.

(2) Lymphoma Specificity: The discussion doesn't deeply engage with why lymphoma might pose unique prediction challenges compared to other ICU populations, or how the identified predictors might relate specifically to lymphoma pathophysiology beyond general critical illness (e.g., tumor burden impacting BUN/platelets, chemotherapy effects). (3) SHAP Utility: While SHAP is highlighted, the discussion could better elaborate on the concrete clinical value of the individual risk assessments (SHAP waterfall plots) shown in Fig 6. How would this directly change management? (4)

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy .

Reviewer #1: No

Reviewer #2: Yes:  Feng SHEN

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org . Please note that Supporting Information files do not need this step.

PLoS One. 2025 Aug 20;20(8):e0330197. doi: 10.1371/journal.pone.0330197.r002

Author response to Decision Letter 1


25 Jul 2025

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

Response:

Thank you for reminding us of PLOS ONE’s formatting and file-naming standards.

We have revised the manuscript and supplementary files to fully comply with PLOS ONE’s style requirements, including file naming, formatting, figure preparation, and ethical/data-availability statements.

2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, all author-generated code must be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

Response:

Thank you very much for your guidance regarding PLOS ONE’s code-sharing requirements.

We have carefully reviewed PLOS ONE’s code-sharing policy and confirm that all author-generated code that underpins the findings of this study is publicly available without restrictions.

3. When completing the data availability statement of the submission form, you indicated that you will make your data available on acceptance. We strongly recommend all authors decide on a data sharing plan before acceptance, as the process can be lengthy and hold up publication timelines. Please note that, though access restrictions are acceptable now, your entire data will need to be made freely accessible if your manuscript is accepted for publication. This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If you are unable to adhere to our open data policy, please kindly revise your statement to explain your reasoning and we will seek the editor's input on an exemption. Please be assured that, once you have provided your new statement, the assessment of your exemption will not hold up the peer review process.

Response:

Thank you for the editorial reminder. We have now finalized our data-sharing plan and confirm that all datasets analysed in the study will be made freely accessible upon acceptance. Specifically:

• The fully de-identified MIMIC-IV v3.1 database can be obtained by any credentialed researcher who completes the CITI “Data or Specimens Only Research” course and signs the PhysioNet Data Use Agreement (https://physionet.org/content/mimiciv/3.1/).

• No identifiable patient information is included; therefore, public release complies with the protocol approved by our institutional review boards and does not breach any ethics requirements.

We have updated the “Data Availability” statement in the revised manuscript accordingly and confirm that no exemption from open-data sharing is requested.

4. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please ensure that your ethics statement is included in your manuscript, as the ethics statement entered into the online submission form will not be published alongside your manuscript.

Response:

We thank the editors for pointing this out. In the revised manuscript we have moved the complete ethics statement into the Methods section under the subheading “Data Source and Study Design,” and we have removed any duplicate or redundant ethics language from all other sections (including the Acknowledgements). The manuscript now contains a single, concise ethics statement that confirms the use of de-identified public data (MIMIC-IV), exemption from IRB review, and compliance with the 1964 Helsinki Declaration and its later amendments.

Reviewer #1:

1. Regarding the outcome measure, while the study designates "in-hospital mortality" as the primary endpoint, further clarification on the temporal boundary of this definition would enhance the clinical interpretability of results. Given that mortality at different time points in ICU patients carries distinct clinical implications (e.g., 7-day mortality reflecting acute intervention efficacy, 28-day mortality indicating comprehensive treatment outcomes, and long-term mortality associated with underlying disease progression), it is recommended to reference the standard definitions in comparable studies and supplement the methodological rationale for time frame delineation.

Response:

We thank the reviewer for this valuable comment. In the revised Methods section, under “Outcome Definition,” we have now added explicit temporal boundaries:

In-hospital mortality is defined as death occurring at any time between ICU admission and hospital discharge (index hospitalisation), consistent with the standard definition adopted by the MIMIC-IV database and prior large-scale ICU outcome studies (Noritomi et al., 2021; Rajkomar et al., 2018). This endpoint was chosen because (1) it captures the full spectrum of acute and subacute mortality risk directly pertinent to ICU prognostication, and (2) it aligns with clinical decision-making contexts where treatment intensity and goals-of-care discussions are re-evaluated throughout the entire hospital stay. Although 7-day or 28-day mortality are alternative metrics, they were not selected because ICU length of stay in lymphoma patients is often prolonged by chemotherapy schedules and infectious complications, making in-hospital mortality the most clinically relevant and patient-centred outcome for this cohort.

2. Potential collinearity exists among variables selected by Lasso regression (e.g., BUN and Cr as renal function markers, platelet count and PT in coagulation function). Although Lasso regularization mitigates this issue, adding a correlation matrix would strengthen the argument for model robustness. Additionally, incorporating SHAP dependence plots or interaction value analyses to explore nonlinear interactions between key variables (e.g., BUN and platelet count) could provide richer evidence for clinical interpretation of model prediction mechanisms.

Response:

We thank the reviewer for highlighting the importance of demonstrating the absence of harmful collinearity. In the revised manuscript we have added a full Pearson correlation matrix for the eight LASSO-selected variables (Supplementary Figure S1). The highest absolute correlation observed is 0.60 (BUN vs. creatinine), followed by 0.40 (platelets vs. PT); all remaining pairwise correlations are < 0.30. These low-to-moderate values indicate that multicollinearity is not a concern after LASSO regularisation, and the model’s coefficients remain stable and interpretable.

Given the limited magnitude of these correlations and the fact that the LASSO penalty already shrinks highly collinear predictors, we did not deem SHAP interaction analyses essential. Instead, the correlation matrix and the robust performance metrics (AUC = 0.7766, stable across cross-validation folds) jointly support the model’s reliability.

3. While baseline characteristics acknowledge the impact of multiple comorbidities, integrating a standardized comorbidity index (e.g., Charlson Index) to quantify comorbidity burden would systemize the risk factor analysis. As different comorbidities contribute differently to mortality, supplementing correlation analyses between comorbidity indices and death outcomes would enhance the clinical relevance of findings.

Response:

We appreciate the reviewer’s suggestion. After LASSO-based feature selection, none of the individual comorbidity indicators were retained in the final eight-variable model. Because the Charlson Comorbidity Index (CCI) itself is a composite of those same comorbidities, including it would introduce redundancy and violate the parsimony principle already enforced by LASSO. Furthermore, when we compared the predictive performance of (a) the LASSO-selected model versus (b) the same model augmented with the CCI, the AUC remained essentially unchanged (ΔAUC = +0.003). Therefore, we elected to keep the original, more parsimonious set of variables and did not integrate the Charlson Index.

Reviewer #2: Its my pleasure to review the manuscript titled "Predicting In-Hospital Mortality in ICU Patients with lymphoma Mellitus Using Machine Learning Models". The study develops and validates ML models for predicting in-hospital mortality in ICU patients with presumed lymphoma. While the methodology is generally sound and addresses a clinically relevant problem, critical terminology errors undermine the foundation of the work. Significant revisions are required before consideration for PLOS ONE. The AUC of 0.7766 is modest, and clinical applicability needs stronger justification.

________________________________________

The major Issues of the research included:

1. "Lymphoma Mellitus" is incorrect and non-existent. "Mellitus" specifically refers to diabetes (e.g., Diabetes Mellitus). The correct term is simply "Lymphoma".

Response:

We sincerely thank the reviewer for pointing out the terminology error. We fully acknowledge that “Lymphoma Mellitus” is an incorrect and non-existent term; “mellitus” is reserved for diabetes mellitus. In the revised manuscript we have carefully replaced every instance of “Lymphoma Mellitus” with “Lymphoma” throughout the title, abstract, main text, figures, tables, and supplementary materials to ensure accuracy and clarity.

2. The introduction could better emphasize the specific challenges of predicting mortality in lymphoma patients compared to the general ICU population (e.g., unique complications like tumor lysis syndrome, immunosuppression-related infections, specific organ involvement). The gap regarding ML for this specific subgroup is adequately stated.

Response:

We appreciate the reviewer’s helpful suggestion. In the revised Introduction (paragraph 2, lines 45–58) we have added a concise paragraph that explicitly contrasts lymphoma ICU patients with the general ICU population. We now highlight lymphoma-specific mortality drivers such as acute tumor lysis syndrome, profound immunosuppression-related opportunistic infections (e.g., Pneumocystis jirovecii, invasive moulds), chemotherapy-induced cardiotoxicity, and direct renal or hepatic infiltration. These unique pathophysiological features render traditional scores (SOFA, APACHE) less reliable for this subgroup, underscoring the need for tailored machine-learning models.

3. The exclusion of 10,054 patients due to "missing values for all variables" (Fig 1) is unusual and concerning. It suggests potential selection bias. Were these patients truly missing every single variable collected? Clarification or rephrasing is needed (e.g., "missing key predictor or outcome variables").

Response:

We thank the reviewer for raising this important point. The wording in the original flow-chart was indeed imprecise. The 10 054 patients were not missing every variable; rather, they were excluded because they lacked one or more of the key predictor or outcome variables required for modelling (e.g., vital signs at ICU admission, essential laboratory values such as BUN or platelets, or the in-hospital mortality flag). These missing data rendered them non-evaluable by our modelling pipeline. We have updated the legend of Figure 1 and the Methods section (Data Collection and Processing) to clarify this exclusion criterion as “missing key predictor or outcome variables” instead of the overly broad phrase “missing values for all variables.” This change more accurately reflects the selection process and mitigates concern for undue selection bias.

4. The long timeframe (2008-2019) introduces potential confounding from evolving ICU practices and lymphoma treatments over 11 years. This isn't addressed.

Response:

We appreciate the reviewer’s concern regarding the 2008–2019 time span and its potential for confounding by evolving ICU practices and lymphoma treatments. Because the MIMIC-IV database does not contain detailed, time-stamped information about chemotherapy regimens, immunotherapy protocols, or changes in critical-care bundles over the years, we are unable to adjust for these factors in the current analysis. Consequently, we have added the following sentence to the Limitations section of the revised manuscript:

“Given the absence of granular treatment data (e.g., specific chemotherapy agents, immunotherapies, or evolving ICU bundles) in MIMIC-IV, we could not control for temporal changes in lymphoma management or critical-care practices. Future multicenter studies that incorporate longitudinal drug and intervention data are warranted to address this limitation.”

5. MIMIC-IV, while large, is single-center data (Beth Israel Deaconess MC). Generalizability to other settings may be limited, appropriately noted as a limitation.

Response:

We thank the reviewer for highlighting the issue of generalizability related to the single-center nature of MIMIC-IV. We fully agree that the database is derived solely from Beth Israel Deaconess Medical Center (Boston, USA) and may not reflect patient demographics, ICU practices, or lymphoma treatment patterns in other settings. In the revised manuscript we have explicitly stated:

“MIMIC-IV is a single-center database derived from Beth Israel Deaconess Medical Center (Boston, USA), which may introduce selection bias and limit generalizability to other ICUs with different patient demographics, clinical practices, or resource availability. Future research should conduct external validation on multiple independent datasets to confirm the generalizability of the models.”

6. While LASSO identified predictors, the justification for the initial set of variables extracted from MIMIC-IV is somewhat brief. Were lymphoma-specific variables (e.g., disease stage, type [Hodgkin/Non-Hodgkin], recent chemotherapy, presence of tumor lysis syndrome, neutropenia) considered or available? These are highly relevant to mortality risk in lymphoma patients.

Response:

We thank the reviewer for emphasizing the clinical importance of lymphoma-specific covariates. Unfortunately, MIMIC-IV does not contain ICD-O histology codes, Ann Arbor stage, Hodgkin vs. non-Hodgkin sub-typing, date-stamped chemotherapy records, tumour-lysis-syndrome flags, or serial neutrophil counts that would allow reliable identification of neutropenia. Consequently, disease stage, lymphoma subtype, recent cytotoxic therapy, TLS, and neutropenia could not be incorporated into the initial variable set. We have now added the following sentence to the Data Collection subsection (Methods) and to the Limitations section:

“Because MIMIC-IV lacks lymphoma-specific variables such as histological subtype, Ann Arbor stage, recent chemotherapy exposure, tumour-lysis syndrome, or neutrophil nadir, these high-risk features could not be included in the analysis; future multicenter studies with dedicated oncology-ICU databases are needed to capture these predictors.”

7. The models only use admission/early ICU data. Interventions during the ICU stay (e.g., mechanical ventilation, vasopressors, dialysis, specific lymphoma treatments) are not included as potential predictors or confounders, significantly limiting the model's potential clinical applicability for dynamic risk assessment. This is a major omission.

Response:

We thank the reviewer for pointing out this critical limitation. MIMIC-IV does not contain granular, time-stamped logs of mechanical-ventilator settings, vasopressor doses, renal-replacement-therapy sessions, or lymphoma-specific therapies (e.g., rasburicase, dose-adjusted chemotherapy) administered after ICU admission. Consequently, these dynamic interventions could neither be included as predictors nor adjusted for as potential time-varying confounders, which indeed curtails the model’s u

Attachment

Submitted filename: Response to Reviewers.docx

pone.0330197.s005.docx (28.6KB, docx)

Decision Letter 1

Chiara Lazzeri

29 Jul 2025

<p>Predicting In-Hospital Mortality in ICU Patients with lymphoma Using Machine Learning Models

PONE-D-25-18876R1

Dear Dr. Lan,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager®  and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support .

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Chiara Lazzeri

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Chiara Lazzeri

PONE-D-25-18876R1

PLOS ONE

Dear Dr. Lan,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Chiara Lazzeri

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Data. The complete, de-identified raw dataset for all patients who met the study inclusion and exclusion criteria.

    (XLS)

    pone.0330197.s001.xls (522KB, xls)
    S1 Table. Extent of missing data before imputation (n = 1591).

    (DOCX)

    pone.0330197.s002.docx (14.9KB, docx)
    S1 Fig. Correlation matrix of the final selected variables.

    (TIF)

    pone.0330197.s003.tif (1.4MB, tif)
    Attachment

    Submitted filename: Response to Reviewers.docx

    pone.0330197.s005.docx (28.6KB, docx)

    Data Availability Statement

    The datasets analyzed during this study are derived from the publicly available MIMIC-IV v3.1 database (https://physionet.org/content/mimiciv/3.1/). Credentialed researchers can obtain access by completing the CITI Data or Specimens Only Research course and submitting a data use agreement via PhysioNet. All data are de-identified and HIPAA-compliant, and the analysis code along with variable extraction scripts have been deposited in the public GitHub repository (https://github.com/tianbilan/MIMIC-IV-Lymphoma-Mortality-ML).


    Articles from PLOS One are provided here courtesy of PLOS

    RESOURCES