Abstract
Introduction:
Acute liver injury (ALI) is a common complication of sepsis and is associated with adverse clinical outcomes. We aimed to develop a model to predict the risk of ALI in patients with sepsis after hospitalization.
Methods:
Medical records of 3196 septic patients treated at the Lishui Central Hospital in Zhejiang Province from January 2015 to May 2023 were selected. Cohort 1 was divided into ALI and non-ALI groups for model training and internal validation. The initial laboratory test results of the study subjects were used as features for machine learning (ML), and models built using nine different ML algorithms were compared to select the best algorithm and model. The predictive performance of model stacking methods was then explored. The best model was externally validated in Cohort 2.
Results:
In Cohort 1, LightGBM demonstrated good stability and predictive performance with an area under the curve (AUC) of 0.841. The top five most important variables in the model were diabetes, congestive heart failure, prothrombin time, heart rate, and platelet count. The LightGBM model showed stable and good ALI risk prediction ability in the external validation of Cohort 2 with an AUC of 0.815. Furthermore, an online prediction website was developed to assist healthcare professionals in applying this model more effectively.
Conclusions:
The Light GBM model can predict the risk of ALI in patients with sepsis after hospitalization.
Keywords: Acute liver injury, external validation, machine learning, model stacking, sepsis
INTRODUCTION
Currently, sepsis has emerged as a severe infectious disease and a leading cause of mortality in intensive care unit (ICU) patients.[1] Globally, at least 30,000 individuals are affected by sepsis each year, with a mortality rate of 30%–50%.[2] Sepsis is a severe infectious disease typically caused by pathogens such as bacteria, viruses, or fungi, leading to a systemic inflammatory response syndrome. The characteristics of sepsis include abnormal body temperature, increased heart rate, rapid breathing, elevated or reduced white blood cell counts, and multiple organ dysfunction. Sepsis can rapidly deteriorate, resulting in acute organ dysfunction, including acute liver injury (ALI).[3] ALI is a pathological condition in which liver function is impaired over a short period of time, usually within hours to days. This condition is usually manifested by the elevated levels of acute liver enzymes, such as aspartate aminotransferase (AST) and alanine aminotransferase (ALT). ALI can be caused by a variety of causes, including infection, drug poisoning, alcohol poisoning, metabolic disease, shock, drug reaction, autoimmune disease, etc. This condition may vary in severity from mild liver enzyme abnormalities to severe liver failure.[4] Liver disease accounts for two million deaths annually and is responsible for 4% of all deaths (1 out of every 25 deaths worldwide). ALI is one of the causes of death.[5] In sepsis patients, ALI is a common and dangerous complication, as the systemic inflammatory response triggered by sepsis can adversely affect the liver, potentially resulting in a rapid decline in liver function, and even liver failure.[6] Therefore, early prediction of the severity of ALI in septic patients is crucial for providing timely and appropriate interventions.
Traditional clinical markers and scoring systems, while valuable, often lack the sensitivity and specificity required for early ALI prediction. In recent years, machine learning (ML) models have emerged as powerful tools in the field of health care, offering the potential to enhance prediction accuracy and inform clinical decision-making.[7] ML, a subfield of Artificial Intelligence, focuses on how computer systems can learn from the data and improve their performance without explicit programming.[8] In contrast to traditional model, ML focuses on allowing computers to automatically discover patterns from large amounts of data and make predictions and decisions based on these patterns, yielding superior performance,[9] see supplementary material 1 [Supplementary Table 1]. ML finds extensive applications in various fields, including medical diagnosis, image and speech recognition, financial forecasting, recommendation systems, etc., and has emerged as a powerful tool in health-care research for analyzing complex datasets and making accurate predictions.[10] For ALI caused by sepsis, ML models can be utilized to analyze patients’ clinical data and biochemical indicators, assisting physicians in earlier diagnosis of sepsis-induced ALI and predicting the severity of the condition. This allows for timely intervention. Additionally, personalized treatment recommendations can be provided based on the individual patient’s specific circumstances and medical history. This may include appropriate drug selection and dosages to minimize the impact of ALI. Finally, ML models can monitor the progression of a patient’s condition and provide real-time feedback when necessary, aiding healthcare teams in predicting which patients are at a higher risk of developing severe ALI and taking preventative measures.
Supplementary Table 1.
Comparison of traditional model and machine learning model
| Parameter | Traditional model | Machine learning models | ||
|---|---|---|---|---|
| Model cost | It requires expensive labor costs and data preparation, and involve cumbersome feature engineering and rule definition. Model maintenance and performance improvements require manual intervention, adding additional costs. In terms of scalability, more engineering work is needed to accommodate large-scale data and multiple features | It requires more data, but are relatively inexpensive to prepare. It can automatically learn features and reduce the cost of manual feature engineering. Modeling requires expertise, but subsequent maintenance and performance improvements can be automated to some extent, reducing overall costs. At the same time, machine learning models are more adaptable to large and diverse data, improving performance and scalability | ||
| Data requirement | Relatively few, and often require carefully curated specific data sets | Large amounts of data are required for training and may require data cleaning and preparation | ||
| Model performance | It usually has low predictive performance and is suitable for some basic problems | Can provide higher predictive performance and complex analysis | ||
| Turn around time | Fast and usually does not require a lot of computation time | More computing time may be required for training and prediction | ||
| Automation and scalability | Limited, requiring manual intervention and adjustment | It can be automated and scalable for large-scale problems | ||
| Complexity and interpretability | Usually relatively simple and easy to explain | It can be more complex and difficult to explain, and interpretability issues need to be dealt with | ||
| Applicable areas and issues | Suitable for simple problems and small data sets | Suitable for large data sets and complex problems | ||
| Predictive accuracy | Usually relatively low | May provide higher prediction accuracy | ||
| Data requirement | Relatively few, and often require carefully curated specific data sets | Large amounts of data are required for training and may require data cleaning and preparation | ||
| Model development cost | Usually low and relatively easy to implement | Higher, requires professional data scientists and engineers | ||
| Model performance | It usually has low predictive performance and is suitable for some basic problems | Can provide higher predictive performance and complex analysis | ||
| Turn around time | Fast and usually does not require a lot of computation time | More computing time may be required for training and prediction | ||
| Automation and scalability | Limited, requiring manual intervention and adjustment | It can be automated and scalable for large-scale problems | ||
| Complexity and interpretability | Usually relatively simple and easy to explain | It can be more complex and difficult to explain, and interpretability issues need to be dealt with | ||
| Applicable areas and issues | Suitable for simple problems and small data sets | Suitable for large data sets and complex problems | ||
| Predictive accuracy | Usually relatively low | May provide higher prediction accuracy |
In this study, our objective is to develop a ML model capable of predicting ALI in septic patients. This research has the potential to improve early interventions and optimize clinical decision-making for septic patients after hospitalization, aiding in the development of personalized treatment strategies to enhance the prognosis of sepsis-associated ALI and improve patients’ quality of life and overall health.
METHODS
Study subjects
We conducted a single-center, observational retrospective study, selecting a cohort of 3196 patients with sepsis admitted to the ICU and Emergency Medicine of the Center Hospital of Lishui City, Zhejiang Province, as our study subjects. The diagnosis of sepsis was based on the International Consensus Definition for Sepsis and Septic Shock (Sepsis-3). ALI diagnosis criteria employed the Liver Dysfunction Score (LDS), which primarily considers the ratio of serum AST to serum ALT. An LDS result > 1 indicated ALI, with severity increasing as the ratio rises. This study was approved by the Ethics Committee of the Center Hospital of Lishui City, Zhejiang Province.
Cohort 1 (model construction and model fusion): Patients with a first diagnosis of sepsis who attended the ICU and Emergency Medicine between January 2015 and December 2020
Cohort 2 (external validation): Patients with a first diagnosis of sepsis who attended the ICU and Emergency Medicine between January 2021 and May 2023.
Inclusion criteria for patients were as follows: (1) Age ≥18 years old; (2) Stay in hospital for more than 24 h with sufficient data for analysis; (3) all patients experienced their first episode of sepsis and were admitted within 72 h of symptom onset; (4) patients diagnosed with sepsis according to the International Consensus Definition for Sepsis and Septic Shock (Sepsis-3).
Exclusion criteria for patients were: (1) Age ≤18 years old; (2) patients with chronic liver disease and other liver conditions; (3) severe multi-organ dysfunction; (4) malignancies; and (5) pregnant and lactating women.
Data collection
All the baseline variables were obtained at the time of ICU admission. Demographic data included gender, age, body mass index (BMI), and Acute Physiology and Chronic Health Evaluation II (APACHE II) score. Complications included renal dysfunction, respiratory failure, and heart failure. Liver function indicators encompassed AST, ALT, glutamate dehydrogenase (GLDH), lactate dehydrogenase (LDH), albumin (ALB), alkaline phosphatase (ALP), total bilirubin (TBil), and direct bilirubin (DBil) measurements. Coagulation function indicators included platelet count, activated partial thromboplastin time (APTT), prothrombin time (PT), and prothrombin activity. Inflammatory markers included procalcitonin, hemoglobin, C-reactive protein, and white blood cell (WBC). Clinical data included respiratory rate, heart rate, and death. Medical history encompassed diabetes and hypertension. Infection sources included gallbladder, liver, abdomen, lungs, urinary tract, and bloodstream. The study endpoint was ALI.
Data preprocessing
In order to process the experimental data in a standardized and unified manner, we preprocessed the data. Variables with missing values >30% were eliminated, and for variables with missing values <30%, a variety of interpolation methods were used to maintain the integrity of the dataset. The cohort 1 and cohort 2 data sets were automatically divided into training sets (70%) and test sets (30%), respectively, using random seeds.
Feature selection
Five feature selection methods were used to obtain subsets of predictive variables for further model development. These methods included Lasso, Boruta, max-relevance and min-redundancy (mRMR), Relief algorithm, and recursive feature elimination. The results obtained from these five methods were evaluated comprehensively, and variables that appeared in at least four of the five methods were selected to build the model.
Model development and evaluation
In this study, nine ML model algorithms were developed, including logistics, XGBoost, LightGBM, random forest (RF), AdaBoost, GaussianNB (GNB), multi-layer perceptron (MLP), support vector machine (SVM), and k-nearest neighbors (KNN) algorithms. We explored which ML model is more suitable for predicting ALI in septic patients. The dataset was divided into training set, validation set, and test set. The training set underwent k-fold cross-validation (k = 5) resampling method, and the grid search method was used to optimize the hyperparameters of the model to obtain the best model performance. Three measures of model quality were used to evaluate the clinical value of the model: Discrimination, calibration, and clinical utility. Firstly, the PR curve was used to quantify the discriminative ability of the model, showing the precision and recall at different thresholds. Then, the calibration plot was used to evaluate the performance of the model by comparing the deviation between the predicted probabilities and the actual event occurrence frequency. Next, the DCA was used to assess the clinical utility of the model by calculating the net benefit at different threshold probabilities, evaluating the clinical application value of the model. In addition, one of the models was evaluated using metrics such as accuracy, sensitivity, specificity, F1 score, and AP in the confusion matrix to quantify the performance of the model.
Model explanation
The SHAP package in python was used to explain the output of the ML model, which based on game theory, was used to explore the interpretability of the predictive models. SHAP is known for its high reliability, computational efficiency, and low computational cost and has been widely used for explaining ML model outputs.
Development of stacking ensemble model
In this study, we explored an innovative approach known as the stacking ensemble model to further enhance the predictive performance of ALI in sepsis patients. The core idea behind the stacking ensemble model is to combine multiple different base ML models to achieve more robust and accurate predictive performance. Model fusion (Stacking) was employed to classify the samples, with the first layer models consisting of XGB Classifier, LGBM Classifier, and RandomForest Classifier, as these models have already demonstrated good performance in this study. In the second layer, five weak classifiers were trained through 5-fold cross-validation using the output from the first layer and then modeled using a logistic model (mete-model). The meta-model’s purpose is to combine the outputs of the base models to generate the final prediction. Logistic regression, as the meta-model, offers simplicity and interpretability while effectively integrating the results of multiple base models to enhance overall performance. The model parameters were set as follows: C (regularization factor) = 1.0, max_iter (number of iterations) =100, penalty (type of regularization) = l2, and tol (convergence measurement) =0.0001.
During the model training and validation phases, we divided the data into training, validation, and testing sets. We optimized the model’s hyperparameters through cross-validation and grid search to achieve the best performance. Performance evaluation encompassed various metrics, including accuracy, sensitivity, specificity, F1 score, and average precision, providing a comprehensive assessment of the stacking ensemble model’s performance.
Statistical analysis
All statistical analyses were performed using R version 3.6.3 and Python version 3.7. Continuous variables with a normal distribution trend were expressed as mean ± standard deviation (SD), while nonnormally distributed variables were represented as median (Q25, Q75). Between-group variable comparisons were performed using Student’s-t-test, rank-sum test, or analysis of variance. Categorical variables were described as n (%), and differences in categorical variables were compared using Chi-square tests or Fisher’s exact tests. A significance level of P < 0.05 was considered statistically significant.
RESULTS
Patient characteristics
According to the inclusion and exclusion criteria, 3196 subjects were eventually included in this study [Table 1]. There were 2162 patients in cohort 1, of whom 684 (31.64%) were patients with ALI and 1478 (68.36%) were patients without ALI. There were 1034 patients in cohort 2, including 335 patients with ALI (32.40%) and 699 patients without ALI (67.60%). There was no significant difference in baseline characteristics between the two groups (P > 0.05).
Table 1.
Demographic characteristics of the subjects
| Variable | Category | All (n=3196) | Cohort 1 (n=2162) | Cohort 2 (n=1034) | P | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Basic patient information | ||||||||||
| Age, mean±SD | - | 67.94±13.62 | 68.01±13.76 | 67.79±13.30 | 0.678 | |||||
| BMI, median (IQR) | - | 21.60 (19.580–23.690) | 21.62 (19.610–23.770) | 21.47 (19.460–23.610) | 0.314 | |||||
| Gender, n (%) | Female | 1011 (31.633) | 672 (31.082) | 339 (32.785) | 0.333 | |||||
| Male | 2185 (68.367) | 1490 (68.918) | 695 (67.215) | |||||||
| APACHE II, median (IQR) | - | 24.00 (18.000–29.000) | 24.00 (18.000–29.000) | 23.00 (18.000–29.000) | 0.807 | |||||
| Past medical history, n (%) | ||||||||||
| Hypertension | No | 2612 (81.727) | 1750 (80.944) | 862 (83.366) | 0.097 | |||||
| Yes | 584 (18.273) | 412 (19.056) | 172 (16.634) | |||||||
| Diabetes | No | 2421 (75.751) | 1640 (75.856) | 781 (75.532) | 0.842 | |||||
| Yes | 775 (24.249) | 522 (24.144) | 253 (24.468) | |||||||
| Complications, n (%) | ||||||||||
| ALI | No | 2177 (68.116) | 1478 (68.363) | 699 (67.602) | 0.666 | |||||
| Yes | 1019 (31.884) | 684 (31.637) | 335 (32.398) | |||||||
| Heart failure | No | 2573 (80.507) | 1746 (80.759) | 827 (79.981) | 0.604 | |||||
| Yes | 623 (19.493) | 416 (19.241) | 207 (20.019) | |||||||
| Renal failure | No | 2258 (70.651) | 1516 (70.120) | 742 (71.760) | 0.341 | |||||
| Yes | 938 (29.349) | 646 (29.880) | 292 (28.240) | |||||||
| Respiratory failure | No | 1858 (58.135) | 1250 (57.817) | 608 (58.801) | 0.598 | |||||
| Yes | 1338 (41.865) | 912 (42.183) | 426 (41.199) | |||||||
| Liver function index, median (IQR) | ||||||||||
| DBil | - | 12.40 (6.500–22.500) | 12.20 (6.500–22.800) | 12.40 (6.500–22.200) | 0.996 | |||||
| TBil | - | 18.60 (11.000–33.500) | 18.60 (11.000–33.500) | 18.60 (11.000–33.500) | 0.911 | |||||
| ALP | - | 91.00 (46.000–164.000) | 90.00 (48.000–164.000) | 91.00 (44.000–165.000) | 0.589 | |||||
| ALB, mean±SD | - | 26.22±5.55 | 26.27±5.50 | 26.11±5.65 | 0.447 | |||||
| LDH | - | 635.00 (302.000–2150.000) | 606.00 (302.000–2150.000) | 635.00 (302.000–2150.000) | 0.801 | |||||
| GLDH | - | 57.00 (31.000–124.000) | 57.00 (31.000–124.000) | 58.00 (31.000–120.000) | 0.654 | |||||
| Coagulation function indicators, median (IQR) | ||||||||||
| PCT | - | 15.00 (2.810–29.720) | 14.27 (2.820–29.720) | 15.00 (1.600–29.250) | 0.730 | |||||
| PA, mean±SD | - | 64.18±20.04 | 64.24±19.97 | 64.06±20.17 | 0.809 | |||||
| PT | - | 15.80 (11.200–21.700) | 15.70 (11.200–21.500) | 16.10 (10.900–22.100) | 0.424 | |||||
| APTT | - | 34.70 (27.000–43.200) | 34.60 (27.300–42.800) | 35.60 (26.500–44.000) | 0.640 | |||||
| Platelet count | - | 126.00 (69.000–208.000) | 124.00 (67.000–208.000) | 135.00 (72.000–210.000) | 0.138 | |||||
| Inflammatory markers | ||||||||||
| WBC, median (IQR) | - | 12.20 (6.700–20.000) | 12.00 (6.900–20.200) | 12.80 (6.400–19.900) | 0.771 | |||||
| C-reactive protein, median (IQR) | - | 154.00 (95.000–222.000) | 153.00 (95.000–222.000) | 155.00 (93.000–221.000) | 0.924 | |||||
| Hemoglobin, mean±SD | - | 98.08±25.44 | 98.15±25.51 | 97.93±25.27 | 0.817 | |||||
| Clinical data | ||||||||||
| Heart rate, mean±SD | - | 103.17±22.71 | 102.95±22.83 | 103.63±22.46 | 0.429 | |||||
| Respiratory rate, median (IQR) | - | 21.00 (17.000–25.000) | 21.00 (17.000–25.000) | 21.00 (17.000–25.000) | 0.996 | |||||
| Death, n (%) | No | 1872 (58.573) | 1257 (58.141) | 615 (59.478) | 0.473 | |||||
| Yes | 1324 (41.427) | 905 (41.859) | 419 (40.522) | |||||||
| Source of infection, n (%) | Blood | 754 (23.592) | 488 (22.572) | 266 (25.725) | 0.313 | |||||
| Urinary tract | 173 (5.413) | 125 (5.782) | 48 (4.642) | |||||||
| Lung | 1564 (48.936) | 1073 (49.630) | 491 (47.485) | |||||||
| Abdominal cavity | 281 (8.792) | 185 (8.557) | 96 (9.284) | |||||||
| Liver | 144 (4.506) | 98 (4.533) | 46 (4.449) | |||||||
| Gall bladder | 280 (8.761) | 193 (8.927) | 87 (8.414) |
SD: Standard deviation, BMI: Body mass index, IQR: Interquartile range, APACHE II: Acute physiology and Chronic Health Evaluation II, ALI: Acute liver injury, DBil: Direct bilirubin, TBil: Total bilirubin, ALP: Alkaline phosphatase, ALB: Albumin, LDH: Lactate dehydrogenase, GLDH: Glutamate dehydrogenase, PCT: Procalcitonin, PA: Prothrombin activity, APTT: Activated partial thromboplastin time, WBC: White blood cell, PT: Prothrombin time
Feature selection
Five feature selection methods, namely Lasso, Boruta, mRMR, Relief, and XGBoost, were used to select the predictive variables [Figure 1]. Ultimately, variables that appeared at least four times among these five methods were chosen to construct the ML model. The selected predictive variables include BMI, APACHE II, heart failure, GLDH, LDH, ALB, ALP, TBil, DBil, APTT, PT, hemoglobin, WBC, respiratory rate, heart rate, diabetes, infection sources and platelet count [Supplementary Table 2].
Figure 1.

5 feature selection methods. (a) Lasso. Its optimal regular parameter is 0.024. (b) Boruta. Rejected as red and accepted as green. (c) Max-relevance and min-redundancy. (d) ReliefF. (e) XGBoost-REF
Supplementary Table 2.
Prediction model variables selected by five feature selection methods
| Variables | Lasso | Boruta | mRMR | Relief | XGBoost-RFE | ≥4 methods | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | √ | √ | ||||||||||
| Gender | √ | √ | ||||||||||
| BMI | √ | √ | √ | √ | √ | |||||||
| APACHE II | √ | √ | √ | √ | √ | √ | ||||||
| Renal failure | √ | √ | ||||||||||
| Respiratory failure | √ | |||||||||||
| Heart failure | √ | √ | √ | √ | √ | √ | ||||||
| GLDH | √ | √ | √ | √ | √ | √ | ||||||
| LDH | √ | √ | √ | √ | √ | |||||||
| ALB | √ | √ | √ | √ | √ | |||||||
| ALP | √ | √ | √ | √ | √ | √ | ||||||
| TBil | √ | √ | √ | √ | √ | √ | ||||||
| DBil | √ | √ | √ | √ | √ | √ | ||||||
| APTT | √ | √ | √ | √ | √ | |||||||
| PT | √ | √ | √ | √ | √ | |||||||
| PA | √ | √ | √ | |||||||||
| PCT | √ | √ | ||||||||||
| Hemoglobin | √ | √ | √ | √ | √ | √ | ||||||
| C-reactive protein | √ | √ | √ | |||||||||
| WBC | √ | √ | √ | √ | √ | |||||||
| Respiratory rate | √ | √ | √ | √ | √ | |||||||
| Heart rate | √ | √ | √ | √ | √ | √ | ||||||
| Death | √ | √ | ||||||||||
| Diabetes | √ | √ | √ | √ | √ | √ | ||||||
| Hypertension | √ | |||||||||||
| Infection sources | √ | √ | √ | √ | √ | √ | ||||||
| Platelet count | √ | √ | √ | √ | √ | √ |
APACHE II: Acute Physiology and Chronic Health Evaluation II, BMI: Body mass index, GLDH: Glutamate dehydrogenase, LDH: Lactate dehydrogenase, ALB: Albumin, ALP: Alkaline phosphatase, PA: Prothrombin activity, PCT: Procalcitonin, WBC: White blood cell, APTT: Activated partial thromboplastin time, DBil: Direct bilirubin, TBil: Total bilirubin, mRMR: Max-Relevance and Min-Redundancy, RFE: Recursive feature elimination, PT: Prothrombin time, √: In order to determine whether the feature variable was selected or not during the modeling process, the √ symbol was used to represent it. Each feature variable was judged under different feature selection methods, and if it was selected, it was represented by the √ symbol. The last column was a summary, if a feature variable was selected by four or more feature selection methods, we used the √ symbol
Model development and evaluation
In this study, nine ML algorithms were used to build predictive models. The training set was primarily used for model training during algorithm modeling, while the validation set was used to evaluate and optimize the performance of the models. Tables 2 summarized the training set and validation set results for each of the nine model classifications, ranked by area under the curve (AUC). Among all current models, the best performance of the training set was XGBoost (sorted according to AUC), and the best performance of the validation set was LightGBM. The two are not consistent, and XGBoost may have overfitting phenomenon, but LightGBM has relatively good stability and prediction performance. Therefore, LightGBM was chosen as the best prediction model.
Table 2.
Summary of the training set and validation set results for multi-model classification
| Classification model | AUC | Cutoff | Accuracy | Sensitivity | Specificity | Positive predictive value | Negative predictive value | F1 score | Kappa | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Training set | ||||||||||||||||||
| XGBoost | 1.000 | 0.907 | 0.999 | 1.000 | 1.000 | 1.000 | 0.999 | 1.000 | 0.998 | |||||||||
| Logistic | 0.748 | 0.270 | 0.657 | 0.776 | 0.604 | 0.469 | 0.856 | 0.584 | 0.320 | |||||||||
| LightGBM | 1.000 | 0.802 | 0.999 | 1.000 | 1.000 | 1.000 | 0.999 | 1.000 | 0.998 | |||||||||
| RF | 1.000 | 0.480 | 0.997 | 0.999 | 0.999 | 1.000 | 0.996 | 0.999 | 0.994 | |||||||||
| AdaBoost | 0.915 | 0.495 | 0.832 | 0.879 | 0.812 | 0.680 | 0.935 | 0.767 | 0.638 | |||||||||
| GNB | 0.831 | 0.378 | 0.764 | 0.808 | 0.745 | 0.589 | 0.894 | 0.681 | 0.500 | |||||||||
| MLP | 0.628 | 0.115 | 0.552 | 0.727 | 0.473 | 0.388 | 0.802 | 0.501 | 0.160 | |||||||||
| SVM | 0.783 | 0.312 | 0.725 | 0.741 | 0.719 | 0.547 | 0.858 | 0.629 | 0.418 | |||||||||
| KNN | 0.795 | 0.400 | 0.759 | 0.795 | 0.650 | 0.684 | 0.779 | 0.736 | 0.384 | |||||||||
| Validation set | ||||||||||||||||||
| XGBoost | 0.838 | 0.907 | 0.760 | 0.813 | 0.747 | 0.774 | 0.759 | 0.792 | 0.322 | |||||||||
| logistic | 0.735 | 0.270 | 0.647 | 0.695 | 0.695 | 0.475 | 0.824 | 0.563 | 0.295 | |||||||||
| LightGBM | 0.847 | 0.802 | 0.754 | 0.816 | 0.764 | 0.792 | 0.749 | 0.802 | 0.320 | |||||||||
| RF | 0.825 | 0.480 | 0.774 | 0.775 | 0.760 | 0.677 | 0.807 | 0.719 | 0.429 | |||||||||
| AdaBoost | 0.827 | 0.495 | 0.770 | 0.747 | 0.786 | 0.617 | 0.863 | 0.674 | 0.494 | |||||||||
| GNB | 0.810 | 0.378 | 0.752 | 0.794 | 0.745 | 0.586 | 0.880 | 0.673 | 0.479 | |||||||||
| MLP | 0.612 | 0.115 | 0.531 | 0.736 | 0.468 | 0.373 | 0.780 | 0.491 | 0.119 | |||||||||
| SVM | 0.710 | 0.312 | 0.667 | 0.750 | 0.638 | 0.479 | 0.816 | 0.579 | 0.304 | |||||||||
| KNN | 0.551 | 0.400 | 0.644 | 0.403 | 0.698 | 0.382 | 0.713 | 0.377 | 0.081 |
KNN: K-nearest neighbors, SVM: Support vector machine, MLP: Multi-layer perceptron, GNB: GaussianNB, RF: Random forest, AUC: Area under the curve
The forest plot below shows the receiver operating characteristic (ROC) results of each model’s predictions, with the mean and SD of the ROC values calculated through 5-fold cross-validation [Figure 2]. After parameter optimization, the AP values of the nine prediction models ranged from 0.41 to 1.0 in training set [Figure 3a], the AP values ranged from 0.345 to 0.704 in verification set [Figure 3b]. Among them, the LightGBM model achieved the highest AP value, indicating its higher accuracy. The performance and utility of nine prediction models were comprehensively evaluated through calibration curves and effective decision curves. The calibration plot on the validation set demonstrated that the LightGBM model had some degree of calibration [Figure 3c]. The DCA curve for the validation set assessed the utility of the model in clinical decision-making, demonstrating that the LightGBM model maintained a high true positive rate within a range of high predicted probabilities, indicating high sensitivity in predicting positive cases [Figure 3d]. In summary, combined with model performance evaluation indexes such as recall rate, AP and DCA, LightGBM model is considered to be the best predictive model for ALI in sepsis.
Figure 2.

ROC curve plots. (a) ROC curves for the training set. (b) ROC curves for the validation set. The x-axis represents the false positive rate (FPR), which is the proportion of samples predicted as positive but are actually negative. The y-axis represents the true positive rate (TPR), also known as recall, which is the proportion of samples predicted as positive and are actually positive. (c) Multi-model forest plot for the validation set
Figure 3.

Model Performance Evaluation. (a) PR curve plot for the training set. (b) PR curve plot for the validation set. The x-axis represents the model’s ability to correctly predict positives, and the y-axis represents the model’s ability to correctly predict true positives among the predicted positives. (c) Calibration plot for the validation set. The x-axis represents the predicted probabilities of the model, and the y-axis represents the observed actual probabilities. The ideal calibration curve is the diagonal line (black dashed line). (d) DCA curve plot for the validation set. The x-axis represents the threshold for model predictions, and the y-axis represents the net benefit. Higher net benefit indicates better utility performance of the prediction model
Construction and evaluation of optimal model
The optimal model was selected using an automated parameter tuning framework with the LightGBM classification algorithm to identify the key features. The parameters of the LightGBM classification model were as follows: Boosting_type: Gbdt, learning_rate: 0.1, max_depth: −1, n_estimators (maximum number of trees): 100, num_leaves (maximum number of leaves): 31. The results show that the 5-fold cross-validation has the best effect [Table 3]. The mean AUC of the training set was 1, the mean AUC of the validation set was 0.835 (0.789–0.881), and the mean AUC of the test set was 0.841 (0.810–0.872) [Figure 4a-c]. The AUC of training set, verification set and test set finally stabilized around 0.8–1, and the model prediction effect was accurate. Since the performance of the validation set under the AUC index is lower than that of the test set or the ratio is <10%, the model fit can be considered successful, and the learning curve shows that the training set and the validation set have strong fit and high stability [Figure 4d]. These results show that the LightGBM model can be used for classification modeling of data sets [Figure 4e-f].
Table 3.
Summary of test set, validation set, and training set results of LightGBM model
| AUC | Cutoff | Accuracy | Sensitivity | Specificity | Positive predictive value | Negative predictive value | F1 score | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5 fold cross verification | ||||||||||||||||
| Test set | 0.841 | 0.831 | 0.727 | 0.813 | 0.761 | 0.716 | 0.729 | 0.762 | ||||||||
| Validation set | 0.835 | 0.838 | 0.746 | 0.796 | 0.753 | 0.747 | 0.746 | 0.765 | ||||||||
| Training set | 1.000 | 0.838 | 0.999 | 1.000 | 1.000 | 1.000 | 0.999 | 1.000 | ||||||||
| 10 fold cross verification | ||||||||||||||||
| Test set | 0.819 | 0.769 | 0.747 | 0.828 | 0.659 | 0.737 | 0.749 | 0.78 | ||||||||
| Validation set | 0.856 | 0.799 | 0.754 | 0.848 | 0.760 | 0.762 | 0.753 | 0.799 | ||||||||
| Training set | 1.000 | 0.799 | 0.999 | 1.000 | 1.000 | 1.000 | 0.999 | 1.000 | ||||||||
| 15 fold cross verification | ||||||||||||||||
| Test set | 0.818 | 0.729 | 0.75 | 0.818 | 0.677 | 0.728 | 0.755 | 0.771 | ||||||||
| Validation set | 0.857 | 0.768 | 0.775 | 0.851 | 0.787 | 0.765 | 0.776 | 0.799 | ||||||||
| Training set | 1.000 | 0.768 | 0.999 | 1.000 | 1.000 | 1.000 | 0.999 | 1.000 |
AUC: Area under the curve
Figure 4.

Evaluation of optimal models. (a) ROC curve plot for the training set. (b) ROC curve plot for the validation set. (c) ROC curve plot for the test set. (d) Learning curve plot. (e) Model calibration curve plot. (f) DCA curve plot for the test set
Explanation of SHAP method and individual analysis
In order to visually explain the selected variables, we used Shap to demonstrate how these variables predicted the formation of ALI in the model. The Shap summary plot for the LightGBM model was depicted in Figure 5a, comprising 20 features. Each feature’s importance was represented along the horizontal axis, with different-colored dots denoting the contributions of all patients to the outcomes. Red dots signified high-risk values, while blue dots represented low-risk values. Figure 5b displayed the ranking of the 20 risk factors based on their mean absolute ShaP values. The x-axis represented the SHAP values indicating the importance in the predictive model, with the top five variables in descending order of importance being diabetes, heart failure, PT, heart rate, and platelet count. Furthermore, we provided two illustrative examples to highlight the model’s interpretability. One example involved a septic patient without ALI, whose SHAP prediction score was relatively low (0.005) [Supplementary Figure 1 (725.3KB, tif) a]. Conversely, the other example featured a septic patient with ALI, having a higher SHAP score (0.932) [Supplementary Figure 1 (725.3KB, tif) b], and it illustrated the contributions of each variable to the sample prediction.
Figure 5.

SHAP Analysis. (a) Summary of SHAP variable contributions. (b) SHAP importance plot. Red indicates high feature values, purple indicates feature values close to the overall mean, and blue indicates low feature values
Model stacking
To enhance the predictive performance of sepsis-related ALI, this study explored ensemble learning methods and model stacking. Upon evaluation, the Logistic Regression (Stacking) model emerged as the most proficient on the test set, exhibiting an AUC of 0.823 [Figure 6]. Furthermore, it demonstrated a robust cross-validation accuracy of 76.1%, alongside a sensitivity of 73.7% and a specificity of 77.5%.
Figure 6.

Model stacking. (a) Test set ROC curve. (b) Verification set ROC curve. (c) Training set ROC curve. (d) Test set DCA curve
External validation
We chose the LightGBM model as the evaluation tool instead of the Logistic Regression (Stacking) model because, in the initial model performance evaluation, LightGBM demonstrated higher AUC, sensitivity, and specificity. Therefore, we decided to further validate the performance of the LightGBM model using additional datasets from the same hospital but from different time periods for external validation. The dataset included 1,034 sepsis patients, comprising 335 patients with ALI and 699 patients without ALI. As shown in Figure 7, the LightGBM model exhibits stable and good ALI risk prediction capability, with an AUC of 0.815 [Table 4]. Early diagnosis and the precision of interventions are pivotal factors in significantly enhancing patient survival. Leveraging our ML model, our objective was to enhance diagnostic accuracy and ensure timely interventions. Furthermore, our model facilitates efficient resource utilization and enables targeted interventions, resulting in reduced health-care costs while ultimately enhancing the overall quality of life for patients.
Figure 7.

External validation of the LightGBM model. (a) Test set ROC curve. (b) Test set DCA curve. (c) Test set calibration curve. (d) Test set confusion matrix heat map. 0 on the axis indicates non-ALI, and 1 indicates patients with ALI. From left to right, there are true positives, false positives, true negatives, and false negatives
Table 4.
Test set analysis of model stacking
| AUC | Cutoff | Accuracy | Sensitivity | Specificity | Positive predictive value | Negative predictive value | F1 score | Kappa | AUC_L | AUC_U | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.815 | 0.831 | 0.726 | 0.776 | 0.715 | 0.755 | 0.723 | 0.765 | 0.237 | 0.788 | 0.841 |
AUC: Area under the curve
Online prediction website
To better assist healthcare professionals in applying this model, we created an online website: https://www.xsmartanalysis.com/model/list/predict/model/html?mid=9102andsymbol=7Yf169RTby83826tq942. A screenshot of the online tool was shown in Supplementary Figure 2 (1.3MB, tif) . Simply enter the necessary information and the patient will be discriminated as either ALI or non-ALI. This online prediction site provides a convenient tool for doctors and healthcare practitioners that combines the power of ML with the expertise of healthcare professionals to effortlessly apply research findings to actual clinical practice. Through this platform, healthcare professionals can efficiently assess patient risk without the need to delve into the complexities of ML models. Real-time application of this model aids in the early identification of patients at higher risk of ALI, allowing for necessary interventions and improving the accuracy of patient risk assessment.
DISCUSSION
Sepsis-induced systemic inflammatory response can lead to liver cell damage, necrosis, and infiltration of inflammatory cells, resulting in ALI.[11] This injury can cause liver dysfunction and may be accompanied by complications such as renal failure, pulmonary inflammation, and abnormal blood coagulation.[12] The severity of sepsis-associated ALI lies in their interaction, exacerbating each other’s pathological processes.[13] Impaired liver function can result in the accumulation of toxins and more severe inflammatory responses.[14] At the same time, ALI makes the liver more susceptible to invasive infectious diseases and worsens systemic inflammatory response and organ failure.[15] Therefore, early and accurate assessment of liver injury is crucial for the treatment and prognosis of patients with sepsis.
Although SOFA and SAPS II scoring systems have a history of application in the risk assessment of critically ill patients, they are not without pitfalls. These scoring systems are often limited by the experience of the physician, and they exclude many valuable latent variables, which somewhat limits their precision.[16] In addition, these traditional scoring systems may be cumbersome in the process and require more manual intervention, thus introducing human error.[17] In contrast, ML models offer a more powerful and automated way to predict the risk of adverse outcomes in critically ill patients. For example, van Doorn et al.[18] found that ML models were more sensitive than physicians and clinical risk scores in predicting 31-day mortality in patients with sepsis presenting to the emergency department (92% vs. 72% vs. 78%; P < 0.001) while maintaining similar specificity (78% vs. 74% vs. 72%; P > 0.02) and high diagnostic accuracy.
In this study, nine ML algorithms were employed, including logistics, XGBoost, LightGBM, RF, AdaBoost, GNB, MLP, SVM, and KNN, to explore their applications in predicting the risk of ALI in ICU sepsis patients. It is noteworthy that XGBoost exhibited the best performance on the training set, but on the validation set, LightGBM outperformed, with an AUC of 0.841, accuracy of 0.727, sensitivity of 0.813, and specificity of 0.761. XGBoost’s superior performance on the training set might indicate a degree of overfitting to the training data.[19] In contrast, LightGBM might have better regularization mechanisms, helping to mitigate overfitting issues on the validation set, and it can handle large-scale datasets and exhibit sensitivity to complex relationships.[20] Among ML and traditional predictive models, the LightGBM model performed best in predicting sepsis and ALI, which is consistent with some previous research. Zhang and Gong[21] demonstrated that for prediagnosis of acute liver failure in comparative analysis, if there is a need for longer processing time or multi-class classification with substantial missing data, XGBoost is recommended, while LightGBM is the preferred choice for higher predictive accuracy. Yao et al.[22] showed that in postoperative sepsis mortality prediction models based on ML, XGBoost had higher predictive accuracy compared to generalized linear models, but overfitting was inevitable. Furthermore, the application of ensemble learning methods further improved predictive performance. Model fusion, by combining the outputs of multiple models, increased prediction stability and accuracy.[23] Although the AUC, accuracy, sensitivity, and specificity of model fusion were slightly lower than LightGBM (AUC of 0.823, accuracy of 0.761, sensitivity of 0.737, and specificity of 0.775), it still excelled in risk prediction, providing strong support for scenarios requiring higher robustness. Kaur et al.[24] demonstrated that even when combining all the base models, stacked ensemble models do not always deliver the best performance. In conclusion, the limitations of traditional scoring systems have led us to turn to ML models, particularly LightGBM, to enhance the accuracy and automation of risk prediction for critically ill patients. This study emphasizes the immense potential of ML in the medical field, providing more reliable and comprehensive tools for clinical decision-making.
This study represents the first and only attempt to utilize ML for predicting the risk of patients with sepsis developing ALI. Currently, there is a lack of evidence demonstrating the superiority of ML algorithms in predicting ALI in patients with sepsis. In this research, we developed a ML model for predicting ALI risk based on clinical data of sepsis patients, employing five feature selection methods to narrow down 18 key features for modeling. These features include BMI, APACHE II, heart failure, GLDH, LDH, ALB, ALP, TBil, DBil, APTT, PT, hemoglobin, WBC, respiratory rate, heart rate, diabetes, infection sources, and platelet count. Cross-validation demonstrated that using these 18 features resulted in higher accuracy and AUC scores in the prediction model compared to using all features. Furthermore, since these routine indicators can be easily assessed upon admission, our ML prediction model can be readily integrated into clinical practice. Through SHAP analysis,[25] we identified the top 5 variables that significantly influenced the LightGBM model: Diabetes, heart failure, PT, heart rate, and platelet count. It is important to note that SHAP values reflect the overall impact of features on model predictions, and this does not necessarily mean the impact is the same for all samples, while SHAP sample bars provide the specific contributions of each feature in that particular sample. Consistent with this study’s results, many prior studies have reported that major risk factors for ALI include heart failure, diabetes,[26] PT,[27] and platelet count.[28]
In summary, the LightGBM model developed in this study proves highly valuable for health-care providers treating patients with sepsis. The proposed algorithm excels in predicting the occurrence of ALI in patients with sepsis, offering accuracy, sensitivity, and specificity based on the optimal ROC curve. This predictive model facilitates more efficient allocation of hospital resources, especially for patients with critical conditions, while assisting in providing more personalized care and reducing medical errors caused by fatigue from extended ICU shifts. Designing effective predictive models can improve the quality of care and enhance patient survival rates. Therefore, the LightGBM model holds significant potential in identifying high-risk patients with sepsis, aiding in the formulation of effective supplementary and treatment care plans. This can be achieved by reducing ambiguity through quantitative, objective, and evidence-based models for risk stratification, prediction, and the ultimate care plan, ultimately reducing complications and increasing patient survival chances.
CONCLUSIONS
The LightGBM model demonstrated superior predictive performance in identifying the risk of ALI in septic patients. The combination of LightGBM model and patient clinical data can predict ALI in sepsis patients in a timely and accurate manner. The online prediction website serves as a valuable tool for health-care practitioners to apply this research in a clinical setting.
Research quality and ethics statement
This study was approved by the Institutional Review Board/Ethics Committee (Ethics committee of Lishui Central Hospital of Zhejiang Province). The authors followed applicable EQUATOR Network (https://www.equator-network.org/) guidelines during the conduct of this research project.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
Individual analysis. (a) sepsis patients without acute liver injury (ALI). (b) sepsis patients with ALI. SHAP values represent prediction-related characteristics of an individual patient and the contribution of each characteristic to mortality prediction. f(X) is the probabilistic prediction value, the red element (left) represents the element that increases the risk of ALI, and the blue element (right) represents the element that reduces the risk of ALI. the length of the arrow helps visualize the extent of the impact on the prediction. The longer the arrow, the greater the effect
Sample screenshot of an online tool to predict the risk of sepsis patients developing ALI. (a) The probability of a sepsis patient developing ALI was predicted by the online tool to be 31.711%. (b) The probability of a sepsis patient developing ALI predicted by the online tool was 88.179%. By inputting the corresponding parameters in the online model, the probability of sepsis patients developing ALI can be calculated automatically, and the management of the disease can be suggested
REFERENCES
- 1.Sun Y, Cai Y, Zang QS. Cardiac autophagy in sepsis. Cells. 2019;8:141. doi: 10.3390/cells8020141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Ren Y, Zhang L, Xu F, Han D, Zheng S, Zhang F, et al. Risk factor analysis and nomogram for predicting in-hospital mortality in ICU patients with sepsis and lung infection. BMC Pulm Med. 2022;22:17. doi: 10.1186/s12890-021-01809-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Yang W, Tao K, Zhang P, Chen X, Sun X, Li R. Maresin 1 protects against lipopolysaccharide/d-galactosamine-induced acute liver injury by inhibiting macrophage pyroptosis and inflammatory response. Biochem Pharmacol. 2022;195:114863. doi: 10.1016/j.bcp.2021.114863. [DOI] [PubMed] [Google Scholar]
- 4.Koch DG, Speiser JL, Durkalski V, Fontana RJ, Davern T, McGuire B, et al. The natural history of severe acute liver injury. Am J Gastroenterol. 2017;112:1389–96. doi: 10.1038/ajg.2017.98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Devarbhavi H, Asrani SK, Arab JP, Nartey YA, Pose E, Kamath PS. Global burden of liver disease: 2023 update. J Hepatol. 2023;79:516–37. doi: 10.1016/j.jhep.2023.03.017. [DOI] [PubMed] [Google Scholar]
- 6.Lu Y, Shi Y, Wu Q, Sun X, Zhang WZ, Xu XL, et al. An overview of drug delivery nanosystems for sepsis-related liver injury treatment. Int J Nanomedicine. 2023;18:765–79. doi: 10.2147/IJN.S394802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zhang C, Lu Y. Study on artificial intelligence: The state of the art and future prospects. J Ind Inf Integr. 2021;23:100224. [Google Scholar]
- 8.Janiesch C, Zschech P, Heinrich K. Machine learning and deep learning. Electron Mark. 2021;3:685–95. [Google Scholar]
- 9.Salehi H, Burgueño R. Emerging artificial intelligence methods in structural engineering. Eng Struct. 2018;171:170–89. [Google Scholar]
- 10.Garg A, Mago V. Role of machine learning in medical research: A survey. Comput Sci Rev. 2021;40:100370. [Google Scholar]
- 11.Eshghi F, Tahmasebi S, Alimohammadi M, Soudi S, Khaligh SG, Khosrojerdi A, et al. Study of immunomodulatory effects of mesenchymal stem cell-derived exosomes in a mouse model of LPS induced systemic inflammation. Life Sci. 2022;310:120938. doi: 10.1016/j.lfs.2022.120938. [DOI] [PubMed] [Google Scholar]
- 12.Squires JE, McKiernan P, Squires RH. Acute liver failure: An update. Clin Liver Dis. 2018;22:773–805. doi: 10.1016/j.cld.2018.06.009. [DOI] [PubMed] [Google Scholar]
- 13.Yan J, Li S, Li S. The role of the liver in sepsis. Int Rev Immunol. 2014;33:498–510. doi: 10.3109/08830185.2014.889129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Keenan BP, Fong L, Kelley RK. Immunotherapy in hepatocellular carcinoma: The complex interface between inflammation, fibrosis, and the immune response. J Immunother Cancer. 2019;7:267. doi: 10.1186/s40425-019-0749-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Bernal W, Jalan R, Quaglia A, Simpson K, Wendon J, Burroughs A. Acute-on-chronic liver failure. Lancet. 2015;386:1576–87. doi: 10.1016/S0140-6736(15)00309-8. [DOI] [PubMed] [Google Scholar]
- 16.Huang H, Liu Y, Wu M, Gao Y, Yu X. Development and validation of a risk stratification model for predicting the mortality of acute kidney injury in critical care patients. Ann Transl Med. 2021;9:323. doi: 10.21037/atm-20-5723. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Yue S, Li S, Huang X, Liu J, Hou X, Zhao Y, et al. Machine learning for the prediction of acute kidney injury in patients with sepsis. J Transl Med. 2022;20:215. doi: 10.1186/s12967-022-03364-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.van Doorn WP, Stassen PM, Borggreve HF, Schalkwijk MJ, Stoffers J, Bekers O, et al. A comparison of machine learning models versus clinical evaluation for mortality prediction in patients with sepsis. PLoS One. 2021;16:e0245157. doi: 10.1371/journal.pone.0245157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Dong J, Chen Y, Yao B, Zhang X, Zeng N. A neural network boosting regression model based on XGBoost. Appl Soft Comput. 2022;125:109067. [Google Scholar]
- 20.Hajihosseinlou M, Maghsoudi A, Ghezelbash R. A novel scheme for mapping of MVT-type Pb – Zn Prospectivity: LightGBM, a highly efficient gradient boosting decision tree machine learning algorithm. Nat Resour Res. 2023;32:1–22. [Google Scholar]
- 21.Zhang D, Gong Y. The comparison of LightGBM and XGBoost coupling factor analysis and prediagnosis of acute liver failure. IEEE Access. 2020;8:220990–1003. [Google Scholar]
- 22.Yao RQ, Jin X, Wang GW, Yu Y, Wu GS, Zhu YB, et al. A machine learning-based prediction of hospital mortality in patients with postoperative sepsis. Front Med (Lausanne) 2020;7:445. doi: 10.3389/fmed.2020.00445. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ganaie MA, Hu M, Malik AK, Tanveer M, Suganthan PN. Ensemble deep learning: A review. Eng Appl Artif Intell. 2022;115:105151. [Google Scholar]
- 24.Kaur H, Poon PK, Wang SY, Woodbridge DM. Depression level prediction in people with Parkinson’s disease during the COVID-19 pandemic. Annu Int Conf IEEE Eng Med Biol Soc. 2021;2021:2248–51. doi: 10.1109/EMBC46164.2021.9630566. [DOI] [PubMed] [Google Scholar]
- 25.Bai BL, Wu ZY, Weng SJ, Yang Q. Application of interpretable machine learning algorithms to predict distant metastasis in osteosarcoma. Cancer Med. 2023;12:5025–34. doi: 10.1002/cam4.5225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Jovanović J, Milovanović DR, Sazdanović P, Sazdanović M, Radovanović M, Novković L, et al. Risk factors profile for liver damage in cardiac inpatients. Vojnosanit Pregl. 2020;77:934–42. [Google Scholar]
- 27.Kakisaka K, Suzuki Y, Kataoka K, Okada Y, Miyamoto Y, Kuroda H, et al. Predictive formula of coma onset and prothrombin time to distinguish patients who recover from acute liver injury. J Gastroenterol Hepatol. 2018;33:277–82. doi: 10.1111/jgh.13819. [DOI] [PubMed] [Google Scholar]
- 28.Xu X, Hou Z, Xu Y, Gu H, Liang G, Huang Y. The dynamic of platelet count as a novel and valuable predictor for 90-day survival of hepatitis B virus-related acute-on-chronic liver failure patients. Clin Res Hepatol Gastroenterol. 2021;45:101482. doi: 10.1016/j.clinre.2020.06.008. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Individual analysis. (a) sepsis patients without acute liver injury (ALI). (b) sepsis patients with ALI. SHAP values represent prediction-related characteristics of an individual patient and the contribution of each characteristic to mortality prediction. f(X) is the probabilistic prediction value, the red element (left) represents the element that increases the risk of ALI, and the blue element (right) represents the element that reduces the risk of ALI. the length of the arrow helps visualize the extent of the impact on the prediction. The longer the arrow, the greater the effect
Sample screenshot of an online tool to predict the risk of sepsis patients developing ALI. (a) The probability of a sepsis patient developing ALI was predicted by the online tool to be 31.711%. (b) The probability of a sepsis patient developing ALI predicted by the online tool was 88.179%. By inputting the corresponding parameters in the online model, the probability of sepsis patients developing ALI can be calculated automatically, and the management of the disease can be suggested
