Skip to main content
Heliyon logoLink to Heliyon
. 2024 Sep 1;10(17):e37179. doi: 10.1016/j.heliyon.2024.e37179

Construction of a machine learning-based prediction model for unfavorable discharge outcomes in patients with ischemic stroke

Yuancheng He a,f, Xiaojuan Zhang b, Yuexin Mei c, Deng Qianyun d, Xiuqing Zhang e, Yuehua Chen f,g, Jie Li f,g, zhou Meng h, Yuehong Wei f,i,
PMCID: PMC11408056  PMID: 39296250

Abstract

Background

Ischemic stroke is a common and serious disease with economic and healthcare burdens. Predicting the unfavorable discharge outcome of patients is essential for formulating appropriate treatment strategies and providing personalized care. Therefore, this study aims to establish and validate a prediction model based on machine learning methods to accurately predict the discharge outcome of ischemic stroke patients, providing valuable information for clinical decision making.

Methods

The derivation data consisted of 964 patients from Guangdong Provincial People's Hospital and was used for training and internal validation. A favourable discharge outcome was defined as a National Institutes of Health Stroke Scale score of ≤1 or a decrease of ≥8 points compared to the admission score. A predictive model was created based on 88 medical characteristics gathered during the patient's initial admission, using nine machine learning algorithms. The model's predictive performance was compared using various evaluation metrics. The final model's feature importance was ranked and explained using the Shapley additive explanation method.

Findings

The random forest model demonstrated the greatest discriminative ability among the nine machine learning models. We created an interpretable random forest model by ranking and reducing the features based on their importance, which included eight features. In internal validations, the final model accurately predicted the discharge outcomes of ischemic stroke with AUC values of 0.903 and has been translated into a convenient tool to facilitate its utility in clinical settings.

Conclusions

Our explainable ML model was not only successfully developed to accurately predict discharge outcomes in patients with ischemic stroke and it mitigated the concern of the “black-box” issue with an undirect interpretation of the ML technique.

Keywords: Ischemic stroke, Discharge outcome, Machine learning, Prediction model, SHAP

1. Introduction

Stroke, the leading cause of death globally, has caused over 20,000 annual deaths in China [1,2]. It is estimated that 67.3–80.5 % of strokes are classified as ischemic strokes(IS) [3]. The high disability, mortality, and recurrence rates associated with ischemic stroke pose a serious threat to the health of the population, resulting in significant economic and healthcare burdens for families and society [4]. Prediction and explanation of the clinical outcome of ischemic stroke patients is crucial for the development of targeted interventions. In patients with ischemic stroke, thrombus composition [5], size [5], location [5], time of stroke onset [5], haemoglobin [6]and BMI [7], etc. are associated with clinical outcome.

At present, the evaluation indicators for the treatment efficacy of ischemic stroke patients mainly include the National Institutes of Health Stroke Scale (NIHSS) [8] and modified Rankin Scale scores (mRS) [9]. The NIHSS is used to assess the severity of patients' condition and neurological deficits, while the mRS is used to evaluate patients’ functional status and quality of life [8,9].

Interpretable prediction models for treatment outcomes have been successfully developed using outcome indicators such as modified Rankin Scale scores [[10], [11], [12], [13], [14]], readmission rates [15], and mortality rates [16], despite the recent publication of numerous studies on ischemic stroke patients. Single indicators such as triglyceride-glucose (TyG) [17], neutrophil-to-lymphocyte ratio (NLR) [18], platelet-to-lymphocyte ratio (PLR) [18] and lymphocyte-to-monocyte ratio (LMR) [18] have been used to predict outcome in ischemic stroke patients. However, the search for a single biomarker with sufficient sensitivity or specificity has been unsuccessful and inconvenient [19] due to the complexity of the pathophysiological mechanisms of stroke [20].

However, there is limited research on predicting changes in NIHSS scores at discharge. The NIHSS score is a comprehensive tool for assessing neurological impairment in patients, encompassing various domains such as limb movement, sensation, speech, and visual function. By evaluating the extent of functional deficits in these domains, it allows for a more accurate determination of the severity of a patient's condition [21]. Therefore, changes in the NIHSS score at discharge may be a more accurate predictor of patient recovery and outcomes. Collecting information on demographic characteristics, examining related indicators and other factors at the time of hospitalisation, and establishing a multi-factor model to predict the degree of neurological deficit at discharge can help clinicians better assess patients' rehabilitation potential and develop individualised treatment plans.

In recent years, machine learning (ML) methods based on electronic medical records (EMR) have gained attention and recognition from clinical physicians [22,23]. The widespread use of electronic medical records in hospitals allows for more accurate and convenient collection of clinical data from patients. Currently, many ML techniques have been employed in the development of ischemic strokes prediction models, with the majority demonstrating good predictive value [24,25].However, despite the powerful capabilities of ML methods due to the complexity of the models, they are still limited by the difficulty of directly interpreting what is known as the “black box”. To overcome these issues with the “black box,” the Shapley Additive Explanation (SHAP) method is employed to interpret ML models and visualize individual variable predictions [26,27]. The SHAP method is a unified approach used in early research to explain the output results of ML models. However, there is limited research on changes in discharge NIHSS scores in ischemic stroke patients using the SHAP method to interpret predictive models.

The aim of this study is to establish and validate interpretable machine learning models that can predict early and accurate changes in discharge NIHSS scores in ischaemic stroke patients. The study aims to elucidate the importance of features and interpret the model using the SHAP method. In addition, the prognostic significance of the final model in ischaemic stroke patients will be determined.

2. Methods

2.1. Study design

This study employed a domestic cross-sectional research design to develop and validate a predictive model for the discharge outcome of patients with ischemic stroke.

2.2. Study population

The derivation data consists of patients who were discharged from Guangdong Provincial People's Hospital, China, between January 1, 2022, and December 25, 2023. These patients were diagnosed with ischemic stroke, and cases with missing key information (NIHSS scores at admission and discharge) or cases of deceased individuals were excluded from the data.

2.3. Data collection and definition

The data collection process included the gathering of personal information, medical history, family history, current illness, clinical symptoms, in-hospital and post-discharge laboratory and specialty examination results, length of hospitalisation and surgical interventions. These data points were sourced from the electronic medical record system of the same neurological stroke unit.

Additionally, four variables were further identified, namely TyG [17], NLR [18], PLR [18], and LMR [18]. NLR was calculated as the ratio of neutrophil count to lymphocyte count, PLR as the ratio of platelet count to lymphocyte count, LMR as the ratio of lymphocyte count to monocyte count, and TyG as Ln(Triglyceride (mg/dL)*Glucose (mg/dL)/2). The admission NIHSS score was divided into five grades: Grade 1 (0 points), Grade 2 (1–4 points), Grade 3 (5–15 points), Grade 4 (16–20 points), and Grade 5 (21–42 points) [28]. The admission mRS (modified Rankin Scale) assessment results were divided into two grades: Grade 1 (0–1 points) and Grade 2 (2–6 points) [29]. Discharge outcomes were determined based on the change in NIHSS score at discharge for ischaemic stroke patients. If the NIHSS score at discharge was ≤1 point or showed a decrease of ≥8 points compared to admission, it was considered a favourable discharge outcomes; otherwise, it was considered an unfavorable discharge outcomes [30]. Additional variables are detailed in Supplementary Table S1.

2.4. Derived data processing

A total of 99 variables were collected, and the proportion of missing data for each variable is provided in Supplementary Table S2. Features with more than 20 % missing values(n = 193) were excluded from the analysis to minimize bias caused by missing data. Interpolation is conducted for variables exhibiting less than 20 % missingness. In the case of normally distributed variables, multifilling is employed, whereas non-normally distributed variables are interpolated by transforming to a normal distribution or by utilising a random forest to predict the missing values. Univariate analysis was performed on all the remaining independent variables, and significant variables(p < 0.05) will be considered for inclusion in the model, as shown in Table 1. Due to potential multicollinearity between features, when two variables were highly correlated (correlation coefficient >0.6), one of the variables with less correlation to the outcome was removed from the dataset using Spearman's correlation analysis [31], as shown in Supplementary Figure S1. Detailed correlations between variables are presented in Supplementary Appendix S1. The correlations between variables and the outcome are shown in Supplementary Appendix 2.

Table 1.

Comparison of a subset of the demographic and clinical characteristics between unfavorable discharge outcomes and favourable discharge outcomes in the derivation data.

Variable unfavorable Discharge
Outcomes(n = 506)
favourable Discharge
Outcomes(n = 458)
p-value
Length of hospitalisation 10.36 ± 4.91 9.27 ± 4.18 <0.001*
Age, year#,a 64.54 ± 11.90 64.09 ± 11.31 0.547
Sexc
 Male, n(%) 367 (72.5 %) 322 (70.3 %) 0.489
 Female, n(%) 139 (27.5 %) 136 (29.7 %)
Smokingc
 Smoking = 0, n(%) 293 (57.9 %) 268 (58.5 %) 0.899
 Smoking = 1, n(%) 213 (42.1 %) 190 (41.5 %)
Alcoholc
 Alcohol = 0, n(%) 413 (81.6 %) 378 (82.5 %) 0.776
 Alcohol = 1, n(%) 93 (18.4 %) 80 (17.5 %)
Lymphocyte count, 10^9/L#,b 1.54 ± 0.59 2.00 ± 0.69 <0.001*
Glycosylated hemoglobin, %#,a 6.66 ± 1.70 6.59 ± 1.68 0.505
NLR#,b 4.40 ± 3.76 2.74 ± 2.54 <0.001*
PLR#,b 193.83 ± 121.19 131.90 ± 66.45 <0.001*
LMR#,a 3.03 ± 1.64 3.11 ± 1.50 0.413
Monocytes were counted, 10^9/L#,b 5.60 ± 2.45 4.63 ± 1.86 <0.001*
Ratio of neutrophils#,a 0.66 ± 0.11 0.63 ± 0.10 <0.001*
Platelet Count, 10^9/L#,b 258.28 ± 86.13 233.59 ± 75.89 <0.001*
Plasma fibrinogen content, g/L#,a 3.97 ± 1.28 3.61 ± 1.12 <0.001*
triglyceride-glucose#,b 2.58 ± 0.48 2.43 ± 0.40 <0.001*
Triglycerides, mmol/L#,b 1.41 ± 0.72 1.53 ± 1.06 0.037*
Systolic blood pressure on admission, mmHg#,a 152.61 ± 20.03 141.33 ± 19.15 <0.001*
Diastolic blood pressure on admission, mmHg#,a 85.22 ± 13.32 85.00 ± 13.26 0.796
NIHSSd
 NIHSS = 1, n(%) 9(1.8 %) 144(31.4 %) <0.001*
 NIHSS = 2, n(%) 204(40.3 %) 253(55.2 %)
 NIHSS = 3, n(%) 258(51.0 %) 24(5.2 %)
 NIHSS = 4, n(%) 22(4.3 %) 20(4.4 %)
 NIHSS = 5, n(%) 13(2.6 %) 17(3.7 %)
History of cerebral infarctionc
 History of cerebral infarction = 0, n(%) 334 (66.0 %) 348 (76.0 %) <0.001*
 History of cerebral infarction = 1, n(%) 172 (34.0 %) 110 (24.0 %)
Family history of hypertensionc
 Family history of hypertension = 0,n(%) 290 (57.3 %) 272 (59.4 %) 0.557
 Family history of hypertension = 1,n(%) 216 (42.7 %) 186 (40.6 %)
Interventionc
 Intervention = 0, n(%) 476 (94.1 %) 427 (93.2 %) 0.688
 Intervention = 1, n(%) 30 (5.9 %) 31 (6.8 %)
mRSc
 mRS = 0, n(%) 248 (49.0 %) 384 (83.8 %) <0.001*
 mRS = 1, n(%) 258 (51.0 %) 74 (16.2 %)

Note: All data were collected as the first-time measurements at the time of admission.

NLR:neutrophil-to-lymphocyte ratio; PLR:platelet-to-lymphocyte ratio; mRS:Modified Rankin Scale; LMR:Lymphocyte-to-monocyte ratio; NIHSS:National Institutes of Health Stroke Scale.

a

T-test was used for univariate analysis.

b

Welch's t-test was used for univariate analysis.

c

Chi-square test was used for univariate analysis.

d

Mann-Whitney U test was used for univariate analysis.

#

means that the statistical expression of the variable is mean ± SD.

2.5. Model development and comparison

Data from the People's Hospital of Guangdong Province is partitioned, with 70 % allocated for training and 30 % for internal validation, in order to mitigate overfitting concerns. Additionally, an external dataset is utilized for testing purposes.

The predictive model was established using the remaining characteristics after exclusion by univariate analysis and collinearity analysis. Nine maximum likelihood models, namely Classification and Regression Trees (CART), Extra Trees (ET), K-Nearest Neighbors (KNN), LightGBM, Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM),Categorical Boosting(CAT), and Extreme Gradient Boosting (XGBoost), were employed to predict the discharge outcomes of ischemic stroke patients. To optimize the predictive model, a combination of grid search and manual tuning was used to obtain the final set of hyperparameters.

The reliability of the models was evaluated using common performance metrics, including the area under the receiver-operating-characteristic (ROC) curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and F1 score. Additionally, to validate the predictive model, five-fold and ten-fold cross-validation were performed on the derived data.

2.6. Feature selection and model explanation

Obtaining correct explanations for ML models is challenging. The SHAP method is a technique that ranks the importance of input features and explains the predictions made by the model and is designed to overcome the black box problem [27].

SHAP value-assisted feature selection was employed to restrict the prediction model from 20 to 2 features, in accordance with the rank of feature importance. This resulted in the selection of the final model with the best predictive ability in the process of reducing features. The nonparametric method of Delong [32] was used to compare the difference between AUCs. The features of the chosen ML model were gradually reduced until the AUC was dramatically decreased.

The SHAP method provides both global and local explanations for model interpretation. The global explanation provides consistent and accurate attribute values for each feature in the model, revealing the associations between input features and discharge outcomes. The local explanation demonstrates specific predictions for individual cases by inputting specific data.

2.7. Statistical analysis

Data analysis was conducted using R Version 4.2.0. The normality of the data was assessed using the Kolmogorov-Smirnov test. For normally distributed measurement data, an independent sample t-test was used for comparison. For non-normally distributed measurement data, Welch's t-test was used. Ranked data was compared using the Mann-Whitney U test, and count data was analyzed using either the Chi-square test or Fisher's exact test.

Python 3.8.0 was utilized to construct the machine learning model and generate decision curve analysis (DCA) and risk-reward curve. The SHAP package was employed for feature selection to determine the optimal model, and the area under the curve (AUC) and other indicators were used to evaluate its predictive ability.

3. Results

3.1. Patient characteristics

In this study, a total of 964 patients with ischemic stroke were included, among whom 506 patients were classified as having a favourable outcome at discharge according to the defined criteria, while 458 patients were classified as having an unfavorable outcome. Table 1 shows a subset of the demographic and clinical characteristics of these 964 patients. Table 1 shows a subset of the demographic and clinical characteristics of these 964 patients, and the full table is provided in Supplementary Table S3. Table 1 shows the characteristics of patients with ischemic stroke who were discharged with favourable and unfavorable outcomes. The average length of hospital stay for patients with unfavorable discharge outcomes was 10.36 days, while the average length of hospital stay for patients with favourable discharge outcomes was 9.27 days. This difference was found to be statistically significant. The group with unfavorable discharge outcomes had a similar age distribution to the group with favourable discharge outcomes, with average ages of 64.54 years and 64.09 years, respectively. The most common NIHSS grade on admission in these two groups was grade 3 (51 %) and grade 2 (55 %), respectively. Compared to the group with favourable discharge outcomes, there was a higher proportion of men in the group with unfavorable discharge outcomes.

From the initial 88 variables, we identified 24 significant predictors with predictive value. Excluding four variables, namely PLR, monocytes were counted, ratio of neutrophils, and TyG, due to multicollinearity. Finally, a total of 20 variables with predictive value were finally collected, including monocyte ratio, platelet count, white cell count, plasma fibrinogen content, glucose, k+, triglycerides, apolipoproteinAI, lactate dehydrogenase, history of cerebral infarction, systolic blood pressure on admission, level of risk, mRS, D-dimer, NLR, NIHSS, cholinesterase, urobilinogen, lymphocyte count and ratio of lymphocytes.

Data from 674 patients were assigned to the training group, while information from 290 patients was allocated to the internal validation group. A comparison of the demographic and clinical variables related to the 20 predictors among the training, internal validation, and derivation groups is presented in Table 2.

Table 2.

Comparison of demographic and clinical characteristics among the training, internal validation, and external validation data(The following 20 variables have demonstrated significance through single-factor analysis).

Variable Training
Internal validation
Derivation
674 290 964
Uroabc
 Uro = 0,n(%) 610(90.5 %) 256(88.3 %) 866(89.8 %)
 Uro = 1,n(%) 46(6.8 %) 27(9.3 %) 73(7.6 %)
 Uro = 2,n(%) 14(2.1 %) 4(1.4 %) 16(1.7 %)
 Uro = 3,n(%) 3(0.4 %) 2(0.7 %) 7(0.7 %)
 Uro = 4,n(%) 1(0.1 %) 1(0.3 %) 2(0.2 %)
HCIac
 HCI = 0,n(%) 487(72.3 %) 195(67.2 %) 682(70.7 %)
 HCI = 1,n(%) 187(27.7 %) 95(32.8 %) 282(29.3 %)
Risk Levelac
 Risk Level = 0,n(%) 629(93.3 %) 265(91.4 %) 894(92.7 %)
 Risk Level = 1,n(%) 45(6.7 %) 25(8.6 %) 70(7.3 %)
NIHSSabc
 NIHSS = 1,n(%) 106(15.7 %) 47(16.2 %) 153(15.9 %)
 NIHSS = 2,n(%) 323(47.9 %) 134(46.2 %) 457(47.4 %)
 NIHSS = 3,n(%) 195(28.9 %) 87(30.0 %) 282(29.3 %)
 NIHSS = 4,n(%) 30(4.5 %) 12(4.1 %) 42(4.4 %)
 NIHSS = 5,n(%) 20(3.0 %) 10(3.4 %) 30(3.1 %)
mRSabc
 mRS = 0,n(%) 447(66.3 %) 185(63.8 %) 632(65.6 %)
 mRS = 1,n(%) 227(33.7 %) 105(36.2 %) 332(34.4 %)
ChEc,u/L 7301.5 [7011.35 7176.32 [7011.35, 7341.30] 726.385 [7011.35,
, 7341.30] 7341.30]
LymCabc,10^9/L 1.75 [1.70, 1.86] 1.78 [1.70, 1.86] 1.76 [1.70, 1.86]
NLRabc 3.72 [3.10, 3.60] 3.35 [3.10, 3.60] 3.61 [3.10, 3.60]
LymRabc 0.24 [0.22, 0.24] 0.23 [0.22, 0.24] 0.24 [0.22, 0.24]
MonoCabc,10^9/L 0.64 [0.64, 0.71] 0.67 [0.64, 0.71] 0.65 [0.64, 0.71]
PltCac,10^9/L 245.98 [237.08, 258.68] 247.88 [237.08, 258.68] 246.55 [237.08, 258.68]
WBCac,10^12/L 7.71 [7.69, 8.31] 8 [7.69, 8.31] 7.8 [7.69, 8.31]
Fibac,g/L 3.76 [3.73, 4.02] 3.87 [3.73, 4.02] 3.8 [3.73, 4.02]
Gluabc, mmol/L 6.25 [5.96, 6.63] 6.29 [5.96, 6.63] 6.26 [5.96, 6.63]
Kac,mmol/L 3.65 [3.63, 3.72] 3.68 [3.63, 3.72] 3.66 [3.63, 3.72]
TGac, mmol/L 1.52 [1.27, 1.41] 1.34 [1.27, 1.41] 1.47 [1.27, 1.41]
ApoAIac,g/L 1.19 [1.14, 1.20] 1.17 [1.14, 1.20] 1.18 [1.14, 1.20]
LDHac,U/L 189.77 [181.40, 195.08] 188.24 [181.40, 195.08] 189.31 [181.40, 195.08]
D_dimerac,ng/ml 950.31 [742.98, 1213.36] 978.17 [742.98, 1213.36] 958.69 [742.98, 1213.36]
SBPabc, mmHg 147.6 [144.05, 148.79] 146.42 [144.05, 148.79] 147.25 [144.05, 148.79]

Continuous variables are presented as mean (confidence interval), while categorical variables are presented as numbers (percentages). The derivation dataset includes both training and internal validation data. In the column of significance, “a” indicates a significant difference between unfavorable discharge outcomes and favourable discharge outcomes in the derivation dataset, “b” indicates a significant difference between unfavorable discharge outcomes and favourable discharge outcomes in the training set, and “c” indicates a significant difference between unfavorable discharge outcomes and favourable discharge outcomes in the internal validation set.

ChE:Cholinesterase; LymC:Lymphocyte count; NeutR:Ratio of neutrophils; LymR:Ratio of lymphocytes; MonoC:Monocyte ratio; PltC:Platelet Count; WBC:White cell count; Fib:Plasma fibrinogen content; Glu:Glucose; TG:Triglycerides; ApoAI:ApolipoproteinAI; LDH:Lactate dehydrogenase; SBP:Systolic blood pressure on admission; HCI:History of cerebral infarction; Risk Level:Level of risk; NIHSS:National Institutes of Health Stroke Scale; mRS:Modified Rankin Scale; Uro:Urobilinogen.

The Uro was graded according to its reading: a reading of 0 was defined as grade 0, a reading of 0 or less than or equal to 34 as grade 1, a reading of 34 or less than or equal to 68 as grade 2, a reading of 68 or less than or equal to 135 as grade 3, a reading of 135 or less than or equal to 202 as grade 4, and a reading of more than 202 as grade 5.NIHSS score was divided into five grades: Grade 1 (0 points), Grade 2 (1–4 points), Grade 3 (5–15 points), Grade 4 (16–20 points), and Grade 5 (21–42 points). The discharge mRS (modified Rankin Scale) assessment results were divided into two grades: Grade 1 (0–1 points) and Grade 2 (3–6 points).

3.2. Model development and performance comparison

Data collected from Guangdong Provincial People's Hospital was used to establish nine ML models for predicting the discharge outcomes of patients with ischemic stroke.Among the nine models, the Catboost model demonstrated the best predictive performance (AUC = 0.916) when incorporating 20 variables. It was followed by the ET model (AUC = 0.909), RF model (AUC = 0.900), and LightGBM model (AUC = 0.898).The discriminative performance of these nine models is presented in Supplementary Table S4. The ROC curves and SHAP summary plots of the top 20 features for the four best-performing maximum likelihood models are shown in Fig. 1A and Supplementary Fig. S2A–D.

Fig. 1.

Fig. 1

Performance of machine learning models to predict discharge outcomes in patients with ischemic stroke.

(A) ROC curves of the top five best-performing machine learning models. (B) AUCs of the top four best-performing machine learning models with varied numbers of features. (C) AUC, sensitivity, specificity, NPV, PPV, accuracy, and F1 score of the random forest model with varied numbers of features.

AUC: area under the ROC curve; LightGBM: light gradient boosting machine; ML: machine learning; RF: random forest; ROC: receiver-operating-characteristic; ET: extra tree; CART: Classification and Regression Trees.

Based on the Delong test results, there is a significant difference in the AUC values between models with 8 and 5 independent variables in the RF model. Furthermore, there is a significant difference in the AUC values between models with 5 and 2 independent variables in the RF model. The LightGBM and ET models also exhibit a significant difference in the AUC values between models with 5 and 2 independent variables. However, the CART model shows no significant difference in the AUC values across different numbers of independent variables, shown in Supplementary Appendix 3.

The RF model consistently demonstrated the best predictive ability and optimal stability among the four models during the process of reducing feature dimensionality based on feature importance rankings. This was observed at the optimal cutoff value determined by the maximum Youden index.

Throughout the feature reduction process based on feature importance ranking, the changes in AUC values for the four models indicated that the Random Forest model consistently maintained superior predictive performance compared to the other models, as illustrated in Fig. 1B. It is therefore evident that the RF model demonstrated the most effective and reliable efficacy and stability in predicting the discharge outcomes of ischemic stroke.

The performance of the RF model with varying numbers of features is detailed in Fig. 1C and Supplementary Table S5. Sensitivity, specificity, PPV, NPV, accuracy, and F1 score were computed at the optimal cutoff value that maximized the Youden index.

3.3. Identification of the final model

The definitive model was established through feature selection within the Random Forest framework. As depicted in Fig. 1C and Supplementary Fig. S3A, the 20-feature model demonstrated significantly superior predictive performance compared to the 2-feature model (ΔAUC = 0.114, P < 0.001) in forecasting outcomes for patients with ischemic stroke at discharge. Nonetheless, it did not exhibit a statistically significant enhancement over the 9-feature model (ΔAUC = 0.001, P = 0.942), the 8-feature model (ΔAUC = 0.003, P = 0.825), and the 7-feature model (ΔAUC = 0.008, P = 0.565). The 8-feature model showcased favourable net benefit and a high threshold probability, akin to the 20-feature model. Furthermore, the area under the Precision-Recall curve of the 8-feature model surpassed that of the 20-feature model, indicating comparable and robust clinical utility for the 8-feature models, as illustrated in Supplementary Fig. S3B-G. From Fig. 1C, it is evident that the 8-feature model excelled in the metrics of sensitivity, specificity, positive predictive value, negative predictive value, accuracy, and F1 score as the number of variables decreased.

Therefore, we focused on the 8-feature Random Forest model, including monocyte ratio, platelet count,glucose, systolic blood pressure on admission, lymphocyte count, neutrophil-to-lymphocyte ratio, NIHSS, and mRS, as the final model for further analysis.The final RF model yielded an AUC of 0.903, with a sensitivity of 0.759, specificity of 0.876, PPV of 0.784, NPV of 0.817, accuracy of 0.817, and F1 score of 0.806.

Table 2 is the confusion matrix for this prediction model, showing the prediction performance of the binary classification model for different categories: where the model correctly predicted 127 true negative samples as negative categories (True Negatives), 18 true negative samples incorrectly predicted as positive categories (False Positives), 28 true samples incorrectly predicted as negative categories (False Negatives), and 117 true samples correctly predicted as positive classes (True Positives). Table 3 provides an assessment of prediction performance for the two discharge outcomes, including precision, recall, F1 scores, and the number of samples supported for each category. Specifically, for the unfavorable discharge outcome, the precision rate was 0.82, the recall rate was 0.88, the F1 score was 0.85, and the number of supported samples was 145; for the favourable discharge outcome, the precision rate was 0.87, the recall rate was 0.81, the F1 score was 0.84, and the number of supported samples was also 145 (see Table 4).

Table 3.

Confusion matrix.

Confusion Matrix True value
Positives Negatives
Predictive value Positives 127 18
Negatives 28 117

Table 4.

Explanation of the classified report.

Indicators Precision Recall F1-score Support
0 (unfavorable discharge outcome) 0.82 0.88 0.85 145
1 (favourable discharge outcome) 0.87 0.81 0.84 145

To validate the appropriate sample size for this study and the robustness of the model to site variations, further cross-validation was conducted. As shown in Supplementary Figures S4A and S4B, the average AUC values of the final model in five-fold and ten-fold cross-validation were 0.886 and 0.890, respectively.

The study confidently analyzed the predictive capacities of various factors, such as NIHSS, lymphocyte count, systolic blood pressure on admission, monocyte ratio, NLR, mRS, glucose, and platelet count, in relation to the discharge outcomes of patients with ischemic stroke. These factors were compared with the 8-feature final model. As depicted in Supplementary Fig. S5A, NIHSS (ΔAUC = 0.088, P < 0.001), NLR (ΔAUC = 0.305, P < 0.001), MonoC (ΔAUC = 0.236, P < 0.001), SBP (ΔAUC = 0.361, P < 0.001), LymC (ΔAUC = 0.299, P < 0.001), mRS (ΔAUC = 0.206, P < 0.001), Glu (ΔAUC = 0.335, P < 0.001), and PltC (ΔAUC = 0.454, P < 0.001). The Decision Curve Analysis (DCA) curves further demonstrated that the final model exhibited superior clinical utility compared to monocyte ratio, platelet count,glucose, systolic blood pressure on admission, lymphocyte count, NLR, NIHSS, and mRS, as depicted in Supplementary Fig. S5B.

3.4. Model explanation

In order to improve the interpretability of the model, the SHAP method has been employed to provide insights into the final model's output by quantifying the contribution of each variable to the prediction. The global explanation outlines the overall behavior of the model. The SHAP summary plots (Fig. 2A and B) present the feature contributions to the model based on average SHAP values, displayed in descending order. Furthermore, the SHAP dependence plot aids in understanding how a single feature influences the prediction model's output. In Fig. 2C–I, the comparison between real values and SHAP values for the 8 features is illustrated, with SHAP values below zero indicating a positive class prediction in the model, signifying an elevated risk of unfavorable Discharge Outcomes. For example, individuals diagnosed with ischemic stroke presenting a mRS grade >0 or a NIHSS grade ≥1 exhibited SHAP values below zero, influencing the classification towards the “unfavorable Discharge Outcomes” category. Moreover, an NLR value ≥ 2.3 and a lymphocyte count value ≥ 2.2 also steered the decision towards the“unfavorable Discharge Outcomes”classification. While the scatter plots of the remaining variables exhibit a clustering of data points around SHAP = 0, a clear linear trend can be observed. The monocyte ratio demonstrates an inverse relationship with unfavorable outcomes in patients with ischemic stroke, indicating that higher values are associated with a lower likelihood of unfavorable Discharge Outcomes. On the other hand, the variables systolic blood pressure on admission, glucose, and platelet count show a positive correlation with unfavorable outcomes in patients with ischemic stroke, suggesting that higher values are linked to a higher probability of unfavorable Discharge Outcomes.

Fig. 2.

Fig. 2

Global model explanation by the SHAP method.

(A) SHAP summary bar plot. (B) SHAP summary dot plot. The probability of AKI development increases with the SHAP value of a feature. A dot is made for SHAP value in the model for each single patient, so each patient has one dot on the line for each feature. The colors of the dots demonstrate the actual values of the features for each patient, where red indicates a higher feature value and blue indicates a lower feature value. The dots are stacked vertically to show density. (CI) SHAP dependence plot. Each dependence plot shows how a single feature affects the output of the prediction model, and each dot represents a single patient.LymC: Lymphocyte count; NLR: neutrophil-to-lymphocyte ratio; MonoC: Monocyte ratio; PltC: Platelet Count; NIHSS: National Institutes of Health Stroke Scale; mRS: Modified Rankin; Glu: Glucose; SBP: Systolic blood pressure on admission.

Moreover, the local interpretation delved into elucidating the rationale behind a specific prediction for an individual by integrating personalized input data. Fig. 3A–C depicted a case of ischemic stroke in a patient who did experience unfavorable discharge outcomes. Fig. 3A portrayed this individual as belonging to the “favourable Discharge Outcomes” category with a probability of 16 %, while Fig. 3B depicted the same individual as belonging to the “unfavorable Discharge Outcome” category with a probability of 84 %, as per the predictive model. Fig. 3C presented the SHAP values and specific values corresponding to each variable for this particular patient.It was noted that the values of platelet count, glucose, lymphocyte count, and NIHSS contributed to the inclination towards the “unfavorable Discharge Outcomes” category, while the converse was observed for the remaining variables. An increase in the portion for each patient is indicative of a greater likelihood of favouring the “unfavorable Discharge Outcome” decision.

Fig. 3.

Fig. 3

Local model explanation by the SHAP method.

Figures A–C represent A patient with ischemic stroke who have favourable discharge outcomes.

LymC: Lymphocyte count; NLR: neutrophil-to-lymphocyte ratio; MonoC: Monocyte ratio; PltC: Platelet Count; NIHSS: National Institutes of Health Stroke Scale; mRS: Modified Rankin; Glu: Glucose; SBP: Systolic blood pressure on admission.

3.5. Convenient application for clinical utility

The final prediction model was implemented in a web application to facilitate its use in clinical scenarios, as shown in Fig. 4. When the actual values of the 8 features required for the model are entered, this application automatically predicts the risk of an unfavorable discharge outcome for a patient.The web application can be accessed online via https://randomforest-model-for-ischemic-stroke.streamlit.app using a mobile phone or computer.

Fig. 4.

Fig. 4

Convenient application for clinical utility.

The convenient application of the final Random Forest model with 8 features is available for unfavorable discharge outcomes prediction. When entering actual values of the 8 features, this application automatically displays the probability of 84 %.

4. Discussion

To the best of our knowledge, this is the first cross-sectional study with internal validation that investigates and compares nine ML models for predicting outcomes at discharge based on the NIHSS in patients with ischemic stroke. We identified a set of predictive risk factors and developed a predictive model for ischemic stroke patients using ML algorithms combined with clinical and laboratory data easily extracted from electronic medical records systems.

Although many studies have focused on prediction in patients with ischemic stroke up to now, implementing a single biomarker analysis in clinical practice seems challenging due to the potential pathophysiological differences in ischemic stroke. ML techniques are powerful computational methods for handling complex and vast datasets, as they can deal with highly variable datasets and understand the complex relationships between variables in a flexible and trainable manner. Combining data from electronic medical records systems with mature ML algorithms aids in the development of clinical prediction models [33]. Among the nine ML models evaluated, the RF model exhibited the best area under the curve, superior net benefit, and a high feature reduction threshold probability. Random forests are a widely used tool in the medical field, with a growing body of evidence supporting their ability to provide reliable predictive value by integrating the results of multiple decision trees [[34], [35], [36]]. In this study, we developed a final model consisting of eight features using the RF algorithm. The clinical accessibility and convenience of these features enable easy acquisition and assessment upon admission of patients with ischemic stroke. Therefore, we believe that this model has the potential to serve as an early identification tool for patients with ischemic stroke, aiding healthcare teams in making more accurate predictions and intervention decisions at the early stages of patient admission, thereby improving clinical outcomes and treatment efficacy.

Due to the lack of guiding principles or consensus on feature selection for predictive models, determining the optimal number of features to include in a model remains a challenge. While incorporating more features may provide additional information for predictive models, including a large number of features can potentially limit the clinical applicability of the model, and the inclusion of non-causal features may compromise prediction accuracy [37]. The utilization of the SHAP method assists in feature selection. The model we ultimately developed is a simple and convenient machine learning predictive model that can be easily utilized to support clinical decision-making for patients with ischemic stroke.

Incorporating NIHSS, NLR, monocyte ratio, systolic blood pressure on admission, lymphocyte count, mRS, glucose, and platelet count as the final model's eight features, these variables reflect various physiological states and disease characteristics of the patients. In other predictive models for ischemic stroke, regardless of whether the outcome measures are 90-day mRS score [14], mortality rate [38], or 30-day readmission rate [15], NIHSS often plays a significant role. Early neurological improvement (ENI) and early neurological deterioration (END) are closely associated with NLR in acute ischemic stroke patients [18]. Enhancing angiogenesis, neuroregeneration, and neuroprotection through platelets and their microparticles is feasible and improves functional recovery after stroke [39]. Following acute ischemic stroke (AIS), peripheral monocytes migrate to the site of the lesion within 24 h, reaching their peak levels at 3–7 days, and subsequently undergo differentiation into macrophages [40]. The extent of infarct size and severity of neurological impairment following ischemic stroke are closely correlated with systemic blood pressure [41]. Fasting blood glucose is positively associated with the risk of stroke, exhibiting a non-linear dose-response relationship [42]. Depletion of lymphocytes in patients with acute ischemic stroke is associated with unfavorable neurological functional outcomes [43]. The aforementioned evidence suggests the rationality of including these variables in the final model. While individual consideration of certain features, such as systolic blood pressure on admission (AUC = 0.542), glucose (AUC = 0.568), and lymphocyte count(AUC = 0.449), shows relatively weaker predictive abilities for discharge outcomes, the integration of these features into our comprehensive model demonstrates significantly stronger predictive performance (AUC = 0.903).

Among the numerous articles examining predictive models for patients with cerebral infarction, our study is the inaugural investigation to utilize the alteration in NIHSS score at discharge as an outcome indicator. In comparison to other studies that frequently utilize mRS scores to evaluate patients' functional outcomes at three months, our model demonstrated strong performance with an AUC of 0.903. For example, previous studies have reported AUCs of 0.87 [44] for random forest models and 0.858 [45] for Cox proportional risk regression when evaluating mRS at three months. Additionally, in predicting mortality three months after admission, a random forest model achieved an AUC of 0.909, while for morbidity at three months, the AUC was 0.738 [46]. Other studies examining 3-month functional outcomes using various models, such as logistic regression (AUC = 0.764) and random forest (AUC = 0.757) [47], show comparable results. Our model's performance is thus comparable or superior to these reported results. Detailed comparisons are provided in Supplementary Appendix S4. A comparison of these studies [[48], [49], [50], [51]] revealed that, despite differing predictive objectives, the NIHSS score was consistently identified as a crucial predictor of outcome in patients with cerebral infarction. Additionally, our model highlighted the pivotal role of the neutrophil-to-lymphocyte ratio and monocyte ratio, which are readily accessible and widely utilized metrics in clinical practice, aligning well with the immediate outcome prediction focus of our study. This emphasizes the importance of considering multiple factors when predicting the discharge outcomes of ischemic stroke patients, as they collectively provide a more accurate and reliable assessment. Our model surpasses traditional scoring systems, showcasing the value of employing a comprehensive approach in predicting patient outcomes.

ML techniques are often described as a “black box” with little explanation of how predictions are generated. This can lead to reluctance among clinicians to use them, as they hesitate to make medical decisions based on opaque information. This brings another advantage of our study: we utilize the SHAP method to explain the “black box” of the ML model. The SHAP method provides both global explanations, describing the overall functionality of the model, and local explanations, detailing how specific predictions for individual patients are made based on personalized input data. This explanatory capability offers clinicians greater transparency and trust, enabling them to confidently apply ML models in medical decision-making. Moreover, with a convenient tool based on the Streamlit framework, this prediction model can be used on the webpage and shared with more clinicians.

Our final model performed well in internal validation with AUC values of 0.903. These results indicate that our model has high predictive accuracy and reliability. Accurately predicting discharge outcomes is crucial for clinical physicians to make timely interventions and treatment decisions in patients with ischemic stroke. Previous studies have mainly focused on long-term outcomes such as 90-day mRS scores [52] and one-year recurrence rates [53]. The NIHSS score is a standardised tool for assessing the severity and prognosis of stroke patients, covering multiple functional areas of the nervous system. From a research perspective, accurately predicting changes in NIHSS scores can also help to assess the effectiveness of different treatment strategies and compare the efficacy of various clinical interventions. By analysing the accuracy and reliability of prediction models, researchers can provide stronger evidence for future clinical practice and promote personalisation and precision in the treatment of stroke patients. Therefore, constructing a predictive model for discharge outcomes in patients with ischemic stroke holds important clinical implications. This model enables healthcare professionals to initiate targeted interventions in a timely manner, such as implementing proactive rehabilitation plans or specific pharmacological treatments, to enhance patient recovery and prevent complications. Additionally, it provides a more accurate estimation of prognosis, facilitating shared decision-making between clinicians and patients and ensuring appropriate healthcare planning.

We acknowledge several limitations in this study. Firstly, the complex pathophysiological mechanisms and multiple etiologies of ischemic stroke make it unclear whether our model performs well in predicting various types of ischemic stroke. In addition, it should be noted that our model was developed using data from the population of Guangdong Province, China, and its applicability to national or global populations is uncertain. Nevertheless, we have performed internal validation on the ischemic stroke population, which provides some evidence for the generalisability of our results, and further evaluation of the applicability of the model is needed. Thirdly, although machine learning techniques require a large amount of data to build predictive models, there are currently no standards to calculate the sample size required for machine learning-based predictive models. However, our good performance in cross-validation indicates that an appropriate sample size and the design of internal validation provide sufficient power for exploring and constructing predictive models for discharge outcomes in the ischemic stroke population. Fourthly, despite the fact that ischemic stroke can be classified into multiple subtypes in clinical practice, the limited sample size in this study precludes separate prediction and modeling for each subtype of ischemic stroke.

In conclusion, we have successfully developed an interpretable ML model to predict the discharge outcomes of ischemic stroke patients, based on easily extractable clinical data from electronic medical records systems. The final RF model demonstrated good predictive capability for discharge outcomes in internal validations.

Ethics and consent

This study was approved by the Medical Research Ethics Committee of Guangdong Provincial People's Hospital on March 30, 2021 (Ethics code:KY-Z-2020-387-04), and the study protocol was approved by the Ethical Review Boards of all the participating centers. Prior to the commencement of data collection, all relevant parties, such as patients and healthcare providers, were informed about the purpose and nature of the research. All patients provided written informed consent to participate in the study and for their data to be published. Any potential ethical concerns or conflicts have been proactively identified and addressed to ensure the integrity and credibility of this research.

Sources of funding

This research was supported by the National Natural Science Foundation of China (82002235), the Key Project of Medicine Discipline of Guangzhou (Grant Number 2021–2023-11) and the Basic Research Project of Key Laboratory of Guangzhou (No.202102100001).

Data and code Availability

Sharing research data helps other researchers evaluate your findings, build on your work, and increase trust in your article. We encourage all our authors to make as much of their data publicly available as reasonably possible.

  • 1.
    Has data associated with your study been deposited into a publicly available repository?
    • -
      No
  • 2.
    Please select why. Please note that this statement will be available alongside your article upon publication:
    • -
      The authors do not have permission to share data

CRediT authorship contribution statement

Yuancheng He: Writing – review & editing, Writing – original draft, Validation, Supervision, Resources, Project administration, Funding acquisition, Formal analysis, Data curation, Conceptualization. Xiaojuan Zhang: Writing – original draft, Supervision, Software, Resources, Methodology, Investigation, Formal analysis. Yuexin Mei: Validation, Supervision, Software, Project administration, Methodology, Investigation, Conceptualization. Deng Qianyun: Visualization, Supervision, Resources, Methodology, Investigation. Xiuqing Zhang: Writing – review & editing, Resources, Project administration, Funding acquisition, Formal analysis. Yuehua Chen: Writing – original draft, Software, Project administration. Jie Li: Validation, Funding acquisition, Conceptualization. zhou Meng: Project administration, Investigation, Conceptualization. Yuehong Wei: Writing – review & editing, Supervision, Software, Project administration, Investigation, Formal analysis, Data curation.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

We are grateful to the Guangdong Provincial People's Hospital for the provision of data and to the following sources of funding: the National Natural Science Foundation of China (82002235), the Key Project of Medicine Discipline of Guangzhou (Grant Number 2021–2023-11) and the Basic Research Project of Key Laboratory of Guangzhou (No.202102100001).

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.heliyon.2024.e37179.

Contributor Information

Yuancheng He, Email: heyuancheng2022@163.com.

Xiaojuan Zhang, Email: zhangxj67@mail2.sysu.edu.cn.

Yuexin Mei, Email: meiyx5@mail2.sysu.edu.cn.

Deng Qianyun, Email: dengqianyun@gdph.org.cn.

Xiuqing Zhang, Email: zhang_xiuqing@gibh.ac.cn.

Yuehua Chen, Email: 15507468258@163.com.

Jie Li, Email: ljieecho@163.com.

zhou Meng, Email: 649587540@qq.com.

Yuehong Wei, Email: wei_yh0928@163.com.

Appendix A. Supplementary data

The following is the Supplementary data to this article:

Multimedia component 1
mmc1.pdf (718.6KB, pdf)

References

  • 1.Roth G.A., Forouzanfar M.H., Moran A.E., et al. Demographic and epidemiologic drivers of global cardiovascular mortality. N. Engl. J. Med. 2015;372(14):1333–1341. doi: 10.1056/NEJMoa1406656. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Gyedu A., Stewart B.T., Nakua E., et al. Assessment of risk of peripheral vascular disease and vascular care capacity in low- and middle-income countries. Br. J. Surg. 2016;103(1):51–59. doi: 10.1002/bjs.9956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Feigin V.L., Lawes C.M., Bennett D.A., et al. Stroke epidemiology: a review of population-based studies of incidence, prevalence, and case-fatality in the late 20th century. Lancet Neurol. 2003;2(1):43–53. doi: 10.1016/s1474-4422(03)00266-7. [DOI] [PubMed] [Google Scholar]
  • 4.Perry L.A., Berge E., Bowditch J., et al. Antithrombotic treatment after stroke due to intracerebral haemorrhage. Cochrane Database Syst. Rev. 2017;5(5) doi: 10.1002/14651858.CD012144.pub2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Joundi R.A., Menon B.K. Thrombus composition, imaging, and outcome prediction in acute ischemic stroke. Neurology. 2021;97(20 Suppl 2):S68–S78. doi: 10.1212/WNL.0000000000012796. [DOI] [PubMed] [Google Scholar]
  • 6.Yafasova A., Fosbol E.L., Johnsen S.P., et al. Time to thrombolysis and long-term outcomes in patients with acute ischemic stroke: a nationwide study. Stroke. 2021;52(5):1724–1732. doi: 10.1161/STROKEAHA.120.032837. [DOI] [PubMed] [Google Scholar]
  • 7.Liu Z., Sanossian N., Starkman S., et al. Adiposity and outcome after ischemic stroke: obesity paradox for mortality and obesity parabola for favorable functional outcomes. Stroke. 2021;52(1):144–151. doi: 10.1161/STROKEAHA.119.027900. [DOI] [PubMed] [Google Scholar]
  • 8.Kwah L.K., Diong J. National Institutes of health stroke scale (NIHSS) J. Physiother. 2014;60(1):61. doi: 10.1016/j.jphys.2013.12.012. [DOI] [PubMed] [Google Scholar]
  • 9.Ryu W.S., Hong K.S., Jeong S.W., et al. Association of ischemic stroke onset time with presenting severity, acute progression, and long-term outcome: a cohort study. PLoS Med. 2022;19(2) doi: 10.1371/journal.pmed.1003910. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Jiang L., Miao Z., Chen H., et al. Radiomics analysis of diffusion-weighted imaging and long-term unfavorable outcomes risk for acute stroke. Stroke. 2023;54(2):488–498. doi: 10.1161/STROKEAHA.122.040418. [DOI] [PubMed] [Google Scholar]
  • 11.Mistry E.A., Hart K.W., Davis L.T., et al. Blood pressure management after endovascular therapy for acute ischemic stroke: the BEST-II randomized clinical trial. JAMA. 2023;330(9):821–831. doi: 10.1001/jama.2023.14330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Yao Z., Mao C., Ke Z., et al. An explainable machine learning model for predicting the outcome of ischemic stroke after mechanical thrombectomy. J Neurointerv Surg. 2023;15(11):1136–1141. doi: 10.1136/jnis-2022-019598. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Li L., Han Z., Wang R., et al. Association of admission neutrophil serine proteinases levels with the outcomes of acute ischemic stroke: a prospective cohort study. J. Neuroinflammation. 2023;20(1):70. doi: 10.1186/s12974-023-02758-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Mastrorilli D., Mezzetto L., D'Oria M., et al. National Institutes of Health stroke scale score at admission can predict functional outcomes in patients with ischemic stroke undergoing carotid endarterectomy. J. Vasc. Surg. 2022;75(5):1661–1669. doi: 10.1016/j.jvs.2021.11.079. [DOI] [PubMed] [Google Scholar]
  • 15.Kumar A., Roy I., Bosch P.R., et al. Medicare claim-based national Institutes of health stroke scale to predict 30-day mortality and hospital readmission. J. Gen. Intern. Med. 2022;37(11):2719–2726. doi: 10.1007/s11606-021-07162-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Fernandez-Lozano C., Hervella P., Mato-Abad V., et al. Random forest-based prediction of stroke outcome. Sci. Rep. 2021;11(1) doi: 10.1038/s41598-021-89434-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Yang Y., Huang X., Wang Y., et al. The impact of triglyceride-glucose index on ischemic stroke: a systematic review and meta-analysis. Cardiovasc. Diabetol. 2023;22(1):2. doi: 10.1186/s12933-022-01732-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Gong P., Liu Y., Gong Y., et al. The association of neutrophil to lymphocyte ratio, platelet to lymphocyte ratio, and lymphocyte to monocyte ratio with post-thrombolysis early neurological outcomes in patients with acute ischemic stroke. J. Neuroinflammation. 2021;18(1):51. doi: 10.1186/s12974-021-02090-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Tao J., Xie X., Luo M., et al. Identification of key biomarkers in ischemic stroke: single-cell sequencing and weighted co-expression network analysis. Aging (Albany NY) 2023;15(13):6346–6360. doi: 10.18632/aging.204855. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Qin C., Yang S., Chu Y.H., et al. Correction To: signaling pathways involved in ischemic stroke: molecular mechanisms and therapeutic interventions. Signal Transduct Target Ther. 2022;7(1):278. doi: 10.1038/s41392-022-01129-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Zhang H., Yang G., Dong A. Prediction model between serum vitamin D and neurological deficit in cerebral infarction patients based on machine learning. Comput. Math. Methods Med. 2022;2022 doi: 10.1155/2022/2914484. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Sung S.F., Lin C.Y., Hu Y.H. EMR-based phenotyping of ischemic stroke using supervised machine learning and text mining techniques. IEEE J Biomed Health Inform. 2020;24(10):2922–2931. doi: 10.1109/JBHI.2020.2976931. [DOI] [PubMed] [Google Scholar]
  • 23.Heo J., Yoon J.G., Park H., et al. Machine learning-based model for prediction of outcomes in acute stroke. Stroke. 2019;50(5):1263–1265. doi: 10.1161/STROKEAHA.118.024293. [DOI] [PubMed] [Google Scholar]
  • 24.Castonguay A.C., Zoghi Z., Zaidat O.O., et al. Predicting functional outcome using 24-hour post-treatment characteristics: application of machine learning algorithms in the STRATIS registry. Ann. Neurol. 2023;93(1):40–49. doi: 10.1002/ana.26528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Jo H., Kim C., Gwon D., et al. Combining clinical and imaging data for predicting functional outcomes after acute ischemic stroke: an automated machine learning approach. Sci. Rep. 2023;13(1) doi: 10.1038/s41598-023-44201-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Jabal M.S., Joly O., Kallmes D., et al. Interpretable machine learning modeling for ischemic stroke outcome prediction. Front. Neurol. 2022;13 doi: 10.3389/fneur.2022.884693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Lundberg S.L.S.M., Lee S.L.S., Guyon I.Guyon I., et al. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017 [Google Scholar]
  • 28.Warner J.J., Harrington R.A., Sacco R.L., et al. Guidelines for the early management of patients with acute ischemic stroke: 2019 update to the 2018 guidelines for the early management of acute ischemic stroke. Stroke. 2019;50(12):3331–3332. doi: 10.1161/STROKEAHA.119.027708. [DOI] [PubMed] [Google Scholar]
  • 29.Waseem H., Salih Y.A., Burney C.P., et al. Efficacy and safety of the telestroke drip-and-stay model: a systematic review and meta-analysis. J. Stroke Cerebrovasc. Dis. 2021;30(4) doi: 10.1016/j.jstrokecerebrovasdis.2021.105638. [DOI] [PubMed] [Google Scholar]
  • 30.Feil K., Berndt M.T., Wunderlich S., et al. Endovascular thrombectomy for basilar artery occlusion stroke: analysis of the German Stroke Registry-Endovascular Treatment. Eur. J. Neurol. 2023;30(5):1293–1302. doi: 10.1111/ene.15694. [DOI] [PubMed] [Google Scholar]
  • 31.Hu J., Xu J., Li M., et al. Identification and validation of an explainable prediction model of acute kidney injury with prognostic implications in critically ill children: a prospective multicenter cohort study. EClinicalMedicine. 2024;68 doi: 10.1016/j.eclinm.2023.102409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.DeLong E.R., DeLong D.M., Clarke-Pearson D.L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–845. [PubMed] [Google Scholar]
  • 33.Goecks J., Jalili V., Heiser L.M., et al. How machine learning will transform biomedicine. Cell. 2020;181(1):92–101. doi: 10.1016/j.cell.2020.03.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Kang M.W., Kim J., Kim D.K., et al. Machine learning algorithm to predict mortality in patients undergoing continuous renal replacement therapy. Crit. Care. 2020;24(1):42. doi: 10.1186/s13054-020-2752-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Potash E., Ghani R., Walsh J., et al. Validation of a machine learning model to predict childhood lead poisoning. JAMA Netw. Open. 2020;3(9) doi: 10.1001/jamanetworkopen.2020.12734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Zhang X, Yue P, Zhang J, et al. A novel machine learning model and a public online prediction platform for prediction of post-ERCP-cholecystitis (PEC) EClinicalMedicine. 2022;48:101431. doi: 10.1016/j.eclinm.2022.101431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Xu X., Nie S., Zhang A., et al. A new criterion for pediatric AKI based on the reference change value of serum creatinine. J. Am. Soc. Nephrol. 2018;29(9):2432–2442. doi: 10.1681/ASN.2018010090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Ramos-Pachon A., Lopez-Cancio E., Bustamante A., et al. D-dimer as predictor of large vessel occlusion in acute ischemic stroke. Stroke. 2021;52(3):852–858. doi: 10.1161/STROKEAHA.120.031657. [DOI] [PubMed] [Google Scholar]
  • 39.Hayon Y., Shai E., Varon D., et al. The role of platelets and their microparticles in rehabilitation of ischemic brain tissue. CNS Neurol. Disord.: Drug Targets. 2012;11(7):921–925. doi: 10.2174/1871527311201070921. [DOI] [PubMed] [Google Scholar]
  • 40.Han D., Liu H., Gao Y. The role of peripheral monocytes and macrophages in ischemic stroke. Neurol. Sci. 2020;41(12):3589–3607. doi: 10.1007/s10072-020-04777-9. [DOI] [PubMed] [Google Scholar]
  • 41.Zhao Y., Zhang X., Chen X., et al. Neuronal injuries in cerebral infarction and ischemic stroke: from mechanisms to treatment. Int. J. Mol. Med. 2022;49(2) doi: 10.3892/ijmm.2021.5070. submitted for publication. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Shi H., Ge Y., Wang H., et al. Fasting blood glucose and risk of Stroke: a Dose-Response meta-analysis. Clin Nutr. 2021;40(5):3296–3304. doi: 10.1016/j.clnu.2020.10.054. [DOI] [PubMed] [Google Scholar]
  • 43.Juli C., Heryaman H., Nazir A., et al. The lymphocyte depletion in patients with acute ischemic stroke associated with poor neurologic outcome. Int. J. Gen. Med. 2021;14:1843–1851. doi: 10.2147/IJGM.S308325. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Yao Z., Mao C., Ke Z., et al. An explainable machine learning model for predicting the outcome of ischemic stroke after mechanical thrombectomy. J Neurointerv Surg. 2023;15(11):1136–1141. doi: 10.1136/jnis-2022-019598. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Jiang L., Miao Z., Chen H., et al. Radiomics analysis of diffusion-weighted imaging and long-term unfavorable outcomes risk for acute stroke. Stroke. 2023;54(2):488–498. doi: 10.1161/STROKEAHA.122.040418. [DOI] [PubMed] [Google Scholar]
  • 46.Kim D.Y., Choi K.H., Kim J.H., et al. Deep learning-based personalised outcome prediction after acute ischaemic stroke. J. Neurol. Neurosurg. Psychiatry. 2023;94(5):369–378. doi: 10.1136/jnnp-2022-330230. [DOI] [PubMed] [Google Scholar]
  • 47.Jo H., Kim C., Gwon D., et al. Combining clinical and imaging data for predicting functional outcomes after acute ischemic stroke: an automated machine learning approach. Sci. Rep. 2023;13(1) doi: 10.1038/s41598-023-44201-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Jo H., Kim C., Gwon D., et al. Combining clinical and imaging data for predicting functional outcomes after acute ischemic stroke: an automated machine learning approach. Sci. Rep. 2023;13(1) doi: 10.1038/s41598-023-44201-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Yao Z., Mao C., Ke Z., et al. An explainable machine learning model for predicting the outcome of ischemic stroke after mechanical thrombectomy. J Neurointerv Surg. 2023;15(11):1136–1141. doi: 10.1136/jnis-2022-019598. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Jiang L., Miao Z., Chen H., et al. Radiomics analysis of diffusion-weighted imaging and long-term unfavorable outcomes risk for acute stroke. Stroke. 2023;54(2):488–498. doi: 10.1161/STROKEAHA.122.040418. [DOI] [PubMed] [Google Scholar]
  • 51.Kim D.Y., Choi K.H., Kim J.H., et al. Deep learning-based personalised outcome prediction after acute ischaemic stroke. J. Neurol. Neurosurg. Psychiatry. 2023;94(5):369–378. doi: 10.1136/jnnp-2022-330230. [DOI] [PubMed] [Google Scholar]
  • 52.Zhang M.Y., Mlynash M., Sainani K.L., et al. Ordinal prediction model of 90-day modified Rankin scale in ischemic stroke. Front. Neurol. 2021;12 doi: 10.3389/fneur.2021.727171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Wu S., Shi Y., Wang C., et al. Glycated hemoglobin independently predicts stroke recurrence within one year after acute first-ever non-cardioembolic strokes onset in A Chinese cohort study. PLoS One. 2013;8(11) doi: 10.1371/journal.pone.0080690. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Multimedia component 1
mmc1.pdf (718.6KB, pdf)

Data Availability Statement

Sharing research data helps other researchers evaluate your findings, build on your work, and increase trust in your article. We encourage all our authors to make as much of their data publicly available as reasonably possible.

  • 1.
    Has data associated with your study been deposited into a publicly available repository?
    • -
      No
  • 2.
    Please select why. Please note that this statement will be available alongside your article upon publication:
    • -
      The authors do not have permission to share data

Articles from Heliyon are provided here courtesy of Elsevier

RESOURCES