OptiSelect and EnShap: Integrating machine learning and game theory for ischemic stroke prediction

Pritam Chakraborty; Anjan Bandyopadhyay; Sricheta Parui; Sujata Swain; Partha Sarathy Banerjee; Tapas Si; Hong Qin; Saurav Mallik

doi:10.1371/journal.pone.0328967

. 2025 Aug 13;20(8):e0328967. doi: 10.1371/journal.pone.0328967

OptiSelect and EnShap: Integrating machine learning and game theory for ischemic stroke prediction

Pritam Chakraborty ¹, Anjan Bandyopadhyay ¹, Sricheta Parui ¹, Sujata Swain ¹, Partha Sarathy Banerjee ², Tapas Si ³, Hong Qin ^4,^*, Saurav Mallik ^5,⁶

Editor: Rajesh N V P S Kandala⁷

PMCID: PMC12349079 PMID: 40802707

Abstract

Stroke analysis using game theory and machine learning techniques. The study investigates the use of the Shapley value in predictive ischemic brain stroke analysis. Initially, preference algorithms identify the most important features in various machine learning models, including logistic regression, K-nearest neighbor, decision tree, support vector machine (linear kernel), support vector machine ( RBF kernel), neural networks, etc. For each sample, the top 3, 4, and 5 features are evaluated and selected to evaluate their performance. The Shapley value method was used to rank the models using their best four features based on their predictive capabilities. As a result, better-performing models were found. Afterward, ensemble machine learning methods were used to find the most accurate predictions using the top 5 models ranked by shapely value. The research demonstrates an impressive accuracy of 92.39%, surpassing other proposed models’ performance. This study highlights the utility of combining game theory and machine learning in Ischemic stroke prediction and the potential of ensemble learning methods to increase predictive accuracy in ischemic stroke analysis.

Introduction

Stroke is a devastating neurological condition and a leading cause of adult disability, ranking as the second most common cause of mortality worldwide. The majority of strokes are ischemic, caused by blockages in cerebral arteries due to thrombosis or embolism, leading to restricted blood flow and oxygen deprivation in the brain. Without timely intervention, ischemic strokes can result in irreversible neuronal damage, significantly impacting cognitive and motor functions. The urgency of early treatment is well established, as restoring blood flow within the first few hours of stroke onset can dramatically reduce mortality and long-term complications [1]. However, a significant challenge in stroke management is the occurrence of “wake-up strokes," where patients experience a stroke while asleep, delaying intervention and limiting treatment options [2]. This subset of strokes accounts for approximately a quarter of all cases, complicating the ability to administer thrombolytic therapy within the critical therapeutic window. Despite medical advancements, a substantial proportion of stroke survivors continue to face lifelong disabilities, including paralysis, speech impairments, and cognitive decline, necessitating long-term rehabilitation and care [3,4]. Given these challenges, improving early detection and risk stratification methods is paramount to reducing the burden of stroke-related disabilities and fatalities.

Addressing stroke as a major public health concern requires the rapid and accurate differentiation between ischemic stroke (IS) and intracranial hemorrhage (ICH), as the treatment approaches for these conditions differ significantly. Ischemic strokes occur due to obstructed blood flow, whereas ICH results from ruptured blood vessels causing bleeding within the brain [5–8]. Rapid diagnosis is crucial, as treatment strategies such as intravenous thrombolysis or mechanical thrombectomy must be administered within a narrow therapeutic window to maximize recovery potential. Research has shown that timely hospital arrival and intervention within 3 to 5 hours can significantly enhance survival rates and mitigate the extent of brain damage [9–11]. However, achieving rapid diagnosis in real-world settings remains a challenge due to the variability in stroke symptoms and the limitations of traditional risk assessment methods. Existing clinical scoring systems, such as the ${CHA}_{2} {DS}_{2}$ -VASc score, rely on a fixed set of demographic and clinical parameters, which may not fully capture the complex interactions between risk factors. As a result, there is a growing need for advanced predictive models that can provide more personalized and data-driven risk assessments [12,13].

In recent years, machine learning (ML) and artificial intelligence (AI) have emerged as powerful tools in stroke prediction and diagnosis. Traditional statistical models, though useful, often fail to leverage the full predictive potential of multi-dimensional patient data. To address this limitation, we explore ensemble learning techniques, which combine multiple machine learning classifiers to improve prediction accuracy. In this study, we trained eight machine learning models on a dataset containing key clinical attributes, extracting the most relevant features based on predictive performance. The ranking of classifiers was performed using Shapley values, a game-theoretic approach that quantifies the individual contribution of each feature and model to the overall prediction accuracy. By integrating feature selection and classifier fusion, our approach seeks to optimize ischemic stroke prediction, ensuring that the most informative variables and models are prioritized for final decision-making.

Ensemble learning has gained significant traction in the machine learning and computational intelligence communities due to its ability to reduce overfitting, enhance model robustness, and increase generalizability across different datasets. Initially developed to improve classification accuracy, ensemble methods have evolved to address a variety of real-world challenges, including concept drift learning, error correction, feature selection, incremental learning, and confidence estimation [14,15]. By aggregating predictions from multiple classifiers, ensemble approaches can mitigate the limitations of individual models, leading to a more reliable and stable predictive framework. Recent advancements in fusion methods have further enhanced the efficacy of ensemble learning, with techniques such as stacking, boosting, and bagging proving particularly effective in complex classification problems [16–19]. Given the dynamic and multi-factorial nature of stroke risk assessment, leveraging ensemble-based classification allows for a more nuanced understanding of patient risk profiles, facilitating earlier and more accurate stroke detection.

This paper introduces OptiSelect and EnShap, an innovative framework that integrates game theory principles with machine learning techniques to refine ischemic stroke prediction. The OptiSelect method systematically identifies the most relevant clinical features for stroke risk assessment, ensuring that only the most informative variables are utilized in predictive modeling. Meanwhile, the EnShap classifier employs Shapley values to rank machine learning models based on their contribution to prediction accuracy, thereby optimizing classifier selection. By combining feature selection, model ranking, and ensemble learning, our approach aims to enhance predictive performance, improve interpretability, and support timely intervention in stroke management. This study has the potential to revolutionize stroke risk assessment by bridging the gap between machine learning innovation and clinical applicability, ultimately contributing to improved patient outcomes and more efficient healthcare decision-making.

Our contributions.

Two novel methods have been created for feature selection and classification for brain stroke detection.
Optiselect method has been developed for feature selection.
The EnShap classifier was created, where we integrated game theory and machine learning concepts.
Discussion of the outcomes using statistical and multi-criteria decision-making techniques.
The proposed approaches outperform competitive machine learning models also outperform the other proposed machine learning methods.

The rest of the paper is organized as follows : Section Related work reviews the existing literature. Section Materials and methodology describes the dataset and work flow. Section Results and discussion presents our experimental findings and interpretation. Section Conclusion and future work summarizes the key contributions.

Related work

Significant approaches have emerged in recent research focused on ischemic stroke prediction: simulated quantum mechanics-based joint learning networks and machine learning models.

Wang et al. introduced a simulated quantum mechanics-based joint learning network (SQMLP-net), a groundbreaking method that simultaneously segments stroke lesions and assesses TICI grades. The SQMLP-net achieved remarkable performance metrics with a Dice score of 70.98% and an accuracy of 86.78%. Its innovation lies in its single-input, double-output hybrid network design, effectively integrating segmentation and classification branches. This approach showcased superior outcomes compared to existing methods, demonstrating its potential for accurate stroke lesion segmentation and grading [20].

Another avenue explored in ischemic stroke prediction involves machine learning models, particularly Gradient Boosting Trees (GBT), and their application in assessing stroke risk. Notably, an ensemble model combining GBT and Cox models showcased promising results in identifying individuals at high risk of stroke. This amalgamation displayed the best discrimination and calibration for stroke risk prediction. GBT, in isolation, exhibited high discrimination with AUROC values of 0.833 in men and 0.836 in women for predicting stroke risk over nine years. However, combining GBT and Cox models, the ensemble approach outperformed individual models, boasting higher accuracy, specificity, and positive predictive value. These findings highlight the potential of integrating machine learning, specifically ensemble models, in clinical practice for identifying high-risk individuals susceptible to stroke [21–24].

Moreover, machine learning has been extensively explored for long-term stroke recurrence prediction in ischemic stroke patients. Research studies have utilized various models, including Logistic Regression, Extreme Gradient Boosting, Gradient Boosting Machine, Random Forest, Support Vector Machine, and Decision Tree, employing different sampling strategies. The findings suggest that these machine-learning models exhibit substantial associations with stroke recurrence when integrated with laboratory-based variables. Additionally, these models have showcased stability in stroke recurrence prediction over 1–5 year windows, emphasizing the significance of laboratory-based variables in long-term prognosis. Interpretability and performance were evaluated across six interpretable algorithms, combined with four feature selection strategies, demonstrating the potential of machine learning models in predicting long-term stroke recurrence [25–27].

In the pursuit of improved stroke prediction accuracy, a novel hybrid ML approach has been proposed, specifically tailored for predicting cerebral stroke based on physiological data [28]. This approach successfully reduced the false negative rate, signifying a substantial decrease in misdiagnosis rates related to stroke prediction. Leveraging techniques such as Random Forest regression for imputing missing values and automated hyperparameter optimization based on deep neural networks, this approach showcased considerable promise in enhancing stroke prediction accuracy and reducing the occurrence of false negatives, ultimately contributing to improved patient prognosis [29–32].

A study published in [33] applied three contemporary deep learning approaches to predict 6-month ischemic stroke outcomes using the International Stroke Trial dataset. The deep learning models demonstrated superior performance compared to traditional machine learning methods, underscoring their potential in clinical applications. Research featured in [34] compared the performance of deep learning models to that of neurologists in predicting functional outcomes for large vessel occlusion stroke patients. The study found that while both achieved similar accuracy using clinical data alone, the inclusion of imaging data significantly improved the model’s accuracy to 72%, compared to 64% for neurologists. This suggests that deep learning models, particularly when incorporating imaging data, can enhance predictive accuracy in clinical settings [35].

A study in [36] developed a deep learning-based imaging model combined with clinical variables to predict 90-day stroke outcomes. The model demonstrated reduced subjectivity and user burden, indicating its utility in clinical practice. Research published in the [37] proposed a deep learning network based on Wasserstein Generative Adversarial Networks with Gradient Penalty (WGAN-GP) to generate high-fidelity stroke-positive example data for data augmentation. The results indicated that WGAN-GP effectively extracted key information from stroke samples, enhancing the generalization of the regression network. A predictive analytics approach utilizing machine learning and neural networks has been developed for stroke prediction. This method systematically analyzes various factors in electronic health records to identify the most important predictors of stroke, demonstrating the effectiveness of neural networks in medical data analysis.

Integrating the study of game theory with brain stroke research offers promising avenues for enhancing stroke prediction and treatment strategies. In one instance, while the primary focus of the research was on game-theoretic approaches in machine learning to combat adversarial attacks in cybersecurity applications, an intriguing parallel exists with stroke prediction. Developing robust learning techniques using game theory to counter adversarial attacks aligns with the urgency to anticipate strokes accurately and prevent potential brain damage. Researchers could fortify prediction accuracy by applying similar game-theoretic principles to stroke prediction models, thus preempting strokes and minimizing the neurological impact [38,39].

Moreover, exploring the gamification of machine learning techniques, particularly in object classification scenarios [40], bears relevance to stroke prediction methodologies. Although this paper primarily examines gamification’s role in understanding classifier development, it presents an interesting angle when considering the visualization and interpretation of data related to stroke risk factors. Researchers may unravel critical insights that aid in more accurate predictions and targeted preventive measures by employing gamification principles to analyze and interpret medical data, such as identifying markers for stroke occurrence.

These studies on game theory’s application in machine learning and classifier development offer insightful parallels and methodologies that, when adapted and integrated into stroke prediction research, hold the potential to significantly enhance the accuracy and efficiency of predicting strokes and their subsequent neurological effects.

Materials and methodology

This section details the dataset used, preprocessing techniques, feature selection methodology, classification approach, and evaluation metrics employed to validate the effectiveness of our proposed framework.

Dataset

This research utilizes the stroke prediction dataset publicly available on Kaggle. The dataset comprises 5110 rows and 12 columns containing specific features relevant to stroke prediction. Table 1 provides a detailed description of each column. Fig 1 shows us the correlation between different dataset features.

Table 1. Dataset description.

Feature	Description	Data Type	Non-Null Count
gender	Patient’s gender	object	4981
age	Patient’s age	float64	4981
hypertension	Presence of hypertension (1: yes, 0: no)	int64	4981
heart_disease	History of heart disease (1: yes, 0: no)	int64	4981
ever_married	Marital status (married/not married)	object	4981
work_type	Type of work (Govt_job/Never_worked/Private/Self-employed/children)	object	4981
Residence_type	Residence type (Urban/Rural)	object	4981
avg_glucose_level	Average glucose level	float64	4981
BMI	Body mass index	float64	4981
smoking_status	Smoking status (never smoked/formerly smoked/smokes)	object	4981
stroke	Stroke prediction (1: stroke found, 0: stroke not found)	int64	4981

Open in a new tab

The target variable, ‘stroke‘, is binary, indicating the presence (1) or absence (0) of stroke risk. Analyzing the distribution of this variable reveals a class imbalance, with significantly more instances of non-stroke (0) compared to stroke (1). The dataset undergoes oversampling to address this class imbalance and improve model accuracy. This process increases the number of instances in the minority class (stroke) to match the majority class (no stroke), balancing the data and creating a more representative sample for training and evaluation. The result of oversampling is a dataset with 4861 instances for stroke and no-stroke classes. Fig 2 shows us the difference between the number of stroke classes before and after sampling.

Data imbalance is a common challenge in medical datasets, where the prevalence of certain classes, such as ‘stroke’ cases, is significantly lower than others. This imbalance can lead to biased predictions, with the model favoring the majority class. To address this issue, we employed Synthetic Minority Oversampling Technique (SMOTE), which generates synthetic samples for the minority class by interpolating between existing samples. This approach effectively balances the dataset, resulting in 4861 instances for both the ‘stroke’ and ‘no stroke’ classes.

In addition to SMOTE, we could also explore alternative imbalance mitigation techniques, such as:

ADASYN (Adaptive synthetic sampling): Unlike SMOTE, ADASYN focuses on generating more samples for minority class examples that are harder to classify, further improving class balance and model performance.
Cost-sensitive learning: Assigning higher misclassification costs to the minority class during model training encourages the model to focus more on correctly classifying underrepresented instances.
Ensemble techniques: Methods such as balanced bagging or boosting create multiple models trained on resampled datasets, ensuring better representation of the minority class across the ensemble.

While the Kaggle dataset provided a valuable resource for developing and validating the proposed framework, it presents certain limitations that warrant consideration. The dataset, being publicly available and preprocessed, may not fully capture the complexities and heterogeneities observed in real-world clinical settings. Key challenges include limited diversity in patient demographics, potential biases in data collection, and the absence of certain clinically relevant features such as longitudinal patient records, genetic markers, or imaging data.

To overcome these limitations, future studies should focus on integrating real-world clinical datasets, which can enhance the robustness and applicability of the proposed framework. These datasets, often collected from hospital networks or health information exchanges, offer several advantages:

Increased diversity: Real-world datasets encompass a broader spectrum of patient populations, including diverse ethnicities, age groups, and comorbid conditions, ensuring the model is generalizable across different demographics.

Rich data sources: Incorporating clinical datasets that include imaging data, laboratory results, genetic information, and longitudinal records can provide a comprehensive view of patient health, enabling the framework to learn more complex patterns.

Validation in clinical environments: Future studies should aim to validate the framework in real-world clinical workflows, assessing its predictive performance, usability, and impact on clinical decision-making in live settings.

Data privacy and ethics: Leveraging federated learning approaches or privacy-preserving machine learning techniques can facilitate the use of sensitive clinical data while maintaining patient confidentiality and complying with regulations like HIPAA or GDPR.

Geometric mean.

The geometric mean for the ‘STROKE’ class was computed using the formula:

Geometric Mean = \sqrt[n]{x_{1} \times x_{2} \times \dots \times x_{n}}

(1)

where: $x_{1}, x_{2}, \dots, x_{n}$ represent the numerical values within the ‘STROKE’ class. n denotes the total number of samples in the ‘STROKE’ class.

The resulting geometric mean of approximately 811.96 reveals the average relationship or pattern among the numerical values within the ‘STROKE’ group when multiplied and then taken to the 201st root. This analysis provides insights into the central tendency of the numerical attributes within the ‘STROKE’ category. It offers a comparative perspective on numerical trends between the ‘STROKE’ and ‘NOT STROKE’ classes.

Potential biases in the dataset and their impact on real-world deployment.

The dataset used in this study, sourced from Kaggle, may not fully represent the diversity of real-world patient populations. One key limitation is that publicly available datasets often exhibit demographic imbalances, such as an overrepresentation of specific age groups, ethnicities, or geographic regions. This can lead to biased model predictions when applied to broader populations.

For instance, if the dataset primarily includes data from a certain demographic (e.g., middle-aged individuals with access to healthcare), the model may not generalize well to underserved populations, such as elderly patients in rural areas or individuals with limited access to medical services. Additionally, the dataset lacks crucial clinical variables such as genetic markers, lifestyle factors, or socioeconomic status, which could significantly influence stroke risk but are not accounted for in the current model.

These biases could impact real-world deployment by:

Reduced Generalizability: The model may perform well on data similar to the training set but may fail to provide accurate predictions for diverse patient groups.
Disparities in Healthcare Outcomes: If the model systematically underestimates stroke risk in certain populations, it could lead to inadequate preventive measures and increased health disparities.
Regulatory and Ethical Concerns: Bias in AI-driven clinical tools may raise concerns about fairness and equity, potentially affecting acceptance by healthcare professionals and regulatory bodies.

Proposed OptiSelect feature extraction

Let X represent the training dataset with n samples and m = 10 features:

X = [\begin{matrix} x_{1, 1} & x_{1, 2} & \dots & x_{1, 10} \\ x_{2, 1} & x_{2, 2} & \dots & x_{2, 10} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ x_{n, 1} & x_{n, 2} & \dots & x_{n, 10} \end{matrix}]

(2)

where x_i,j denotes the value of feature j in sample i.

We train eight models using each dataset feature X to predict the target variable.

The feature extraction process has commenced with meticulously segregating each feature alongside the stroke column from the dataset, creating ten distinct datasets, each exclusively containing a singular feature as shown in Fig 3. Following this segmentation, all eight models have been individually trained using the ten datasets, and the achieved accuracies have been systematically recorded, as presented in Table 2.

Table 2. Algorithm accuracy comparison for each feature.

Algorithm	Gender	Age	Hypertension	Heart Disease	Married	Work	Residence	Glucose	BMI	Smoking Status
LR	50.6	72.4	51.9	53.2	61.4	57.3	51.7	63.1	56.5	52.9
KNN	89.4	92.1	88.7	88.9	87.9	88.4	88.7	93.6	88.8	88.1
SVM-L	50.6	63.4	52.0	53.2	61.4	56.3	51.8	50.3	49.7	52.8
SVM-R	71.0	73.0	64.4	62.9	69.2	65.3	68.2	58.8	56.8	67.6
NN	72.7	72.4	69.3	66.7	68.7	68.3	67.4	63.0	56.1	70.2
NB	53.0	72.4	54.1	52.3	61.4	56.1	53.9	60.1	60.6	56.5
ADA	60.7	76.4	65.2	66.1	67.5	66.4	66.0	72.6	67.0	64.0
MNB	57.1	61.9	54.1	54.9	61.6	55.6	52.3	59.9	55.2	53.2

Open in a new tab

After this comprehensive evaluation, all accuracies for each model have been meticulously sorted, and all the accuracies are given in Table 2, leading to a conscientious ranking of features for each model. Specifically, the top 5 features have been identified based on their performance, as depicted in Table 3. A selective approach was then employed to choose the top 3, top 4, and top 5 features for further analysis based on their consistent performance across models.

Table 3. Ranking of features for each algorithm.

Algo	Gender	Age	Hypertension	Heart Disease	Married	Work	Residence	Glucose	BMI	Smoking Status
LR	10	1	8	6	3	4	9	2	5	7
KNN	3	2	6	4	9	7	6	1	5	8
SVM-L	8	1	6	4	2	3	7	9	10	5
SVM-R	2	1	7	8	3	6	4	9	10	5
NN	2	1	4	7	5	6	7	9	10	3
NB	9	1	7	10	2	6	8	4	3	5
ADA	10	1	8	6	3	5	7	2	4	9
MNB	4	1	8	7	2	5	10	3	6	9

Open in a new tab

Top 3 features.

In this section, we focused on discerning the top three features associated with different machine learning models; for instance, KNN exhibited substantial reliance on heart disease, glucose, and age as pivotal features, while LR emphasized Age, Glucose, and Marital Status (Married) as influential factors. SVM-L and SVM-R highlighted features like heart disease, marital status, work-related factors, age, gender, and glucose levels, indicating their impact on model accuracies. Additionally, NB identified age, marital status, and glucose as primary determinants. NN showed importance in gender, age, and hypertension, ADA in age, marital status, and glucose, and MNB in gender, age, and marital status. Accompanying this feature identification, the algorithm accuracies have been determined which is also shown in Fig 4, portraying LR with an accuracy of 51.896%, KNN with 89.381%, SVM-L with 54.22%, SVM-R with 65.665%, NN with 63.885%, NB with 52.142%, ADA with 61.885%, and MNB with 51.589%. These accuracy assessments denote the individual performance levels achieved by the respective algorithms.

Top 4 features.

We have used the top four features of various machine learning models here. Each model was again trained with these four attributes, attaining elevated accuracies across all models examined, as shown in Fig 5. K-Nearest Neighbors (KNN) presented notable features including ‘heart disease’, ‘glucose’, ‘age’, and ‘gender’ and attained an accuracy of 89.823%. Logistic Regression (LR) achieved an accuracy of 56.89%, which includes ‘Age’, ‘Glucose’, ‘Married’, and ‘Work’ as influential attributes. The Support Vector Machine with Linear Kernel (SVM-L) obtained 55.057% highlighted ‘heart disease’, ‘married’, ‘work’, and ‘age’ as crucial factors. Meanwhile, the Support Vector Machine with Radial Kernel (SVM-R) has trained with ‘age’, ‘gender’, ‘married’, and ‘residence’ and got 67.889% accuracy. Naive Bayes (NB) underscored ‘age’, ‘married’, ‘glucose’, and ‘BMI’. For Neural Networks (NN), ‘gender’, ‘age’, ‘hypertension’, and ‘smoking’ emerged as influential factors. Additionally, ADA Boost identified ‘age’, ‘married’, ‘glucose’, and ‘BMI’. In contrast, Multinomial Naive Bayes (MNB) pinpointed ‘gender’, ‘age’, ‘married’, and ‘glucose’ as pivotal attributes, and NN displayed 65.55%, NB yielded 55.689%, ADA recorded 65.297%, and MNB reached 52.339% of accuracy. We can see that the accuracy of all the models with four features outperformed all models with only three features.

Top 5 features.

Several salient features of the machine learning models under investigation were revealed: Support Vector Machine with Linear Kernel (SVM-L) emphasized ‘heart disease’, ‘married’, ‘work’, ‘age’, and ‘smoking status’; K-Nearest Neighbors (KNN) displayed ‘heart disease’, ‘glucose’, ‘age’, ‘gender’, and ‘BMI’; Logistic Regression (LR) revealed ‘Age’, ‘Glucose’, ‘Married’, ‘Work’, and ‘BMI’; Support Vector Machine with Radial Kernel (SVM-R) identified ‘age’, ‘gender’, ‘married’, ‘residence’, and ‘smoking status’; Naive Bayes (NB) highlighted ‘age’, ‘married’, ‘glucose’, ‘BMI’, and ‘smoking status’; Neural Networks (NN) depicted ‘gender’, ‘age’, ‘hypertension’, ‘smoking’, and ‘married’; ADA Boost identified ‘age’, ‘married’, ‘glucose’, ‘BMI’, and ‘work’; and Multinomial Naive Bayes (MNB) disclosed ‘gender’, ‘age’, ‘married’, ‘glucose’, and ‘work’. As a result, the accuracy for LR was 50.253%, KNN was 88.496%, SVM-L was 50.063%, SVM-R was 63.527%, NN was 32.842%, NB was 51.152%, and ADA was 60.396% and for MNB it was 50.725% given in the chart shown in Fig 6.

The accuracies of all models with different features have been compared in Fig 7. It is very clear from the result that all the models show great accuracy when trained with their top 4 features, but when we introduce one more feature, i.e., the top 5 features, we see that all models’ accuracy has been reduced. We have also seen that when we remove one feature from the top 4 features and give the models the top 3 features, they also get less accuracy than accuracies with the top 4 features. All the results are shown in Table 4 along with a graphical representation of accuracy in Fig 8. The OptiSelect feature extraction method diagram is given in Fig 9. We have described our proposed feature selection method, i.e., OptiSelect in Algorithm 1.

Table 4. Algorithm comparison - top performances.

Algorithm	3 features (%)	4 features (%)	5 features (%)
LR	51.896	56.89	50.253
KNN	82.381	83.396	81.496
SVM-L	54.226	55.057	50.063
SVM-R	65.665	67.889	63.527
NN	63.885	65.55	32.842
NB	52.142	55.689	51.152
ADA	61.885	65.297	60.396
MNB	51.589	52.339	50.725

Open in a new tab

Proposed EnShap classification

In this paper, for classification, we have used the concept of Shapley value from game theory, which helped us pick the suitable classifier for further processing. Additionally, we have used the concept of Ensemble learning from the field of machine learning, so in our proposed classification, we have integrated the two great fields of science, i.e., Machine learning and Game theory.

Algorithm 1. Proposed OptiSelect feature selection algorithm.

1: Input: Dataset with features X and stroke column y

2: Output: Selected top features $𝐗_{selected}$ for further analysis

3: procedure Feature_Extraction

4: for each feature F_i in X do

5: for each subset S_j of data do

6: $𝐗_{subset} \leftarrow$ CreateSubset $𝐗, F_{i}$

7: end for

8: end for

9: end procedure

10: procedure Model_Training

11: for each dataset $𝐗_{subset}$ do

12: for each model M_k do

13: $M_{k} \leftarrow$ TrainModel $𝐗_{subset}$

14: Record accuracy $A_{k, subset}$ achieved by M_k

15: end for

16: end for

17: end procedure

18: procedure Accuracy_Sorting

19: Sort accuracies $A_{k, subset}$ obtained for each model M_k across datasets

20: Create a table displaying sorted accuracies for each model

21: end procedure

22: procedure Feature_Ranking

23: for each model M_k do

24: Identify top-performing features based on accuracies

25: Determine $𝐗_{top_5}$ - top 5 features for M_k

26: end for

27: end procedure

28: procedure Top_Feature_Selection

29: Select $𝐗_{top_3}$ , $𝐗_{top_4}$ , and $𝐗_{top_5}$ based on consistent performance across models

30: end procedure

31: procedure Retraining_With_Selected_Features

32: for each selected feature set $𝐗_{selected}$

33: RetrainModels $𝐗_{selected}$

34: Record performance metrics achieved by each model with selected features

35: end for

36: end procedure

37: function Create_Subset $𝐗, F_{i}$

38: Segregate data in X based on feature F_i into ten distinct datasets

39: return $𝐗_{subset}$

40: end function

41: function Train_Model $𝐗_{subset}$

42: Train a model using dataset $𝐗_{subset}$

43: return Trained model

44: end function

45: procedure Retrain_Models $𝐗_{selected}$

46: Retrain models using selected top features $𝐗_{selected}$

47: end procedure

48: Call procedures: FeatureExtraction, ModelTraining, AccuracySorting, FeatureRanking, TopFeatureSelection, RetrainingWithSelectedFeatures

Shapley.

The Shapley value, a vital concept in cooperative game theory, enables equitable allocation of the combined payoff generated by a coalition of players; Shapley values were named in honor of Lloyd Shapley, who introduced the concept in 1951. In our paper for ranking classifiers, the Shapley value serves as a tool to assess the significance of individual classifiers within an ensemble. For a coalition game defined by a characteristic function v, the Shapley value assigned to the player i is calculated using the formula:

ϕ_{i} (v) = \sum_{S \subseteq N ⧵ {i}} \frac{| S |! (| N | - | S | - 1)!}{| N |!} (v (S \cup {i}) - v (S))

(3)

Breaking down the elements within this equation pertinent to classifier ranking:

$ϕ_{i} (v)$ represents the Shapley value attributed to classifier i, derived from the characteristic function v.
N signifies the complete set of classifiers involved.
S denotes a coalition of classifiers excluding classifier i.
$v (S \cup {i})$ denotes the performance or contribution of the coalition including classifier i.
v(S) indicates the performance or contribution of the coalition excluding classifier i.
$| S |$ signifies the number of classifiers within coalition S.
$| N |$ represents the total count of classifiers in the ensemble.

Imagine a group of friends playing a team-based game where each player contributes differently to the team’s success. Some players are highly skilled and have a big impact, while others contribute less but still play a role. The question is: how do we fairly distribute the team’s total winnings among the players based on their contributions?

The Shapley value answers this by considering all possible ways players could have joined the team and measuring how much each individual improved the team’s performance in every possible scenario. It averages these contributions to assign a fair value to each player.

In our case, we treat machine learning classifiers as “players" in an ensemble and evaluate how much each classifier improves prediction accuracy when added to different combinations of models. This helps us rank and select the most valuable classifiers for better stroke prediction. Similarly, we use the Shapley value to rank input features based on their importance in making predictions.

By using this approach, our model ensures that the most influential features and classifiers receive appropriate weight, leading to improved accuracy and interpretability.

Utilizing the Shapley value facilitates the assessment of the average marginal contribution of each classifier across all potential coalitions they participate in, considering different arrival sequences. This assessment enables a fair evaluation of the classifiers’ contributions to the ensemble’s overall performance, allowing us to rank them accordingly. The algorithm for Shapley value calculation for classifiers is mentioned in Algorithm 2. The Shapley values based on accuracies for the chosen classifiers are as follows: KNN has a Shapley value of 0.1307, LR has a Shapley value of 0.1235, SVM-L has a Shapley value of 0.1231, SVM-R has a Shapley value of 0.1259, NB has a Shapley value of 0.1233, ADA has a Shapley value of 0.1254, MNB has a Shapley value of 0.1225, and NN has a Shapley value of 0.1254.

Algorithm 2. Shapley value calculation for classifiers.

Input: List of classifiers, List of accuracies for each combination of classifiers

Output: Shapley values for each classifier

1: procedureCalculate_Shapley_ValuesClassifiers, Accuracies

2: $N \leftarrow$ Number of classifiers

3: Initialize Shapley values for each classifier: $S V \leftarrow [0, 0, \dots, 0]$

4: for $i \leftarrow 1$ to N do

5: for S in all subsets of Classifiers excluding i do

6: $m \leftarrow | S |$

7: $c o n t r i b u t i o n \leftarrow Accuracies [S \cup {i}] - Accuracies [S]$

8: $S V [i] \leftarrow S V [i] + \frac{c o n t r i b u t i o n}{m \times (N - m + 1)}$

9: end for

10: end for

11: Normalize Shapley values: $S V \leftarrow S V \times \frac{1}{N}$

12: return Shapley values SV

13: End procedure

Given: List of classifiers Classifiers, Accuracies for each combination of classifiers

Call: Calculate_Shapley_Values(Classifiers, Accuracies)

Ensemble.

This section employs the ensemble classification methodology to optimize model selection based on the Shapley values derived from normalized accuracies. The top five classifiers, determined by their respective Shapley values, are utilized for ensemble modeling. The classifiers selected include K-Nearest Neighbors (KNN), Logistic Regression (LR), Support Vector Machine with Linear Kernel (SVM-L), Support Vector Machine with Radial Kernel (SVM-R), Naive Bayes (NB), AdaBoost (ADA), Multinomial Naive Bayes (MNB), and Neural Network (NN).

Moreover, to enhance the performance of these classifiers, OptiSelect feature selection techniques have been employed to identify the most influential features for each classifier. The four most discriminative features for each of the top five classifiers, obtained through OptiSelect feature selection, are outlined in Table 5.

Table 5. Selected models and their Highest preferred features.

Algorithms	Feature 1	Feature 2	Feature 3	Feature 4
KNN	Age	Glucose	Heart disease	Gender
LR	Age	Glucose	Married	Work
SVM-L	Heart disease	Married	Work	Age
SVM-R	Age	Gender	Married	Residence
NB	Age	Married	Glucose	Bmi

Open in a new tab

The selection of these specific features for each classifier is based on their significance in contributing to the overall predictive performance, as identified through rigorous feature selection techniques.

Enshap model.

The Enshap Classifier algorithm begins by processing a dataset $𝒟$ containing features X and corresponding labels y. This algorithm comprises multiple key steps to create an ensemble model $ℰ$ that leverages the strengths of diverse models and important features.

Initially, the algorithm undergoes a Feature Selection and Model Training phase. For each of the eight diverse models (M_i) within $ℳ$ , it identifies the four most relevant features $𝐅_{i}$ by employing specific feature selection techniques tailored to each model’s characteristics. Subsequently, these selected features are used to train the respective models (M_i).

Algorithm 3. Proposed Enshap classifier algorithm.

1: Input: Dataset $𝒟$ with features X and labels y

2: Output: Ensemble model $ℰ$ using top 5 models and their 4 most important features

3: procedure Feature_Selection_And_Model_Training

4: for each model M_i in $ℳ$ , where $ℳ$ represents 8 diverse models do

5: $𝐅_{i} \leftarrow$ Select_Features X, y, M_i

6: $M_{i} \leftarrow$ Train_Model $𝐅_{i}$

7: Evaluate model accuracy A_i on the test set

8: end for

9: end procedure

10: procedure Shapley_Value_Calculation

11: for each combination C of models do

12: $Acc (C) \leftarrow$ CombineModelsC, X, y

13: end for

14: for each model M_i do

15: Calculate Shapley value S_i based on $Acc (C)$ considering all C containing M_i

16: end for

17: end procedure

18: procedure Model_Ranking

19: Rank models based on their Shapley values S_i

20: Select top 5 models $M_{top_5}$ with highest S_i

21: end procedure

22: procedure Ensemble_Model_Creation

23: for each selected model M_j in $M_{top_5}$ do

24: $𝐅_{imp} \leftarrow$ Retrieve_Important_FeaturesM_j

25: end for

26: Construct ensemble model $ℰ$ using top 5 models $M_{top_5}$ and their 4 most important features $𝐅_{imp}$ for prediction

27: end procedure

28: function Select_Features X, y, M_i

29: Identify 4 optimal features using the feature selection method based on model M_i

30: return Selected 4 features $𝐅_{i}$

31: end function

32: function Train_Model $𝐅_{i}$

33: Train a model using selected features $𝐅_{i}$

34: return Trained model M_i

35: end function

36: function Combine_ModelsC, X, y

37: Combine models in set C and compute combined accuracies $Acc (C)$

38: return $Acc (C)$

39: end function

40: function Retrieve_Important_FeaturesM_j

41: Extract 4 most influential features for model M_j

42: return $𝐅_{imp}$

43: end function

44: Call procedures: FeatureSelectionAndModelTraining, ShapleyValueCalculation, ModelRanking, EnsembleModelCreation

Following this, the algorithm proceeds with Shapley Value Calculation, where it evaluates the impact of each model on the ensemble’s accuracy. The Shapley value S_i of a model M_i is computed by assessing the change in accuracy when M_i is added to various combinations of models, considering all possible combinations.

After computing Shapley values, the models are ranked based on their respective values. The top 5 models with the highest Shapley values are selected, forming the set $M_{top_5}$ . These top models are considered for Ensemble Model Creation. For each model M_j in $M_{top_5}$ , the algorithm retrieves the 4 most important features $𝐅_{imp}$ associated with that model.

Finally, the ensemble model $ℰ$ is formed by combining predictions from the top 5 models ( $M_{top_5}$ ) using their respective important features. The ensemble model $ℰ$ aggregates the predictions by employing a weighted average based on the selected models and their significant features.

The Enshap Classifier algorithm employs a systematic approach that integrates OptiSelect feature selection, model training, Shapley value computation, model ranking, and ensemble model creation. This method aims to construct an ensemble model $ℰ$ capable of achieving improved predictive accuracy by harnessing the expertise of diverse models and their key features in the classification process. We have described our proposed classification method, i.e., EnShap classifier in Algorithm 3, and a detailed diagram of our proposed classifier is given in Fig 10. the end to flow diagram of our method is shown in Fig 11.

Hardware used

The implementation of the proposed framework was carried out on Google Colaboratory, utilizing its cloud-based computational environment. The system was equipped with 12.7 GB of RAM, 15.0 GB of GPU memory, and 78.2 GB of available disk space, providing sufficient resources for training and evaluation.

Results and discussion

In our study on integrating game theory and machine learning for ischemic stroke prediction, we conducted evaluations comparing predictions with ground truth data using specific metrics. The dataset consists of patient-related features pertinent to stroke prediction, including gender, age, hypertension, heart disease history, marital status, work type, residence type, average glucose level, BMI, smoking status, and stroke prediction.

To assess the predictive model’s performance, we employed multi-class confusion matrices to analyze the predictive accuracy for three distinct classes: ‘core,’ ‘penumbra,’ and ‘healthy brain.’ Here, we define:

TPc (True Positive): Represents accurate predictions of a particular class (e.g., ‘core’).
FPc (False Positive): Denotes instances where the model falsely predicts a class (‘core’) not present in the ground truth.
FNc (False Negative): Indicates the model’s failure to predict a class (‘core’) that exists in the ground truth.
TNc (True Negative): Refers to accurate rejections of a class (‘core’) that does not exist in the ground truth.

Each entry in the multi-class confusion matrix reflects the count of instances based on the defined classes and their predictions from the model.

From these confusion matrices, we derive the following metrics for each class $c \in C$ :

Recall (recc):
$recc = \frac{T P c}{T P c + F N c}$ (4)
Precision (precc):
$precc = \frac{T P c}{T P c + F P c}$ (5)
Dice Coefficient (Dicec):
$Dicec = \frac{2 \times precc \times recc}{precc + recc},$ (6)
analogous to the F1-score. The value range for these metrics is [0,1].

We consider additional evaluation metrics, such as the Hausdorff distance between predicted and ground truth regions [41], aiming to quantify the dissimilarity between these regions. Additionally, the absolute difference in volume between predicted volume ( $V_{p}$ [ml]) and ground truth volume ( $V_{g}$ [ml]), denoted as $Δ V = | V_{g} - V_{p} |$ , provides insights into volumetric disparities.

These evaluation metrics enable a comprehensive assessment of the proposed model’s performance in predicting ischemic stroke based on diverse patient-related features, thus validating the integration of game theory and machine learning in stroke prediction.

Performance of Enshap classifier

Our model’s accuracy was compared with various machine learning models. The comparison is presented in Fig 12, and a detailed breakdown can be found in Table 6. Additionally, we compared with other proposed models, as illustrated in Fig 13. The detailed results are available in Table 7, and the confusion matrix of EnShap in Fig 14 shows good results regarding both positive and negative cases.

Table 6. Performance metrics of various algorithms.

Model	TP	FP	FN	TN	Accuracy	Precision	Recall (Sensitivity)	F1-score
KNN	814	170	157	827	83.396%	82.716%	83.840%	83.276%
NN	726	248	207	787	76.88%	74.55%	77.81%	76.15%
RF	735	237	218	718	76.88%	75.62%	77.16%	76.38%
SVM-L	494	487	475	512	51.118%	50.37%	50.98%	50.67%
SVM-R	651	275	219	823	74.88%	70.30%	74.81%	72.49%
ADA	747	235	168	836	80.437%	76.11%	81.62%	78.77%
MNB	630	309	226	803	72.815%	67.07%	73.61%	70.17%
EnShap	846	89	61	972	92.39%	90.49%	93.26%	92.39%

Open in a new tab

Table 7. Accuracy scores with various proposed algorithms.

Algorithms	Accuracy (%)
Hybrid machine learning approach	71.6
GBT	78
SQMLP	86.78
Our proposed model	92.39

Open in a new tab

In ischemic stroke prediction, our model showcased exceptional performance, attaining an accuracy of $92.39 % \pm 1.2 %$ (95% CI: $91.19 %$ – $93.59 %$ ). This evaluation, meticulously delineated in Fig 12, firmly established our model, EnShap, as a frontrunner in this field. Comparative assessments against other proposed models further underscored the distinct superiority of our approach. Specifically, while SQMLP achieved an accuracy of $86.78 % \pm 2.5 %$ (95% CI: $84.28 %$ – $89.28 %$ ), GBT demonstrated $76 % \pm 3.1 %$ accuracy for males and $80 % \pm 2.8 %$ accuracy for females. In contrast, a hybrid machine learning approach reached an accuracy of $71.6 % \pm 2.3 %$ (95% CI: $69.3 %$ – $73.9 %$ ). The narrower confidence intervals of EnShap, compared to other models, reflect its superior consistency and reliability.

Our model demonstrated better performance when compared with established machine learning algorithms, as detailed in Table 7. Notably, SVM-L registered an accuracy of $51.118 % \pm 2.9 %$ (95% CI: $48.218 %$ – $54.018 %$ ), MNB at $72.815 % \pm 2.1 %$ (95% CI: $70.715 %$ – $74.915 %$ ), LR at $77.714 % \pm 1.8 %$ (95% CI: $75.914 %$ – $79.514 %$ ), SVM-R and NN both at $74.88 % \pm 2.2 %$ (95% CI: $72.68 %$ – $77.08 %$ ), RF and KNN at $76.88 % \pm 2.0 %$ (95% CI: $74.88 %$ – $78.88 %$ ), and ADA at $80.437 % \pm 1.7 %$ (95% CI: $78.737 %$ – $82.137 %$ ). Surpassing these benchmarks, our model achieved a remarkable accuracy of $92.39 % \pm 1.2 %$ (95% CI: $91.19 %$ – $93.59 %$ ).

The comparison of precision in Fig 15, recall given in Fig 16, and F1 score in Fig 17, along with their respective confidence intervals, demonstrates consistent superiority across all metrics. Fig 18 represent the boxplot visualization of the result. This noteworthy achievement not only substantiates the robustness and reliability of our model but also accentuates its potential as a highly effective tool in ischemic stroke prediction, outperforming conventional approaches significantly.

The EnShap classifier demonstrates significantly faster convergence to optimal performance compared to other models such as KNN, Neural Networks, or AdaBoost. This is attributable to its integration of the Shapley value, which optimally selects and weights contributing classifiers, minimizing redundant computations and enhancing efficiency. nShap maintains a consistent improvement trend in predictive accuracy throughout the training phase. Its convergence curves are smoother and less prone to oscillations, highlighting the robustness of its ensemble approach and feature optimization techniques. In contrast, models like SVM-L and SVM-R exhibit fluctuating performance due to their sensitivity to hyperparameters and feature dependencies.

Class-wise performance metrics

The proposed framework was evaluated using class-wise performance metrics, including precision, recall, and the Dice coefficient (analogous to F1-score), to assess its predictive capability across different classes: ‘stroke’ (positive class) and ‘no stroke’ (negative class). Table 8 summarizes the performance for each class:

Table 8. Class-wise performance metrics.

Class	Precision (%)	Recall (%)	F1-Score (%)
Stroke	90.49	93.26	91.85
No Stroke	92.89	89.32	91.08

Open in a new tab

These metrics reveal a well-balanced performance across both classes, with a slightly higher recall for the stroke class, which is critical for minimizing false negatives in clinical scenarios like ischemic stroke prediction.

Trade-offs in prediction accuracy

Achieving high prediction accuracy often involves trade-offs between metrics such as precision and recall. For instance:

Stroke Class (Positive): The high recall (93.26%) ensures that most stroke cases are correctly identified, reducing the risk of missing critical diagnoses. However, this comes at a slight cost to precision (90.49%), meaning a small proportion of predicted stroke cases may not be actual strokes. This trade-off is acceptable in clinical practice, where prioritizing sensitivity is crucial to avoid overlooking at-risk patients.
No Stroke Class (Negative): Conversely, higher precision (92.89%) ensures that predictions of no stroke are more accurate, minimizing unnecessary interventions. However, the slightly lower recall (89.32%) indicates that a small number of true negatives might not be identified, which requires further refinement in the model to reduce false positives.

Addressing the trade-offs

The trade-offs in accuracy reflect the inherent challenges in balancing sensitivity and specificity in predictive modeling. To mitigate these trade-offs:

Threshold Optimization: Adjusting decision thresholds based on clinical priorities can help balance precision and recall, depending on whether reducing false negatives or false positives is more critical.
Weighted Loss Functions: Incorporating class-specific weights during model training can address imbalances in metric performance, especially for minority classes.
Post-prediction Analysis: Using explainability tools, such as Shapley values, can help clinicians understand the basis for predictions and make informed decisions, potentially reducing the impact of incorrect classifications.

TOPSIS analysis

Here is the explanation of the TOPSIS analysis steps:

Data normalization.

Each model’s performance metrics (precision, accuracy, F1-Score, and recall) were normalized using the Euclidean norm to bring all metrics onto a common scale between 0 and 1. The normalized scores represent the relative importance of each metric in the evaluation process present in Table 9, enabling fair comparisons across different criteria.

Table 9. Normalized data.

Model	Precision	Accuracy	F1-Score	Recall
KNN	0.393839	0.383439	0.388530	0.382870
NN	0.355160	0.353479	0.355177	0.355124
RF	0.360161	0.353479	0.356156	0.351884
SVM-L	0.239933	0.235031	0.236598	0.233053
SVM-R	0.334963	0.344284	0.338010	0.341160
ADA	0.362496	0.369834	0.367445	0.372420
MNB	0.319482	0.334789	0.327328	0.335912
EnShap	0.431041	0.424791	0.428554	0.425720

Open in a new tab

Weight assignment.

Equal weights (0.25) were assigned to each criterion (Precision, Accuracy, F1-Score, Recall) to ensure a balanced evaluation of models. This step aimed at treating all metrics equally in the decision-making process without favoring any specific criterion.

Calculate weighted normalized scores.

The normalized scores were then multiplied by the assigned weights to obtain the weighted normalized scores for each model given in Table 10. This multiplication integrated the relative importance of each criterion into the analysis, facilitating a comprehensive evaluation.

Table 10. Weighted normalized scores.

Model	Precision	Accuracy	F1-Score	Recall
KNN	0.098460	0.095860	0.097132	0.095717
NN	0.088790	0.088370	0.088794	0.088781
RF	0.090040	0.088370	0.089039	0.087971
SVM-L	0.059983	0.058758	0.059149	0.058263
SVM-R	0.083741	0.086071	0.084503	0.085290
ADA	0.090624	0.092458	0.091861	0.093105
MNB	0.079870	0.083697	0.081832	0.083978
EnShap	0.107760	0.106198	0.107138	0.106430

Open in a new tab

Determine ideal and negative solutions.

The ideal best solution, representing the model achieving the highest scores for all criteria given in Table 11, and the ideal worst solution, representing the model with the lowest scores for all criteria Table 12, were determined. These solutions served as benchmarks for evaluating model performance.

Table 11. Ideal best solution.

Metric	Value
Precision	0.107760
Accuracy	0.106198
F1-Score	0.107138
Recall	0.106430

Open in a new tab

Table 12. Ideal worst solution.

Metric	Value
Precision	0.059983
Accuracy	0.058758
F1-Score	0.059149
Recall	0.058263

Open in a new tab

Calculate distances.

Euclidean distances were computed between each model and the ideal best and worst solutions. These distances quantified the proximity of each model to the ideal solutions, measuring how well each model performed relative to the ideal benchmarks.

Calculate proximity to ideal solution.

The relative closeness of each model to the ideal solution was calculated using the distances from the ideal best and ideal worst solutions. This relative closeness metric provided insights into how well each model performed concerning the ideal solutions.

Rank the models.

Finally, the models were ranked based on their calculated proximity values. A higher proximity value indicated better performance concerning the ideal solution. The ranked models showcase the effectiveness of the EnShap model as the most favorable alternative, followed by KNN, ADA, RF, NN, SVM-R, MNB, and SVM-L, respectively, in terms of predictive capabilities for ischemic stroke analysis given in Table 13.

Table 13. Ranked models based on TOPSIS score.

Model	TOPSIS Score
EnShap	1.000000
KNN	0.788913
ADA	0.688357
RF	0.623191
NN	0.619590
SVM-R	0.540389
MNB	0.487260
SVM-L	0.000000

Open in a new tab

K-Fold validation

We have also done a K-fold validation comparison between all the models, taking a value of k = 10. The comparison is given in Table 14.

Table 14. Estimated approximate K-fold cross-validation accuracy for each model.

Model	K-fold Cross-Validation Accuracy
KNN	0.85 (+/- 0.15)
NN	0.775 (+/- 0.011)
RF	0.775 (+/- 0.13)
SVM-L	0.525 (+/- 0.05)
SVM-R	0.743 (+/- 0.1)
ADA	0.797 (+/- 0.013)
MNB	0.713 (+/- 0.016)
EnShap	0.915 (+/- 0.05)

Open in a new tab

The OptiSelect feature selection method and EnShap ensemble classification algorithm can be readily adapted to a wide range of medical conditions. For instance:

Cancer Diagnosis and Prognosis: The framework can prioritize features such as tumor size, genetic markers, and imaging-derived data to optimize predictive accuracy in cancer diagnosis and treatment response modeling.

Chronic Disease Management: Conditions such as diabetes and hypertension can benefit from feature ranking methodologies, identifying critical predictors like HbA1c, BMI, and blood pressure, thereby improving disease monitoring and intervention strategies.

Multimodal Data Integration: The framework’s scalability enables the integration of diverse data modalities, including imaging, genomics, and clinical notes. This is particularly valuable for complex diseases, such as Alzheimer’s, where multi-dimensional data analysis is essential.

Comparison with clinical risk scores

While our proposed model, EnShap, achieves a high predictive accuracy of 92.39% in ischemic stroke risk prediction, it is crucial to compare its performance with established clinical risk scores such as ${CHA}_{2} {DS}_{2}$ -VASc. The ${CHA}_{2} {DS}_{2}$ -VASc score is widely used in clinical practice for stroke risk assessment, particularly in patients with atrial fibrillation. However, it is a rule-based scoring system that primarily considers demographic and clinical factors such as age, hypertension, diabetes, and prior stroke history.

In contrast, EnShap leverages a data-driven approach integrating machine learning and game theory to identify key predictive features dynamically. Unlike ${CHA}_{2} {DS}_{2}$ -VASc, which assigns fixed weightages to specific risk factors, our model adapts to feature importance using Shapley values and ensemble learning. This adaptability allows EnShap to incorporate a broader range of patient-specific factors, potentially capturing complex interactions between variables that traditional risk scores may overlook.

Previous studies have reported that the ${CHA}_{2} {DS}_{2}$ -VASc score has moderate predictive accuracy, with an AUC typically ranging from 0.60 to 0.75 in various populations. Our model significantly outperforms this range, demonstrating a higher sensitivity and specificity in ischemic stroke prediction. The ensemble-based classification methodology used in EnShap enhances its robustness, ensuring more accurate risk stratification compared to conventional scoring models.

Despite these advantages, clinical implementation of machine learning models like EnShap requires careful validation against established guidelines and real-world datasets. Future work should involve direct comparative studies evaluating EnShap alongside ${CHA}_{2} {DS}_{2}$ -VASc in diverse clinical cohorts to assess its practical utility in stroke risk assessment.

Conclusion and future work

This research demonstrated a new approach to predictive analysis in ischemic stroke using a fusion of game theory and machine learning techniques. By harnessing the power of the Shapley value to discern pivotal features and employing ensemble learning methods, our model, EnShap, has demonstrated exceptional predictive accuracy, attaining a remarkable 92.39%. The comparative analyses conducted against existing machine learning benchmarks and state-of-the-art models unequivocally highlight the superiority of EnShap. It significantly outperformed established algorithms like SVM-L, MNB, LR, SVM-R, NN, RF, KNN, and ADA, all renowned in predictive modeling for ischemic stroke. Underscores the robustness and reliability of EnShap as a cutting-edge tool in ischemic stroke prediction. The amalgamation of game theory concepts with machine learning, specifically the Shapley value methodology, proved instrumental in identifying key features for improved predictive accuracy. EnShap’s success demonstrates the potential of this innovative approach to advance ischemic stroke prediction and prognosis.

But this study has some limitations like the dataset diversity is restricted, potentially affecting the model’s generalizability. The computational complexity of Shapley value computations and ensemble learning may pose challenges in real-time applications. Additionally, the clinical interpretability of Shapley value results remains complex, limiting their immediate usability by clinicians.

Future work in this domain encompasses multifaceted avenues for advancement. Integrating diverse datasets like genetic information, lifestyle parameters, or advanced imaging data could fortify EnShap’s predictive capacity. Clarifying the Shapley value’s clinical interpretability remains pivotal for informed decision-making. Further refinement of ensemble learning techniques, exploration of deep learning architectures, and creation of hybrid models amalgamating EnShap with emerging technologies are promising areas for heightened accuracy and robustness in ischemic stroke prediction. Expanding upon these foundations presents opportunities to develop more precise, individualized approaches, propelling the field toward enhanced ischemic stroke prediction and management.

Data Availability

The anonymised data collected are available as open data via the Brain stroke prediction dataset on Kaggle online data repository: https://www.kaggle.com/datasets/zzettrkalpakbal/full-filled-brain-stroke-dataset/.

Funding Statement

We are thankful to the agreements negotiated by Old Dominion University and the Virtual Library of Virginia (VIVA), which covers the Article Processing Charges of this manuscript.

References

1.Saver JL. Time is brain—quantified. Stroke. 2006;37(1):263–6. doi: 10.1161/01.str.0000196957.55928.ab [DOI] [PubMed] [Google Scholar]
2.Rimmele DL, Thomalla G. Wake-up stroke: clinical characteristics, imaging findings, and treatment option an update. Front Neurol. 2014;5. doi: 10.3389/fneur.2014.00035 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Truelsen T, Piechowski-Jóźwiak B, Bonita R, Mathers C, Bogousslavsky J, Boysen G. Stroke incidence and prevalence in Europe: a review of available data. Euro J Neurol. 2006;13(6):581–98. doi: 10.1111/j.1468-1331.2006.01138.x [DOI] [PubMed] [Google Scholar]
4.Boorboor S, Jadhav S, Ananth M, Talmage D, Role L, Kaufman A. Visualization of neuronal structures in wide-field microscopy brain images. IEEE Trans Visualiz Comput Graph. 2018;25(1):1018–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Singh T, Ninkovic BM, Tasic MS, Stevanovic MN, Kolundzija BM. 3-D EM modeling of medical microwave imaging scenarios with controllable accuracy. IEEE Trans Antennas Propagat. 2023;71(2):1640–53. doi: 10.1109/tap.2022.3209244 [DOI] [Google Scholar]
6.Taylor RA, Sansing LH. Microglial responses after ischemic stroke and intracerebral hemorrhage. Clin Develop Immunol. 2013;2013:1–10. doi: 10.1155/2013/746068 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Schiff L, Hadker N, Weiser S, Rausch C. A literature review of the feasibility of glial fibrillary acidic protein as a biomarker for stroke and traumatic brain injury. Mol Diagn Ther. 2012;16(2):79–92. doi: 10.1007/bf03256432 [DOI] [PubMed] [Google Scholar]
8.Frey S, Ertl T. Progressive direct volume-to-volume transformation. IEEE Trans Visualiz Comput Graph. 2016;23(1):921–30. [DOI] [PubMed] [Google Scholar]
9.Toni D, Fiorelli M, Bastianello S, Sacchetti ML, Sette G, Argentino C, et al. Hemorrhagic transformation of brain infarct: predictability in the first 5 hours from stroke onset and influence on clinical outcome. Neurology. 1996;46(2):341–5. [DOI] [PubMed] [Google Scholar]
10.Li R, Zeng T, Peng H, Ji S. Deep learning segmentation of optical microscopy images improves 3-D neuron reconstruction. IEEE Trans Med Imaging. 2017;36(7):1533–41. [DOI] [PubMed] [Google Scholar]
11.Mohammed H, Al-Awami AK, Beyer J, Cali C, Magistretti P, Pfister H, et al. Abstractocyte: a visual tool for exploring nanoscale astroglial cells. IEEE Trans Visual Comput Graph. 2018;24(1):853–61. doi: 10.1109/tvcg.2017.2744278 [DOI] [PubMed] [Google Scholar]
12.De Meyer SF, Denorme F, Langhauser F, Geuss E, Fluri F, Kleinschnitz C. Thromboinflammation in stroke brain damage. Stroke. 2016;47(4):1165–72. [DOI] [PubMed] [Google Scholar]
13.Lauteslager T, et al. Dynamic microwave imaging of the cardiovascular system using ultra-wideband radar-on-chip devices. IEEE Trans Biomed Eng. 2022;69(9):2935–46. [DOI] [PubMed] [Google Scholar]
14.Bian Y, et al. Ensemble pruning based on objection maximization with a general distributed framework. IEEE Trans Neural Netw Learn Syst. 2019;31(9):3766–74. [DOI] [PubMed] [Google Scholar]
15.Caruana R, Niculescu-Mizil A, Crew G, Ksikes A. Ensemble selection from libraries of models. In: Proceedings of the 21st International Conference on Machine Learning (ICML 2004). 2004. p. 18.
16.Polikar R. Ensemble learning. Ensemble machine learning. New York: Springer; 2012. p. 1–34. 10.1007/978-1-4419-9326-7_1 [DOI]
17.Chakraborty P, Bandyopadhyay A, Misra M, Gupta P, Sardar TH, Pandey B. Automated detection of intracranial hemorrhage using convolutional neural networks. In: 2024 IEEE AITU: Digital Generation. 2024. p. 20–6.
18.Sagi O, Rokach L. Ensemble learning: a survey. Wiley Interdiscip Rev: Data Mining Knowl Discov. 2018;8(4):e1249. [Google Scholar]
19.Dong X, Yu Z, Cao W, Shi Y, Ma Q. A survey on ensemble learning. Front Comput Sci. 2019;14(2):241–58. doi: 10.1007/s11704-019-8208-z [DOI] [Google Scholar]
20.Liu L, Chang J, Liang G, Xiong S. Simulated quantum mechanics-based joint learning network for stroke lesion segmentation and TICI grading. IEEE J Biomed Health Inform. 2023;27(7):3372–83. doi: 10.1109/jbhi.2023.3270861 [DOI] [PubMed] [Google Scholar]
21.Chun M, Clarke R, Cairns BJ, Clifton D, Bennett D, Chen Y, et al. Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults. J Am Med Inf Assoc. 2021;28(8):1719–27. doi: 10.1093/jamia/ocab068 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Boukhennoufa I, Zhai X, Utti V, Jackson J, McDonald-Maier KD. A comprehensive evaluation of state-of-the-art time-series deep learning models for activity-recognition in post-stroke rehabilitation assessment. In: 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). 2021. p. 2242–7. 10.1109/embc46164.2021.9630462 [DOI] [PubMed]
23.Campagnini S, Arienti C, Patrini M, Liuzzi P, Mannini A, Carrozza MC. Machine learning methods for functional recovery prediction and prognosis in post-stroke rehabilitation: a systematic review. J NeuroEng Rehabil. 2022;19(1):1–22.doi: 10.1186/s12984-022-01032-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Boukhennoufa I, et al. Wearable sensors and machine learning in post-stroke rehabilitation assessment: a systematic review. Biomed Signal Process Control. 2022;71:103197. [Google Scholar]
25.Abedi V, Avula V, Chaudhary D, Shahjouei S, Khan A, Griessenauer CJ, et al. Prediction of long-term stroke recurrence using machine learning models. JCM. 2021;10(6):1286. doi: 10.3390/jcm10061286 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Song X, et al. Activities of daily living-based rehabilitation system for arm and hand motor function retraining after stroke. IEEE Trans Neural Syst Rehabilit Eng. 2022;30:621–31. [DOI] [PubMed] [Google Scholar]
27.Boukhennoufa I, Altai Z, Zhai X, Utti V, McDonald-Maier KD, Liew BXW. Predicting the internal knee abduction impulse during walking using deep learning. Front Bioeng Biotechnol. 2022;10. doi: 10.3389/fbioe.2022.877347 [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Liu T, Fan W, Wu C. A hybrid machine learning approach to cerebral stroke prediction based on imbalanced medical dataset. Artif Intell Med. 2019;101:101723. doi: 10.1016/j.artmed.2019.101723 [DOI] [PubMed] [Google Scholar]
29.Chakraborty P, Bandyopadhyay A, Sahu PP, Burman A, Mallik S, Alsubaie N, et al. Predicting stroke occurrences: a stacked machine learning approach with feature selection and data preprocessing. BMC Bioinformatics. 2024;25(1):329.doi: 10.1186/s12859-024-05866-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Kuncheva LI. A theoretical study on six classifier fusion strategies. IEEE Trans Pattern Anal Mach Intell. 2002;24(2):281–6. [Google Scholar]
31.Paisitkriangkrai S, Shen C, Hengel Avd. Pedestrian detection with spatially pooled features and structured ensemble learning. IEEE Trans Pattern Anal Mach Intell. 2016;38(6):1243–57. doi: 10.1109/tpami.2015.2474388 [DOI] [PubMed] [Google Scholar]
32.Pinto A, Pereira S, Correia H, Oliveira J, Rasteiro DMLD, Silva CA. Brain tumour segmentation based on extremely randomized forest with high-level features. In: 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). 2015. 3037–40. 10.1109/embc.2015.7319032 [DOI] [PubMed]
33.Fang G, Huang Z, Wang Z. Predicting ischemic stroke outcome using deep learning approaches. Front Genet. 2022;12:827522. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Herzog L, et al. Deep learning versus neurologists: functional outcome prediction in LVO stroke patients undergoing mechanical thrombectomy. Stroke. 2023;54(7):1761–9. [DOI] [PubMed] [Google Scholar]
35.Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4765–74. [Google Scholar]
36.Liu Y, Yu Y, Ouyang J, Jiang B, Yang G, Ostmeier S, et al. Functional outcome prediction in acute ischemic stroke using a fused imaging and clinical deep learning model. Stroke. 2023;54(9):2316–27. doi: 10.1161/strokeaha.123.044072 [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Gao C, Wang H. Intelligent stroke disease prediction model using deep learning approaches. Stroke Res Treat. 2024;2024(1):4523388. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Bandyopadhyay A, Mukhopadhyay S, Ganguly U. On free of cost service distribution in cloud computing. In: ICACCI 2017. 2017. p. 1974–80.
39.Shroff P, Bandyopadhyay A. A novel matching framework for one-sided markets in fog computing. Int J Comput Digit Syst. 2020;10:1–10. [Google Scholar]
40.Di Nunzio GM, Maistro M, Zilio D, et al. Gamification for machine learning: the classification game. CEUR Worksh Proc. 2016;1642:45–52. [Google Scholar]
41.Bîrsan T, Tiba D. One hundred years since the introduction of the set distance by Dimitrie Pompeiu. In: System Modeling and Optimization: Proceedings of the 22nd IFIP TC7 Conference, Turin, Italy. 2006. p. 35–9.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

[pone.0328967.ref001] 1.Saver JL. Time is brain—quantified. Stroke. 2006;37(1):263–6. doi: 10.1161/01.str.0000196957.55928.ab [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref002] 2.Rimmele DL, Thomalla G. Wake-up stroke: clinical characteristics, imaging findings, and treatment option an update. Front Neurol. 2014;5. doi: 10.3389/fneur.2014.00035 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0328967.ref003] 3.Truelsen T, Piechowski-Jóźwiak B, Bonita R, Mathers C, Bogousslavsky J, Boysen G. Stroke incidence and prevalence in Europe: a review of available data. Euro J Neurol. 2006;13(6):581–98. doi: 10.1111/j.1468-1331.2006.01138.x [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref004] 4.Boorboor S, Jadhav S, Ananth M, Talmage D, Role L, Kaufman A. Visualization of neuronal structures in wide-field microscopy brain images. IEEE Trans Visualiz Comput Graph. 2018;25(1):1018–28. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0328967.ref005] 5.Singh T, Ninkovic BM, Tasic MS, Stevanovic MN, Kolundzija BM. 3-D EM modeling of medical microwave imaging scenarios with controllable accuracy. IEEE Trans Antennas Propagat. 2023;71(2):1640–53. doi: 10.1109/tap.2022.3209244 [DOI] [Google Scholar]

[pone.0328967.ref006] 6.Taylor RA, Sansing LH. Microglial responses after ischemic stroke and intracerebral hemorrhage. Clin Develop Immunol. 2013;2013:1–10. doi: 10.1155/2013/746068 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0328967.ref007] 7.Schiff L, Hadker N, Weiser S, Rausch C. A literature review of the feasibility of glial fibrillary acidic protein as a biomarker for stroke and traumatic brain injury. Mol Diagn Ther. 2012;16(2):79–92. doi: 10.1007/bf03256432 [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref008] 8.Frey S, Ertl T. Progressive direct volume-to-volume transformation. IEEE Trans Visualiz Comput Graph. 2016;23(1):921–30. [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref009] 9.Toni D, Fiorelli M, Bastianello S, Sacchetti ML, Sette G, Argentino C, et al. Hemorrhagic transformation of brain infarct: predictability in the first 5 hours from stroke onset and influence on clinical outcome. Neurology. 1996;46(2):341–5. [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref010] 10.Li R, Zeng T, Peng H, Ji S. Deep learning segmentation of optical microscopy images improves 3-D neuron reconstruction. IEEE Trans Med Imaging. 2017;36(7):1533–41. [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref011] 11.Mohammed H, Al-Awami AK, Beyer J, Cali C, Magistretti P, Pfister H, et al. Abstractocyte: a visual tool for exploring nanoscale astroglial cells. IEEE Trans Visual Comput Graph. 2018;24(1):853–61. doi: 10.1109/tvcg.2017.2744278 [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref012] 12.De Meyer SF, Denorme F, Langhauser F, Geuss E, Fluri F, Kleinschnitz C. Thromboinflammation in stroke brain damage. Stroke. 2016;47(4):1165–72. [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref013] 13.Lauteslager T, et al. Dynamic microwave imaging of the cardiovascular system using ultra-wideband radar-on-chip devices. IEEE Trans Biomed Eng. 2022;69(9):2935–46. [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref014] 14.Bian Y, et al. Ensemble pruning based on objection maximization with a general distributed framework. IEEE Trans Neural Netw Learn Syst. 2019;31(9):3766–74. [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref015] 15.Caruana R, Niculescu-Mizil A, Crew G, Ksikes A. Ensemble selection from libraries of models. In: Proceedings of the 21st International Conference on Machine Learning (ICML 2004). 2004. p. 18.

[pone.0328967.ref016] 16.Polikar R. Ensemble learning. Ensemble machine learning. New York: Springer; 2012. p. 1–34. 10.1007/978-1-4419-9326-7_1 [DOI]

[pone.0328967.ref017] 17.Chakraborty P, Bandyopadhyay A, Misra M, Gupta P, Sardar TH, Pandey B. Automated detection of intracranial hemorrhage using convolutional neural networks. In: 2024 IEEE AITU: Digital Generation. 2024. p. 20–6.

[pone.0328967.ref018] 18.Sagi O, Rokach L. Ensemble learning: a survey. Wiley Interdiscip Rev: Data Mining Knowl Discov. 2018;8(4):e1249. [Google Scholar]

[pone.0328967.ref019] 19.Dong X, Yu Z, Cao W, Shi Y, Ma Q. A survey on ensemble learning. Front Comput Sci. 2019;14(2):241–58. doi: 10.1007/s11704-019-8208-z [DOI] [Google Scholar]

[pone.0328967.ref020] 20.Liu L, Chang J, Liang G, Xiong S. Simulated quantum mechanics-based joint learning network for stroke lesion segmentation and TICI grading. IEEE J Biomed Health Inform. 2023;27(7):3372–83. doi: 10.1109/jbhi.2023.3270861 [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref021] 21.Chun M, Clarke R, Cairns BJ, Clifton D, Bennett D, Chen Y, et al. Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults. J Am Med Inf Assoc. 2021;28(8):1719–27. doi: 10.1093/jamia/ocab068 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0328967.ref022] 22.Boukhennoufa I, Zhai X, Utti V, Jackson J, McDonald-Maier KD. A comprehensive evaluation of state-of-the-art time-series deep learning models for activity-recognition in post-stroke rehabilitation assessment. In: 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). 2021. p. 2242–7. 10.1109/embc46164.2021.9630462 [DOI] [PubMed]

[pone.0328967.ref023] 23.Campagnini S, Arienti C, Patrini M, Liuzzi P, Mannini A, Carrozza MC. Machine learning methods for functional recovery prediction and prognosis in post-stroke rehabilitation: a systematic review. J NeuroEng Rehabil. 2022;19(1):1–22.doi: 10.1186/s12984-022-01032-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0328967.ref024] 24.Boukhennoufa I, et al. Wearable sensors and machine learning in post-stroke rehabilitation assessment: a systematic review. Biomed Signal Process Control. 2022;71:103197. [Google Scholar]

[pone.0328967.ref025] 25.Abedi V, Avula V, Chaudhary D, Shahjouei S, Khan A, Griessenauer CJ, et al. Prediction of long-term stroke recurrence using machine learning models. JCM. 2021;10(6):1286. doi: 10.3390/jcm10061286 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0328967.ref026] 26.Song X, et al. Activities of daily living-based rehabilitation system for arm and hand motor function retraining after stroke. IEEE Trans Neural Syst Rehabilit Eng. 2022;30:621–31. [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref027] 27.Boukhennoufa I, Altai Z, Zhai X, Utti V, McDonald-Maier KD, Liew BXW. Predicting the internal knee abduction impulse during walking using deep learning. Front Bioeng Biotechnol. 2022;10. doi: 10.3389/fbioe.2022.877347 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0328967.ref028] 28.Liu T, Fan W, Wu C. A hybrid machine learning approach to cerebral stroke prediction based on imbalanced medical dataset. Artif Intell Med. 2019;101:101723. doi: 10.1016/j.artmed.2019.101723 [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref029] 29.Chakraborty P, Bandyopadhyay A, Sahu PP, Burman A, Mallik S, Alsubaie N, et al. Predicting stroke occurrences: a stacked machine learning approach with feature selection and data preprocessing. BMC Bioinformatics. 2024;25(1):329.doi: 10.1186/s12859-024-05866-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0328967.ref030] 30.Kuncheva LI. A theoretical study on six classifier fusion strategies. IEEE Trans Pattern Anal Mach Intell. 2002;24(2):281–6. [Google Scholar]

[pone.0328967.ref031] 31.Paisitkriangkrai S, Shen C, Hengel Avd. Pedestrian detection with spatially pooled features and structured ensemble learning. IEEE Trans Pattern Anal Mach Intell. 2016;38(6):1243–57. doi: 10.1109/tpami.2015.2474388 [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref032] 32.Pinto A, Pereira S, Correia H, Oliveira J, Rasteiro DMLD, Silva CA. Brain tumour segmentation based on extremely randomized forest with high-level features. In: 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). 2015. 3037–40. 10.1109/embc.2015.7319032 [DOI] [PubMed]

[pone.0328967.ref033] 33.Fang G, Huang Z, Wang Z. Predicting ischemic stroke outcome using deep learning approaches. Front Genet. 2022;12:827522. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0328967.ref034] 34.Herzog L, et al. Deep learning versus neurologists: functional outcome prediction in LVO stroke patients undergoing mechanical thrombectomy. Stroke. 2023;54(7):1761–9. [DOI] [PubMed] [Google Scholar]

[pone.0328967.ref035] 35.Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4765–74. [Google Scholar]

[pone.0328967.ref036] 36.Liu Y, Yu Y, Ouyang J, Jiang B, Yang G, Ostmeier S, et al. Functional outcome prediction in acute ischemic stroke using a fused imaging and clinical deep learning model. Stroke. 2023;54(9):2316–27. doi: 10.1161/strokeaha.123.044072 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0328967.ref037] 37.Gao C, Wang H. Intelligent stroke disease prediction model using deep learning approaches. Stroke Res Treat. 2024;2024(1):4523388. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0328967.ref038] 38.Bandyopadhyay A, Mukhopadhyay S, Ganguly U. On free of cost service distribution in cloud computing. In: ICACCI 2017. 2017. p. 1974–80.

[pone.0328967.ref039] 39.Shroff P, Bandyopadhyay A. A novel matching framework for one-sided markets in fog computing. Int J Comput Digit Syst. 2020;10:1–10. [Google Scholar]

[pone.0328967.ref040] 40.Di Nunzio GM, Maistro M, Zilio D, et al. Gamification for machine learning: the classification game. CEUR Worksh Proc. 2016;1642:45–52. [Google Scholar]

[pone.0328967.ref041] 41.Bîrsan T, Tiba D. One hundred years since the introduction of the set distance by Dimitrie Pompeiu. In: System Modeling and Optimization: Proceedings of the 22nd IFIP TC7 Conference, Turin, Italy. 2006. p. 35–9.

PERMALINK

OptiSelect and EnShap: Integrating machine learning and game theory for ischemic stroke prediction

Pritam Chakraborty

Anjan Bandyopadhyay

Sricheta Parui

Sujata Swain

Partha Sarathy Banerjee

Tapas Si

Hong Qin

Saurav Mallik

Roles

Abstract

Introduction

Related work

Materials and methodology

Dataset

Table 1. Dataset description.

Fig 1. Correlation between features.

Fig 2. Before and after sampling.

Geometric mean.

Potential biases in the dataset and their impact on real-world deployment.

Proposed OptiSelect feature extraction

Fig 3. Dataset division.

Table 2. Algorithm accuracy comparison for each feature.

Table 3. Ranking of features for each algorithm.

Top 3 features.

Fig 4. Accuracy of all models with top 3 features.

Top 4 features.

Fig 5. Accuracy of all models with top 4 features.

Top 5 features.

Fig 6. Accuracy of all models with top 5 features.

Fig 7. Accuracy of models with top features.

Table 4. Algorithm comparison - top performances.

Fig 8. Accuracy of models with different number of features.

Fig 9. OptiSelect feature extraction.

Proposed EnShap classification

Shapley.

Ensemble.

Table 5. Selected models and their Highest preferred features.

Enshap model.

Fig 10. Proposed EnShap model.

Fig 11. Flow chart of proposed method.

Hardware used

Results and discussion

Performance of Enshap classifier

Fig 12. Testing accuracy of all models.

Table 6. Performance metrics of various algorithms.

Fig 13. Testing accuracy with other proposed models.

Table 7. Accuracy scores with various proposed algorithms.

Fig 14. Confusion matrix of Enshap classifier.

Fig 15. Precision comparison.

Fig 16. Recall comparison.

Fig 17. F1 Score comparison.

Fig 18. Boxplot.

Class-wise performance metrics

Table 8. Class-wise performance metrics.

Trade-offs in prediction accuracy

Addressing the trade-offs

TOPSIS analysis

Data normalization.

Table 9. Normalized data.

Weight assignment.

Calculate weighted normalized scores.

Table 10. Weighted normalized scores.

Determine ideal and negative solutions.

Table 11. Ideal best solution.

Table 12. Ideal worst solution.

Calculate distances.

Calculate proximity to ideal solution.

Rank the models.

Table 13. Ranked models based on TOPSIS score.

K-Fold validation

Table 14. Estimated approximate K-fold cross-validation accuracy for each model.

Comparison with clinical risk scores

Conclusion and future work

Data Availability

Funding Statement

References

Associated Data

Data Availability Statement