Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Nov 4;15:38628. doi: 10.1038/s41598-025-22469-2

Prediction of hemorrhagic transformation in acute ischemic stroke patients using clinico-radiomics models

Yun Hwa Roh 1,#, E-Nae Cheong 1,#, Seung Chai Jung 1,✉,#, Jihye Yun 1,2,, Ji Su Ko 3, Se Jin Cho 4, Keum Mi Choi 1, Sang-ik Park 1, So Yeong Jeong 4, Da Hyun Lee 4, Eunseon Jeong 1
PMCID: PMC12586621  PMID: 41188459

Abstract

This study aimed to develop and validate a clinico-radiomics model integrating radiomics features from multiparametric MRI and clinical scoring systems to predict hemorrhagic transformation (HT) in acute ischemic stroke. A total of 918 patients were retrospectively included. Patients from Institution A who underwent MRIs between 2017 and 2019 were assigned to the training set (n = 792), whereas those from 2020 (n = 78) formed the internal validation set. External validation included 48 patients from Institution B. All patients underwent multiparametric MRI, including diffusion- and perfusion-weighted imaging, fluid-attenuated inversion recovery, and gradient-echo. Radiomics features were selected using the least absolute shrinkage and selection operator regression and random forest models. Clinico-radiomics models were developed using logistic regression by combining radiomics features with clinical scoring systems (HAT, SEDAN, DRAGON, SITS-ICH). Model performance was assessed using the area under the receiver operating characteristic curve (AUC), with DeLong’s test for comparison. Ten radiomics features were selected: time-to-peak, mean transit time, time-to-maximum, relative cerebral blood volume, and relative cerebral blood flow. The radiomics model demonstrated comparable performance to clinical models in the internal validation set (AUC: 0.79 vs. 0.75–0.81); but outperformed them in the external validation set (AUC: 0.85 vs. 0.66–0.68). Clinico-radiomics models demonstrated higher AUCs than radiomics or clinical models in both internal (AUC: 0.81–0.84 vs. 0.79 vs. 0.75–0.81) and external validation sets (AUC: 0.86–0.89 vs. 0.85 vs. 0.66–0.68). These findings suggest that clinico-radiomics models offer improved predictive accuracy for HT compared to radiomics or clinical scoring models alone.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-22469-2.

Keywords: Hemorrhagic transformation, Acute ischemic stroke, Radiomics, Perfusion, Magnetic resonance imaging

Subject terms: Biomarkers, Diseases, Medical research, Neurology

Introduction

Hemorrhagic transformation (HT), defined as secondary bleeding that occurs within or adjacent to infarcted areas, is a major complication in patients with acute ischemic stroke, with an incidence between 3.2% and 43.3%13. HT is often associated with the worsening of neurological function and can significantly affect functional outcomes with increased risk of death or disability, especially when parenchymal hematoma develops4. Early identification of patients at high risk of HT is essential for guiding clinical decision-making and optimizing management strategies, including adjustments to antithrombotic therapy and enhanced clinical monitoring.

Several scoring systems have been developed to predict the risk of HT1,58. Some of these clinical models include certain imaging markers, such as early ischemic changes on computed tomography (CT) and the hyperdense vessel sign of the middle cerebral artery. However, these CT-based markers are limited by variable sensitivity and interobserver variability9,10. While CT remains the standard modality for detecting hemorrhage due to its availability and speed, it provides limited insight into the underlying pathophysiology of HT.

Magnetic resonance imaging (MRI) offers superior tissue contrast and provides complementary pathophysiological information in acute ischemic stroke. Diffusion-weighted imaging (DWI) and apparent diffusion coefficient (ADC) maps accurately delineate infarcted tissue, with low ADC values indicating compromised cellular integrity. T2-weighted fluid attenuated inversion recovery (FLAIR) can depict early parenchymal changes, while T2*-weighted gradient-echo (GRE) imaging sensitively detects microbleeds and hemosiderin deposition. As HT results from blood-brain barrier disruption and impaired cerebral autoregulation following ischemia, perfusion-weighted imaging (PWI) offers hemodynamic information for assessing its risk. MRI features, such as a large infarct core volume, low ADC, and hypoperfusion, have been proposed as predictors of HT11,12.

Despite these advantages, MRI markers are still limited by interobserver variability and subjective visual assessment similar to CT. Moreover, as HT results from multifactorial processes, reliance on single imaging parameters provides only a one-dimensional perspective, limiting the comprehensive assessment of ischemic and reperfused tissues. Radiomics offers a promising approach for extracting high-dimensional quantitative features from medical images, enabling the identification of subtle tissue characteristics that may not be detected through conventional imaging. Although radiomics models for predicting HT have been developed, they are predominantly based on CT imaging1316. Integrating radiomics with perfusion MRI may improve predictive accuracy by more effectively capturing the underlying complex hemodynamic changes and tissue vulnerabilities17,18. However, the radiomic applications of perfusion MRI for HT prediction remain underexplored.

We hypothesized that incorporating a radiomics model based on multiparametric MRI, including dynamic susceptibility contrast perfusion-weighted imaging (DSC-PWI), into existing clinical scoring systems would improve the prediction of HT in patients with acute ischemic stroke. Therefore, the aim of this study was to develop and validate a radiomics model utilizing multiparametric MRI and to assess the added value of this approach when combined with clinical scoring systems.

Materials and methods

Study population

This retrospective study was conducted in accordance with the principles of the Declaration of Helsinki and was approved by the Institutional Review Board of Asan Medical Center (IRB No. 2022 − 0579). The requirement for informed consent was waived by the IRB due to the retrospective nature of the study.

We searched the electronic database of the Department of Radiology at our tertiary center (Institution A) and retrospectively reviewed patient records between January 2017 and December 2020. We identified 1056 consecutive patients who met the following inclusion criteria: (i) patients with acute ischemic stroke who underwent multiparametric MRI, including diffusion-weighted imaging (DWI), T2-weighted FLAIR, T2*-weighted GRE imaging, and DSC-PWI; and (ii) evidence of acute infarction on DWI, and (iii) MRI performed within 24 h of symptom onset.

The exclusion criteria were as follows: (i) intracranial hemorrhage identified during the initial workup (n = 83) and (ii) inadequate follow-up imaging (n = 195). All MRI images met the required analytical quality standards, and no patients were excluded due to image quality issues or MRI artifacts.

A total of 778 patients were included in the study. A temporal split was applied: patients who underwent MRI between January 2017 and December 2019 were assigned to the training set (n = 700), while those who underwent MRI between January 2020 and December 2020 were allocated to the internal validation set (n = 78).

We searched the Department of Radiology electronic database at Institution B for the external validation set and applied the same inclusion and exclusion criteria. A total of 48 patients were included in the external validation cohort. Due to the retrospective nature of this study and data archiving limitations at Institution B, ADC maps were not available for the external validation cohort. To maintain data consistency and integrity across all datasets, ADC was excluded from the analysis. The inclusion and exclusion processes are illustrated in Fig. 1.

Fig. 1.

Fig. 1

Patients inclusion process. MRI = magnetic resonance imaging; DWI = diffusion-weighted imaging; FLAIR = fluid attenuated inversion recovery; GRE = gradient echo; PWI = perfusion-weighted imaging;

Image acquisition

MRI scans at Institution A were performed using a 1.5T scanner (Magnetom Avanto; Siemens Healthcare) or a 3T scanner (Ingenia, Ingenia CX, Achieva, Philips Healthcare). At Institution B, an MRI was performed using 1.5T (Intera; Philips Healthcare) or 3T (Achieva or Ingenia; Philips Healthcare) scanners. The MRI sequences included DWI, FLAIR, GRE, and dynamic susceptibility contrast PWI. For DSC-PWI, gadobutrol (Gadovist, Bayer Schering Pharma AG) and gadoterate meglumine (Dotarem; Guerbet) were administered as an intravenous bolus injection (0.1 mmol/kg). A detailed description and comparison of imaging parameters are provided in the Supplementary Table 1. Olea Sphere software (Olea Medical) was used to automatically generate multiparametric MRI images, including ADC from DWI and perfusion maps of cerebral blood flow (CBF), cerebral blood volume (CBV), mean transit time (MTT), time to peak (TTP), time to maximum (Tmax), and K2 maps.

Image segmentation and feature extraction

The region of interest (ROI) of the infarct regions was initially delineated on DWI images based on areas of restricted diffusion indicating acute infarction using Olea software to generate an automatic ROI. This ROI was manually reviewed and adjusted as needed by a radiologist (Y.H.R., with six years of experience) blinded to patient information and outcomes. Subsequently, the ROI was co-registered to FLAIR, GRE, and PWI, and all image data for radiomics feature extraction were maintained in their original space to ensure spatial correspondence across all sequences. The co-registration process was based on spatial coordinate transformation rather than signal intensity matching to ensure accurate alignment across different imaging sequences, independent of lesion visibility on individual sequences. Perfusion and diffusion maps, which are computational maps, were used without further processing. However, FLAIR and GRE images underwent N4 bias field correction before feature extraction to minimize the intensity of inhomogeneities19. Furthermore, z-score intensity normalization was performed on DWI, FLAIR, and GRE images (excluding computational maps) to reduce variability across scanners and acquisition setting20.

To analyze specific brain regions quantitatively, a radiomics approach was applied using the open-source Python library PyRadiomics (v.3.1.0)21. All images were resampled to 1 mm isotropic voxel spacing prior to feature extraction to ensure spatial consistency across sequences. This tool enabled the extraction of 107 radiomic features from each ROI across all images to capture a broad range of image characteristics, including intensity distributions, texture patterns, and spatial relationships. Detailed information on the extracted features is provided in Supplementary Table 2, and the calculation formulas are available in the online documentation of PyRadiomics (https://pyradiomics.readthedocs.io/en/latest/features.html#module-radiomics.shape). Since nine imaging sequences (DWI, FLAIR, GRE, CBF, CBV, MTT, TTP, Tmax, and K2) were used, a total of 963 features were extracted per patient (calculated as 9 MRI sequences × 107 features × 1 ROI).

The extracted features are organized into six primary groups. First-order statistics (32 features) were used to describe the intensity distribution within each ROI. Gray-level co-occurrence matrix (GLCM) features, with a total of 24 features, capture texture by analyzing the spatial relationships between voxels. The gray-level run-length matrix (GLRLM) features (16 in total) measure the frequency of consecutive voxels with identical gray levels. The gray-level size zone matrix (GLSZM) features, also numbered 16, quantify the zones of the connected voxels sharing the same gray level. Gray-level dependence matrix (GLDM) features (14 measures) assessed voxel dependency on neighboring voxels. Finally, the neighboring grey tone difference matrix (NGTDM) features, comprising five descriptors, capture the variations between a voxel and its neighbors.

To assess the reproducibility of radiomic features, the required sample size was calculated using hypothesis testing based on the method proposed by Walter, Eliasziw, and Donner22. The intraclass correlation coefficient (ICC) for the test-retest reliability of the measurement tool was anticipated to be 0.9, with a minimum acceptable threshold of 0.7. The analysis was conducted with a two-tailed significance level (α) of 0.05 and a statistical power (1-β) of 80%. This calculation yielded a sample size of 23 participants; therefore, 24 patients from the training set were randomly selected. To assess reproducibility, the ROI was automatically generated again using the Olea software, as previously described. A second radiologist (J.S.K., with six years of experience) independently reviewed and adjusted the ROI as necessary. ICC values were calculated for 107 radiomics features extracted from each imaging sequence (DWI, FLAIR, GRE, and PWI maps). All features from DWI and ADC demonstrated ICC values above 0.75. While some features from FLAIR and GRE exhibited ICC values below 0.75, only features with ICC ≥ 0.75 were eligible for the subsequent feature selection process.

Assessment of HT

All patients underwent follow-up MRI, including GRE or susceptibility-weighted imaging (SWI) sequences, within 1–3 days of the initial MRI. A neuroradiologist (S.C.J., with 15 years of experience), blinded to patient information, reviewed the follow-up images to assess the presence of HT. HT was defined as a confluent petechial hemorrhage greater than 10 mm within the infarcted area on GRE or SWI23.

Radiomics features and clinical characteristics selection

All radiomic features were standardized using z-score transformation, where the mean and standard deviation were estimated from the training set and subsequently applied to the internal and external validation sets, in order to minimize the influence of scale differences among features and to provide uniform input for the model. Among the 963 features, least absolute shrinkage and selection operator (LASSO) regression was first applied to select 20 features, which were further refined by ranking with a Random Forest model to identify the top 10 features (Fig. 2). Ten features were selected: two from TTP, three from MTT, and one each from the Tmax, FLAIR, K2, rCBV, and rCBF maps.

Fig. 2.

Fig. 2

Coefficients of the optimal radiomics features. the above graph shows the regression coefficients calculated using LASSO, selected features with non-zero coefficients. Features with larger absolute values have a greater impact on predicting the target variable (hemorrhagic transformation). Positive coefficients indicate a positive influence, while negative coefficients indicate a negative influence. The below graph represents feature importance derived from Random Forest based on Gini impurity. Values closer to 1 indicate a more significant role in decision-making. The top 10 most important features are displayed.

Variables from the existing clinical models for predicting HT were selected to construct the respective clinico-radiomics models. For the HAT-radiomics model, glucose level, presence of diabetes mellitus, and National Institutes of Health Stroke Scale (NIHSS) score were chosen. Age, glucose level, and NIHSS score were selected for the SEDAN-radiomics model. The DRAGON-radiomics model includes age, glucose level, NIHSS score, and modified Rankin scale. For the SITS-ICH-radiomics model, age, glucose level, NIHSS score, systolic blood pressure, body weight, onset-to-treatment time, history of hypertension, and use of aspirin or clopidogrel were selected. Variables of hyperdense middle cerebral artery signs or early ischemic signs were considered radiologic characteristics and therefore were not selected as clinical features.

Radiomics and clinico-radiomics model construction

The 10 selected radiomics features were used to develop the “radiomics model,” and the chosen clinical characteristics were integrated to construct the “clinico-radiomics models.” This approach constructed one radiomic model, four clinical models, and four combined clinico-radiomics models.

A logistic regression model was developed using a grid search for hyperparameter tuning and stratified 10-fold cross-validation. A grid of hyperparameters, including various solvers, regularization strengths, class weights (with class_weight=’balanced’ to assign higher importance to HT), and maximum iterations, was defined for tuning. The model was optimized to identify the best hyperparameters based on the cross-validation performance. Following the hyperparameter search, the logistic regression model was retrained using optimal parameters and fitted to the input data. This process ensured robust model selection and performance evaluation. All model configuration processes were implemented using Python (version 3.11.6).

Statistical analysis

Baseline characteristics of the included patients were summarized using descriptive statistics. Categorical variables were analyzed using the chi-square test, whereas continuous variables were evaluated using the Mann–Whitney U test. Python was used to statistically analyze the above steps, including interobserver variability evaluation, radiomics feature selection, and model construction. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), and the sensitivity, specificity, and accuracy were also calculated. DeLong’s test was used to assess changes in the model performance after the addition of clinical characteristics. All statistical tests were two-sided; a p-value less than 0.05 was considered statistically significant. A schematic representation of the study is shown in Fig. 3.

Fig. 3.

Fig. 3

Schematic flow of the study. ROI = region of interest; LASSO = least absolute shrinkage and selection operator; DM = diabetes mellitus; NIHSS = national institutes of health stroke scale; mRS = modified Rankin scale; sBP = systolic blood pressure; HTN = hypertension.

Results

Patient characteristics

The baseline characteristics of the patients are summarized in Table 1. HT occurred in 132 patients (18.9%) in the training set, 17 patients (21.8%) in the internal validation set, and 8 patients (16.7%) in the external validation set, with no significant differences between the groups (all p > 0.05).

Table 1.

Patients characteristics.

Cohorts P 1 P 2 P 3
Training Internal validation External validation
(n = 700) (n = 78) (n = 48)
Hemorrhagic transformation 132 (18.9) 17 (21.8) 8 (16.7) 0.822 0.933 0.777
Age, year 67.71 ± 13.34 70.96 ± 12.43 71.27 ± 12.83 0.121 0.197 0.992
Sex (male) 444 (63.4) 41 (52.6) 34 (70.8) 0.170 0.590 0.120
NIHSS 8.86 ± 6.57 7.82 ± 6.1 9.02 ± 5.92 0.406 0.987 0.602
mRS 3.31 ± 1.37 2.87 ± 1.51 3.21 ± 1.07 0.027* 0.880 0.406
Hypertension 444 (63.4) 46 (59) 28 (58.3) 0.743 0.780 0.997
Systolic blood pressure 146.89 ± 44.97 145.06 ± 24.85 148.96 ± 25.815 0.938 0.948 0.883
Diabetes 181 (25.9) 22 (28.2) 16 (33.3) 0.906 0.526 0.819
Glucose 148.48 ± 62.76 154.21 ± 64.6 143.31 ± 51.04 0.744 0.857 0.635
BMI
 Weight 63.2 ± 12.02 62.55 ± 11.1 68.82 ± 11.57 0.902 0.007* 0.017*
 Platelet 225.43 ± 93.02 225.46 ± 73.5 NA 0.998 NA NA
 Aspirin treatment 498 (71.1) 43 (55.1) 33 (68.8) 0.014* 0.941 0.271
 Plavix treatment 354 (50.6) 26 (33.3) 21 (43.8) 0.015* 0.656 0.522
 HAT 1.46 ± 1.3 1.01 ± 1.03 1.25 ± 0.98 0.012 0.532 0.591
 SEDAN 1.93 ± 1.32 2.03 ± 1.45 1.83 ± 0.98 0.788 0.908 0.722
 DRAGON 4.43 ± 2.27 4.23 ± 2.24 4.23 ± 1.52 0.940 0.962 1.000
 SITS-ICH 4.66 ± 2.07 2.72 ± 1.61 4.71 ± 1.75 < 0.001* 0.8936 < 0.001*

Data are mean ± standard deviation or n (%).

NIHSS = national institutes of health stroke scale; mRS = modified Rankin scale; BMI = body mass index; HAT = hemorrhage after thrombolysis; SEDAN = blood sugar, early infarct signs, hyperdense cerebral artery sign, age; DRAGON = diabetes, race, age, glucose, onset to treatment time, NIHSS score; SITS-ICH = safe implementation of treatments in stroke–international collaboration on hemorrhagic transformation.

P1 value is for the training cohort and the internal validation cohort.

P2 value is for the training cohort and the external validation cohort.

P3 value is for the internal validation cohort and the external validation cohort.

*Indicates statistical significance.

In comparison to the training set, the internal validation set showed differences in the Modified Rankin Scale (3.31 vs. 2.87, p = 0.027), aspirin treatment (91.1% vs. 55.1%, p = 0.014), and clopidogrel treatment (50.6% vs. 33.3%, p = 0.015). Among the clinical models, the HAT score (1.46 vs. 1.25, p = 0.012) and SITS-ICH score (4.66 vs. 2.72, p < 0.001) were lower in the internal validation set compared to the training set. When comparing the training and external validation sets, no significant differences were observed in other variables or clinical model scores (all p > 0.05), except for weight (63.2 vs. 68.82, p = 0.007).

Predictive performance of radiomics models and clinical models

The predictive performances of the radiomics, clinical, and clinico-radiomics models are summarized in Table 2; Fig. 4. The performance of the radiomics model in the training set had an AUC of 0.76 (95% CI, 0.72–0.80), with a sensitivity of 76%, specificity of 65%, and accuracy of 67%. In the internal validation set, the predictive performance had an AUC of 0.79 (95% CI, 0.67–0.80), with a sensitivity of 58.8%, specificity of 77.1%, and accuracy of 73%. In the external validation set, the predictive performance had an AUC of 0.85 (95% CI, 0.68–0.98), with a sensitivity of 62.5%, specificity of 87.5%, and accuracy of 83.3%.

Table 2.

Predictive performance of radiomics, clinical, and clinico-radiomics models.

Training Internal validation External validation
Radiomics Clinical Clinico-radiomics Radiomics Clinical Clinico-radiomics Radiomics Clinical Clinico-radiomics
HAT AUC (95% CI) 0.76 0.7 0.79 0.79 0.77 0.83 0.85 0.66 0.86
(0.72–0.80) (0.62–0.73) (0.75–0.82) 0.67–0.89 0.64–0.87 0.73–0.92 0.68–0.98 0.47–0.84 0.67–0.99
Accuracy 67% 67.3% 71.3% 73% 70.5% 71.8% 83.3% 62.5% 85.4%
Sensitivity 76% 67.7% 76.4% 58.8% 58.8% 58.8% 62.5% 37.5% 63%
Specificity 65% 67.3% 70.1% 77.1% 73.8% 75.4% 87.5% 68% 90%
SEDAN AUC (95% CI) 0.76 0.7 0.8 0.79 0.78 0.82 0.85 0.68 0.9
(0.72–0.80) (0.64–0.74) (0.76–0.83) 0.67–0.89 0.66–0.88 0.71–0.92 0.68–0.98 0.51–0.84 0.74–1.00
Accuracy 67% 66.7% 71.8% 73% 71.8% 71.8% 83.3% 64.6% 81.3%
Sensitivity 76% 63.6% 78.8% 58.8% 64.7% 58.8% 62.5% 75% 75%
Specificity 65% 67.4% 70.2% 77.1% 73.8% 75.4% 87.5% 62.5% 83%
DRAGON AUC (95% CI) 0.76 0.69 0.79 0.79 0.75 0.81 0.85 0.67 0.9
(0.72–0.80) (0.63–0.73) (0.75–0.82) 0.67–0.89 0.61–0.87 0.70–0.91 0.68–0.98 0.50–0.84 0.75–1.00
Accuracy 67% 66.2% 72% 73% 71.8% 73.1% 83.3% 64.6% 83.3%
Sensitivity 76% 62.4% 78.9% 58.8% 70.6% 64.7% 62.5% 62.5% 75%
Specificity 65% 67.1% 70.4% 77.1% 72.1% 75.4% 87.5% 65% 85%
SITS-ICH AUC (95% CI) 0.76 0.69 0.79 0.79 0.81 0.84 0.85 0.67 0.89
(0.72–0.80) (0.61–0.71) (0.74–0.82) 0.67–0.89 0.69–0.89 0.73–0.92 0.68–0.98 0.47–0.85 0.71–1.00
Accuracy 67% 66% 69.4% 73% 68% 73.1% 83.3% 62.5% 81%
Sensitivity 76% 61.4% 73.5% 58.8% 70.6% 64.7% 62.5% 75% 75%
Specificity 65% 67% 68.4% 77.1% 67.2% 75.4% 87.5% 60% 82.5%

AUC = area under the receiver operating characteristic curve; CI = confidence interval; HAT = hemorrhage after thrombolysis; SEDAN = blood sugar, early infarct signs, hyperdense cerebral artery sign, age; DRAGON = diabetes, race, age, glucose, onset to treatment time, NIHSS score; SITS-ICH = safe implementation of treatments in stroke–international collaboration on hemorrhagic transformation.

Fig. 4.

Fig. 4

Predictive performance of the radiomics, clinical, and clinico-radiomics models. (A) Receiver operating characteristic (ROC) curve and DeLong test p-value matrix for the training set. (B) ROC curve and DeLong test p-value matrix for the internal validation set. (C) ROC curve and DeLong test p-value matrix for the external validation set. CV = cross-validation; HAT = hemorrhage after thrombolysis; SEDAN = blood sugar, early infarct signs, hyperdense cerebral artery sign, age; DRAGON = diabetes, race, age, glucose, onset to treatment time, national institutes of health stroke scale (NIHSS) score; SITS-ICH = safe implementation of treatments in stroke–international collaboration on hemorrhagic transformation.

In the training set, the HAT, SEDAN, DRAGON, and SITS-ICH models demonstrated similar performances with AUCs of approximately 0.7. In the internal validation set, the SITS-ICH model showed the highest performance, with an AUC of 0.81 (95% CI, 0.69–0.89), compared to the others, which ranged from 0.75 to 0.78. All four clinical models exhibited similar performances in the external validation set, with AUCs ranging between 0.66 and 0.68.

When comparing the radiomics model to the clinical models, the radiomics model showed significantly higher performance than all clinical models in the training set (all p < 0.05). In the internal validation set, the radiomics model performed less than SITS-ICH (0.79 vs. 0.81, p = 0.015). The radiomics model demonstrated slightly higher performance than the HAT, SEDAN, and DRAGON scores (0.79 vs. 0.75–0.78, p = 0.005–0.015). In the external validation set, the radiomics model showed significantly higher predictive performance for HT than the clinical models (0.86–0.9 vs. 0.66–0.68, all p < 0.05).

Predictive performance of clinico-radiomics models

Clinico-radiomics models incorporating clinical characteristics from HAT, SEDAN, DRAGON, and SITS-ICH showed AUCs of around 0.8 in the training and internal validation sets (Table 2). The AUCs of the four clinico-radiomics models were higher in the external validation set, ranging from 0.86 to 0.9.

In comparison, the clinico-radiomics models consistently outperformed the clinical models across all cohorts, with statistical significance (all p < 0.05) (Fig. 4). The clinico-radiomics models also demonstrated superior predictive performance compared to the radiomics model across all cohorts. The difference in performance between the clinico-radiomics and the radiomics model was statistically significant in the training set and the SITS-ICH radiomics model in the internal validation set.

There were no significant differences in the performances of the radiomics, clinical, and clinico-radiomics models between the intravenous thrombolysis and intra-arterial thrombectomy groups in the training and internal validation sets (p > 0.05). Therefore, the models were applied to external validation sets regardless of the treatment type. The clinico-radiomics models consistently outperformed the radiomics and clinical models across all subgroups (Supplementary Tables 3 and 4).

Discussion

We developed radiomics and clinico-radiomics models that integrated the radiomics model with clinical characteristics from four pre-existing clinical models for predicting HT. The radiomics model demonstrated comparable performance to existing clinical models in the internal validation set (AUC: 0.79 vs. 0.75–0.81) but outperformed them in the external validation set (AUC: 0.85 vs. 0.66–0.68). The clinico-radiomics models, which combined radiomics features with clinical variables, consistently showed improved diagnostic performance compared to the radiomics model and clinical scoring systems alone. In the internal validation set, the clinico-radiomics models achieved AUCs of 0.81–0.84 compared to 0.79 for the radiomics model and 0.75–0.81 for the clinical models. In the external validation set, the clinico-radiomics models achieved AUCs of 0.86–0.9 compared to 0.85 for the radiomics model and 0.66–0.68 for the clinical models.

Clinical models, including HAT, SEDAN, DRAGON, and SITS-ICH, have been developed and validated for the early prediction of HT in patients with acute ischemic stroke, with varying performances across studies1,24. In this study, the diagnostic performance of these scoring systems in the training set was approximately 0.7. In the internal validation set, the SITS-ICH model performed slightly better (0.81 vs. 0.75–0.78), which may be attributed to the lower proportion of patients on antiplatelet therapy. All four models exhibited similar performances in the external validation set, with AUCs ranging from 0.66 to 0.68. The moderate performance of these clinical models (AUCs between 0.6 and 0.8) highlights their limitations, including reduced predictive power in diverse patient populations and reliance on imaging features with variable sensitivity and interobserver variability.

Studies have explored the use of radiomics to improve HT prediction, with most focusing on non-contrast CT1316. However, multiparametric MRI, including PWI, can provide pathophysiological information regarding the blood-brain barrier disruption1,17,18. In our radiomics model, the selected features were derived from the TTP, MTT, Tmax, FLAIR, K2, rCBV, and rCBF maps. Notably, TTP and MTT accounted for the majority of the PWI-related features. This aligns with the findings of Jiang et al., who reported that among deep-learning models using a single parameter, MTT and TTP outperformed CBF, CBV, and DWI25. TTP measures the time taken to reach peak voxel enhancement, with an increased TTP indicating delayed blood flow26. The MTT is the ratio of CBV to CBF and is inversely proportional to cerebral perfusion pressure27. The selection of features related to these two parameters likely indicates that the risk of HT increases owing to BBB disruption in severely ischemic areas with decreased perfusion. Those from FLAIR were included in the conventional MRI features, whereas no features from DWI were selected. Similarly, Xu et al. reported that the FLAIR-based radiomics model outperformed both the DWI and combined DWI-FLAIR models28. FLAIR changes have been shown to correlate with matrix metalloproteinase-9 levels and the risk of HT in the acute phase of stroke, which may explain the relevance of the selected FLAIR features29.

This study aimed to evaluate the added value of combining radiomic models with clinical variables from well-established scoring systems. The clinico-radiomics models consistently demonstrated improved predictive performance compared to clinical scoring systems alone across the training, internal validation, and external validation sets. Similarly, CT-based and conventional MRI models have shown enhanced performance when combined with clinical factors28,30. Among the few studies using PWI, Meng et al. reported that integrating clinical factors with radiomics features improved predictive performance compared to either clinical or radiomics models alone31. However, their study was limited by a small sample size, and the inclusion of infarction location and SWI vessel signs as clinical factors may be more appropriately classified as radiologic rather than clinical features. Additionally, deep learning models combining PWI with clinical data outperform both clinical and imaging-only models, further highlighting the value of integrating multiple data sources to predict HT25.

Our clinico-radiomics model offers several practical advantages. The automated radiomics pipelines provide objective, reproducible metrics that reduce interobserver variability inherent in conventional imaging interpretation. The model’s consistent performance across different scanner platforms and institutions supports its potential for clinical adoption. Also, integration with established clinical scoring systems enhances physician acceptance and interpretability compared to purely algorithmic approaches. However, several considerations remain for real-world adoption. While our model demonstrated robustness across varying imaging protocols, optimal performance may still benefit from standardized imaging protocol with quality control measures and adequate computational infrastructure, which may not be uniformly available across institutions. Integration with existing hospital information systems will be essential for effective clinical deployment and user acceptance.

This study has several limitations. First, the ADC data were not included because of the lack of an external validation set. While the incorporation of PWI adds value by better capturing the underlying pathophysiology of HT, future studies including ADC may provide a more comprehensive evaluation of the infarct core. Second, the external validation cohort had a small sample size and was conducted at a single center, which limits the generalizability of the findings. Further prospective studies involving multicenter validation are required to confirm the robustness and applicability of the proposed model. Third, pre-existing white matter hyperintensities in the infarct region on FLAIR images may have confounded the extracted radiomics features, as our automated process did not specifically account for or exclude such cases. This may have introduced variability in FLAIR-derived features in the analysis. Fourth, the differences in MRI acquisition protocols and sequence parameters across institutions may have affected the extracted radiomics features and model performance. Despite this, our radiomics model showed robust performance across the datasets and actually demonstrates the feasibility of model implementation. Fifth, the absence of a minimum lesion volume threshold may have introduced variability in radiomics feature extraction, although only 11 cases (1.4%) had lesion volumes smaller than 100 mm3.

In conclusion, clinico-radiomics models that integrate clinical features and radiomics can enhance the predictability of HT in patients with acute ischemic stroke. These models have the potential to assist clinicians in making informed decisions regarding patient management.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (43.8KB, docx)

Author contributions

Project administration and supervision: S.C.J. J.Y., Data curation: S.C.J., J.Y., Y.H.R., E.C., J.S.K., S.J.C., K.M.C., S.P., S.Y.J., D.H.L., E.J., Formal analysis: E.C., J.Y., Methodology: S.C.J., J.Y., Y.H.R., E.C., Writing—original draft: Y.H.R., E.C., Writing—review & editing: Y.H.R., E.C., S.C.J. All authors read and approved the final manuscript.

Funding

This study was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (NRF-2020M3E5D2A01084578 and NRF-2019R1A2C1089939).

Data availability

The data in the current study are available from the corresponding author upon reasonable request.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Yun Hwa Roh, E-Nae Cheong, Seung Chai Jung and Jihye Yun contributed equally to this work.

Contributor Information

Seung Chai Jung, Email: dynamics79@gmail.com.

Jihye Yun, Email: dool0120@gmail.com.

References

  • 1.Yaghi, S. et al. Treatment and outcome of hemorrhagic transformation after intravenous alteplase in acute ischemic stroke: A scientific statement for healthcare professionals from the American Heart Association/American Stroke Association. Stroke48, e343–e361. 10.1161/STR.0000000000000152 (2017). [DOI] [PubMed] [Google Scholar]
  • 2.Sussman, E. S. & Connolly, E. S. Jr. Hemorrhagic transformation: a review of the rate of hemorrhage in the major clinical trials of acute ischemic stroke. Front. Neurol.4, 69. 10.3389/fneur.2013.00069 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Sun, J. et al. Risk factors of hemorrhagic transformation in acute ischaemic stroke: A systematic review and meta-analysis. Front. Neurol.14, 1079205. 10.3389/fneur.2023.1079205 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Paciaroni, M. et al. Early hemorrhagic transformation of brain infarction: rate, predictive factors, and influence on clinical outcome: results of a prospective multicenter study. Stroke39, 2249–2256. 10.1161/STROKEAHA.107.510321 (2008). [DOI] [PubMed] [Google Scholar]
  • 5.Lou, M. et al. The HAT score: a simple grading scale for predicting hemorrhage after thrombolysis. Neurology71, 1417–1423. 10.1212/01.wnl.0000330297.58334.dd (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Strbian, D. et al. Symptomatic intracranial hemorrhage after stroke thrombolysis: the SEDAN score. Ann. Neurol.71, 634–641. 10.1002/ana.23546 (2012). [DOI] [PubMed] [Google Scholar]
  • 7.Wang, A. et al. DRAGON score predicts functional outcomes in acute ischemic stroke patients receiving both intravenous tissue plasminogen activator and endovascular therapy. Surg. Neurol. Int.8, 149. 10.4103/2152-7806.210993 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Mazya, M. et al. Predicting the risk of symptomatic intracerebral hemorrhage in ischemic stroke treated with intravenous alteplase: safe implementation of treatments in stroke (SITS) symptomatic intracerebral hemorrhage risk score. Stroke43, 1524–1531. 10.1161/STROKEAHA.111.644815 (2012). [DOI] [PubMed] [Google Scholar]
  • 9.Mair, G. et al. Sensitivity and specificity of the hyperdense artery sign for arterial obstruction in acute ischemic stroke. Stroke46, 102–107. 10.1161/STROKEAHA.114.007036 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Wardlaw, J. M. & Mielke, O. Early signs of brain infarction at CT: observer reliability and outcome after thrombolytic treatment–systematic review. Radiology235, 444–453. 10.1148/radiol.2352040262 (2005). [DOI] [PubMed] [Google Scholar]
  • 11.Suh, C. H. et al. MRI for prediction of hemorrhagic transformation in acute ischemic stroke: a systematic review and meta-analysis. Acta Radiol.61, 964–972. 10.1177/0284185119887593 (2020). [DOI] [PubMed] [Google Scholar]
  • 12.Li, M. et al. Magnetic resonance perfusion-weighted imaging in predicting hemorrhagic transformation of acute ischemic stroke: A retrospective study. Diagnostics10.3390/diagnostics13223404 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Xie, G. et al. Radiomics-based infarct features on CT predict hemorrhagic transformation in patients with acute ischemic stroke. Front. Neurosci.16, 1002717. 10.3389/fnins.2022.1002717 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Wen, X., Xiao, Y., Hu, X., Chen, J. & Song, F. Prediction of hemorrhagic transformation via pre-treatment CT radiomics in acute ischemic stroke patients receiving endovascular therapy. Br. J. Radiol.96, 20220439. 10.1259/bjr.20220439 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ren, H. et al. A clinical-radiomics model based on noncontrast computed tomography to predict hemorrhagic transformation after stroke by machine learning: a multicenter study. Insights Imaging. 14, 52. 10.1186/s13244-023-01399-5 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Heo, J. et al. Radiomics using non-contrast CT to predict hemorrhagic transformation risk in stroke patients undergoing revascularization. Eur. Radiol.34, 6005–6015. 10.1007/s00330-024-10618-6 (2024). [DOI] [PubMed] [Google Scholar]
  • 17.Kovacs, K. B., Bencs, V., Hudak, L., Olah, L. & Csiba, L. Hemorrhagic transformation of ischemic strokes. Int. J. Mol. Sci.10.3390/ijms241814067 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Hong, J. M., Kim, D. S. & Kim, M. Hemorrhagic transformation after ischemic stroke: mechanisms and management. Front. Neurol.12, 703258. 10.3389/fneur.2021.703258 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Tustison, N. J. et al. N4ITK: improved N3 bias correction. IEEE Trans. Med. Imaging. 29, 1310–1320. 10.1109/TMI.2010.2046908 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Reinhold, J. C., Dewey, B. E., Carass, A. & Prince, J. L. Evaluating the impact of intensity normalization on MR image synthesis. Proc. SPIE Int. Soc. Opt. Eng.10.1117/12.2513089 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.van Griethuysen, J. J. M. et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res.77, e104–e107. 10.1158/0008-5472.CAN-17-0339 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Walter, S. D., Eliasziw, M. & Donner, A. Sample size and optimal designs for reliability studies. Stat. Med.17, 101–110. 10.1002/(sici)1097-0258(19980115)17:1%3C101::aid-sim727%3E3.0.co;2-e (1998). [DOI] [PubMed] [Google Scholar]
  • 23.Larrue, V., von Kummer, R. R., Muller, A. & Bluhmki, E. Risk factors for severe hemorrhagic transformation in ischemic stroke patients treated with recombinant tissue plasminogen activator: a secondary analysis of the European-Australasian Acute Stroke Study (ECASS II). Stroke32, 438–441. 10.1161/01.str.32.2.438 (2001). [DOI] [PubMed] [Google Scholar]
  • 24.Nisar, T., Hanumanthu, R. & Khandelwal, P. Symptomatic intracerebral hemorrhage after intravenous thrombolysis: predictive factors and validation of prediction models. J. Stroke Cerebrovasc. Dis.28, 104360. 10.1016/j.jstrokecerebrovasdis.2019.104360 (2019). [DOI] [PubMed] [Google Scholar]
  • 25.Jiang, L. et al. A deep learning-based model for prediction of hemorrhagic transformation after stroke. Brain Pathol.33, e13023. 10.1111/bpa.13023 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Sobesky, J. et al. Which time-to-peak threshold best identifies penumbral flow? A comparison of perfusion-weighted magnetic resonance imaging and positron emission tomography in acute ischemic stroke. Stroke35, 2843–2847. 10.1161/01.STR.0000147043.29399.f6 (2004). [DOI] [PubMed] [Google Scholar]
  • 27.Schumann, P. et al. Evaluation of the ratio of cerebral blood flow to cerebral blood volume as an index of local cerebral perfusion pressure. Brain121(Pt 7), 1369–1379. 10.1093/brain/121.7.1369 (1998). [DOI] [PubMed] [Google Scholar]
  • 28.Xu, Q. et al. Clinical features and FLAIR radiomics nomogram for predicting functional outcomes after thrombolysis in ischaemic stroke. Front. Neurosci.17, 1063391. 10.3389/fnins.2023.1063391 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Jha, R. et al. Fluid-attenuated inversion recovery hyperintensity correlates with matrix metalloproteinase-9 level and hemorrhagic transformation in acute ischemic stroke. Stroke45, 1040–1045. 10.1161/STROKEAHA.113.004627 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Zhang, Y. et al. Constructing machine learning models based on non-contrast CT radiomics to predict hemorrhagic transformation after stoke: a two-center study. Front. Neurol.15, 1413795. 10.3389/fneur.2024.1413795 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Meng, Y. et al. Prediction model of hemorrhage transformation in patient with acute ischemic stroke based on multiparametric MRI radiomics and machine learning. Brain Sci.10.3390/brainsci12070858 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (43.8KB, docx)

Data Availability Statement

The data in the current study are available from the corresponding author upon reasonable request.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES