Skip to main content
Journal of Imaging Informatics in Medicine logoLink to Journal of Imaging Informatics in Medicine
. 2024 Nov 11;38(4):1950–1962. doi: 10.1007/s10278-024-01318-0

Robust Radiomics Models for Predicting HIFU Prognosis in Uterine Fibroids Using SHAP Explanations: A Multicenter Cohort Study

Huan Liu 1, Jincheng Zeng 1, Chen Jinyun 1, Xiaohua Liu 2, Yongbin Deng 3, Chenghai Li 1,4,, Faqi Li 1,
PMCID: PMC12343387  PMID: 39528886

Abstract

This study sought to develop and validate different machine learning (ML) models that leverage non-contrast MRI radiomics to predict the degree of nonperfusion volume ratio (NVPR) of high-intensity focused ultrasound (HIFU) treatment for uterine fibroids, equipping clinicians with an early prediction tool for decision-making. This study conducted a retrospective analysis on 221 patients with uterine fibroids who received HIFU treatment and were divided into a training set (N = 117), internal validation (N = 49), and an external test set (N = 55). The 851 radiomics features were extracted from T2-weighted imaging (T2WI), and the max-relevance and min-redundancy (mRMR) and the least absolute shrinkage and selection operator (LASSO) regression were applied for feature selection. Several ML models were constructed by logistic regression (LR), decision tree (DT), random forest (RF), support vector machine (SVM), extreme gradient boosting (XGBoost), and light gradient boosting machine (LGBM). These models underwent internal and external validation, and the best model’s feature significance was assessed via the Shapley additive explanations (SHAP) method. Four significant non-contrast MRI radiomics features were identified, with the SVM model outperforming others in both internal and external validations, and the AUCs of the T2WI models were 0.860, 0.847, and 0.777, respectively. SHAP analysis highlighted five critical predictors of postoperative NVPR degree, encompassing two radiomics features from non-contrast MRI and three clinical data indicators. The SVM model combining radiomics features and clinical parameters effectively predicts NVPR degree post-HIFU, which enables timely and effective interventions of HIFU.

Supplementary Information

The online version contains supplementary material available at 10.1007/s10278-024-01318-0.

Keywords: Machine learning, Radiomics, Uterine fibroid, NVPR, SHAP

Introduction

Uterine fibroids are the most common benign tumors in the reproductive age of women, with the symptoms of abnormal uterine bleeding, pelvic pressure, and infertility, significantly impacting their quality of life [1]. Conventional treatments for fibroids include hysterectomy, myomectomy, and uterine artery embolization (UAE), but these treatments are invasive. High-intensity focused ultrasound (HIFU) as a minimally invasive treatment option has been widely used in the management of uterine fibroids [2]. The therapeutic effect was closely related to the non-perfused volume ratio (NPVR) of immediate postoperative ablation [3, 4]. But not all fibroids will benefit from the HIFU ablation, due to the high cellular and vascular nature of fibroids, leading to hindering the deposition of ultrasound energy [5], underscoring the importance of accurate preoperative assessment to ensure the efficacy of HIFU treatment.

Magnetic resonance imaging (MRI) can provide high-resolution soft tissue imaging, making it an ideal tool for assessing NPVR [6, 7]. NPVR most accurately reflects the volume of coagulative necrosis after HIFU ablation and is an important indicator of treatment success. Most studies on the prognosis of HIFU ablation for uterine fibroids focus on the overall volume change of the fibroid [8]. T2WI has good tissue contrast, clearly showing the boundary and internal structure of fibroids. In addition, T2WI does not need to inject contrast agents for enhanced scanning, which has a low economic burden on patients and great promotion value. Signal intensity on T2WI is widely used to predict the ablation effect. Compared with the low signal of uterine fibroids, the high signal often indicates poor treatment effect. However, judgments based on T2 signal strength are subject to subjective influence and can only be qualitatively evaluated for treatment suitability of fibroids, making it almost impossible to quantify changes in fibroid heterogeneity.

Radiomics is a noninvasive approach for high-throughput extraction and evaluation of huge amounts of quantitative features from medical images [9]. Its strength is the conversion of visual image information into deep-seated data for quantitative analysis [10], which has proven invaluable across various fields, including oncology. Recent advancements further highlight its effectiveness in non-invasively predicting outcomes for patients with uterine fibroids, showcasing its broad applicability and potential in medical prognostication [1115]. Although Qin et al. [16] developed a logistic regression model utilizing a dual-sequence MRI radiomics approach, achieving an AUC of 0.841, it lacked a comparison of different ML models. Zheng et al. [14, 15] extracted radiomics features from T2-weighted images and ADC maps derived from DWI and achieved promising results (AUC = 0.857) with SVM based on a three-tiered partitioning of NVPR and AUC of 0.89 based on a two-class prediction; however, they did not conduct external validation. Moreover, the complex and frequently nonlinear connections between the myriad subtle features identified by radiomics and their clinical outcomes pose a substantial analytical challenge. In this context, the deployment of machine learning (ML) celebrated for its exceptional ability to decode complex patterns in vast and detailed datasets is crucial for developing an effective predictive model [17]. Popular ML classifiers, such as light gradient boosting machine (LGBM) and extreme gradient boosting (XGBoost), have shown their adaptability in applications from NVPR prediction to predicting outcomes in patients with uterine fibroids [18]. However, there is limited research on ML models that use non-contrast MRI radiomics to predict NVPR in patients with uterine fibroids.

With this background, our study is dedicated to developing and validating an interpretable ML model that utilizes radiomics features from non-contrast MRI scans. Our objective is to forecast the risk of postoperative NVPR degree patients with uterine fibroids following HIFU, providing clinicians with a tool for early prediction and enabling prompt intervention.

Materials and Methods

Study Population

This double-center retrospective study was approved by the institutional review boards of the First Affiliated Hospital of Chongqing Medical University, and the necessity for informed consent was waived. A total of 166 patients with uterine fibroids underwent treatment with HIFU at the First Affiliated Hospital of Chongqing Medical University from February 2012 to May 2020, and 55 patients from the Chongqing Haifu Hospital between February 2016 to July 2020. The inclusion criteria were as follows: (1) premenopausal or perimenopausal women who were above 18 years of age, (2) diagnosed with clinical symptomatic uterine leiomyomas with diameters > 3 cm, (3) received MRI examinations before and after HIFU therapy, (4) no previous history of surgery or drug treatment. Exclusion criteria are as follows: (1) scar in the vocal tract, pubic obstruction; (2) CDFI (Color Doppler Flow Imaging) blood supply grade is not equal to 1 or 2; (3) poor image quality with significant artifacts. Ultimately, 166 patients from center 1, who were randomly stratified into training cohort (n = 117) and internal test cohort (n = 49) at a ratio of 7:3, and 55 patients from center 2 constituted an independent external test cohort were included in the study.

HIFU Ablation Procedure

A JC-type focused ultrasound tumor treatment system (Chongqing Haifu Medical Technology Co., China) monitored with a color Doppler ultrasound scanner (Mylab 70, Esaote, Italy) was used to treat uterine myomas. The ultrasound transducer had a 0.8 MHz frequency, 20 cm diameter, 400 W power, and a focal field of 1.5 × 1.5 × 8.0 mm at an 18-cm focal length. Preoperatively, lower abdominal skin degassing, enema, and bladder saline instillation were performed. During the procedure, the patient lay prone on the treatment table in contact with degassed water. Emission power and dose delivery were tailored to patient tolerance and myoma appearance on grayscale imaging. Conscious sedation was maintained with fentanyl citrate (0.8–1.0 µg/kg) and midazolam maleate (0.02–0.03 mg/kg) every 30–40 min, while monitoring vital signs.

The volume of the uterine fibroid (V) is calculated using the ellipsoid formula [19]: V=π6×D1×D2×D3), where D1, D2, and D3 represent the maximal longitudinal, anteroposterior and transverse diameters, respectively. The nonperfusion volume (NPV) was measured using the same volumetric measurement technique applied to assess the V. The nonperfusion volume ratio (NPVR), which indicates the effectiveness of the treatment, was calculated using the following formula: NVPR=NPV/V×100%, which express the percentage of the fibroid that has been nonperfused as a result of the treatment, providing a quantitative measure of therapeutic efficacy. All patients were divided into two groups: patients with ablation rates ≥ 80% were included in good treatment outcomes, and patients with ablation rates < 80% had bad treatment outcomes.

MRI Examination and Image Preprocessing

For data acquisition, each patient underwent an MRI examination before and after HIFU ablation. The postoperative MRI examination was performed within 7 days after treatment. Details regarding the T2WI MR image acquisition parameters of two centers are shown in Table S1.

Tumor Segmentation

Imaging data were collected through the picture archiving and communication system (PACS) of the First Affiliated Hospital of Chongqing Medical University and the Chongqing Haifu Hospital, and patients’ preoperative MRI images were exported in DICOM format. Firstly, the N4ITK bias field correction algorithm was applied to all MRI T2WI images to improve the grayscale distribution of images. Then, all MR images were resampled to a voxel size of 1 × 1 × 1 mm3 by linear interpolation to standardize the voxel spacing. The ROIs were delineated by a radiologist with 3 years experiences using ITK-SNAP (v3.6.0; http://www.itksnap.org) which provided a powerful function of semiautomatic segmentation and verified by a gynecologic imaging physician who has been working in the field of gynecologic imaging for 14 years. A complete schematic is presented in Fig. 1.

Fig. 1.

Fig. 1

Workflow of this study

Feature Extraction

All handcrafted features were extracted from regions of interest, utilizing an available source pyradiomics package (http://github.com/radiomics/pyradiomics). The voxel intensity values were discretized by using a fixed bin width of 5 and a total of 851 radiomics features were calculated from each VOI in accordance with the Image Biomarker Standardization Initiative (IBSI) [20], including first-order statistics features, shape features, textural features, and transformation features [21]. First-order features primarily described the distribution of voxel intensities within the lesions in the MR image; shape features detailed the geometric and morphological characteristics of the lesions; texture features based on the gray-level co-occurrence matrix (GLCM) and gray-level run length matrix (GLRLM) mainly described the texture with the spatial relationship between the distance and angle of different pixel pairs. To enhance intricate patterns in the data invisible to the human eye, advanced filters were employed, including wavelet decompositions with all possible combinations of high (H) or low (L) pass filters in each of the three dimensions (HHH, HHL, HLH, LHH, LLL, LLH, LHL, and HLL).

Feature Selection

Three feature selection steps were executed to mitigate overfitting. Initially, the abnormal values were replaced by the median, all the features were standardized, and z-score normalization of MRI signal intensities was performed to eliminate the variance of features before selection. Secondly, we removed the redundant and less-relevant features using the minimum-redundancy and maximum-relevance (mRMR), which optimizes feature selection by minimizing redundancy (features that are highly correlated with each other) and maximizing relevance (features that have high mutual information with the outcome variable). This was achieved by calculating mutual information for each feature pair, retaining those that contributed the most to the target variable. Then, the optimized feature subsets were selected by the least absolute shrinkage and selection operator (LASSO) method with tenfold cross-validation. LASSO penalizes the absolute size of the coefficients, effectively driving some feature coefficients to zero, and thus, selecting only the most important ones and the optimal regularization parameter (lambda) was selected by maximizing the area under the roc curve across the cross-validation folds. The efficiency of model fitting and complexity was measured by Akaike Information Criterion (AIC), which balances model complexity and goodness-of-fit. A lower AIC value indicates a better trade-off between model complexity and accuracy. All feature selection procedures were executed on the training cohort and used for the internal and external test cohorts.

Construction and Analysis of Machine Learning Models

The final selected features were applied to construct radiomics models. To select a classifier model that has the greatest recognition of tumor data, our study chose six mainstream machine learning algorithm training models, including logistic regression (LR), decision tree (DT), random forest (RF), support vector machine (SVM) [22], light gradient boosting machine (LGBM) [23], and extreme gradient boosting (XGBoost) [24] algorithms. The minimum sample split for the DT was 2. For the RF model, there is a configuration of 50 trees. The SVM used a radial basis function (RBF) kernel, adept at handling non-linear data, with hyperparameters finely adjusted via grid search. XGBoost parameters, such as a 0.02 learning rate and a 50-tree ensemble, were optimized through grid search to ensure a delicate balance between complexity and accuracy in predictions. LGBM parameters, 30-tree sample, 7 number of leaves, and the bagging frequency of 4. After deriving each model, we subjected them to a stringent internal validation and external test process to assess their discrimination, calibration, and clinical applicability. The diagnostic performances of the models were compared by the area under the curve (AUC) of the receiver operating characteristic curve (ROC), accuracy, precision, F1-score. Then, the best radiomics model was screened.

After identifying the optimal predictive models, our focus shifted to understanding the contribution of each variable to the prediction. We incorporated the SHAP (Shapley Additive Explanations) [25] methodology to gain a deeper insight into feature importance, emphasizing the most influential variables. Features were ranked by their SHAP values in descending order of influence, pinpointing the key predictors with our patient cohort. To ensure the model’s robustness, we conducted independent internal validation and external tests. This thorough assessment affirmed their discriminative power, calibration, and clinical relevance, offering a well-rounded perspective on the predictive strength of these models.

Statistical Analysis

Statistical evaluations were conducted using R statistical software (version 4.2.1) and Python programming software (version 3.7.1). Continuous variables that exhibited a skewed distribution were presented as median (interquartile range (IQR)) and evaluated with the Mann–Whitney U-test. Variables with normal distribution were presented as mean ± standard deviation (SD) and compared across groups using the independent sample t-test. Categorical data were denoted as a number (percentage) and analyzed using the χ2 test. The Kruskal–Wallis H test was performed to compare classification, signal intensity, and tumor location between each of the three datasets. ROC curves and the areas under the ROC curves (AUCs) were utilized to evaluate the overall performance of NVPR prediction models and conventional radiomics models. The accuracy, precision, and F1-score of the models were also compared and analyzed. The DeLong test was used to compare the AUC values between the two models, and P-value < 0.05 was considered a statistically significant difference.

Results

Patient Summary

Data of 211 patients with uterine leiomyomas were obtained from the Hospital management system, after strict screening based on the inclusion and exclusion criteria. And the cohort was ultimately divided into three cohorts: 117 in the training cohort, 49 in the internal validation cohort, and 55 in the externally test cohort. The prevalence of NVPR was comparable between the training (47.86%, 56/117), internal test (48.98%, 24/49), and external test (56.36%, 31/55) cohorts, with no significant statistical difference observed (χ2 = 1.1205, P = 0.571) in Table S2. Table 1 further supports these findings, confirming uniform distribution across both cohorts without significant variations in clinical characteristics (except rectus abdominis thickness P = 0.003, others P > 0.05).

Table 1.

Comparisons of the clinical characteristics among the training, internally and externally verification cohorts

Variable Training cohort (n = 117) Internal test cohort (n = 49) External test cohort (n = 55) Statistics P-value
Number
  1 86 (73.50%) 40 (81.63%) 41 (74.55%) 1.727 0.582
  > 1 31 (26.23%) 9 (18.37%) 14 (25.45%)
Location
  Anterior 97 (82.91%) 41 (83.67%) 50 (90.91%) - 0.246
  Posterior 11 (9.40%) 6 (12.24%) 5 (9.09%)
  Side 9 (7.69%) 2 (4.08%) 0 (0.00%)
Type
  Intermuscular 90 (76.92%) 32 (65.31%) 32 (58.18%) 8.985 0.061
  Submucosa 9 (7.69%) 4 (8.16%) 10 (18.18%)
  Subserous 18 (15.38%) 13 (26.53%) 13 (23.64%)
CDFI
  1 31 (26.50%) 14 (28.57%) 25 (45.45%) 8.791 0.067
  2 86 (73.5%) 35 (71.43%) 30 (54.55%)
Signal intensity
  Low 40 (34.19%) 18 (36.73%) 18 (32.73%) 1.495 0.827
  Intermediate 42 (35.90%) 16 (32.65%) 16 (29.09%)
  High 35 (29.91%) 15 (30.61%) 21 (38.18%)
  Age 40.00 (35.00, 44.30) 42.00 (37.70, 44.00) 39.00 (31.40, 44.00) 2.711 0.258
  BMI 22.26 ± 2.39 21.91 ± 2.07 22.17 ± 2.57 0.372 0.69
  Rectus abdominis thickness 9.00 (6.82, 11.00) 9.00 (7.74, 10.00) 10.00 (8.68, 12.00) 11.943 0.003*
  Fat thickness 16.00 (12.00, 19.65) 14.60 (11.49, 19.00) 17.00 (13.00, 20.00) 1.232 0.54
  Abdominal wall thickness 25.00 (22.00, 30.00) 24.00 (18.70, 30.00) 28.00 (25.00, 30.00) 5.324 0.07
  Distance anterior myoma skin 40.40 (29.60, 55.73) 42.20 (28.35, 61.81) 43.00 (36.00, 51.80) 1.177 0.555
  Distance posterior end of the fibroid skin 94.20 (82.77, 106.12) 97.30 (86.60, 106.52) 99.00 (88.40, 102.00) 0.506 0.776
  Ablation volume 53,361.95 (31,322.57, 93607.13) 67,728.63 (32,637.39, 113,517.71) 71,838.62 (41,880.54, 121,414.18) 4.173 0.124
  Size 5.40 (4.37, 6.53) 5.70 (4.75, 6.57) 5.60 (4.76, 6.97) 1.229 0.541

Note: * indicate the significance with P value<0.05

Comparative Clinical Characteristics of Patients with Low and High NVPR in the Training Cohort

Table 2 compares clinical characteristics between patients with low and high NVPR in the training cohort. It shows that higher NVPR is associated with increased size, signal intensity, and location (all P < 0.05). Key clinical parameters were recorded for the development of clinical and combined ML prediction models.

Table 2.

Comparative clinical characteristics of patients with low and high NVPR in the training cohort

Variable NPVR < 80% (N = 61) NPVR > 80% (N = 56) Statistics P-value
Number
  1 45 (73.77%) 41 (73.21%) 0.005 0.946
  > 1 16 (26.23%) 15 (26.79%)
Location
  Anterior 46 (75.41%) 51 (91.07%) - 0.017*
  Posterior 10 (16.39%) 1 (1.79%)
  Side 5 (8.20%) 4 (7.14%)
Type
  Intermuscular 51 (83.61%) 39 (69.64%) - 0.22
  Submucosa 3 (4.92%) 6 (10.71%)
  Subserous 7 (11.48%) 11 (19.64%)
CDFI
  1 13 (21.31%) 18 (32.14%) 2.55 0.279
  2 48 (78.69%) 38 (67.86%)
Signal intensity
  Low 14 (22.95%) 26 (46.43%) 13.82 0.001*
  Intermediate 20 (32.79%) 22 (39.29%)
  High 27 (44.26%) 4 (14.29%)
  Age 38.80 ± 6.03 40.23 ± 6.46  − 1.238 0.218
  BMI 22.55 ± 2.33 21.94 ± 2.44 1.369 0.174
  Rectus abdominis thickness 8.80 (7.00, 11.00) 9.00 (6.13, 11.00) 0.246 0.806
  Fat thickness 16.80 ± 5.10 15.81 ± 5.36 1.028 0.306
  Abdominal wall thickness 25.50 ± 5.46 26.16 ± 7.20  − 0.563 0.575
  Distance anterior myoma skin 41.40 (29.60, 56.65) 39.20 (26.30, 54.63) 0.832 0.405
  Distance posterior end of the fibroid skin 97.61 ± 15.60 92.29 ± 17.44 1.74 0.084
  Ablation volume 51,907.96 (30,296.18, 81,513.66) 65,837.90 (31,898.04, 104,644.29)  − 0.753 0.451
  Size 5.43 (4.73, 6.75) 5.30 (4.08, 6.29) 1.992 0.046*

Note: * indicate the significance with P value<0.05

Radiomics Analysis

In the training cohort, we extracted and normalized 851 radiomics from each baseline non-contrast MRI image. These were narrowed down to 70 potential predictors through a mRMR algorithm. From these, a lasso logistic regression model with Akaike criteria (AIC) pinpointed just 4 optimal features associated with postoperative NVPR prediction, each characterized by non-zero coefficients (Fig. 2).

Fig. 2.

Fig. 2

Radiomics feature selection using LASSO logistic regression. a LASSO coefficient distribution of the 70 radiomics features. b Selection of the tuning parameter (λ) using tenfold cross-validation via the minimum criteria (λ.min). The optimal λ results in 4 features with nonzero coefficients. LASSO least absolute shrinkage and selection operator, λ penalty regularization parameter

Model Comparison for Postoperative NVPR Prediction

In our study, we evaluated the effectiveness of predictive models for assessing NVPR prediction value in patients with uterine fibroids post-HIFU, utilizing six ML classifiers: LR, DT, RF, SVM, LGBM, and XGBoost. These classifiers were tested on three distinct datasets: clinical, radiomics, and a combined dataset. Table 3 presents a systematic comparison of these models, with their performance metrics including ROC, calibration, and DCA curves illustrated in Fig. 3 and Fig. S1,2 among the training, internal validation, and external test set. Our findings indicated that models integrating clinical and radiomics data (clinical-radiomics models) significantly outperformed those based solely on clinical (AUC, 0.583–0.79) or radiomics data (AUC, 0.715–0.785), achieving AUCs ranging from 0.719 to 0.847 in the internal validation cohort.

Table 3.

Performance of ML classifiers for predicting postoperative NVPR prediction using clinical data, radiomics features, and combined datasets in patients with uterine fibroids among the training, internal, and external test cohorts

Data type ML classifier Training cohort (n = 117) Internal validation (n = 49) External validation (n = 55)
AUC Accuracy Precision F1 score AUC Accuracy Precision F1 score AUC Accuracy Precision F1 score
Clinical data LR 0.705 0.692 0.679 0.679 0.752 0.714 0.708 0.708 0.720 0.691 0.769 0.702
DT 0.999 0.983 1.0 0.982 0.583 0.571 0.579 0.512 0.610 0.545 0.65 0.510
RF 0.999 0.983 1.0 0.982 0.69 0.592 0.591 0.565 0.605 0.509 0.583 0.509
SVM 0.770 0.735 0.705 0.735 0.79 0.735 0.72 0.735 0.698 0.673 0.741 0.690
LGBM 0.798 0.744 0.710 0.746 0.717 0.633 0.625 0.625 0.681 0.582 0.654 0.596
XGBoost 0.870 0.769 0.754 0.761 0.762 0.694 0.714 0.667 0.667 0.545 0.636 0.528
Radiomics LR 0.787 0.709 0.672 0.717 0.767 0.673 0.682 0.652 0.715 0.655 0.688 0.698
DT 1.000 1.000 1.000 1.000 0.715 0.714 0.692 0.720 0.658 0.673 0.686 0.727
RF 1.000 1.000 1.000 1.000 0.752 0.694 0.737 0.651 0.688 0.636 0.667 0.688
SVM 0.778 0.701 0.662 0.711 0.775 0.673 0.667 0.667 0.703 0.636 0.667 0.688
LGBM 0.929 0.889 0.877 0.885 0.728 0.673 0.682 0.652 0.688 0.636 0.677 0.677
XGBoost 0.939 0.863 0.870 0.855 0.785 0.714 0.750 0.682 0.701 0.618 0.679 0.644
Combined LR 0.829 0.735 0.727 0.721 0.822 0.816 0.826 0.809 0.769 0.636 0.667 0.688
DT 1.000 1.000 1.000 1.000 0.736 0.735 0.704 0.745 0.640 0.636 0.704 0.655
RF 1.000 1.000 1.000 1.000 0.783 0.714 0.750 0.682 0.719 0.618 0.656 0.667
SVM 0.860 0.769 0.774 0.752 0.847 0.816 0.826 0.809 0.777 0.691 0.733 0.721
LGBM 0.950 0.889 0.877 0.885 0.719 0.633 0.650 0.591 0.698 0.636 0.677 0.677
XGBoost 0.958 0.889 0.906 0.881 0.770 0.714 0.750 0.682 0.715 0.618 0.708 0.618

Fig. 3.

Fig. 3

Comparative analysis of ML classifiers-namely logistic regression, decision tree, random forest, SVM, LGBM, and XGBoost across different data types among the training, internal validation, and external validation cohorts. ac The performance of these ML classifiers on clinical data. df The performance of these classifiers with radiomics features. g–i Their performance using combined clinical and radiomics data

In evaluating clinical-radiomics models, the SVM did not achieve the highest AUC in the training cohort with 0.860; it still exhibited significant discriminative ability and generalization ability, achieving an AUC of 0.847 and 0.777 in the internal validation and external test set exhibited superior calibration, especially noticeable around the 40% threshold (Fig. S1). This evidence positions SVM as the optimal model for predicting postoperative NVPR prediction (Fig. 4). SVM’s uniform excellence in key metrics, including accuracy, precision, and F1 score, underscores its effectiveness. The DCA curve further affirmed the model’s effectiveness, highlighting its substantial net benefits (Fig. S2). These findings underscored the SVM model’s potential as a valuable predictive tool for postoperative NVPR prediction, underscoring its applicability in clinical settings.

Fig. 4.

Fig. 4

Confusion matrix of the SVM model for predicting postoperative NVPR degree in patients with uterine fibroids in training (a), internal validation (b), and external validation cohort (c)

Interpretation of the Model

The SHAP analysis was utilized to decipher the SVM model, quantifying the impact of each feature. By computing the absolute mean SHAP values, it facilitated the prioritization of features based on their importance. Notably, four radiomics features from baseline non-contrast MRI scans and three clinical variables emerged as the most significant influencers in the model (Fig. 5a). A summary plot illustrated the collective impact of these features, represented through their SHAP values (Fig. 5b). This visualization provided comprehensive insights into how each feature contributes to the prediction for individual patients. Importantly, higher values of these top seven features correlated with a greater prediction of NVPR degree in patients with uterine fibroids following HIFU treatment.

Fig. 5.

Fig. 5

SHAP analysis of the SVM model for predicting postoperative NVPR degree in patients with uterine fibroids. a The ranking of feature significance as determined by absolute mean SHAP values. b A summary plot incorporating SHAP values, providing a comprehensive visualization of the cumulative influence of each feature. SHAP Shapley additive explanation, SVM support vector machine

Subgroup Analysis of Signal Intensity

The predictive value of the omics machine learning model in different signal strength subgroups is illustrated in Fig. 6. The box plots display the distribution of predicted probabilities (SVM) for achieving a non-perfused volume ratio (NVPR) greater than 80% versus less than 80% across three signal intensity groups (low, intermediate, and high), with all P-values lower than 0.05. This indicates significant differences between the groups. The ROC curves further demonstrate the model’s ability to predict NVPR > 80% across all signal intensity levels, with the highest predictive performance observed in the low signal intensity group. The significant P-values and reasonable AUC values collectively suggest that the model is robust and effective in predicting NVPR outcomes based on signal intensity.

Fig. 6.

Fig. 6

The predictive value of omics machine learning models in different signal strength subgroups. a The distribution of SVM model prediction probabilities in low signal, equal signal and high signal groups, and NVPR prediction is significant. b Omics has a certain predictive value in the three groups of low signal, equal signal, and high signal

Discussion

In the present study, we focused on enhancing predictive models for NVPR degree prediction in patients with HIFU treatment, employing six ML classifiers and analyzing both clinical and non-contrast MRI radiomics data. Our thorough assessment, encompassing evaluations of discriminative capacity, calibration, and clinical applicability, established the SVM model, which combines non-contrast MRI radiomics with clinical data, as the superior choice. Incorporating SHAP analysis improved the SVM model’s interpretability, emphasizing crucial clinical and radiomics predictors of NVPR degree. This innovative method combines non-contrast MR radiomics and clinical data through ML to predict NVPR accurately, promoting personalized clinical interventions that could notably improve patient outcomes.

For uterine fibroids, choosing the proper treatment options was important to the patients’ prognosis. High-intensity focused ultrasound (HIFU) is a non-invasive treatment option that utilizes focused ultrasound waves to ablate fibroid tissue. However, the effectiveness of HIFU can be influenced by factors such as the acoustic pathway attenuation and the heterogeneous acoustic absorption within the target area, which lead to the benefit extent difference. T2-weighted imaging [26] is an essential tool in the comprehensive evaluation of uterine fibroids. Its ability to provide detailed and high-contrast images of fibroid anatomy, composition, and associated degenerative changes greatly enhances diagnostic accuracy and treatment planning. By incorporating AI and machine learning into the treatment planning process, clinicians can better tailor HIFU therapy to each patient’s unique anatomical and physiological characteristics.

In our study, we preferred ML models for their ability to navigate complex non-linear relationships between variables and outcomes, outperforming traditional linear predictive approaches [27]. We tested six ML models on both clinical and radiomics data, finding that nearly all achieved satisfactory calibration and clinical utility, though they varied significantly in their ability to discriminate. Notably, the combined clinical-radiomics ML models (AUC, 0.829–1.00) proved most effective in predicting the degree of NVPR prediction, offering superior discrimination capabilities. This superiority likely originates from the extensive combination of clinical and radiomics features, offering a wider analytical foundation than models based solely on clinical or radiomics data. Moreover, utilizing non-enhanced T2WI, it also achieved comparable efficiency of AUC = 0.860 and 0.847 in the training and validation cohort, which exhibits a robust generalization with the previous research. The observed discrepancies in predictive accuracy can be attributed to these variations in data integration, highlighting the potential of combined clinical-radiomics ML models to refine clinical decision-making and patient outcomes in the context of HIFU treatment for uterine fibroids.

The selected features belong to texture features before and after wavelet transform, which reflect specific positions relative to each other and capture subtle changes occurring within images to quantify intratumor [28]. “original_gldm_SmallDependenceHighGrayLevelEmphasis” measures the joint distributions of small correlations with lower gray values. Wavelet transform, which calculates the resolution of image signals in different frequency scale planes, is useful for replaying even subtle but important texture information that is neglected by observers. Radiomics features “wavelet.HHL_glcm_SumAverage,” “wavelet.LLL_glszm_ZoneEntropy,” and “wavelet.LLH_glszm_SmallAreaHighGrayLevelEmphasis” were significantly associated with NVPR degree, implying that these features robustly reflect NVPR degree.

Within our selection of ML models, SVM emerged as the most efficacious clinical-radiomics model, demonstrating high accuracy even during external validation. To tackle the interpretability challenges inherent in complex ML models, we utilized SHAP methodology. This technique elucidates the decision-making process at the cohort level, enhanced by intuitive visualizations, allowing for a nuanced understanding of how individual variables influence predictions, thus building trust in AI among clinicians [29]. It identified five principal predictors of NVPR prediction: two radiomics features from non-contrast MRI scans and three clinical factors. The significance of non-contrast MRI radiomics features was anticipated due to their correlation with NVPR prediction. These features, detailed by radiomics, provide a more comprehensive and objective assessment than traditional imaging alone. While the biological correlation of certain texture features may seem abstract at first glance, these characteristics are instrumental in delineating the complex nature of uterine fibroids beyond basic parameters like shape and volume.

Additionally, size, signal intensity, and tumor location were confirmed as crucial clinical predictors, consistent with evidence that signal intensity variability impacts HIFU prognosis [15, 26]. Together with SHAP, SVM offers a detailed insight into the impact of variables on outcomes, proving invaluable for predicting postoperative NVPR and enhancing the role of ML in clinical decision-making and improving patient outcomes. In the subgroup analysis of signal strength, the combined model omics label has a significant prediction distribution of NVPR of different signal strengths (P < 0.05) and has a good prediction value.

This study has several limitations. Firstly, our investigation was conducted using a retrospective design. To ensure the generalizability and validity of the ML model, prospective studies are warranted. Secondly, the clinical relevance of these AI-generated features might be challenging to interpret; however, advancements in radiomics and visualization tools are bridging this gap, enhancing our understanding and integration of these technologies into clinical practice. Efforts to address the mentioned shortcomings are continuously underway. Thirdly, the limited sample size constituted another limitation and selection bias, underscoring the need for further studies with larger cohorts to corroborate the predictive potential of our findings.

Conclusion

In conclusion, our evaluation identified the SVM model, integrating non-contrast MRI radiomics and clinical data, as the most effective ML approach for predicting postoperative NVPR degree in patients with uterine fibroids. This innovative combination of non-contrast MRI radiomics with clinical data through ML, particularly our refined SVM model, promises to enhance the accuracy of postoperative MVPR degree prediction assessment. Anticipated to support clinicians in decision-making for fitted patients with HIFU treatment, this development could lead to early, personalized interventions.

Supplementary Information

Below is the link to the electronic supplementary material.

Funding

This research was supported by the Chongqing Talent Program Project (CSTC2021ycjh-bgzxm0068), Innovative Research Group Project of the National Natural Science Foundation of China (82151319), Chongqing Natural Science Foundation Innovation and Development Joint Fund Project (CSTB2022NSCQ-LZX0028), Chongqing Medical University Major Scientific and Technological Innovation Project, and the Open Research Project of the Key Laboratory of Quality Evaluation for Ultrasonic Surgical Equipment (SMDTKL-2023–2-01).

Data Availability

The images and clinical data used in this study are not publicly available due to privacy restrictions. These data contain sensitive information related to research participants, and therefore cannot be released. However, we are willing to provide aggregated summaries and analysis results upon request. If you have specific data needs, please contact the correspondent author Faqi Li for further information.

Declarations

Conflict of Interest

The authors declare no competing interests.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Chenghai Li, Email: uslichenghai@cqmu.edu.cn.

Faqi Li, Email: lifq@cqmu.edu.cn.

References

  • 1.Marsh EE, Wegienka G, Williams DR. Uterine Fibroids. JAMA. 2024;331(17):1492-1493. 10.1001/jama.2024.0447 [DOI] [PubMed] [Google Scholar]
  • 2.Li F, Chen J, Yin L, et al. HIFU as an alternative modality for patients with uterine fibroids who require fertility-sparing treatment. Int J Hyperthermia. 2023;40(1):2155077. 10.1080/02656736.2022.2155077 [DOI] [PubMed] [Google Scholar]
  • 3.Lyon PC, Rai V, Price N, Shah A, Wu F, Cranston D. Ultrasound-Guided High Intensity Focused Ultrasound Ablation for Symptomatic Uterine Fibroids: Preliminary Clinical Experience. Ultraschall Med. 2020;41(5):550-556. 10.1055/a-0891-0729 [DOI] [PubMed] [Google Scholar]
  • 4.Verpalen IM, de Boer JP, Linstra M, et al. The Focused Ultrasound Myoma Outcome Study (FUMOS); a retrospective cohort study on long-term outcomes of MR-HIFU therapy. Eur Radiol. 2020;30(5):2473-2482. 10.1007/s00330-019-06641-7 [DOI] [PubMed] [Google Scholar]
  • 5.Laughlin-Tommaso S, Barnard EP, AbdElmagied AM, et al. FIRSTT study: randomized controlled trial of uterine artery embolization vs focused ultrasound surgery. Am J Obstet Gynecol. 2019;220(2):174.e1-174.e13. 10.1016/j.ajog.2018.10.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Hectors SJCG, Jacobs I, Moonen CTW, Strijkers GJ, Nicolay K. MRI methods for the evaluation of high intensity focused ultrasound tumor treatment: Current status and future needs. Magn Reson Med. 2016;75(1):302-317. 10.1002/mrm.25758 [DOI] [PubMed] [Google Scholar]
  • 7.Kim YS, Lee JW, Choi CH, et al. Uterine Fibroids: Correlation of T2 Signal Intensity with Semiquantitative Perfusion MR Parameters in Patients Screened for MR-guided High-Intensity Focused Ultrasound Ablation. Radiology. 2016;278(3):925-935. 10.1148/radiol.2015150608 [DOI] [PubMed] [Google Scholar]
  • 8.Zhang J, Yang C, Gong C, Zhou Y, Li C, Li F. Magnetic resonance imaging parameter-based machine learning for prognosis prediction of high-intensity focused ultrasound ablation of uterine fibroids. Int J Hyperthermia. 2022;39(1):835-846. 10.1080/02656736.2022.2090622 [DOI] [PubMed] [Google Scholar]
  • 9.Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology. 2016;278(2):563-577. 10.1148/radiol.2015151169 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Zhang YP, Zhang XY, Cheng YT, et al. Artificial intelligence-driven radiomics study in cancer: the role of feature engineering and modeling. Military Med Res. 2023;10(1):22. 10.1186/s40779-023-00458-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Qin S, Jiang Y, Wang F, Tang L, Huang X. Development and validation of a combined model based on dual-sequence MRI radiomics for predicting the efficacy of high-intensity focused ultrasound ablation for hysteromyoma. Int J Hyperthermia. 2023;40(1):2149862. 10.1080/02656736.2022.2149862 [DOI] [PubMed] [Google Scholar]
  • 12.Zhou Y, Zhang J, Chen J, et al. Prediction using T2 ‐weighted magnetic resonance imaging‐based radiomics of residual uterine myoma regrowth after high‐intensity focused ultrasound ablation. Ultrasound in Obstet & Gyne. 2022;60(5):681-692. 10.1002/uog.26053 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Wei C, Li N, Shi B, et al. The predictive value of conventional MRI combined with radiomics in the immediate ablation rate of HIFU treatment for uterine fibroids. International Journal of Hyperthermia. 2022;39(1):475-484. 10.1080/02656736.2022.2046182 [DOI] [PubMed] [Google Scholar]
  • 14.Zheng Y, Chen L, Liu M, Wu J, Yu R, Lv F. Prediction of Clinical Outcome for High-Intensity Focused Ultrasound Ablation of Uterine Leiomyomas Using Multiparametric MRI Radiomics-Based Machine Leaning Model. Front Oncol. 2021;11:618604. 10.3389/fonc.2021.618604 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Zheng Y, Chen L, Liu M, Wu J, Yu R, Lv F. Nonenhanced MRI-based radiomics model for preoperative prediction of nonperfused volume ratio for high-intensity focused ultrasound ablation of uterine leiomyomas. Int J Hyperthermia. 2021;38(1):1349-1358. 10.1080/02656736.2021.1972170 [DOI] [PubMed] [Google Scholar]
  • 16.Qin S, Jiang Y, Wang F, Tang L, Huang X. Development and validation of a combined model based on dual-sequence MRI radiomics for predicting the efficacy of high-intensity focused ultrasound ablation for hysteromyoma. International Journal of Hyperthermia. 2023;40(1):2149862. 10.1080/02656736.2022.2149862 [DOI] [PubMed] [Google Scholar]
  • 17.Akpinar E, Bayrak OC, Nadarajan C, Müslümanoğlu MH, Nguyen MD, Keserci B. Role of machine learning algorithms in predicting the treatment outcome of uterine fibroids using high-intensity focused ultrasound ablation with an immediate nonperfused volume ratio of at least 90. Eur Rev Med Pharmacol Sci. 2022;26(22):8376-8394. 10.26355/eurrev_202211_30373 [DOI] [PubMed] [Google Scholar]
  • 18.Li C, He Z, Lv F, et al. An interpretable MRI-based radiomics model predicting the prognosis of high-intensity focused ultrasound ablation of uterine fibroids. Insights Imaging. 2023;14(1):129. 10.1186/s13244-023-01445-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Keserci B, Duc NM. Magnetic resonance imaging features influencing high-intensity focused ultrasound ablation of adenomyosis with a nonperfused volume ratio of ≥90% as a measure of clinical treatment success: retrospective multivariate analysis. Int J Hyperthermia. 2018;35(1):626-636. 10.1080/02656736.2018.1516301 [DOI] [PubMed] [Google Scholar]
  • 20.Zwanenburg A, Vallières M, Abdalah MA, et al. The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. Radiology. 2020;295(2):328-338. 10.1148/radiol.2020191145 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.van Griethuysen JJM, Fedorov A, Parmar C, et al. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res. 2017;77(21):e104-e107. 10.1158/0008-5472.CAN-17-0339 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intelligent Systems and their Applications. 1998;13(4):18-28. 10.1109/5254.708428 [Google Scholar]
  • 23.Ke G, Meng Q, Finley T, et al. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In: Advances in Neural Information Processing Systems. Vol 30. Curran Associates, Inc.; 2017. Accessed June 26, 2024. https://proceedings.neurips.cc/paper_files/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html
  • 24.Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’16. Association for Computing Machinery; 2016:785–794. 10.1145/2939672.2939785
  • 25.Lundberg SM, Lee SI. A Unified Approach to Interpreting Model Predictions. In: Advances in Neural Information Processing Systems. Vol 30. Curran Associates, Inc.; 2017. Accessed June 26, 2024. https://proceedings.neurips.cc/paper_files/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html
  • 26.Gong C, Lin Z, Lv F, Zhang L, Wang Z. Magnetic resonance imaging parameters in predicting the ablative efficiency of high-intensity focused ultrasound for uterine fibroids. International Journal of Hyperthermia. 2021;38(1):523-531. 10.1080/02656736.2021.1904152 [DOI] [PubMed] [Google Scholar]
  • 27.Uddin S, Khan A, Hossain ME, Moni MA. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inform Decis Mak. 2019;19:281. 10.1186/s12911-019-1004-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Aerts HJWL, Velazquez ER, Leijenaar RTH, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun. 2014;5:4006. 10.1038/ncomms5006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Nohara Y, Matsumoto K, Soejima H, Nakashima N. Explanation of machine learning models using shapley additive explanation and application for real data in hospital. Comput Methods Programs Biomed. 2022;214:106584. 10.1016/j.cmpb.2021.106584 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The images and clinical data used in this study are not publicly available due to privacy restrictions. These data contain sensitive information related to research participants, and therefore cannot be released. However, we are willing to provide aggregated summaries and analysis results upon request. If you have specific data needs, please contact the correspondent author Faqi Li for further information.


Articles from Journal of Imaging Informatics in Medicine are provided here courtesy of Springer

RESOURCES