Highlights
-
•
We developed and externally validated a novel deep learning model called ConvXGB for predicting recurrence risk of patients with cervical cancer using multiparametric DCE-MRI images.
-
•
The deep learning model not only can be used to predict the risk of recurrence, but also can stratify the risk of recurrence-free survival and overall survival after surgery, and guide personalized precision therapy, showing great potential as a clinical adjunct.
-
•
The ConvXGB model outperformed ConvXGB&clin, clinical model, radiomics nomogram and the existing histology-specific tool, highlighting the validity of the novel network architecture.
Keywords: Cervical cancer, Recurrence-free survival;Overall survival, Deep learning, MRI scan
Abstract
Background
Accurate estimation of recurrence risk for cervical cancer plays a pivot role in making individualized treatment plans. We aimed to develop and externally validate an end-to-end deep learning model for predicting recurrence risk in cervical cancer patients following surgery by using multiparametric MRI images.
Methods
The clinicopathologic data and multiparametric MRI images of 406 cervical cancer patients from three institutions were collected. We designed a novel deep learning model called “ConvXGB” for predicting recurrence risk by combining the convolutional neural network (CNN) and eXtreme Gradient Boost (XGBoost). The predictive performance of the ConvXGB model was evaluated using time-dependent area under curve (AUC), compared with the deep learning radio-clinical model, clinical model, conventional radiomics nomogram and an existing histology-specific tool. The potential of the ConvXGB model in predicting the recurrence-free survival (RFS) and overall survival (OS) was assessed.
Results
The ConvXGB model outperformed other models in predicting recurrence risk, with AUCs for 1 and 3 year-RFS of 0.872(95% CI, 0.857–0.906) and 0.882(95% CI, 0.860–0.904) respectively in the test cohort. This model showed better discrimination, calibration and clinical utility. Grad-CAM analysis was adopted to help clinicians better understand the predictive results. Moreover, Kaplan–Meier survival analysis revealed that patients who were stratified into high-risk group by the ConvXGB model were significantly susceptible to higher cumulative recurrence risk rates and worse outcome.
Conclusion
The ConvXGB model allowed for predicting postoperative recurrence risk in cervical cancer patients and for stratifying the risk of RFS and OS.
Introduction
Cervical cancer is the fourth most frequently diagnosed cancer and the fourth leading cause of cancer death in women [1,2]. Despite notable strides in cervical cancer prevention and detection such as human papillomavirus vaccination and early screening technologies, there were an estimated 604,000 new cases and 342,000 deaths in 2020 worldwide [3]. Radical surgery is the primary treatment for women, but tumor recurrence poses a formidable challenge in clinical practice, profoundly affecting life quality and long-term survival [[4], [5], [6], [7]]. Accurate prediction of tumor recurrence risk plays a pivot role in making individualized treatment plans and optimizing patient management [8].
Identifying patients at high risk of recurrence is critical for deciding whether further aggressive treatments are needed [9,10]. Relying solely on clinicians' experience for estimating postoperative recurrence risk may introduce biases. In previous studies, several predictive tools have been proposed but these studies have largely been restricted to clinicopathologic levels and their predictive performances were unsatisfactory. The characteristics of intratumor heterogeneity were not considered in some studies [11]. For example, for providing a more linear assessment of recurrence risk, the novel histology-specific nomograms were developed by Levinson K et al., and the tool had been recommended by the NCCN Clinical Practice Guidelines in Oncology (https://nccn.medlive.cn/index) for predicting 3-year recurrence risk of cervical cancer [12,13]. However, they mainly focused on clinicopathological data but the potential value of imaging information of tumor regions was not investigated. A single-center study proposed the transformer network for predicting recurrence risk in patients with locally advanced cervical cancer [14]. The proposed transformer network had a low AUC value of 0.819, thus limiting its clinical application. Zhang et al. developed a deep learning (DL) radiomics nomogram to predict recurrence risk in early-stage cervical cancer by using MRI images [15]. However, the sample size of this study is relatively small. The median follow-up time was only 23.8 months and therefore longer follow-up periods are needed to validate it. Consequently, a more reliable predictive model for assessing the risk of cervical cancer recurrence is urgently needed to guide clinical decision-making.
Radiomics, a well-validated method, provides new insights for disease diagnosis and prognosis prediction by converting extracted radiological features into quantitative parameters [[16], [17], [18]]. Notably, previous studies revealed that radiomics features could reflect the tumor heterogeneity and microenvironment [[19], [20], [21]]. DL, particularly convolutional neural networks (CNNs), has emerged as a powerful tool in medical field [22,23]. DL signature derived from CNN have demonstrated remarkable capabilities in image discrimination [24,25]. This is attributed to the fact that DL signatures are capable of capturing complex hierarchical representations of input images. Despite this significant progress, no studies have yet reported successful estimation of recurrence risk of cervical cancer using DL signature extracted from dynamic contrast‑enhanced MRI (DCE-MRI) images across multiple institutions.
Overall, our study aimed to develop and validate a novel DL model for predicting recurrence risk of patients with cervical cancer using multiparametric DCE-MRI images. Furthermore, we intended to compare this model with the DL radio-clinical model, clinical model, conventional radiomics nomogram and an existing histology-specific tool.
Methods and Materials
Study design
This retrospective multicenter cohort study has been reported in line with the STROCSS criteria [26]. We strictly followed the ethical guidelines of the 1975 Declaration of Helsinki. The Research Ethics Committees had approved it (RA-2023–441).
Patients with early-stage cervical cancer from 3 hospitals between January 2018 and June 2023 were recruited (Fig. 1). The inclusion criteria were (1) 18 years or more (2) receiving radical hysterectomy and pelvic lymph node dissection (3) pathologically confirmed cervical cancer after surgery (negative surgical margin) (4) complete clinicopathologic data, follow-up data and DCE-MRI images including high-resolution T2-weighted imaging with fat suppression (T2WI/FS), diffusion-weighted imaging (DWI) and apparent diffusion coefficient (ADC) maps (5) no extracervical metastasis. The exclusion criteria were as follows: (1) with the history of receiving anti-tumor therapy such as adjuvant chemotherapy or radiotherapy (2) with other malignant tumors simultaneously (3) the interval between MRI examination and surgery was more than 30 days. We randomly divided the whole cohort into two groups (i.e., derivation and test cohorts) for model development and validation, with a ratio of 7:3 by the method of random number table.
Fig. 1.
Flow diagram of study design. Abbreviations: MRI, magnetic resonance imaging; CNN, convolutional neural network; XGBoost, eXtreme Gradient Boost; ConvXGB, a novel architecture by combing CNN and XGBoost; ConvXGB&clin, a combined model in which independent clinical features were incorporated into a prediction layer of the ConvXGB; RFS, recurrence-free survival; OS, overall survival; ROC, receiver operator characteristic curve.
Primary outcomes
All individuals were followed up every 3 months for the first 2 years and every 6 months thereafter by telephone. The main modalities of follow-up examinations include gynecological examination, blood tests, imaging examinations and other tests as necessary. Recurrence-free survival (RFS) is defined as the period between the date of surgery and the time of recurrence or last contact. Overall survival (OS) was defined as the period between the date of surgery and death or the last follow-up visit.
Image acquisition
All patients underwent pelvic MRI examination before surgery that include T2WI/FS and DWI with two b values (0 and 800 s/mm2). ADC maps were automatically generated in a mono-exponential decay model. All images were collected from picture archiving and communication system. All patients were instructed to fast for 4 hours before examination. Detailed MRI scan parameters of each institution are listed in Supplementary S1.
Segmentation of region of interest
3D slicer software (version 4.11.0) was used for the segmentation of region of interests (ROIs) on axial T2WI and DWI images [27]. Two experts reviewed and manually labelled all images layer by layer using the “Level Tracing” function of the 3D Slicer when blind to clinical information. Areas comprising air and non-tumor tissues were manually removed. For ADC maps, ROIs were delineated on the region of high signal intensity on DWI images, and then copied to the corresponding ADC maps. The dispute on the ROI delineation would be settled after discussion.
Data preprocessing
The largest cross-sectional slice of the 3D-ROI was selected as the input image for the DL model. This ROI area was extended outward into a square area. The cropped images were converted to gray-scaled images. The cropped area was resampled to a size of 224 × 224 pixels as the input channels of model.
Data augmentation on the derivation cohort is essential to improve the generalization ability of models and increase the model robustness. Data augmentation was performed by applying a range of transformations (e.g., flipping, rotation and Gaussian noise) to all images in the derivation cohort. Finally, each image was normalized with z-score in order to obtain a standard normal distribution of the image intensities.
Cervical cancer recurrence CNN prediction model
The mainstream Residual Network 18 (ResNet18) was introduced to make up for limited imaging data in the current study [28,29]. The natural image library ImageNet was used to pre-train the model, and then the model was fine-tuned with DCE-MRI images. ResNet18 is stacked by four Blocks and each layer of convolution is followed by Batch Normalization and ReLu activation functions.
Generally, CNN is suitable for dealing with a large number of medical image data and a fully connected (FC) layer is the last layer of the CNN architecture: the output from the previous pooling layer will be stretched to a single column vector and become the input of this layer [30,31]. The Softmax function is the transformation function for multi-class prediction and then outputs the probability distribution for each category. Dropout regularization is used to decrease the number of neurons in the FC layer to avoid overfitting. In addition, the Extreme Gradient Boosting (XGBoost) of Chen and Guestrin is a highly scalable end-to-end tree boosting system [32]. This machine learning technique can be used for classification and regression problems.
In the present study, we devised a novel architecture called “ConvXGB” by combing the CNN (ResNet18) and XGBoost, of which the FC layer was removed and XGBoost was used as the prediction layer. The network architecture of the proposed model was displayed in Supplementary S2. This architecture has two parts: one for feature learning and one for outcome prediction. The CNN is responsible for extracting representations of input data. Before input to the prediction part, the input must be in the form of a vector, so a reshape layer was applied to just run some housekeeping operations to convert the tensors output from the former layers to the vector required by the next layer. The XGBoost then utilizes these to make final predictions. We performed the grid search method (5-fold cross-validation) to tune the hyperparameters of the proposed architecture: the number of the trees, maximum depth of the tree, learning rate, subsample and colsample_bytree. The ConvXGB model was developed with the optimal hyperparameters. More information about the implementation of hyperparameter tuning techniques can be obtained in Supplementary S3.
Furthermore, we utilized gradient-weighted class activation mapping (Grad-CAM) to interpret the ConvXGB's predictions made regarding recurrence or non-recurrence. This method provides valuable insights into the decision-making process of the ConvXGB. We trained, tuned, and tested the ConvXGB with Python software (TensorFlow library) (Intel Core i5–12600KF CPU, NVIDIA GeForce RTX 4070 Super GPU and RAM 32 GB).
Comparison with other models
Clinicopathologic data were collected from the electronic medical record including age, pathological type, tumor size, lymph node status, depth of tumoral invasion, Federation of Gynecology and Obstetrics (FIGO) stage and lymph-vascular space invasion (LVSI) status [11,33]. Univariate and multivariate Cox regression analysis were used to identify independent risk factors of poor RFS. We constructed a clinical model for predicting tumor recurrence by using XGBoost algorithm. Subsequently, these independent clinical features were incorporated into a prediction layer of the ConvXGB to establish the combined model (namely ConvXGB&clin).
Levinson et al. proposed the histology-specific nomograms for predicting 3-year recurrence risk in early-stage cervical cancer [13]. This existing histology-specific tool is presented in Supplementary S4.
Radiomics nomogram construction
To illustrate the advantage of the ConvXGB, we constructed and evaluated a conventional radiomics nomogram. After data preprocessing, 1409 radiomics features were extracted from 3D ROI using Pyradiomics software (in-house software written in Python, Version 3.1.0). These features were divided into four types: first-order statistics, textural features, wavelet-based features, and Laplacian of Gaussian (LoG)-based features. More detailed information on the radiomics score calculation is shown in Supplementary S5. Finally, we developed a radiomics nomogram by combining radiomics score and independent clinicopathologic features.
Statistical analysis
All data analysis were performed with software (SPSS v26.0, R software v4.0 and Python v3.8.3). All significant tests were 2- sided and P < 0.05. Continuous data was shown as the median ± interquartile range (IQR). The comparison of continuous variables was conducted by the Mann–Whitney U test or Student's t test. Categorical variables were compared by the Chi-square test. Univariate and multivariable Cox regression were conducted to identify risk factors of tumor recurrence.
Time-independent receiver operator characteristic curve (ROC) analysis was performed to evaluate the predictive performance of models. Delong test was performed to compare the statistical significance between ROC curves. The clinical utility and stability of models were evaluated by decision curve (DCA) and calibration curve analysis. The Kaplan–Meier method and log-rank test were used to estimate the RFS and OS for patients in different risk groups. Moreover, we estimated sample size by using pmsampsize package of R software (R2cs = 0.28, parameters = 10, prevalence = 0.15), and then at least 294 cases were required for model development [34]. The analysis codes are accessible in the GitHub repository (https://github.com/xiaobai1572287/ConvXGB.git).
Result
Patient characteristics
The workflow chart is illustrated in Fig. 2. A total of 406 eligible patients from 3 hospitals were recruited. 324 of 406 patients were randomly assigned to the derivation cohort. The remaining 82 cases were reserved for model validation.
Fig. 2.
The workflow of model development in the current study. Abbreviations: MRI, magnetic resonance imaging; CNN, convolutional neural network; XGBoost, eXtreme Gradient Boost; ConvXGB, a novel architecture by combing CNN and XGBoost; ConvXGB&clin, a combined model in which independent clinical features were incorporated into a prediction layer of the ConvXGB; T2WI/FS, T2-weighted imaging with fat suppression; ADC, apparent diffusion coefficient; ROI, region of interests; FIGO, Federation of Gynecology and Obstetrics; ICC, intra- and inter-class correlation coefficients; LASSO, least absolute shrinkage and selection operator.
The details of patient characteristics are shown in Table 1. Median RFS was 40.6(IQR, 24.2) months for the derivation cohort, and 38.3(IQR, 26.8) months for the test cohort. Median OS was 42.8(IQR, 22.9) months for the derivation cohort, and 41.3(IQR, 26.5) months for the test cohort.
Table 1.
Patient characteristics.
Characteristics | Derivation cohort n=324 | Test cohort | p value |
---|---|---|---|
n=82 | |||
Age, year | 54 (9) | 52 (7) | 0.252 |
BMI, kg/m2 | 24.6(4.1) | 25.5(3.8) | 0.042 |
SCC, ng/ml | 7.52(9.18) | 6.41(10.05) | 0.273 |
FIGO stage (2018) | 0.341 | ||
IB1 | 146(45.1) | 34(41.5) | |
IB2 | 132(40.7) | 31(37.8) | |
IB3 | 7(2.2) | 3(3.6) | |
IIA1 | 28(8.6) | 10(12.2) | |
IIA2 | 11(3.4) | 4(4.9) | |
Histology | 0.156 | ||
Squamous cell carcinoma | 238(73.5) | 56(68.3) | |
Adenocarcinoma | 77(23.8) | 21(25.6) | |
Adenosquamous carcinoma | 9(2.7) | 5(6.1) | |
Tumor size, cm | 0.162 | ||
<2 cm | 166(51.2) | 40(48.8) | |
2 to 4 cm | 140(43.2) | 35(42.7) | |
>4 cm | 18(5.6) | 7(8.5) | |
Depth of invasion | |||
Superficial <1/3 | 133(41.1) | 37(45.1) | 0.409 |
Intermediate >1/3 and <2/3 | 191(58.9) | 45(54.9) | |
Deep >2/3 | ? | ||
Lymph node metastases | 0.233 | ||
Positive | 67(20.7) | 20(24.4) | |
Negative | 257(79.3) | 62(75.6) | |
LVSI | 141(43.5) | 32(39) | 0.219 |
Surgical approach | 0.437 | ||
Open | 185(57.1) | 52(63.4) | |
Minimally invasive surgery | 139(42.9) | 30(36.6) | |
Recurrence | 62(19.1) | 16(19.5) | 0.538 |
Median RFS time, month | 40.6(24.2) | 38.3(26.8) | 0.304 |
Median OS time, month | 42.8(22.9) | 41.3(26.5) | 0.427 |
Quantitative values are median (IQR) and categorical variables are n (%).
Abbreviation: BMI, body mass index; FIGO, Federation of Gynecology and Obstetrics; SCC, squamous cell carcinoma antigen; LVSI, lymphovascular space invasion; RFS, recurrence-free survival; OS, overall survival.
Development and assessment of the predictive models
Univariate Cox regression analysis results are displayed in Table 2. FIGO stage, tumor size, depth of invasion, lymph node metastases, lymphovascular space invasion and surgical approach were significantly associated with tumor recurrence (all p values <0.05). Among these factors, FIGO stage, depth of invasion, lymph node metastases and surgical approach were identified as independent risk factors of tumor recurrence (Table 3). The parameter combination of clinical model is presented in Supplementary S6. We then developed the ConvXGB&clin model by integrating independent clinicopathologic features into the ConvXGB model. Additionally, the radiomics nomogram is presented in Supplementary S7.
Table 2.
Univariate Cox regression analysis for identifying risk factors of cervical cancer recurrence in the derivation cohort.
Variables | Univariate Cox analysis |
||
---|---|---|---|
hazard ratio | 95%CI | p value | |
Age | |||
≤60 y | Ref | ||
61–70 y | 1.06 | 0.91–1.26 | 0.415 |
>70 y | 1.17 | 0.95–1.47 | 0.362 |
BMI | 1.02 | 0.85–1.13 | 0.705 |
SCC | 1.16 | 1.04–1.34 | 0.122 |
FIGO stage | |||
IB1 | Ref | ||
IB2 | 1.67 | 1.39–1.94 | 0.023 |
IB3-IIA2 | 2.23 | 1.87–2.65 | 0.006 |
Histological type | |||
Squamous cell carcinoma | Ref | ||
Adenocarcinoma | 2.11 | 0.84–3.08 | 0.261 |
Adenosquamous carcinoma | 1.65 | 0.71–2.62 | 0.309 |
Tumor size | |||
<2 cm | Ref | ||
2 to 4 cm | 1.59 | 1.34–1.85 | 0.037 |
>4 cm | 1.93 | 1.66–2.56 | 0.025 |
Depth of invasion | |||
Superficial <1/3 | Ref | ||
Intermediate and deep ≥1/3 | 1.48 | 1.32–1.67 | 0.007 |
Lymph node metastases | |||
Negative | Ref | ||
Positive | 1.49 | 1.25–1.73 | 0.018 |
LVSI | |||
None | Ref | ||
Present | 1.25 | 1.10–1.42 | 0.042 |
Surgical approach | |||
Open | Ref | ||
Minimally invasive surgery | 1.62 | 1.35–1.87 | 0.014 |
Abbreviation: BMI, body mass index; SCC, squamous cell carcinoma antigen; FIGO, Federation of Gynecology and Obstetrics; LVSI, lymphovascular space invasion.
Table 3.
Multivariate Cox regression analysis for identifying independent risk factors of cervical cancer recurrence in the derivation cohort.
Variables | Multivariate Cox analysis |
||
---|---|---|---|
hazard ratio | 95%CI | p value | |
FIGO stage | |||
IB1 | Ref | ||
IB2 | 1.33 | 1.15–1.56 | 0.041 |
IB3-IIA2 | 2.41 | 1.94–2.70 | <0.001 |
Tumor size, cm | |||
<2 cm | Ref | ||
2 to 4 cm | 1.25 | 1.17–1.48 | 0.106 |
>4 cm | 1.41 | 1.03–1.67 | 0.058 |
Depth of invasion | |||
Superficial <1/3 | Ref | ||
Intermediate and deep ≥1/3 | 1.51 | 1.24–1.73 | 0.015 |
Lymph node metastases | |||
Negative | Ref | 0.015 | |
Positive | 1.52 | 1.33–1.74 | |
LVSI | |||
None | Ref | ||
present | 1.12 | 1.04–1.31 | 0.068 |
Surgical approach | |||
Open | Ref | ||
Minimally invasive surgery | 1.64 | 1.37–1.88 | 0.011 |
Abbreviation: FIGO, Federation of Gynecology and Obstetrics; LVSI, lymphovascular space invasion.
Time-dependent ROC analysis was used to assess the predictive performance of the models at different time points (Fig. 3A, B). The performance for the ConvXGB, ConvXGB&clin, clinical model, radiomics nomogram and the existing histology-specific tool are listed in Table 4. Delong test was performed to compare the statistical significance between the ConvXGB and other models.
Fig. 3.
Model evaluation and interpretation. Time-dependent ROC for the ConvXGB, ConvXGB&clin, clinical model and radiomics nomogram in the derivation (A) and test cohorts (B) at different time points. (C-D) Representative examples of MRI images and visualization of ConvXGB prediction using Grad-CAM. Calibration plots of the ConvXGB, ConvXGB&clin, clinical model, radiomics nomogram and histology-specific tool for 3-year recurrence-free survival in the derivation (E) and test cohorts (F). (G) decision curve analysis revealed that the ConvXGB model showed better clinical utility. Abbreviations: AUC, area under curve; ROI, region of interests; Grad-CAM, gradient-weighted class activation mapping.
Table 4.
Predictive performance of the models for recurrence prediction in patients with cervical cancer.
Performance | AUCs(95%CI) |
|||
---|---|---|---|---|
For 1-year RFS | p value | For 3-year RFS | p value | |
Derivation cohort | ||||
ConvXGB | 0.936(0.899–0.953) | Ref | 0.912(0.885–0.941) | Ref |
ConvXGB&clin | 0.941(0.896–0.957) | 0.357 | 0.925(0.872–0.967) | 0.574 |
Clinical model | 0.632(0.592–0.701) | <0.001 | 0.619(0.572, 0.699) | <0.001 |
Radiomics nomogram | 0.784(0.749–0.825) | <0.001 | 0.776(0.740–0.807) | <0.001 |
Histology-specific nomograms | None | None | 0.762(0.705–0.816) | <0.001 |
Test cohort | ||||
ConvXGB | 0.872(0.857–0.906) | Ref | 0.882(0.860–0.904) | Ref |
ConvXGB&clin | 0.865(0.832–0.894) | 0.295 | 0.873(0.851–0.912) | 0.361 |
Clinical model | 0.605(0.561–0.688) | <0.001 | 0.613(0.566, 0.684) | <0.001 |
Radiomics nomogram | 0.761(0.717–0.816) | <0.001 | 0.748(0.714–0.797) | <0.001 |
Histology-specific nomograms | None | None | 0.756(0.719–0.795) | <0.001 |
Delong test was performed to compare the statistical significance between the AUC values. Abbreviations: RFS, recurrence-free survival; AUC, area under of ROC curve; CI, confidence interval.
The model with the highest AUC value in the test cohort was determined as the final model for downstream analysis. The ConvXGB model yielded the best discriminative power. In the test cohort, the ConvXGB model had the AUCs for 1 and 3 year-RFS of 0.872(95% CI, 0.857–0.906) and 0.882(95% CI, 0.860–0.904), respectively (Table 4). Delong test revealed that the ConvXGB model significantly outperformed the clinical model, radiomics nomogram and the existing histology-specific tool (all p values <0.05). The ConvXGB model performed only slightly better than the ConvXGB&clin model in the test cohort. However, no significant difference was observed between the ConvXGB and ConvXGB&clin model (p value>0.05). Moreover, activation maps were generated and patient examples with different recurrence states for the actual use of the established ConvXGB model are presented in Fig. 3C and D.
Calibration plots for all predictive models for 3-year RFS are displayed in Fig. 3E and F. The ConvXGB model was better calibrated in both cohorts. DCA based on a 3-year time period demonstrated that the ConvXGB and ConvXGB&clin model had the larger net benefit than the others in the whole cohort (Fig. 3G).
Stratification of recurrence risk
What's more, according to the median of probability from the ConvXGB model, the whole cohort was divided into low and high-risk groups. Of the derivation and test datasets, 75 and 21 individuals were classified into the high-risk group, and 249 and 61 individuals were classified into the low-risk group, respectively.
Survival curves of cumulative recurrence probability in the derivation and test datasets are presented in Fig. 4A and B. Result illustrated the prediction results for survival probability for the high- and low-risk groups, revealing that the ConvXGB model could significantly stratify RFS patients (log-rank p values < 0.05).
Fig. 4.
Cumulative recurrence probability (A, B) and Kaplan–Meier survival curves for OS (C, D) by the ConvXGB model between high- (red line) and low-risk (blue line) groups from the derivation and test cohort.
Kaplan-Meier survival curves for OS are shown in Fig. 4C and D. There was a significant difference in OS between the two different risk groups stratified by the ConvXGB model (log-rank p values < 0.05).
Discussion
The estimation of RFS is important for making individualized therapy of patients with cervical cancer. In this multi-center study, we developed and validated a novel DL network architecture for predicting recurrence risk of cervical cancer following surgery using multiparametric MRI images. The ConvXGB model showed excellent predictive performance for tumor recurrence, with AUCs for 1 and 3 year-RFS of 0.872(95% CI, 0.857–0.906) and 0.882(95% CI, 0.860–0.904) respectively in the test cohort. This model showed great discrimination, calibration and clinical utility. Grad-CAM analysis was adopted to help clinicians better understand the predictive results.
The present study was innovative as follows: (1) compared with previous studies, we performed a more comprehensive prediction by analyzing multiparametric MRI images and clinicopathologic data; (2) data from three medical institutions was collected, enhancing the reliability and generalizability of the current study; (3) we designed a new DL frame (namely ConvXGB) for tumor recurrence prediction based on CNN and Chen et al.’s XGBoost. To the best of our knowledge, this is the first multicenter study to predict recurrence risk of cervical cancer based on an end-to-end DL model. The introduction of a CNN for end-to-end training in a supervised manner greatly simplifies the training process. (4) we evaluated the predictive performance of the ConvXGB model, ConvXGB&clin, clinical model, radiomics nomogram and the existing histology-specific tool, and explored the additional value of the optimal model in outcome prediction.
Intratumor heterogeneity (ITH) and tumor microenvironment (TME), key drivers of tumor progression and recurrence, can be reflected on the radiological level [[35], [36], [37]]. DL techniques can autonomously acquire feature representations of ITH and TME from medical images [38]. Consequently, in the present study, we proposed a novel DL model and investigated the validity of this model in predicting tumor recurrence and prognosis stratification. Analysis results revealed that the ConvXGB model may be sufficiently robust and generalizable for real-world applications.
Furthermore, clinicopathological features such as FIGO stage and lymph node status showed close correlation with the RFS, which have been reported in previous studies [11,39,40]. Therefore, in order to enhance the predictive capability of the ConvXGB model, we built the ConvXGB&clin model for outcome prediction by integrating significant clinicopathological features. However, Delong test revealed that there was no significantly statistical difference between the AUC values of two models, indicating clinicopathological features did not significantly assist DL radiomics signature. What's more, the ConvXGB showed significantly better discrimination than the clinical model, conventional radiomics nomogram and the existing histology-specific tool. If the patients are stratified as high-risk by the ConvXGB model, intensive surveillance and systemic therapy such as postoperative neoadjuvant chemotherapy are needed. On the contrary, only regular surveillance is recommended for the low-risk patients. Last but not the least, in our study, the ConvXGB model not only identified recurrence risk but also predicted OS. Kaplan-Meier survival curves for OS estimation revealed that patients with high-risk had worse OS. This model may provide surgeons with information on prognosis for dynamic monitoring, personalized prediction of recurrence risks and guiding the development of risk-reduction strategies.
The present study has some limitations. Firstly, the samples size of our study was not large, this study should be validated in a multi-center prospective cohort. Secondly, ROIs were delineated manually slice by slice in our study, which is a time-consuming procedure. We plan to devise an automated segmentation tool based on CNNs such as U-Net and Mask R–CNN in future research. Finally, the underlying radiological–genomic correlations were not evaluated and then further investigation was necessary. Perhaps an external cohort from TCIA and sequencing data from the corresponding TCGA cohort may be used to reveal this association.
Conclusion
In conclusion, we developed an end-to-end DL model (ConvXGB) for stratification of recurrence risk in cervical cancer patients following surgery by using multiparametric MRI. This model significantly outperformed other models and could also be used for prognostic stratification (i.e., RFS and OS). The model has the potential to help doctors identify high-risk patients after surgery and make more reasonable treatment and follow-up schedules, thereby improving the clinical outcome and life quality of cervical cancer patients.
Data Transparency Statement
Data will be shared, and further inquiries can be directed to the corresponding authors.
Funding
This work was supported by grants from the Scientific Research Foundation of Suzhou Ninth Hospital Affiliated to Soochow University (No. YK202330).
Consent for publication
All authors agree to publish this article.
Ethical statement
The ethical guidelines of the 1975 Declaration of Helsinki were strictly followed. The review board has approved our study and Informed consents were waived.
CRediT authorship contribution statement
Ji Wu: Writing – review & editing, Writing – original draft, Software, Project administration, Funding acquisition, Formal analysis, Data curation, Conceptualization. Jian Li: Writing – review & editing, Writing – original draft, Validation, Supervision, Software, Project administration, Methodology, Investigation, Conceptualization. Bo Huang: Writing – review & editing, Validation, Supervision, Resources, Project administration, Methodology, Formal analysis. Sunbin Dong: Writing – review & editing, Visualization, Validation, Resources, Investigation, Formal analysis. Luyang Wu: Writing – review & editing, Validation, Supervision, Software, Resources, Project administration, Methodology. Xiping Shen: Writing – review & editing, Visualization, Project administration, Investigation, Funding acquisition, Conceptualization. Zhigang Zheng: Conceptualization, Formal analysis, Methodology, Writing – original draft, Writing – review & editing.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
The authors have none to declare.
Footnotes
Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.tranon.2025.102281.
Appendix. Supplementary materials
References
- 1.Chen Y, Zheng Y, Wu Y, et al. Local excision as a viable alternative to hysterectomy for early-stage cervical cancer in women of reproductive age: a population-based cohort study. Int. J. Surg. 2023;109(6):1688–1698. doi: 10.1097/JS9.0000000000000417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Jha AK, Mithun S, Sherkhane UB, et al. Systematic review and meta-analysis of prediction models used in cervical cancer. Artif. Intell. Med. 2023;139 doi: 10.1016/j.artmed.2023.102549. [DOI] [PubMed] [Google Scholar]
- 3.Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2018;68(6):394–424. doi: 10.3322/caac.21492. [DOI] [PubMed] [Google Scholar]
- 4.Lei J, Ploner A, Elfström KM, et al. HPV Vaccination and the Risk of Invasive Cervical Cancer. N. Engl. J. Med. 2020;383(14):1340–1348. doi: 10.1056/NEJMoa1917338. [DOI] [PubMed] [Google Scholar]
- 5.Voelker RA. Cervical Cancer Screening. JAMa. 2023;330(20):2030. doi: 10.1001/jama.2023.21987. [DOI] [PubMed] [Google Scholar]
- 6.Sawaya GF, Smith-McCune K, Kuppermann M. Cervical Cancer Screening: More Choices in 2019. JAMA. 2019;321(20):2018–2019. doi: 10.1001/jama.2019.4595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Moro F, Ciancia M, Zace D, et al. Role of artificial intelligence applied to ultrasound in gynecology oncology: A systematic review. Int. J. Cancer. 2024;155(10):1832–1845. doi: 10.1002/ijc.35092. [DOI] [PubMed] [Google Scholar]
- 8.Zhang Y, Zou J, Li L, et al. Comprehensive assessment of postoperative recurrence and survival in patients with cervical cancer. Eur. J. Surg. Oncol. 2024;50(10) doi: 10.1016/j.ejso.2024.108583. [DOI] [PubMed] [Google Scholar]
- 9.Bizzarri N, Russo L, Dolciami M, et al. Radiomics systematic review in cervical cancer: gynecological oncologists' perspective. Int. J. Gynecol. Cancer. 2023;33(10):1522–1541. doi: 10.1136/ijgc-2023-004589. [DOI] [PubMed] [Google Scholar]
- 10.Saida T, Gu W, Hoshiai S, et al. Artificial Intelligence in Obstetric and Gynecological MR Imaging. Mag. Reson. Med. Sci. 2024 doi: 10.2463/mrms.rev.2024-0077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Manzour N, Chiva L, Chacón E, et al. SUCCOR Risk: Design and Validation of a Recurrence Prediction Index for Early-Stage Cervical Cancer. Ann. Surg. Oncol. 2022;29(8):4819–4829. doi: 10.1245/s10434-022-11671-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Abu-Rustum NR, Yashar CM, Arend R, et al. NCCN Guidelines® Insights: Cervical Cancer, Version 1.2024. J. Natl. Compr. Canc. Netw. 2023;21(12):1224–1233. doi: 10.6004/jnccn.2023.0062. [DOI] [PubMed] [Google Scholar]
- 13.Levinson K, Beavis AL, Purdy C, et al. Beyond Sedlis-A novel histology-specific nomogram for predicting cervical cancer recurrence risk: An NRG/GOG ancillary analysis. Gynecol. Oncol. 2021;162(3):532–538. doi: 10.1016/j.ygyno.2021.06.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Wang J, Mao Y, Gao X, et al. Recurrence risk stratification for locally advanced cervical cancer using multi-modality transformer network. Front. Oncol. 2023;13 doi: 10.3389/fonc.2023.1100087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zhang Y, Wu C, Du J, et al. Prediction of recurrence risk factors in patients with early-stage cervical cancers by nomogram based on MRI handcrafted radiomics features and deep learning features: a dual-center study. Abdom. Radiol. (NY) 2024;49(1):258–270. doi: 10.1007/s00261-023-04125-3. [DOI] [PubMed] [Google Scholar]
- 16.Lambin P, Leijenaar RTH, Deist TM, et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat. Rev. Clin. Oncol. 2017;14(12):749–762. doi: 10.1038/nrclinonc.2017.141. [DOI] [PubMed] [Google Scholar]
- 17.Xiao M, Ma F, Li Y, et al. Multiparametric MRI-based radiomics nomogram for predicting lymph node metastasis in early-stage cervical cancer. J. Magn. Reson. Imaging. 2020;52(3):885–896. doi: 10.1002/jmri.27101. [DOI] [PubMed] [Google Scholar]
- 18.Xin W, Rixin S, Linrui L, et al. Machine learning-based radiomics for predicting outcomes in cervical cancer patients undergoing concurrent chemoradiotherapy. Comput. Biol. Med. 2024;177 doi: 10.1016/j.compbiomed.2024.108593. [DOI] [PubMed] [Google Scholar]
- 19.Aerts HJWL, Velazquez ER, Leijenaar RTH, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat. Commun. 2014;5:4006. doi: 10.1038/ncomms5006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wang X, Xie T, Luo J, et al. Radiomics predicts the prognosis of patients with locally advanced breast cancer by reflecting the heterogeneity of tumor cells and the tumor microenvironment. Breast Cancer Res. 2022;24(1):20. doi: 10.1186/s13058-022-01516-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.He H, Jin Z, Dai J, et al. Computed tomography-based radiomics prediction of CTLA4 expression and prognosis in clear cell renal cell carcinoma. Cancer Med. 2023;12(6):7627–7638. doi: 10.1002/cam4.5449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med. Image Anal. 2017;42:60–88. doi: 10.1016/j.media.2017.07.005. [DOI] [PubMed] [Google Scholar]
- 23.Chen M, Kong C, Lin G, et al. Development and validation of convolutional neural network-based model to predict the risk of sentinel or non-sentinel lymph node metastasis in patients with breast cancer: a machine learning study. EClinicalMedicine. 2023;63 doi: 10.1016/j.eclinm.2023.102176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115–118. doi: 10.1038/nature21056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Song W, Li S, Chang T, et al. Context-Interactive CNN for Person Re-Identification. IEEE Trans. Image Process. 2019 doi: 10.1109/TIP.2019.2953587. [DOI] [PubMed] [Google Scholar]
- 26.Mathew G, Agha R, Albrecht J, et al. STROCSS 2021: Strengthening the reporting of cohort, cross-sectional and case-control studies in surgery. Int. J. Surg. 2021;96 doi: 10.1016/j.ijsu.2021.106165. [DOI] [PubMed] [Google Scholar]
- 27.Abdul Hadi MFR, Abdullah AN, Hashikin NAA, et al. Utilizing 3D Slicer to incorporate tomographic images into GATE Monte Carlo simulation for personalized dosimetry in yttrium-90 radioembolization. Med. Phys. 2022;49(12):7742–7753. doi: 10.1002/mp.15980. [DOI] [PubMed] [Google Scholar]
- 28.Bilal M, Raza SEA, Azam A, et al. Development and validation of a weakly supervised deep learning framework to predict the status of molecular pathways and key mutations in colorectal cancer from routine histology images: a retrospective study. Lancet Digit. Health. 2021;3(12):e763–e772. doi: 10.1016/S2589-7500(21)00180-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Chen Z, Jiang Y, Zhang X, et al. ResNet18DNN: prediction approach of drug-induced liver injury by deep neural network with ResNet18. Brief. Bioinform. 2022;23(1) doi: 10.1093/bib/bbab503. [DOI] [PubMed] [Google Scholar]
- 30.Li B, Ai D, Liu X. CNN-XG: A Hybrid Framework for sgRNA On-Target Prediction. Biomolecules. 2022;12(3) doi: 10.3390/biom12030409. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Thongsuwan S, Jaiyen S, Padcharoen A, et al. ConvXGB: A new deep learning model for classification problems based on CNN and XGBoost. Nucl. Eng. Technol. 2021;53(2):522–531. [Google Scholar]
- 32.Chen T. Guestrin CJPotnASICoKD. Mining D: XGBoost: A Scalable Tree Boosting System. 2016 [Google Scholar]
- 33.Liu B, Sun Z, Xu ZL, et al. Predicting Disease-Free Survival With Multiparametric MRI-Derived Radiomic Signature in Cervical Cancer Patients Underwent CCRT. Front. Oncol. 2021;11 doi: 10.3389/fonc.2021.812993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Riley RD, Ensor J, Snell KIE, et al. Calculating the sample size required for developing a clinical prediction model. BMJ. 2020;368:m441. doi: 10.1136/bmj.m441. [DOI] [PubMed] [Google Scholar]
- 35.Wang X, Xie T, Luo J, et al. Radiomics predicts the prognosis of patients with locally advanced breast cancer by reflecting the heterogeneity of tumor cells and the tumor microenvironment. Breast Cancer Res. 2022;24(1):20. doi: 10.1186/s13058-022-01516-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.He H, Jin Z, Dai J, et al. Computed tomography-based radiomics prediction of CTLA4 expression and prognosis in clear cell renal cell carcinoma. Cancer Med. 2023;12(6):7627–7638. doi: 10.1002/cam4.5449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Zhao J, Zhang Q, Chen Y, et al. Computed Tomography-Based Radiomics to Predict FOXM1 Expression and Overall Survival in Patients with Clear Cell Renal Cell Carcinoma. Acad. Radiol. 2024 doi: 10.1016/j.acra.2024.01.036. [DOI] [PubMed] [Google Scholar]
- 38.Ma J, Chen K, Li S, et al. MRI-based radiomic models to predict surgical margin status and infer tumor immune microenvironment in breast cancer patients with breast-conserving surgery: a multicenter validation study. Eur. Radiol. 2024;34(3):1774–1789. doi: 10.1007/s00330-023-10144-x. [DOI] [PubMed] [Google Scholar]
- 39.Liu J, Li S, Cao Q, et al. Prediction of Recurrent Cervical Cancer in 2-Year Follow-Up After Treatment Based on Quantitative and Qualitative Magnetic Resonance Imaging Parameters: A Preliminary Study. Ann. Surg. Oncol. 2023;30(9):5577–5585. doi: 10.1245/s10434-023-13756-1. [DOI] [PubMed] [Google Scholar]
- 40.Guo C, Wang J, Wang Y, et al. Novel artificial intelligence machine learning approaches to precisely predict survival and site-specific recurrence in cervical cancer: A multi-institutional study. Transl. Oncol. 2021;14(5) doi: 10.1016/j.tranon.2021.101032. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.