Abstract
Purpose
The current study aimed to construct a novel cancer artificial intelligence survival analysis system for predicting the individual mortality risk curves for cervical carcinoma patients receiving different treatments.
Methods
Study dataset (n = 14,946) was downloaded from Surveillance Epidemiology and End Results database. Accelerated failure time algorithm, multi-task logistic regression algorithm, and Cox proportional hazard regression algorithm were used to develop prognostic models for cancer specific survival of cervical carcinoma patients.
Results
Multivariate Cox regression identified stage, PM, chemotherapy, Age, PT, and radiation_surgery as independent influence factors for cervical carcinoma patients. The concordance indexes of Cox model were 0.860, 0.849, and 0.848 for 12-month, 36-month, and 60-month in model dataset, whereas it were 0.881, 0.845, and 0.841 in validation dataset. The concordance indexes of accelerated failure time model were 0.861, 0.852, and 0.851 for 12-month, 36-month, and 60-month in model dataset, whereas it were 0.882, 0.847, and 0.846 in validation dataset. The concordance indexes of multi-task logistic regression model were 0.860, 0.863, and 0.861 for 12-month, 36-month, and 60-month in model dataset, whereas it were 0.880, 0.860, and 0.861 in validation dataset. Brier score indicated that these three prognostic models have good diagnostic accuracy for cervical carcinoma patients. The current research lacked independent external validation study.
Conclusion
The current study developed a novel cancer artificial intelligence survival analysis system to provide individual mortality risk predictive curves for cervical carcinoma patients based on three different artificial intelligence algorithms. Cancer artificial intelligence survival analysis system could provide mortality percentage at specific time points and explore the actual treatment benefits under different treatments in four stages, which could help patient determine the best individualized treatment. Cancer artificial intelligence survival analysis system was available at: https://zhangzhiqiao15.shinyapps.io/Tumor_Artificial_Intelligence_Survival_Analysis_System/.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12967-022-03491-8.
Keywords: Cervical carcinoma, Artificial intelligence, Prognostic model, Cancer specific survival
Introduction
Cervical carcinoma (CC) was one of the most common malignant tumors in women, with 569,847 new cases and 311,365 deaths in 2018 [1]. Pathological stage was proved to be one of the most important risk factors for CC patients. It was reported that 5-year disease specific survival rates were 80% in stage I, 56% in stage II, 36% in stage III, and < 1% in stage IV [2]. Another retrospectively cohort study reported that 5-year survival rates were 95% for Stage I, 73% for Stage II, 68% for Stage III, and 19% for Stage IV [3]. The 5-year survival rates of patients receiving surgery and radiotherapy were 78.3% and 49.1% in 179 elderly cervical carcinoma patients with stage IA to stage IIB [4]. Overall, the prognosis of advanced CC patients was extremely poor with a significantly shorter life expectancy. Therefore, reliable prognostic models that could predict the prognosis of CC patients were of important clinical significance and application value.
Although radiotherapy and chemotherapy were the valuable treatments for CC patients, not all cervical cancer patients could benefit from radiotherapy and chemotherapy. A meta-analysis based on 2074 CC patients from 21 random trials provided convincing evidences for chemotherapy benefits: chemotherapy with cycle more than 14 days had a pooled HR of 1.25 (P = 0.005), whereas chemotherapy with cycle less than 14 days had a pooled HR of 0.83 (P = 0.046), suggesting that inappropriate chemotherapy cycle might reduce the survival rate of CC patients [5]. Meanwhile, neoadjuvant cisplatin with dose intensities more than 25 mg/m2 per week had a HR of 0.91 (P = 0.20), whereas neoadjuvant cisplatin dose intensities less than 25 mg/m2 per week had a HR of 1.35 (P = 0.002), indicating that inappropriate dose of chemotherapy might reduce the survival rate of CC patients [5]. The survival of patients receiving radiotherapy was poor than that of patients not receiving radiotherapy (HR = 1.09, P = 0.169) in 1864 CC patients [5], demonstrating that not all patients could benefit from radiotherapy. For neuroendocrine cervical carcinoma patients without lymph node metastasis, the survival of patients undergo radiotherapy was significantly poor than that of patients not undergo radiotherapy (HR = 3.36, P < 0.05) [6]. For stage I-IIA neuroendocrine cervical carcinoma patients with tumor size more than 4 cm, the median survival time (61 months) of patients undergo neo-adjuvant chemotherapy was shorter than that (63 months) of patients not undergo neo-adjuvant chemotherapy (P = 0.785) [6]. These previous studies demonstrated that not all CC patients could benefit from chemotherapy and radiotherapy, especially for CC patients with stage I and stage II.
Several previous studies developed prognostic models that could predict the prognosis of CC patients [7–10]. However, these prognostic models could only provide the survival curves for a special group, but not predict the survival curves for a specific individual patient at the individual level. Individualized survival prediction was the essential foundation of precision medicine and individualized treatment. Our research team constructed several individual mortality risk predictive tools to provide the individual mortality risk predicted curves for different cancers [11–18]. Several artificial intelligence algorithms were used to develop prognostic models for predicting the individual mortality risk predictive curves for different cancers [19, 20]. Recently, a research team from Harvard Medical School developed a novel predictive tool for predicting the individual mortality risk for glioblastoma patients based on accelerated failure time (AFT) algorithm [21]. These previous studies provided valuable ideas for artificial intelligence in predicting the individual mortality risk curves for different tumors.
Therefore, the current study aimed to construct a novel cancer artificial intelligence survival analysis system for providing the individual mortality risk predicted curves for CC patients receiving different treatments.
Materials and methods
Study dataset
Study dataset was downloaded from Surveillance Epidemiology and End Results (SEER) database (2010–2015). All patients were diagnosed with cervical carcinoma through pathological examination. The diagnostic criteria for cervical carcinoma was in accordance with the suggestions of American Joint Committee on Cancer (AJCC 7 edition). In order to eliminate the effects of confounding factors, living patients with survival time less than 12 months were excluded from the present study. In the study of tumor prognosis, 5 years or 10 years is the most common follow-up period for tumor prognostic study. For a well-designed prognostic study with good patient compliance, the survival time of “living patients” should be infinitely close to the longest follow-up time. The living patients with a survival time shorter than 12 months in the study dataset should consider the following two different situations: the first one is that this patient died within 12 months and can’t continue to follow up. In this case, this died patient defined as a living patient in dataset will has an adverse impact on the study conclusion, so it should be excluded from the current study accordingly. The other one is that this patient is still alive, but can’t be followed up and provide subsequent survival information due to other special reasons. In this case, the survival time of this patient is obviously underestimated, and it will has a significant adverse impact on the study result. Therefore, the living patients who were followed up for less than 12 months were excluded from the current study. Meanwhile, patients who died of causes other than cancer were excluded from the current study. All patients’ privacy information and identity information were anonymized in SEER database. All patients in SEER database signed the informed consent form at the enrollment stage. For the above reasons, ethical review and informed consent were exempted by our institutional review board. There were 14,946 cervical carcinoma patients included in the final survival analysis.
Artificial intelligence algorithms and restricted mean survival time
Cox proportional hazard regression model algorithm was performed according to the advices in original articles [22, 23]. Accelerated failure time (AFT) algorithm was performed according to the previous studies [21, 24]. Multi-task logistic regression (MTLR) algorithm was performance in line with the suggestions of the previous articles [25, 26]. The restricted mean survival time is the sum of the areas under the survival curve in a specific time period [27–31]. As a valuable prognostic index, restricted mean survival time was widely applied to different prognostic studies [27–31].
Statistical analyses
Statistical analyses were carried out by SPSS Statistics 21.0 (SPSS Inc., USA). Artificial intelligence algorithms were performed through R software 3.6.0 and Python language 3.7.2 according to previous studies [11–18]. P value < 0.05 was defined as significant statistical difference.
Results
Study cohort
The current study finally enrolled 14,946 eligible cervical cancer patients. The enrolled patients were randomly divided into model dataset (n = 7536) and validation dataset (n = 7410). The baseline characteristics of patients in model dataset and validation dataset were shown in Table 1.
Table 1.
Baseline characteristics of patients in model group and validation group
| Variable | Model group | Validation group | Group difference | |
|---|---|---|---|---|
| N = 7536 | N = 7410 | Test value | P value | |
| Overall survival [month] | 32 (17.54) | 31 (17.54) | 1.822 | 0.177 | 
| Age [year] | 48 (38.59) | 47 (38.58) | 2.791 | 0.095 | 
| Death [n (%)] | 1846 (24.5) | 1791 (24.2) | 0.198 | 0.656 | 
| PT 0 [n (%)] | 1 (0) | 6 (0.1) | 1.42 | 0.233 | 
| PT 1 [n (%)] | 4326 (58.8) | 4341 (59.0) | ||
| PT 2 [n (%)] | 1782 (24.2) | 1643 (22.3) | ||
| PT 3 [n (%)] | 1158 (15.7) | 1171 (15.9) | ||
| PT 4 [n (%)] | 269 (3.7) | 249 (3.4) | ||
| PN 1 [n (%)] | 2018 (26.8) | 1913 (25.8) | 1.733 | 0.188 | 
| PM 1 [n (%)] | 875 (11.6) | 836 (11.3) | 0.367 | 0.545 | 
| Stage 1 [n (%)] | 3696 (49.9) | 3708 (50.0) | 1.43 | 0.232 | 
| Stage 2 [n (%)] | 1057 (14.3) | 1014 (13.7) | ||
| Stage 3 [n (%)] | 1764 (23.8) | 1721 (23.2) | ||
| Stage 4 [n (%)] | 1019 (13.8) | 967 (13.0) | ||
| Radiation_Surgery [n (%)] | 1971 (24.8) | 1872 (25.3) | 0.355 | 0.551 | 
| Chemotherapy [n (%)] | 3937 (52.2) | 3826 (51.6) | 0.532 | 0.466 | 
Continuous variables were expressed as mean ± standard deviation or median (first quartile, third quartile) as appropriate
Variable importance assessment
The current study performed random survival forest algorithm to evaluate the variable importance and explore the error rate with different number of trees. Error rate chart assessed by Out-Of-Bag method was presented in Fig. 1A. Figure 1B listed the most important variables on survival outcome from high to low: stage, PT, chemotherapy, PM, age, and radiation_surgery. Multivariable Cox regression identified stage, PM, chemotherapy, age, PT, and radiation_surgery as independent prognostic factors for cancer specific survival (CSS) of cervical carcinoma in Table 2.
Fig. 1.
Variable selection information of predictive model: A error rate chart of random survival forest; B variable importance assessment chart of random survival forest; C prognostic nomogram chart generated by multivariable Cox survival regression
Table 2.
Model accuracy evaluation based on bootstrap resampling method
| Model | Dataset | Number | C-index | C-index | C-index | Brier-score | Brier-score | Brier-score | 
|---|---|---|---|---|---|---|---|---|
| 12-month | 36-month | 60-month | 12-month | 36-month | 60-month | |||
| Cox | Dataset1 | 14,946 | 0.822 | 0.822 | 0.822 | 0.077 | 0.124 | 0.133 | 
| Dataset2 | 14,946 | 0.818 | 0.818 | 0.818 | 0.078 | 0.124 | 0.137 | |
| Dataset3 | 14,946 | 0.819 | 0.819 | 0.819 | 0.075 | 0.123 | 0.135 | |
| Dataset4 | 14,946 | 0.819 | 0.819 | 0.819 | 0.076 | 0.125 | 0.138 | |
| Dataset5 | 14,946 | 0.823 | 0.823 | 0.823 | 0.075 | 0.122 | 0.130 | |
| AFT | Dataset1 | 14,946 | 0.824 | 0.824 | 0.824 | 0.077 | 0.123 | 0.131 | 
| Dataset2 | 14,946 | 0.819 | 0.819 | 0.819 | 0.078 | 0.124 | 0.136 | |
| Dataset3 | 14,946 | 0.821 | 0.821 | 0.821 | 0.076 | 0.123 | 0.133 | |
| Dataset4 | 14,946 | 0.821 | 0.821 | 0.821 | 0.077 | 0.125 | 0.136 | |
| Dataset5 | 14,946 | 0.825 | 0.825 | 0.825 | 0.075 | 0.122 | 0.128 | |
| MTLR | Dataset1 | 14,946 | 0.827 | 0.830 | 0.830 | 0.077 | 0.133 | 0.131 | 
| Dataset2 | 14,946 | 0.822 | 0.824 | 0.825 | 0.078 | 0.124 | 0.137 | |
| Dataset3 | 14,946 | 0.824 | 0.828 | 0.829 | 0.076 | 0.122 | 0.134 | |
| Dataset4 | 14,946 | 0.824 | 0.826 | 0.827 | 0.077 | 0.125 | 0.137 | |
| Dataset5 | 14,946 | 0.828 | 0.830 | 0.830 | 0.075 | 0.121 | 0.130 | 
MTLR multi-task logistic regression, AFT accelerated failure time
Prognostic nomogram predictive chart
A prognostic nomogram mortality risk predictive chart based on Cox regression model was presented in Fig. 1C. The prognostic score could be calculated by using the following equation: prognostic score = (0.787 * Stage) + (− 0.208 * Chemotherapy) + (0.015 * Age) + (0.299 * PT) + (− 0.263 * Radiation_Surgery) + (0.245 * PM).
Cancer artificial intelligence survival analysis system
The current study further developed a novel Cancer artificial intelligence survival Analysis system (CAISAS) for predicting the prognosis of cervical carcinoma patients. CAISAS was developed based on six previous influence factors through Cox proportional hazard regression model algorithm, accelerated failure time model (AFT) algorithm, and Multi-task logistic regression (MTLR) algorithm. CAISAS could be freely used at: https://zhangzhiqiao15.shinyapps.io/Tumor_Artificial_Intelligence_Survival_Analysis_System/.
By six major parameters and three artificial intelligence algorithms, CAISAS could provide individual mortality risk predicted curves for a special patient under different treatments.
Performance of Cox proportional hazard regression model
Cox proportional hazard regression model could provide individual survival predicted curves for a special patient under different treatments (Fig. 2A). The concordance indexes of Cox model were 0.860, 0.849, and 0.848 for 12-month, 36-month, and 60-month in model dataset (Fig. 2B), whereas it were 0.881, 0.845, and 0.841 in validation dataset (Fig. 2D). The higher the C index, the better the diagnostic accuracy. Survival curve charts demonstrated that Cox model could discriminate high mortality risk patients from low mortality risk patients in model dataset (Fig. 2C) and validation cohort (Fig. 2E; Additional file 1).
Fig. 2.
Clinical performance of Cox model: A predictive individual mortality risk curves under different treatments; B time-dependent receiver operating characteristic curves in model cohort; C survival curves for high risk group and low risk group in model cohort; D time-dependent receiver operating characteristic curves in validation cohort; E survival curves for high risk group and low risk group in validation cohort
Performance of accelerated failure time model
Accelerated failure time model could provide individual survival predicted curves for a special patient under different treatments (Fig. 3A). The concordance indexes of AFT model were 0.861, 0.852, and 0.851 for 12-month, 36-month, and 60-month in model dataset (Fig. 3B), whereas it were 0.882, 0.847, and 0.846 in validation dataset (Fig. 3D). Survival curve charts demonstrated that AFT model could discriminate high mortality risk patients from low mortality risk patients in model dataset (Fig. 3C) and validation cohort (Fig. 3E).
Fig. 3.
Clinical performance of accelerated failure time model: A predictive individual mortality risk curves under different treatments; B time-dependent receiver operating characteristic curves in model cohort; C survival curves for high risk group and low risk group in model cohort; D time-dependent receiver operating characteristic curves in validation cohort; E survival curves for high risk group and low risk group in validation cohort
Performance of multi-task logistic regression model
Multi-task logistic regression model could provide individual survival predicted curves for a special patient under different treatments (Fig. 4A). The concordance indexes of MTLR model were 0.860, 0.863, and 0.861 for 12-month, 36-month, and 60-month in model dataset (Fig. 4B), whereas it were 0.880, 0.860, and 0.861 in validation dataset (Fig. 4D). Survival curve charts demonstrated that MTLR model could discriminate high mortality risk patients from low mortality risk patients in model dataset (Fig. 4C) and validation cohort (Fig. 4E).
Fig. 4.
Clinical performance of multi-task logistic regression model: A predictive individual mortality risk curves under different treatments; B time-dependent receiver operating characteristic curves in model cohort; C survival curves for high risk group and low risk group in model cohort; D time-dependent receiver operating characteristic curves in validation cohort; E survival curves for high risk group and low risk group in validation cohort
Brier score assessment
The lower the Brier score, the more consistent the predicted results with the actual results. Brier scores of Cox model were 0.126, 0.127, and 0.137 in model dataset, whereas it were 0.116, 0.125, and 0.134 in validation dataset for 12-month, 36-month, and 60-month, respectively. Brier scores of AFT model were 0.133, 0.128, and 0.136 in model dataset, whereas it were 0.124, 0.126, and 0.133 in validation dataset for 12-month, 36-month, and 60-month, respectively. Brier scores of MTLR model were 0.124, 0.126, and 0.137 in model dataset, whereas it were 0.116, 0.124, and 0.133 in validation dataset for 12-month, 36-month, and 60-month, respectively.
Internal validation by bootstrap resampling method
Limited by the special requirements for chemotherapy information and radiotherapy information, the current study failed to obtain effective external validation datasets from public databases other than SEER database. Therefore, according to the recommendations of transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) [32], we used the self-help guide resampling method to build different internal validation datasets for evaluating the accuracy of three prognostic models. We re-sampled 14,946 patients from the original 14,946 patients in the way of put back re-sampling to build 5 internal validation datasets. Then we used these 5 internal validation datasets to evaluate the accuracy of three predictive models (Table 2). The evaluation results showed that the C-indexes of MTLR model were the best, and its highest C-indexes of 12-month, 36-month, and 60-month were 0.828, 0.830, and 0.830 respectively, suggesting that MTLR model has the best diagnostic efficiency in three prognostic models. At the same time, Brier scores of MTLR model of 12-month, 36-month, and 60-month were 0.075, 0.121, and 0.130 respectively, showing good consistency between the actual mortality and predicted mortality predicted by MTLR model.
Survival prediction at specific time points
As shown in Fig. 5, AFT algorithm provided predicted mortality percentage and 95% confidence interval at specific time points. This predictive function could provide individual mortality predicted percentage and 95% confidence interval for patients receiving different treatments at 12-month (Fig. 5A), 36-month (Fig. 5B), and 60-month (Fig. 5C). Through comparison of treatment benefits at different time points, this predictive function could provide valuable prognostic information for personalized treatment decision.
Fig. 5.
Mortality rates and 95% confidence interval by accelerated failure time algorithm for 12-month (A), 36-month (B), and 60-month (C)
Treatment benefits in different stages
To explore the treatment benefits in different stages, CAISAS provided predictive function in providing individual mortality risk predicted curves under different treatments in different stages. Treatment benefits under different treatments were presented in Fig. 6A for stage I, Fig. 6B for stage II, Fig. 6C for stage III, and Fig. 6D for stage IV. Figure 6B, Fig. 6C, and Fig. 6D demonstrated that radiation/surgery and chemotherapy could improve the cancer specific survival in stage II, stage III, and stage IV, whereas Fig. 6A suggested that radiation/surgery and chemotherapy did not improve the cancer specific survival in stage I. The restricted mean survival time could provide lateral prediction of survival time for tumor patients, so as to help patients better understand the survival benefits brought by different treatments. The current predictive system provided the restricted mean survival times for patients receiving various treatments in four tumor stages (Fig. 6).
Fig. 6.
Predictive individual mortality risk curves for patients under different treatments with stage I (A), stage II (B), stage III (C), and stage IV (D)
Treatment benefit of chemotherapy in different stages
To explore the treatment benefit of chemotherapy in different stages, CAISAS provided predictive function in providing individual mortality risk predicted curves for patient without chemotherapy and with chemotherapy in different stages. Treatment benefit of chemotherapy under different treatments were presented in Additional file 2: Fig. S1A for stage I, Additional file 2: Fig. S1B for stage II, Additional file 2: Fig. S1C for stage III, and Additional file 2: Fig. 1D for stage IV. As shown in Additional file 2: Fig. S1A, the survival of patients with chemotherapy was significantly poor than that of patients without chemotherapy in stage I (HR = 4.115, P < 0.001), whereas the survival of patients with chemotherapy was significantly higher than that of patients without chemotherapy in stage II, stage3, and stage IV, indicating that chemotherapy did not improve the cancer specific survival in stage I. The current predictive system provided the restricted mean survival times for patients receiving chemotherapy or not in four tumor stages (Additional file 2: Fig. S1).
Treatment benefit of radiation/surgery in different stages
To explore the treatment benefit of radiation/surgery in different stages, CAISAS provided predictive function in providing individual mortality risk predicted curves for patient without radiation/surgery and with radiation/surgery in different stages. Treatment benefit of radiation/surgery under different treatments were presented in Additional file 3: Fig. S2A for stage I, Additional file 3: Fig. S2B for stage II, Additional file 3: Fig. S2C for stage III, and Additional file 3: Fig. S2D for stage IV. As shown in Additional file 3: Fig. S2A, the survival of patients with radiation/surgery was significantly poor than that of patients without radiation/surgery in stage I (HR = 2.077, P < 0.001), whereas the survival of patients with radiation/surgery was significantly higher than that of patients without radiation/surgery in stage II, stage 3, and stage IV, indicating that radiation/surgery did not improve the cancer specific survival in stage I. The current predictive system provided the restricted mean survival times for patients receiving radiation/surgery or not in four tumor stages (Additional file 3: Fig. S2).
Subgroup analyses of prognostic factors in different stages
To explore the differences of prognostic factors in different stages, the current study performed multivariable Cox regression in different stages. In stage I, univariable Cox regression identified radiation/surgery and chemotherapy as risk factors for cervical carcinoma (P < 0.001). Multivariable Cox regression demonstrated that chemotherapy was an independent risk factor for cervical carcinoma in stage I subgroup (P < 0.001). For stage II subgroup, stage III subgroup, and stage IV subgroup, radiation/surgery and chemotherapy were proved to be independent protective factors for cervical carcinoma by univariable Cox regression and multivariable Cox regression (Table 3).
Table 3.
Results of Cox regression analyses
| Subgroup | Parameters | Univariate analysis | Multivariate analysis | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| HR | HR.95L | HR.95H | P value | Coef. | HR | HR.95L | HR.95H | P value | ||
| All patient (n = 14,946) | Age | 1.033 | 1.031 | 1.035 | < 0.001 | 0.015 | 1.016 | 1.013 | 1.018 | < 0.001 | 
| PT | 2.671 | 2.588 | 2.756 | < 0.001 | 0.299 | 1.348 | 1.286 | 1.414 | < 0.001 | |
| PN | 3.826 | 3.584 | 4.084 | < 0.001 | 0.044 | 1.044 | 0.964 | 1.131 | 0.285 | |
| PM | 7.288 | 6.798 | 7.814 | < 0.001 | 0.245 | 1.278 | 1.155 | 1.414 | < 0.001 | |
| Stage | 2.809 | 2.718 | 2.902 | < 0.001 | 0.787 | 2.197 | 2.054 | 2.351 | < 0.001 | |
| Radiation/surgery | 0.784 | 0.724 | 0.848 | < 0.001 | − 0.263 | 0.769 | 0.708 | 0.836 | < 0.001 | |
| Chemotherapy | 2.985 | 2.772 | 3.215 | < 0.001 | − 0.208 | 0.812 | 0.746 | 0.885 | < 0.001 | |
| Stage 1 (n = 7404) | Age | 1.040 | 1.033 | 1.047 | < 0.001 | 0.032 | 1.033 | 1.026 | 1.040 | < 0.001 | 
| PT | NR | NR | NR | NR | NR | NR | NR | NR | NR | |
| PN | NR | NR | NR | NR | NR | NR | NR | NR | NR | |
| PM | NR | NR | NR | NR | NR | NR | NR | NR | NR | |
| Radiation/surgery | 2.259 | 1.842 | 2.770 | < 0.001 | − 0.046 | 0.955 | 0.757 | 1.206 | 0.699 | |
| Chemotherapy | 4.538 | 3.738 | 5.510 | < 0.001 | 1.369 | 3.931 | 3.141 | 4.921 | < 0.001 | |
| Stage 2 (n = 2071) | Age | 1.018 | 1.011 | 1.024 | < 0.001 | 0.015 | 1.015 | 1.008 | 1.022 | < 0.001 | 
| PT | NR | NR | NR | NR | NR | NR | NR | NR | NR | |
| PN | NR | NR | NR | NR | NR | NR | NR | NR | NR | |
| PM | NR | NR | NR | NR | NR | NR | NR | NR | NR | |
| Radiation/surgery | 0.699 | 0.568 | 0.861 | < 0.001 | − 0.281 | 0.755 | 0.612 | 0.932 | 0.009 | |
| Chemotherapy | 0.574 | 0.465 | 0.707 | < 0.001 | − 0.494 | 0.610 | 0.494 | 0.753 | < 0.001 | |
| Stage 3 (n = 3485) | Age | 1.016 | 1.012 | 1.020 | < 0.001 | 0.006 | 1.006 | 1.002 | 1.010 | 0.003 | 
| PT | 1.736 | 1.617 | 1.864 | < 0.001 | 0.514 | 1.672 | 1.529 | 1.827 | < 0.001 | |
| PN | 0.616 | 0.549 | 0.690 | < 0.001 | 0.180 | 1.197 | 1.047 | 1.369 | 0.009 | |
| PM | NR | NR | NR | NR | NR | NR | NR | NR | NR | |
| Radiation/surgery | 0.495 | 0.438 | 0.560 | < 0.001 | − 0.315 | 0.730 | 0.639 | 0.834 | < 0.001 | |
| Chemotherapy | 0.610 | 0.531 | 0.700 | < 0.001 | − 0.446 | 0.640 | 0.556 | 0.737 | < 0.001 | |
| Stage 4 (n = 1986) | Age | 1.014 | 1.010 | 1.018 | < 0.001 | 0.005 | 1.005 | 1.001 | 1.009 | < 0.001 | 
| PT | 1.172 | 1.114 | 1.233 | < 0.001 | 0.234 | 1.264 | 1.189 | 1.343 | < 0.001 | |
| PN | 1.028 | 0.919 | 1.149 | 0.633 | 0.009 | 1.009 | 0.899 | 1.132 | 0.882 | |
| PM | 1.219 | 1.045 | 1.422 | 0.012 | 0.610 | 1.841 | 1.539 | 2.202 | < 0.001 | |
| Radiation/surgery | 0.437 | 0.374 | 0.511 | < 0.001 | − 0.621 | 0.537 | 0.458 | 0.631 | < 0.001 | |
| Chemotherapy | 0.388 | 0.345 | 0.436 | < 0.001 | − 0.867 | 0.420 | 0.372 | 0.474 | < 0.001 | |
HR hazard ratio
Discussion
Through three artificial intelligence algorithms, we developed a novel cancer artificial intelligence survival analysis system (CAISAS) for individual mortality risk prediction of CC patients. CAISAS could provide individual mortality risk prediction under different treatments through three artificial intelligence algorithms. CAISAS could provide predicted mortality percentage and 95% confidence interval for specific time points, which was helpful to display the actual treatment benefits under different treatments. Meanwhile, CAISAS provided comparison functions of treatment benefits in different stages, which were valuable to understand the treatment benefits under different treatments in different stages. Through simulating treatment benefits and individual mortality risk predicted curves for a special individual patient under different treatments, CC patient could choose the best individualized treatment.
Several previous prognostic models could predict the prognosis of CC patients [7–10], but failed to provide individual mortality risk prediction. CAISAS could not only provide the survival prediction for a specific group at the group level, but also provide the individual mortality risk prediction for a specific patient at the individual level. As far as we know, CAISAS was the first artificial intelligence survival predictive system that could provide individual mortality risk prediction for CC patients in the world.
Cox regression analysis demonstrated that chemotherapy and radiation/surgery did not improve the cancer specific survival in stage I. Previous studies provided evidences to support the result in the current study. The 3-year disease-specific survival for cervical cancer patients receiving radiotherapy and/or chemotherapy was 73.2%, which was significantly lower than 94.3% for patients receiving surgery and/or adjuvant treatment in cervical cancer patients after primary treatment [33]. Patients receiving radiotherapy only had a poor survival rate than patients not receiving radiotherapy (HR 1.48, P < 0.001) [34]. The overall survival in cervical cancer patients receiving radiotherapy was 53%, which was significantly lower than 61% for patients receiving conventional surgery in stage I cervical cancer patients [35]. A meta-analysis based on 2456 CC patients demonstrated that chemoradiation could improve the overall survival rate with an absolute benefit of 10% (from 60 to 70%) [36]. Chemotherapy might be not a protective factor for overall survival of stage I or II CC patients with a HR of 1.31(95% CI 0.46–3.73, P > 0.05) [37]. The overall survival of cervical cancer patients receiving radical hysterectomy was superior to that of patients receiving chemoradiotherapy for CC patients with stage IB–IIA [38]. These previous studies demonstrated that radiotherapy and chemotherapy might not be the best treatments for CC patients with stage I.
Cox proportional hazard regression model algorithm was used to construct predictive models for different tumors [22, 23]. Accelerated failure time model might be a credible alternative to Cox proportional hazard regression model [24, 39]. AFT algorithm was used for developing prognostic models for different cancers [40, 41]. Multi-task logistic regression algorithm was used to build predictive models for prognostic prediction [25, 42, 43]. It was reported that multi-task logistic regression model was superior to Cox model in survival prediction [44]. The concordance indexes and Brier scores of three prognostic models in the current study suggested that these three prognostic models have reliable diagnostic accuracy for prognostic prediction of CC patients.
Limitations
First, the current study was not able to further explore the treatment benefits of specific radiotherapy, chemotherapy, and surgery because the SEER database did not provide the detailed radiotherapy, chemotherapy, and surgery information. Second, because the SEER database did not provide the information of the eighth AJCC tumor staging system, the pathological criteria was in accordance with the seventh AJCC tumor staging system in the current study. Third, in order to improve the clinical generality of CAISAS in different regions and hospitals with different medical levels, several valuable diagnostic biomarkers (such as CA242 and CA199) were not included in CAISAS. The addition of serum tumor biomarkers might be helpful to improve the predictive accuracy of the prognostic models. Fourth, CAISAS provided individualized mortality risk prediction based on the current research dataset of 14,946 cervical cancer patients. As far as the prognostic model is concerned, all individual predictive results are closely related to the clinical characteristics of the enrolled patients, so the predicted results have certain limitations and can’t represent an absolute survival predicted result, which is only for the reference of clinicians. Fifth, the current research lacked independent external validation study. Large sample size independent external validation study is very important for tumor long-term prognostic study.
In conclusion, the current study developed a novel cancer artificial intelligence survival analysis system to provide individual mortality risk predictive curves for cervical carcinoma patients based on three different artificial intelligence algorithms. Cancer artificial intelligence survival analysis system could provide mortality predicted percentage at specific time points and explore the actual treatment benefits under different treatments in different stages, which could help patient determine the best individualized treatment. Cancer artificial intelligence survival analysis system was available at: https://zhangzhiqiao15.shinyapps.io/Tumor_Artificial_Intelligence_Survival_Analysis_System/.
Supplementary Information
Additional file 1. Baseline characteristics of the original study dataset.
Additional file 2: Figure S1. Benefits of chemotherapy for patients with stage I (A), stage II (B), stage III (C), and stage IV (D).
Additional file 3: Figure S2. Benefits of radiotherapy/surgery for patients with stage I (A), stage II (B), stage III (C), and stage IV (D).
Acknowledgements
We would like to thank Dr. Gary S. Collins, Dr. Manali Rupji, Mrs. Qingmei Liu for help and support on development of Artificial intelligence survival predictive system.
Author contributions
Conceptualization, methodology and resources: ZZ, TH, JL, XG, and HL; investigation, data curation, formal analysis, validation, software, project administration, and supervision: ZZ, TH, JL, XG, and HL; writing and visualization: ZZ and HL; funding acquisition: ZZ. All authors read and approved the final manuscript.
Funding
Foshan Science and Technology Bureau (2020001004584).
Availability of data and materials
The study data is available at SEER database (https://seer.cancer.gov/).
Declarations
Ethics approval and consent to participate
Details of all patients in SEER database have been anonymously processed and therefore the current research does not involve patients’ privacy information. The current study was a second study based on public datasets from SEER database. The current study was performed according to public database policy and declaration of Helsinki. Therefore, ethical approval and informed consent were not applicable according to above reasons.
Consent for publication
The current study did not report individual participant's data in any form so that informed consent was not applicable. All authors reviewed the manuscript and consented for publication.
Competing interests
The authors declare no potential conflicts of interest.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Jieyi Liang, Tingshan He and Hong Li are co-first authors of the current study
References
- 1.Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68:394–424. doi: 10.3322/caac.21492. [DOI] [PubMed] [Google Scholar]
- 2.Lorenz E, Strickert T, Hagen B. Radiation therapy in cervical carcinoma: fifteen years experience in a Norwegian health region. Eur J Gynaecol Oncol. 2009;30:20–24. [PubMed] [Google Scholar]
- 3.Kokawa K, Takekida S, Kamiura S, et al. The incidence, treatment and prognosis of cervical carcinoma in young women: a retrospective analysis of 4,975 cases in Japan. Eur J Gynaecol Oncol. 2010;31:37–43. [PubMed] [Google Scholar]
- 4.Huang YW, Li MD, Liu FY, Li YF. Analysis of clinical efficiency of treatment for 179 geriatric women with stage I or II cervical carcinoma. Ai Zheng. 2002;21:1238–1240. [PubMed] [Google Scholar]
- 5.Iwata T, Miyauchi A, Suga Y, Nishio H, Nakamura M, Ohno A, Hirao N, Morisada T, Tanaka K, Ueyama H, Watari H. Neoadjuvant chemotherapy for locally advanced cervical cancer: a systematic review and meta-analysis of individual patient data from 21 randomised trials. Eur J Cancer. 2003;39:2470–2486. doi: 10.1016/s0959-8049(03)00425-8. [DOI] [PubMed] [Google Scholar]
- 6.Zhang X, Lv Z, Lou H. The clinicopathological features and treatment modalities associated with survival of neuroendocrine cervical carcinoma in a Chinese population. BMC Cancer. 2019;19:22. doi: 10.1186/s12885-018-5147-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zhang S, Wang X, Li Z, et al. Score for the overall survival probability of patients with first-diagnosed distantly metastatic cervical cancer: a novel nomogram-based risk assessment system. Front Oncol. 2019;9:1106. doi: 10.3389/fonc.2019.01106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Gulseren V, Kocaer M, Cakir I, et al. Postoperative nomogram for the prediction of disease-free survival in lymph node-negative stage I–IIA cervical cancer patients treated with radical hysterectomy. J Obstet Gynaecol. 2019;40:1–6. doi: 10.1080/01443615.2019.1652888. [DOI] [PubMed] [Google Scholar]
- 9.Zhou H, Li X, Zhang Y, et al. Establishing a nomogram for stage IA–IIB cervical cancer patients after complete resection. Asian Pac J Cancer Prev. 2015;16:3773–3777. doi: 10.7314/apjcp.2015.16.9.3773. [DOI] [PubMed] [Google Scholar]
- 10.Wang C, Yang C, Wang W, et al. A prognostic nomogram for cervical cancer after surgery from SEER database. J Cancer. 2018;9:3923–3928. doi: 10.7150/jca.26220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Zhang Z, Li J, He T, et al. The competitive endogenous RNA regulatory network reveals potential prognostic biomarkers for overall survival in hepatocellular carcinoma. Cancer Sci. 2019;110:2905–2923. doi: 10.1111/cas.14138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Cheng C, Wang Q, Zhu M, et al. Integrated analysis reveals potential long non-coding RNA biomarkers and their potential biological functions for disease free survival in gastric cancer patients. Cancer Cell Int. 2019;19:123. doi: 10.1186/s12935-019-0846-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zhang Z, He T, Huang L, et al. Two precision medicine predictive tools for six malignant solid tumors: from gene-based research to clinical application. J Transl Med. 2019;17:405. doi: 10.1186/s12967-019-02151-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zhang Z, Li J, He T, Ding J. Bioinformatics identified 17 immune genes as prognostic biomarkers for breast cancer: application study based on artificial intelligence algorithms. Front Oncol. 2020;10:330. doi: 10.3389/fonc.2020.00330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zhang Z, Li J, He T, et al. Two predictive precision medicine tools for hepatocellular carcinoma. Cancer Cell Int. 2019;19:290. doi: 10.1186/s12935-019-1002-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Zhang Z, Liu Q, Wang P, et al. Development and internal validation of a nine-lncRNA prognostic signature for prediction of overall survival in colorectal cancer patients. PeerJ. 2018;6:e6061. doi: 10.7717/peerj.6061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Zhang Z, Ouyang Y, Huang Y, et al. Comprehensive bioinformatics analysis reveals potential lncRNA biomarkers for overall survival in patients with hepatocellular carcinoma: an on-line individual risk calculator based on TCGA cohort. Cancer Cell Int. 2019;19:174. doi: 10.1186/s12935-019-0890-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Zhu M, Wang Q, Luo Z, et al. Development and validation of a prognostic signature for preoperative prediction of overall survival in gastric cancer patients. Onco Targets Ther. 2018;11:8711–8722. doi: 10.2147/OTT.S181741. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Senders JT, Staples P, Mehrtash A, et al. An online calculator for the prediction of survival in glioblastoma patients using classical statistics and machine learning. Neurosurgery. 2019;86(2):E184–E192. doi: 10.1093/neuros/nyz403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Chang MC. Development of individual survival estimating program for cancer patients’ management. Healthc Inform Res. 2015;21:134–137. doi: 10.4258/hir.2015.21.2.134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Senders JT, Staples P, Mehrtash A, et al. An online calculator for the prediction of survival in glioblastoma patients using classical statistics and machine learning. Neurosurgery. 2020;86:E184–E192. doi: 10.1093/neuros/nyz403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Fisher LD, Lin DY. Time-dependent covariates in the Cox proportional-hazard regression model. Annu Rev Public Health. 1999;20:145–157. doi: 10.1146/annurev.publhealth.20.1.145. [DOI] [PubMed] [Google Scholar]
- 23.Katzman JL, Shaham U, Cloninger A, et al. DeepSurv: personalized treatment recommender system using a Cox proportional hazard deep neural network. BMC Med Res Methodol. 2018;18:24. doi: 10.1186/s12874-018-0482-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Zare A, Hosseini M, Mahmoodi M, et al. A comparison between accelerated failure-time and cox proportional hazard models in analyzing the survival of gastric cancer patients. Iran J Public Health. 2015;44:1095–1102. [PMC free article] [PubMed] [Google Scholar]
- 25.Alaeddini A, Hong SH. A multi-way multi-task learning approach for multinomial logistic regression*. An application in joint prediction of appointment miss-opportunities across multiple clinics. Methods Inf Med. 2017;56:294–307. doi: 10.3414/ME16-01-0112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Bisaso KR, Karungi SA, Kiragga A, et al. A comparative study of logistic regression based machine learning techniques for prediction of early virological suppression in antiretroviral initiating HIV patients. BMC Med Inform Decis Mak. 2018;18:77. doi: 10.1186/s12911-018-0659-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Zhao L, Claggett B, Tian L, et al. On the restricted mean survival time curve in survival analysis. Biometrics. 2016;72:215–221. doi: 10.1111/biom.12384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Lee CH, Ning J, Shen Y. Analysis of restricted mean survival time for length-biased data. Biometrics. 2018;74:575–583. doi: 10.1111/biom.12772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Liu M, Li H. Estimation of heterogeneous restricted mean survival time using random forest. Front Genet. 2020;11:587378. doi: 10.3389/fgene.2020.587378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Di Spazio L, Cancanelli L, Rivano M, et al. Restricted mean survival time in advanced non-small cell lung cancer treated with immune checkpoint inhibitors. Eur Rev Med Pharmacol Sci. 2021;25:1881–1889. doi: 10.26355/eurrev_202102_25083. [DOI] [PubMed] [Google Scholar]
- 31.Quartagno M, Morris TP, White IR. Why restricted mean survival time methods are especially useful for non-inferiority trials. Clin Trials. 2021;18:743–745. doi: 10.1177/17407745211045124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015;350:g7594. doi: 10.1136/bmj.g7594. [DOI] [PubMed] [Google Scholar]
- 33.Lau YM, Cheung TH, Yeo W, et al. Prognostic implication of human papillomavirus types and species in cervical cancer patients undergoing primary treatment. PLoS ONE. 2015;10:e0122557. doi: 10.1371/journal.pone.0122557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Uppal S, Del Carmen MG, Rice LW, et al. Variation in care in concurrent chemotherapy administration during radiation for locally advanced cervical cancer. Gynecol Oncol. 2016;142:286–292. doi: 10.1016/j.ygyno.2016.05.026. [DOI] [PubMed] [Google Scholar]
- 35.Hou WH, Schultheiss TE, Wong JY, et al. Surgery versus radiation treatment for high-grade neuroendocrine cancer of uterine cervix: a surveillance epidemiology and end results database analysis. Int J Gynecol Cancer. 2018;28:188–193. doi: 10.1097/IGC.0000000000001143. [DOI] [PubMed] [Google Scholar]
- 36.Green J, Kirwan J, Tierney J, et al. Concomitant chemotherapy and radiation therapy for cancer of the uterine cervix. Cochrane Database Syst Rev. 2005 doi: 10.1002/14651858.CD002225.pub2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Fleming ND, Frumovitz M, Schmeler KM, et al. Significance of lymph node ratio in defining risk category in node-positive early stage cervical cancer. Gynecol Oncol. 2015;136:48–53. doi: 10.1016/j.ygyno.2014.11.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Yan RN, Zeng Z, Liu F, et al. Primary radical hysterectomy vs chemoradiation for IB2-IIA cervical cancer: a systematic review and meta-analysis. Medicine. 2020;99:e18738. doi: 10.1097/MD.0000000000018738. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Liang Y, Chai H, Liu XY, et al. Cancer survival analysis using semi-supervised learning method based on Cox and AFT models with L1/2 regularization. BMC Med Genom. 2016;9:11. doi: 10.1186/s12920-016-0169-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Karimi A, Delpisheh A, Sayehmiri K. Application of accelerated failure time models for breast cancer patients’ survival in Kurdistan Province of Iran. J Cancer Res Ther. 2016;12:1184–1188. doi: 10.4103/0973-1482.168966. [DOI] [PubMed] [Google Scholar]
- 41.Stankowski-Drengler TJ, Schumacher JR, Hanlon B, et al. Outcomes for patients with residual stage II/III breast cancer following neoadjuvant chemotherapy (AFT-01) Ann Surg Oncol. 2020;27:637–644. doi: 10.1245/s10434-019-07846-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Talaei-Khoei A, Tavana M, Wilson JM. A predictive analytics framework for identifying patients at risk of developing multiple medical complications caused by chronic diseases. Artif Intell Med. 2019;101:101750. doi: 10.1016/j.artmed.2019.101750. [DOI] [PubMed] [Google Scholar]
- 43.Xia E, Mei J, Xie G, et al. Learning doctors’ medicine prescription pattern for chronic disease treatment by mining electronic health records: a multi-task learning approach. AMIA Annu Symp Proc. 2017;2017:1828–1837. [PMC free article] [PubMed] [Google Scholar]
- 44.Gu W, Zhang Z, Xie X, He Y. An improved muti-task learning algorithm for analyzing cancer survival data. IEEE/ACM Trans Comput Biol Bioinform. 2019;18(2):500–511. doi: 10.1109/TCBB.2019.2920770. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Additional file 1. Baseline characteristics of the original study dataset.
Additional file 2: Figure S1. Benefits of chemotherapy for patients with stage I (A), stage II (B), stage III (C), and stage IV (D).
Additional file 3: Figure S2. Benefits of radiotherapy/surgery for patients with stage I (A), stage II (B), stage III (C), and stage IV (D).
Data Availability Statement
The study data is available at SEER database (https://seer.cancer.gov/).






