Abstract
Background
No models have been developed to predict the survival probability for women with primary vaginal cancer (VC) due to VC’s extreme rareness. We aimed to develop and validate models to predict the overall survival (OS) and cancer-specific survival (CSS) of VC patients.
Methods
A population-based multicenter retrospective cohort study was carried out using the 2004–2018 Surveillance, Epidemiology, and End Results Program database in the United States. The final multivariate Cox model was identified using the Brier score and Harrell’s C concordance statistic (C-statistic). The decision curve, calibration plot, and area under the time-dependent receiver operating characteristic curve (AUC) were used to evaluate model prediction performance. Multiple imputation followed by bootstrap was performed. Bootstrap validation covered the entire statistic procedure from model selection to baseline survival and coefficient calculation. Nomograms predicting OS and CSS were generated.
Results
Of the 2,417 eligible patients, 1,692 and 725 were randomly allocated to the training and validation cohorts. The median age (Interquartile range) was 66 (56–78) and 65 (55–76) for the two cohorts, respectively. Our models had larger net benefits in predicting the survival of VC patients than the American Joint Committee on Cancer stage, presenting great discrimination ability and excellent agreement between the expected and observed events. The performance metrics of our models were calculated in three cohorts: the training cohort, complete cases of the validation cohort, and the imputed validation cohort. For the OS model in the three cohorts, the C-statistics were 0.761, 0.752, and 0.743. The slopes of the calibration plots were 1.017, 1.005, and 0.959. The 3- and 5-year AUCs were 0.795 and 0.810, 0.768 and 0.771, and 0.770 and 0.767, respectively. For the CSS model in the three cohorts, the C-statistics were 0.775, 0.758, and 0.755. The slopes were 1.021, 0.939, and 0.977. And the 3- and 5-year AUCs were 0.797 and 0.793, 0.786 and 0.788, and 0.757 and 0.757, respectively.
Conclusion
We were the first to develop and validate exemplary survival prediction models for VC patients and generate corresponding nomograms that allow for individualized survival prediction and could assist clinicians in performing risk-adapted follow-up and treatment.
Keywords: chemotherapy, lymphadenectomy, M stage, N stage, nomogram, radiotherapy, tumor size, vaginal cancer
Introduction
Primary vaginal cancer (VC) is a rare gynecologic cancer, accounting for 2% of all gynecologic cancer cases, with about 18,000 new cases and 8,000 deaths worldwide in 2020 (1). The VC incidence between 1999 and 2015 remained stable among women aged 40–69 (2). Histologically, vaginal carcinoma mainly includes squamous cell carcinomas (SCC, accounting for 80–90%) and adenocarcinomas (ADE, 4–10%) (3, 4). SCC and ADE present a similar etiology and prognosis; hence they are routinely treated with the same therapy methods (5, 6). Currently, the primary treatment for VC is surgical resection combined with radiotherapy with or without chemotherapy (7–11).
Age, tumor size, lymph node invasion, distant metastasis, radiotherapy, chemotherapy, and surgery type are significant prognostic survival factors for VC patients (7, 9, 11–14). However, no studies have integrated those variables into a single model. Moreover, the staging systems that reflect the severity and extension of VC include the American Joint Committee on Cancer (AJCC) stage (15, 16), the TNM stage of AJCC, and the International Federation of Gynaecology and Obstetrics (FIGO) stage (17). The FIGO stage is a system commonly used by gynecological clinicians, which can be derived from the AJCC stage (4, 17–19). Thus, the AJCC and TNM stages should be estimated during model development and validation.
Lymphadenectomy and sentinel lymph nodes biopsy (SLNB) are two techniques to detect lymph node status. Lymphadenectomy has a higher complication occurrence rate due to its aggression. SLNB is helpful in patients undergoing surgery because lymphatic drainage from the primary lesion does not always follow the anatomically lymphatic channels that would have been predicted (20, 21). However, SLNB challenges the surgical techniques of healthcare centers due to its inherent complexity. Lymphadenectomy and SLNB should be considered during model development.
Nomograms for predicting the overall survival (OS) and cancer-specific survival (CSS) of vulvar cancer patients have been well developed, with a Harrell’s C concordance index (C-statistic) of 0.83–0.85 (22–25). However, no nomograms have been developed to predict the OS or CSS for VC patients due to VC’s extreme rareness, which causes the unfeasibility of developing prediction models, especially within a single healthcare center. Given nomograms’ significant clinical practice value, it is essential to generate nomograms predicting the survival of VC patients. The increased cases in the population-based database made it possible to get adequate VC cases to develop nomograms.
Hence, we aimed to develop and validate nomograms that predict the OS and CSS of VC patients using a population-based multicenter database.
Materials and methods
We carried out the study following the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) guideline for prognostic models (26). An ethical review and informed consent were waived for the study because we used de-identified publicly available data obtained from the Surveillance, Epidemiology, and End Results (SEER) Program database of the National Cancer Institute (27). We have signed the Data-use Agreement for the SEER 1975–2018 Research Data File.
Study population
Patients affected by C52.9 VC (as classified by the International Classification of Diseases for Oncology, 3rd Edition codes) diagnosed between January 01, 2004 and December 31, 2018 were selected from the sub-database of the SEER database (the Incidence-SEER Research Plus Data, 18 registries, Nov 2020 sub [2000–2018]) (27). Patients whose VC was not SCC or ADE, not confirmed by positive histology, or not the first tumor were excluded. Those under 18 or over 100 years or with T0 stage VC were excluded.
Variables and outcomes
The variables assessed in this study included the year of diagnosis, age, marital status, race, tumor size, pathological grade, histology type, radiotherapy, chemotherapy, number of lymph nodes removed, SLNB, surgery type, the presence of other malignancies, the AJCC stage, T, N, and M stages. Those variables were derived from the corresponding data fields of the SEER database. Surgery types were categorized into four groups: none, local tumor excision (LTE), vaginectomy, and debulking. LTE included electrocautery, fulguration (includes hot forceps for tumor destruction), laser, local tumor excision-not otherwise specified (NOS), photodynamic therapy, electrocautery, cryosurgery, laser ablation, laser excision, polypectomy, and excisional biopsy. Vaginectomy included simple or partial surgical removal of the primary site, total surgical removal of the primary site, enucleation, radical surgery, and surgery-NOS. A 0-month survival time was recorded as 0.5 to more accurately represent cases that survived less than 1 month from their diagnosis (28).
The primary outcomes were OS and CSS. OS was defined as the period from the date of diagnosis to the date of death for any reason; alive patients were censored. CSS was defined as the period from the date of diagnosis to the date of death for the reason of VC, while alive patients and those not dead of VC were censored.
Model developing procedure and statistical analysis
A two-tailed p-value < 0.05 was considered statistically significant. The statistical processes were performed using the STATA 17.0 software (StataCorp, College Station, TX, United States). The data was analyzed from September 01, 2021 to March 20, 2022.
The final samples were randomly split into the training and internal validation cohorts using a ratio of 7:3 (1,692 vs. 725 patients), with the constraints of keeping the proportion of outcome events balanced between the two cohorts according to the TRIPOD guideline (26). We used the Chi-square test to investigate the balance of variables between the two cohorts. The Kruskal–Wallis H test was used to compare the between-cohort difference in age and follow-up time.
Multiple imputation using a chained equation with 10 imputed samples was carried out to impute surgery type, number of lymph nodes removed, marital status, pathological grade, and tumor size. The independent variables used during multiple imputations included the year of diagnosis, age, race, radiotherapy, chemotherapy, pathological grade, histology type, presence of other malignancies, and T, N, and M stages. The SLNB variable was neither imputed nor included as a predictor to impute other variables because 92.8% of final samples did not receive SLNB. The imputation procedures were performed separately in the training and validation cohorts to prevent information leakage from each other (29). The sufficiency of the number of imputations was assessed using the fraction of missing information. In the study, ten imputations were enough. Then bootstrap with replacement using 200 repetitions was performed to calculate the Brier score and C-statistic to assess the performance of candidate models. A larger value of the Brier score and C-statistic indicates a better prediction performance of a model. Only if a candidate model with an indicator at least 3% greater than others was considered a better model. If two models had similar indicators, the one including fewer variables was selected as the better model. After a comparison of all candidate models, the final models were identified.
Several nested candidate multivariate Cox models were evaluated in this study. OS candidate models were generated by dropping one insignificant variable with small beta coefficients from a previous model once a time (see Supplementary Tables 1, 2). CSS candidate models were generated similarly, with all the variables in the final OS model kept in CSS models, no matter whether those variables were statistically significant.
The best fit models were refitted on the imputed training cohort using bootstrap with 200 repetitions to calculate the imputation-averaged 3-, 5-year baseline survival and the imputation-averaged coefficients with standard errors.
Next, based on previously calculated baseline survivals and coefficients, the patient-level probabilities of death were calculated within the imputed training cohort, the complete cases of the validation cohort, and the imputed validation cohort. According to the definition of the Cox proportional hazard model (30), the probabilities can be calculated as follows (see Supplementary Document for details):
Within the imputed training and validation cohorts, the patient-level probability of death was a single value calculated by averaging the failure probabilities for the patient in each of the ten imputed samples. Furthermore, the decision curve, calibration plot, and time-dependent receiver operating characteristic (ROC) curve were plotted based on the previously calculated probability of death to assess the final model’s prediction performance (31). The 95% confidence intervals (95% CIs) of the slope of the calibration plot, C-statistic, and AUC of the time-dependent ROC were calculated using bootstrap with 200 repetitions.
Based on the final models, nomograms for predicting 3- and 5-year OS and CSS were generated using a modified “nomocox” command based on pre-calculated baseline survivals (32).
Results
Baseline characteristics
Of the 2,417 patients selected in this study, 1,692 (70%) and 725 (30%) were randomly allocated to the training and internal validation cohorts (see Figure 1). The median age (Interquartile range) was 66 (56–78) and 65 (55–76) for patients in the training and validation cohorts, respectively. The year of diagnosis, age at diagnosis, marital status, race, tumor size, pathological grade, radiotherapy, chemotherapy, histology type, number of lymph nodes removed, SLNB, surgery type, AJCC stage, T, N, M stages, and the presence of other tumors were balanced between the training and validation cohorts (Chi-square p > 0.05 for all), except for the slightly more patients undergoing chemotherapy in the validation cohort (55.9% vs. 51.4%, p = 0.04). There was no difference in the proportion of outcome events between the two cohorts (p > 0.05). The two cohorts had comparable follow-up time (25.5 months [interquartile range 9–68.5] vs. 28 months [9–76], p = 0.651, see Table 1).
TABLE 1.
Characteristics | Training |
Validation |
p-value |
cohort | cohort | ||
No. (%) | No. (%) | ||
Year of diagnosis | 0.563 | ||
2004–2009 | 644(38.1) | 285(39.3) | |
2010–2018 | 1048(61.9) | 440(60.7) | |
Age, median (IQR), y | 66(56–78) | 65(55–76) | 0.188 |
Age, y | 0.181 | ||
18–39 | 58(3.4) | 29(4.0) | |
40–59 | 522(30.9) | 237(32.7) | |
60–79 | 726(42.9) | 322(44.4) | |
80–100 | 386(22.8) | 137(18.9) | |
Marital status | 0.266 | ||
Married | 599(35.4) | 272(37.5) | |
Single | 289(17.1) | 122(16.8) | |
Divorced/widowed/separated | 661(39.1) | 286(39.4) | |
Missing | 143(8.5) | 45(6.2) | |
Race | 0.459 | ||
White | 1310(77.4) | 572(78.9) | |
Black | 253(15.0) | 108(14.9) | |
Other | 129(7.6) | 45(6.2) | |
Tumor size, cm | 0.069 | ||
<2 | 138(8.2) | 74(10.2) | |
2–4 | 361(21.3) | 132(18.2) | |
≥4 | 568(33.6) | 267(36.8) | |
Missing | 625(36.9) | 252(34.8) | |
Pathological grade | 0.929 | ||
Well | 143(8.5) | 57(7.9) | |
Moderately | 491(29.0) | 207(28.6) | |
Poorly/undifferentiated | 518(30.6) | 230(31.7) | |
Missing | 540(31.9) | 231(31.9) | |
Histology type | 0.436 | ||
Squamous cell carcinoma | 1397(82.6) | 589(81.2) | |
Adenocarcinoma | 295(17.4) | 136(18.8) | |
Radiotherapy | 0.902 | ||
None/unknown | 437(25.8) | 200(27.6) | |
Beam | 689(40.7) | 283(39.0) | |
Beam plus implants | 436(25.8) | 188(25.9) | |
Radiation, NOS | 41(2.4) | 18(2.5) | |
Implants | 89(5.3) | 36(5.0) | |
Chemotherapy | 0.042 | ||
None/unknown | 823(48.6) | 320(44.1) | |
Yes | 869(51.4) | 405(55.9) | |
Number of lymph nodes removed | 0.175 | ||
None | 1473(87.1) | 604(83.3) | |
1–3 | 33(2.0) | 21(2.9) | |
≥4 | 150(8.9) | 80(11.0) | |
Unknown number | 14(0.8) | 7(1.0) | |
Missing | 22(1.3) | 13(1.8) | |
Sentinel lymph nodes biopsy | 0.369 | ||
No | 1662(98.2) | 706(97.4) | |
Yes | 8(0.5) | 6(0.8) | |
Missing | 22(1.3) | 13(1.8) | |
Surgery | 0.113 | ||
None | 1182(69.9) | 520(71.7) | |
Local tumor excision | 222(13.1) | 74(10.2) | |
Vaginectomy | 267(15.8) | 118(16.3) | |
Debulking | 12(0.7) | 4(0.6) | |
Missing | 9(0.5) | 9(1.2) | |
Other malignancies | 0.296 | ||
No | 1487(87.9) | 626(86.3) | |
Yes | 205(12.1) | 99(13.7) | |
AJCC stage | 0.686 | ||
I | 474(28.0) | 195(26.9) | |
II | 399(23.6) | 176(24.3) | |
III | 303(17.9) | 151(20.8) | |
IV | 2(0.1) | 1(0.1) | |
IVA | 120(7.1) | 44(6.1) | |
IVB | 214(12.6) | 86(11.9) | |
Missing | 180(10.6) | 72(9.9) | |
T stage | 0.594 | ||
T1 | 594(35.1) | 250(34.5) | |
T2 | 526(31.1) | 229(31.6) | |
T3 | 224(13.2) | 110(15.2) | |
T4 | 184(10.9) | 67(9.2) | |
TX | 164(9.7) | 69(9.5) | |
N stage | 0.703 | ||
N0 | 1185(70.0) | 497(68.6) | |
N1 | 321(19.0) | 148(20.4) | |
NX | 186(11.0) | 80(11.0) | |
M stage | 0.473 | ||
M0 | 1417(83.7) | 619(85.4) | |
M1 | 214(12.6) | 86(11.9) | |
MX | 61(3.6) | 20(2.8) | |
Follow-up time, median (IQR), mo | 25.5(9–68.5) | 28(9–76) | 0.651 |
Outcome | 0.999 | ||
Alive | 796(47.0) | 341(47.0) | |
Dead of vaginal cancer | 646(38.2) | 277(38.2) | |
Dead of other reasons | 237(14.0) | 101(13.9) | |
Dead of unknown reason | 13(0.8) | 6(0.8) |
IQR, interquartile range; NOS, not otherwise specified; AJCC, the American Joint Committee on Cancer.
Candidate model selection
In order to select the final models, the model performance indicators of candidate models were calculated using bootstraps with 200 repetitions within the imputed training cohort. The prediction performance of the candidate models is summarized in Table 2. The final models (Model 5 for both OS and CSS) were selected because they had the least number of variables but similar performance to other candidate models (within a 3% difference in indicator values). The final OS model included age, tumor size, radiotherapy, chemotherapy, surgery, number of lymph nodes removed, and T, N, and M stages. The final CSS model included the same variables and an indicator variable of the presence of other malignancies.
TABLE 2.
Indicators | Overall survival |
Cancer-specific survival |
||
Value | 95% confidence interval | Value | 95% confidence interval | |
Brier score | ||||
Model 1 | 0.163 | 0.133–0.192 | 0.138 | 0.110–0.166 |
Model 2 | 0.287 | 0.258–0.317 | 0.241 | 0.213–0.269 |
Model 3 | 0.288 | 0.259–0.317 | 0.244 | 0.216–0.272 |
Model 4 | 0.291 | 0.261–0.321 | 0.244 | 0.216–0.272 |
Model 5 | 0.290 | 0.261–0.319 | 0.243 | 0.215–0.272 |
C-statistic | ||||
Model 1 | 0.839 | 0.812–0.865 | 0.853 | 0.825–0.882 |
Model 2 | 0.847 | 0.820–0.874 | 0.857 | 0.828–0.886 |
Model 3 | 0.846 | 0.820–0.873 | 0.858 | 0.829–0.887 |
Model 4 | 0.848 | 0.822–0.874 | 0.855 | 0.826–0.884 |
Model 5 | 0.844 | 0.817–0.871 | 0.855 | 0.826–0.883 |
Model 5 is the final selected model; See more details in the Supplementary Tables 1, 2.
Results of the final models
The results of the final multivariate Cox proportional hazard models for predicting the OS and CSS are shown in Table 3. As presented in the table, age, tumor size, radiotherapy, chemotherapy, number of lymph nodes removed, and T, N, and M stages were all significantly associated with OS (p < 0.001). However, the association of chemotherapy, number of lymph nodes removed, and N stage with CSS were insignificant. Additionally, the presence of other malignancies was significantly correlated with CSS (p < 0.001).
TABLE 3.
Variables | Overall survival |
Cancer-specific survival |
||||
β coefficients | Bootstrap SE | p-value | β coefficients | Bootstrap SE | p-value | |
Age, y | ||||||
18–39 | Reference | Reference | ||||
40–59 | 0.19593 | 0.27779 | 0.48060 | 0.06896 | 0.28024 | 0.80564 |
60–79 | 0.67730 | 0.26537 | 0.01070 | 0.35783 | 0.27253 | 0.18918 |
80–100 | 1.46783 | 0.27607 | <0.00001 | 1.09818 | 0.28078 | 0.00009 |
Tumor size, cm | ||||||
<2 | Reference | Reference | ||||
2–4 | 0.21352 | 0.14344 | 0.13660 | 0.24540 | 0.17694 | 0.16548 |
≥4 | 0.44749 | 0.13988 | 0.00138 | 0.57388 | 0.18809 | 0.00228 |
Radiotherapy | ||||||
None | Reference | Reference | ||||
Beam | −0.43073 | 0.09575 | 0.00001 | −0.48371 | 0.12296 | 0.00008 |
Beam + implants | −0.86705 | 0.11851 | <0.00001 | −1.04213 | 0.15922 | <0.00001 |
Radiation, NOS | −0.29244 | 0.20267 | 0.14904 | −0.23287 | 0.23471 | 0.32112 |
Implants | −0.95970 | 0.18768 | <0.00001 | −1.13098 | 0.23740 | <0.00001 |
Chemotherapy | ||||||
None/unknown | Reference | Reference | ||||
Yes | −0.27861 | 0.08434 | 0.00096 | −0.19033 | 0.10500 | 0.06989 |
Surgery | ||||||
None | Reference | Reference | ||||
Local tumor excision | −0.54175 | 0.11260 | <0.00001 | −0.61139 | 0.15755 | 0.0001 |
Vaginectomy | −0.73283 | 0.14495 | <0.00001 | −0.67324 | 0.17219 | 0.00009 |
Debulking | −0.26163 | 0.36299 | 0.47105 | −0.07539 | 0.37202 | 0.83941 |
Number of lymph nodes removed | ||||||
None | Reference | Reference | ||||
1–3 | 0.17467 | 0.30427 | 0.56593 | 0.17230 | 0.38622 | 0.65551 |
≥4 | −0.47991 | 0.19393 | 0.01334 | −0.36552 | 0.21715 | 0.09233 |
Number unknown | 0.38244 | 0.40080 | 0.33999 | 0.67311 | 3.21673 | 0.83425 |
T stage | ||||||
T1 | Reference | Reference | ||||
T2 | 0.23081 | 0.09139 | 0.01155 | 0.36817 | 0.12840 | 0.00414 |
T3 | 0.41905 | 0.12574 | 0.00086 | 0.61803 | 0.14603 | 0.00002 |
T4 | 0.82441 | 0.14587 | <0.00001 | 0.99892 | 0.16405 | <0.00001 |
TX | 0.12220 | 0.17625 | 0.48809 | 0.27493 | 0.19949 | 0.16815 |
N stage | ||||||
N0 | Reference | Reference | ||||
N1 | 0.26231 | 0.10441 | 0.01200 | 0.23417 | 0.12536 | 0.06176 |
NX | 0.25829 | 0.14786 | 0.08066 | 0.06867 | 0.19146 | 0.71986 |
M stage | ||||||
M0 | Reference | Reference | ||||
M1 | 0.66769 | 0.10629 | <0.00001 | 0.71244 | 0.13504 | <0.00001 |
MX | −0.20732 | 0.24499 | 0.39743 | 0.10063 | 0.26910 | 0.70845 |
Presence of other malignancies | ||||||
No | – | Reference | ||||
Yes | – | – | – | −0.49258 | 0.14590 | 0.00074 |
Baseline survival | ||||||
3 years | 0.82548 | – | – | 0.84248 | – | – |
5 years | 0.74510 | – | – | 0.78676 | – | – |
SE, standard error; NOS, not otherwise specified; see Supplementary Document for details about the calculation of patient-level survival probability.
Model prediction performance
Our models have more considerable net benefits than the AJCC stage, showing excellent clinical practice usefulness (Figure 2). Moreover, the calibration plots show a good agreement between the expected and observed events (Figure 3). The time-dependent ROC curves are displayed in Figure 4.
For the OS model, the C-statistics were 0.761, 0.752, and 0.743 in the imputed training cohort, the complete cases of the validation cohort, and the imputed validation cohort, respectively. The slopes of the calibration plots were 1.017, 1.005, and 0.959 in the three cohorts. The 3-year AUCs were 0.795, 0.768, and 0.770. The 5-year AUCs were 0.810, 0.771, and 0.767 (Table 4).
TABLE 4.
Performance | Training cohort (imputed) | Validation cohort (complete cases) | Validation cohort (imputed) |
Overall survival | |||
C-statistic | 0.761 (0.745–0.777) | 0.752 (0.717–0.787) | 0.743 (0.706–0.779) |
Calibration slope | 1.017 (0.942–1.092) | 1.005 (0.848–1.162) | 0.959 (0.777–1.141) |
3-year AUC | 0.795 (0.773–0.817) | 0.768 (0.718–0.818) | 0.770 (0.718–0.821) |
5-year AUC | 0.810 (0.787–0.834) | 0.771 (0.721–0.822) | 0.767 (0.716–0.818) |
Cancer-specific survival | |||
C-statistic | 0.775 (0.759–0.791) | 0.758 (0.723–0.793) | 0.755 (0.710–0.800) |
Calibration slope | 1.021 (0.931–1.111) | 0.939 (0.769–1.109) | 0.977 (0.755–1.199) |
3-year AUC | 0.797 (0.712–0.761) | 0.786 (0.737–0.835) | 0.757 (0.692–0.821) |
5-year AUC | 0.793 (0.771–0.816) | 0.788 (0.737–0.839) | 0.757 (0.691–0.823) |
AUC, the area under the receiver operating characteristic curve. Numbers in parentheses are the bootstrapped 95% confidence interval.
For the CSS model, the C-statistics were 0.775, 0.758, and 0.755 in the three cohorts. The slopes of the calibration plots were 1.021, 0.939, and 0.977. The 3-year AUCs were 0.797, 0.786, and 0.757. The 5-year AUCs were 0.793, 0.788, and 0.757 (Table 4).
Nomograms for predicting the 3- and 5-year survival
The baseline survivals and coefficients of the final models calculated on the imputed training cohort were used to generate the nomograms for predicting the probability of 3- and 5-year OS (Figure 5A) and CSS (Figure 5B) for VC patients. By drawing a vertical line straight down to the horizontal axis labeled with points and summing every single score of each factor, the patient’s probabilities of 3- or 5-year survival were the probabilities corresponding to the total scores.
Discussion
This retrospective cohort study developed and validated models for predicting the 3-and 5-year OS and CSS for VC patients based on a cohort of 2,417 cases from a population-based multicenter database. Our models with superb discrimination and calibration have a more considerable net benefit than the AJCC stage, showing excellent clinical usefulness. Using the corresponding nomograms, which provided a convenient and well-calibrated survival prediction tool, clinicians could calculate patient-level prognostication of survival, recommend intensive clinical follow-up for high-risk patients, and perform the risk-adapted treatment.
The variables included in our models involved age at diagnosis, tumor size, radiotherapy type, chemotherapy, surgery type, number of lymph nodes removed, T stage, N stage, M stage, and the presence of other malignancies. Those variables were regularly inspected characteristics in clinical practice. To our knowledge, no models integrating those factors have been developed to predict the survival of VC patients due to the VC’s extreme rareness (33, 34). We are the first to integrate those factors into a single survival prediction model and build nomograms predicting VC patients’ survival using a large representative population-based cohort. The OS and CSS models contained the same variables, except for the presence of other malignancies for CSS. Accordingly, the probabilities of OS and CSS can be determined simultaneously, intensifying our models’ practical usefulness. Another unique characteristic is that we bootstrapped the entire modeling process, including model selection, performance indicator generation, baseline survival, coefficient, and standard error calculation, which further enhanced the generalizability of our models. Besides, we assessed the internal validity with bootstrap for a more realistic estimate of the prediction performance of the models in similar future patients.
Another strength of this study is that multiple imputation was used to generate 10 sets of imputed samples, which increased the usable sample size and made the calculated coefficients closer to the actual value and their standard error range narrower. Multiple imputation could reduce the complete case biases caused by the poor representation of the complete case. Internal validation was performed in the complete case of the validation cohort and the imputed validation cohort, showing similar results, which further confirmed the excellent performance of our models (35, 36). Moreover, multiple imputation were followed by the bootstrap technique in this study, which made our models capture much more uncertainty and increased their generalizability.
This study confirmed that older age and larger tumor size were negatively associated with survival, consistent with other studies (4, 11, 12, 37–39). We also found that a higher tumor stage was negatively correlated with the survival of VC patients, similar to previous studies (4, 17, 38–40). Instead of using a single FIGO or AJCC stage, we investigated T, N, and M stages in our models because they show a more elaborate representation of tumor progress than a single stage. The significant association of the N stage with survival agreed with published studies that also found a correlation between lymph node invasion and survival (12, 40). To further investigate the effect of lymph node resection, we also controlled the number of lymph nodes removed and found its significant association with survival. The number of lymph nodes removed in the models also contributed to a more precise survival prediction, adapting to modern surgical technique development.
Moreover, we discovered that radiotherapy, chemotherapy, and more aggressive surgery were positively correlated with survival, in agreement with other studies (9, 11, 12, 14, 38–41). Surgery combined with radiotherapy and chemotherapy is still the primary treatment for VC (42). The radiotherapy in the SEER database is classified into beam radiation, radioactive implants, radioisotopes, beam plus implants (combination of beam radiation with radioactive implants or radioisotopes), and radiation-NOS. The radioactive implants and radioisotopes correspond to brachytherapy. We also found that beam plus implants or only implants had better effectiveness than beam radiation, similar to published studies (7, 9, 12, 14, 38, 43, 44). Some studies argued that image-guided brachytherapy might improve the effectiveness of brachytherapy (45–47). Furthermore, we found that the presence of other malignancies was a favorable prognostic factor for CSS but not for OS. That may be because a longer survival time tends to make VC patients experience an increased probability of occurring other malignancies; thus, the death due to VC was competed by other malignancies.
Additionally, we found no improvement in model prediction performance with the addition of marital status, race, pathology grade, and histology type. The lack of performance improvement reflects that significant variables embodied the effects of those variables. The insignificance of histology type reflected the similar prognostic outcome of SCC and ADE for VC.
Attention should be taken when applying those nomograms in clinical practice. We only included VC patients with SCC and ADE in the study. Accordingly, the proposed nomograms should only be reasonably used for the two histology types. Applying the nomograms to other histology types might be problematic. In addition, the nomograms were built based on patients aged 18–100. Hence, expanded application to younger or older patients should be cautious. Additionally, given that the SEER database only includes the United States population, care should be taken when those models are used on a population of other countries.
Some limitations in this study should be clarified. First, we could not control the tumor’s detailed location (the upper or lower of the vagina) because the location is unavailable in the SEER database. An upper third location is associated with more prolonged survival, maybe due to a different lymph drainage pattern (11). Although we controlled the T, N, and M stages, which could account for some effect of tumor location, there may still be confounding effects of the location. Second, due to the retrospective study’s nature, there might be missing factors highly correlated with the survival of VC patients, although we have assessed available variables suggested by previous studies. Third, external validation on a distinct population was not carried out because a sufficiently large sample from a different population was unavailable in a single healthcare center due to VC’s extreme rareness. Finally, human papillomavirus (HPV) has been argued to be positively associated with the survival of VC patients (48). However, the HPV status of VC patients was not available in the SEER database, so it could not be controlled.
Conclusion
For predicting the probability of 3-and 5-year OS and CSS for VC patients with SCC or ADE, we developed and validated the first models and generated the first nomograms based on the models. Our models and the corresponding nomograms with excellent survival prediction performance could help clinicians perform risk-adapted follow-up and treatment on VC patients. Further prospective studies investigating more factors, such as the tumor’s location, are warranted to confirm our study’s findings and improve the prediction accuracy.
Data availability statement
Publicly available datasets were analyzed in this study. This data can be found here: http://seer.cancer.gov.
Ethics statement
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.
Author contributions
W-LZ and Y-YY had full access to all the data in the study and took responsibility for the integrity of the data and the accuracy of the data analysis, organized the original individual studies concept and design, analyzed and interpreted the data, and revised the manuscript. Y-YY acquired the raw data. W-LZ drafted the manuscript. Both authors read and approved the final manuscript.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmed.2022.919150/full#supplementary-material
References
- 1.Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. (2021) 71:209–49. 10.3322/caac.21660 [DOI] [PubMed] [Google Scholar]
- 2.Van Dyne EA, Henley SJ, Saraiya M, Thomas CC, Markowitz LE, Benard VB. Trends in human papillomavirus-associated cancers - United States, 1999-2015. MMWR Morb Mortal Wkly Rep. (2018) 67:918–24. 10.15585/mmwr.mm6733a2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Frank SJ, Deavers MT, Jhingran A, Bodurka DC, Eifel PJ. Primary adenocarcinoma of the vagina not associated with diethylstilbestrol (DES) exposure. Gynecol Oncol. (2007) 105:470–4. 10.1016/j.ygyno.2007.01.005 [DOI] [PubMed] [Google Scholar]
- 4.Hiniker SM, Roux A, Murphy JD, Harris JP, Tran PT, Kapp DS, et al. Primary squamous cell carcinoma of the vagina: Prognostic factors, treatment patterns, and outcomes. Gynecol Oncol. (2013) 131:380–5. 10.1016/j.ygyno.2013.08.012 [DOI] [PubMed] [Google Scholar]
- 5.Aridgides P, Onderdonk B, Cunningham M, Daugherty E, Du L, Bunn WD, et al. Institutional experience using interstitial brachytherapy for the treatment of primary and recurrent pelvic malignancies. J Contemp Brachyther. (2016) 8:175–82. 10.5114/jcb.2016.61062 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Nomura H, Matoda M, Okamoto S, Omatsu K, Kondo E, Kato K, et al. Clinical characteristics of non-squamous cell carcinoma of the vagina. Int J Gynecol Cancer. (2015) 25:320–4. 10.1097/IGC.0000000000000351 [DOI] [PubMed] [Google Scholar]
- 7.Huertas A, Dumas I, Escande A, del Campo ER, Felefly T, Canova C-H, et al. Image-guided adaptive brachytherapy in primary vaginal cancers: A monocentric experience. Brachytherapy. (2018) 17:571–9. 10.1016/j.brachy.2018.01.005 [DOI] [PubMed] [Google Scholar]
- 8.Laliscia C, Gadducci A, Fabrini MG, Barcellini A, Guerrieri ME, Parietti E, et al. Definitive radiotherapy for primary squamous cell carcinoma of the vagina: Are high-dose external beam radiotherapy and high-dose-rate brachytherapy boost the best treatment? Experience of two Italian institutes. Oncol Res Treat. (2017) 40:697–701. 10.1159/000480350 [DOI] [PubMed] [Google Scholar]
- 9.Manuel MM, Cho LP, Catalano PJ, Damato AL, Miyamoto DT, Tempany CM, et al. Outcomes with image-based interstitial brachytherapy for vaginal cancer. Radiother Oncol. (2016) 120:486–92. 10.1016/j.radonc.2016.05.019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Laliscia C, Fabrini MG, Delishaj D, Coraggio G, Morganti R, Tana R, et al. Concomitant external-beam irradiation and chemotherapy followed by high-dose rate brachytherapy boost in the treatment of squamous cell carcinoma of the vagina: A single-center retrospective study. Anticancer Res. (2016) 36:1885–9. [PubMed] [Google Scholar]
- 11.Gadducci A, Fabrini MG, Lanfredini N, Sergiampietri C. Squamous cell carcinoma of the vagina: Natural history, treatment modalities and prognostic factors. Crit Rev Oncol Hematol. (2015) 93:211–24. 10.1016/j.critrevonc.2014.09.002 [DOI] [PubMed] [Google Scholar]
- 12.Ikushima H, Wakatsuki M, Ariga T, Kaneyasu Y, Tokumaru S, Isohashi F, et al. Radiotherapy for vaginal cancer: A multi-institutional survey study of the Japanese radiation oncology study group. Int J Clin Oncol. (2018) 23:314–20. 10.1007/s10147-017-1205-z [DOI] [PubMed] [Google Scholar]
- 13.Aktas M, de Jong D, Nuyttens JJ, van der Zee J, Wielheesen DH, Batman E, et al. Concomitant radiotherapy and hyperthermia for primary carcinoma of the vagina: A cohort study. Eur J Obstet Gynecol Reprod Biol. (2007) 133:100–4. 10.1016/j.ejogrb.2006.05.005 [DOI] [PubMed] [Google Scholar]
- 14.Rajagopalan MS, Xu KM, Lin JF, Sukumvanich P, Krivak TC, Beriwal S. Adoption and impact of concurrent chemoradiation therapy for vaginal cancer: A National Cancer Data Base (n.d.) study. Gynecol Oncol. (2014) 135:495–502. 10.1016/j.ygyno.2014.09.018 [DOI] [PubMed] [Google Scholar]
- 15.Edge SB, Compton CC. The American Joint Committee on Cancer: The 7th edition of the AJCC cancer staging manual and the future of TNM. Ann Surg Oncol. (2010) 17:1471–4. 10.1245/s10434-010-0985-4 [DOI] [PubMed] [Google Scholar]
- 16.National Cancer Institute [NCI]. Seer research plus data description cases diagnosed in 1975-2018*. (2021). Available online at: https://seer.cancer.gov/data-software/documentation/seerstat/nov2020/TextData.FileDescription.pdf (accessed August 9, 2022). [Google Scholar]
- 17.Adams TS, Cuello MA. Cancer of the vagina. Int J Gynecol Obstet. (2018) 143:14–21. 10.1002/ijgo.12610 [DOI] [PubMed] [Google Scholar]
- 18.Saito T, Tabata T, Ikushima H, Yanai H, Tashiro H, Niikura H, et al. Japan Society of Gynecologic Oncology guidelines 2015 for the treatment of vulvar cancer and vaginal cancer. Int J Clin Oncol. (2018) 23:201–34. 10.1007/s10147-017-1193-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Rajaram S, Maheshwari A, Srivastava A. Staging for vaginal cancer. Best Pract Res Clin Obstetrics Gynaecol. (2015) 29:822–32. 10.1016/j.bpobgyn.2015.01.006 [DOI] [PubMed] [Google Scholar]
- 20.Frumovitz M, Gayed IW, Jhingran A, Euscher ED, Coleman RL, Ramirez PT, et al. Lymphatic mapping and sentinel lymph node detection in women with vaginal cancer. Gynecol Oncol. (2008) 108:478–81. 10.1016/j.ygyno.2007.12.001 [DOI] [PubMed] [Google Scholar]
- 21.Hertel H, Soergel P, Muecke J, Schneider M, Papendorf F, Laenger F, et al. Is there a place for sentinel technique in treatment of vaginal cancer?: Feasibility, clinical experience, and results. Int J Gynecol Cancer. (2013) 23:1692–8. 10.1097/IGC.0b013e3182a65455 [DOI] [PubMed] [Google Scholar]
- 22.Lei L, Tan L, Zhao X, Zeng F, Xu D. A prognostic nomogram based on lymph node ratio for postoperative vulvar squamous cell carcinoma from the Surveillance, Epidemiology, and End Results database: A retrospective cohort study. Ann Transl Med. (2020) 8:1382. 10.21037/atm-20-3240 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Zhou H, Zou X, Li H, Chen L, Cheng X. Construction and validation of a prognostic nomogram for primary vulvar melanoma: A SEER population-based study. Jap J Clin Oncol. (2020) 50:1386–94. 10.1093/jjco/hyaa137 [DOI] [PubMed] [Google Scholar]
- 24.Rouzier R, Preti M, Haddad B, Martin M, Micheletti L, Paniel BJ. Development and validation of a nomogram for predicting outcome of patients with vulvar cancer. Obstet Gynecol. (2006) 107:672–7. 10.1097/01.AOG.0000198639.36855.e9 [DOI] [PubMed] [Google Scholar]
- 25.Kim MK, Kim JW, Lee JM, Lee NW, Cha MS, Kim BG, et al. Validation of a nomogram for predicting outcome of vulvar cancer patients, primarily treated by surgery, in Korean population: Multicenter retrospective study through Korean Gynecologic Oncology Group (KGOG-1010). J Gynecol Oncol. (2008) 19:191–4. 10.3802/jgo.2008.19.3.191 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): The TRIPOD statement. Ann Intern Med. (2015) 162:55–63. 10.7326/M14-0697 [DOI] [PubMed] [Google Scholar]
- 27.National Cancer Institute [NCI]. SEER*Stat database: Incidence - seer research plus data, 18 registries, November 2020 Sub (2000-2018) - Linked To County Attributes - Total U.S., 1969-2019 Counties, National Cancer Institute, DCCPS, Surveillance research program, released April 2021, based on the November 2020 submission. (2021). Available online at: https://seer.cancer.gov/data-software/documentation/seerstat/nov2020/ (accessed 25 March, 2022). [Google Scholar]
- 28.Shaikh WR, Weinstock MA, Halpern AC, Oliveria SA, Geller AC, Dusza SW. The characterization and potential impact of melanoma cases with unknown thickness in the United States’ Surveillance, Epidemiology, and End Results Program, 1989-2008. Cancer Epidemiol. (2013) 37:64–70. 10.1016/j.canep.2012.08.010 [DOI] [PubMed] [Google Scholar]
- 29.Wahl S, Boulesteix AL, Zierer A, Thorand B, van de Wiel MA. Assessment of predictive performance in incomplete data by combining internal validation and multiple imputation. BMC Med Res Methodol. (2016) 16:144. 10.1186/s12874-016-0239-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hess KR. Graphical methods for assessing violations of the proportional hazards assumption in Cox regression. Stat Med. (1995) 14:1707–23. 10.1002/sim.4780141510 [DOI] [PubMed] [Google Scholar]
- 31.Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics. (2000) 56:337–44. 10.1111/j.0006-341x.2000.00337.x [DOI] [PubMed] [Google Scholar]
- 32.Alexander Z, Victor A. A general-purpose nomogram generator for predictive logistic regression models. Stata J. (2015) 15:537–46. [Google Scholar]
- 33.Huang J, Cai M, Zhu Z. Survival and prognostic factors in primary vaginal cancer: An analysis of 2004-2014 SEER data. Transl Cancer Res. (2020) 9:7091–102. 10.21037/tcr-20-1825 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Jang WI, Wu HG, Ha SW, Kim HJ, Kang SB, Song YS, et al. Definitive radiotherapy for treatment of primary vaginal cancer: Effectiveness and prognostic factors. Int J Gynecol Cancer. (2012) 22:521–7. 10.1097/IGC.0b013e31823fd621 [DOI] [PubMed] [Google Scholar]
- 35.Ludtke O, Robitzsch A, Grund S. Multiple imputation of missing data in multilevel designs: A comparison of different strategies. Psychol Methods. (2017) 22:141–65. 10.1037/met0000096 [DOI] [PubMed] [Google Scholar]
- 36.Enders CK, Mistler SA, Keller BT. Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. Psychol Methods. (2016) 21:222–40. 10.1037/met0000063 [DOI] [PubMed] [Google Scholar]
- 37.Wolfson AH, Reis IM, Portelance L, Diaz DA, Zhao W, Gibb RK. Prognostic impact of clinical tumor size on overall survival for subclassifying stages I and II vaginal cancer: A SEER analysis. Gynecol Oncol. (2016) 141:255–9. 10.1016/j.ygyno.2016.03.009 [DOI] [PubMed] [Google Scholar]
- 38.Miyamoto DT, Viswanathan AN. Concurrent chemoradiation for vaginal cancer. PLoS One. (2013) 8:e65048. 10.1371/journal.pone.0065048 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Zhou WL, Yue YY. Radiotherapy plus chemotherapy is associated with improved survival compared to radiotherapy alone in patients with primary vaginal carcinoma: A retrospective SEER study. Front Oncol. (2020) 10:570933. 10.3389/fonc.2020.570933 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Zhou W, Yue Y, Pei D. Survival benefit of vaginectomy compared to local tumor excision in women with FIGO stage I and II primary vaginal carcinoma: A SEER study. Arch Gynecol Obstet. (2020) 302:1429–39. 10.1007/s00404-020-05737-6 [DOI] [PubMed] [Google Scholar]
- 41.Meixner E, Arians N, Bougatf N, Hoeltgen L, König L, Lang K, et al. Vaginal cancer treated with curative radiotherapy with or without concomitant chemotherapy: Oncologic outcomes and prognostic factors. Tumori. (2021):3008916211056369. [Epub ahead of print]. 10.1177/03008916211056369 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Kulkarni A, Dogra N, Zigras T. Innovations in the management of vaginal cancer. Curr Oncol. (2022) 29:3082–92. 10.3390/curroncol29050250 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Orton A, Boothe D, Williams N, Buchmiller T, Huang YJ, Suneja G, et al. Brachytherapy improves survival in primary vaginal cancer. Gynecol Oncol. (2016) 141:501–6. 10.1016/j.ygyno.2016.03.011 [DOI] [PubMed] [Google Scholar]
- 44.Reshko LB, Gaskins JT, Metzinger DS, Todd SL, Eldredge-Hindy HB, Silva SR. The impact of brachytherapy boost and radiotherapy treatment duration on survival in patients with vaginal cancer treated with definitive chemoradiation. Brachytherapy. (2021) 20:75–84. 10.1016/j.brachy.2020.08.020 [DOI] [PubMed] [Google Scholar]
- 45.Goodman CD, Mendez LC, Velker V, Weiss Y, Leung E, Louie AV, et al. 3D image-guided interstitial brachytherapy for primary vaginal cancer: A multi-institutional experience. Gynecol Oncol. (2021) 160:134–9. 10.1016/j.ygyno.2020.10.021 [DOI] [PubMed] [Google Scholar]
- 46.Westerveld H, Schmid MP, Nout RA, Chargari C, Pieters BR, Creutzberg CL, et al. Image-Guided Adaptive Brachytherapy (IGABT) for primary vaginal cancer: Results of the international multicenter RetroEMBRAVE cohort study. Cancers (Basel). (2021) 13:1459. 10.3390/cancers13061459 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Jhingran A. Updates in the treatment of vaginal cancer. Int J Gynecol Cancer. (2022) 32:344–51. 10.1136/ijgc-2021-002517 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Rasmussen CL, Bertoli HK, Sand FL, Kjaer AK, Thomsen LT, Kjaer SK. The prognostic significance of HPV, p16, and p53 protein expression in vaginal cancer: A systematic review. Acta Obstet Gynecol Scand. (2021) 100:2144–56. 10.1111/aogs.14260 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Publicly available datasets were analyzed in this study. This data can be found here: http://seer.cancer.gov.