Abstract
Purpose
Cure models are a useful alternative to Cox proportional hazards models in oncology studies when there is a subpopulation of patients who will not experience the event of interest. Although software is available to fit cure models, there are limited tools to evaluate, report and visualize model results. This article introduces the cureit R package, an end-to-end pipeline for building mixture cure models, and demonstrates its use in a data set of patients with primary extremity and truncal liposarcoma.
Methods
To assess associations between liposarcoma histologic subtypes and disease-specific death (DSD) in patients treated at Memorial Sloan Kettering Cancer Center between July 1982 and September 2017, mixture cure models were fit and evaluated using the cureit package. Liposarcoma histologic subtypes were defined as well-differentiated, dedifferentiated, myxoid, round cell, and pleomorphic.
Results
All other analyzed liposarcoma histologic subtypes were significantly associated with higher DSD in cure models compared with well-differentiated. In multivariable models, myxoid (odds ratio [OR], 6.25 [95% CI, 1.32 to 29.6]) and round cell (OR, 16.2 [95% CI, 2.80 to 93.2]) liposarcoma had higher incidences of DSD compared with well-differentiated patients. By contrast, dedifferentiated liposarcoma was associated with the latency of DSD (hazard ratio, 10.6 [95% CI, 1.48 to 75.9]). Pleomorphic liposarcomas had significantly higher risk in both incidence and the latency of DSD (P < .0001). Brier scores indicated comparable predictive accuracy between cure and Cox models.
Conclusion
We developed the cureit pipeline to fit and evaluate mixture cure models and demonstrated its clinical utility in the liposarcoma disease setting, shedding insights on the subtype-specific associations with incidence and/or latency.
Introduction
Survival analysis is a critical tool in oncology research used to investigate clinical risk factors related to cancer progression or death and to evaluate new treatment regimens and biomarkers. The widely-used Cox proportional hazards model effectively utilizes the concept of an at-risk set to allow both uncensored and censored event data to contribute to the estimation of multivariable associations between the event of interest and covariates. However, Cox models rely on underlying assumptions including the assumption of proportional hazards and the supposition that, given infinite follow-up time for every individual, each would eventually experience the event of interest. In some studies, these assumptions may not hold, and Cox models have compromised utilities. For example, the long-term survival rate is typically high for patients with liposarcoma after surgical resection.1,2, With such high rates of censoring in the presence of sufficient follow-up time, it is possible that most patients are risk-free or cured, which implies that the estimated associations between risk factors and time-to-event can be heavily diluted or obscured by including the cured patients. In addition, the proportional hazard assumption imposed by the Cox model may be violated under the high censoring rate.3 The potential presence of a subpopulation of cured patients indicates a population heterogeneity that should be more explicitly captured such that risk factors for cured and uncured populations are distinctly identified and estimated for more useful clinical interpretations.4
Motivated by the above reasons, the mixture cure model has been developed for analysis of time-to-event data when a considerable portion of patients are cured from the disease. 5,6,7 The mixture cure model assumes there are two latent patient subpopulations: those susceptible to the event of interest (ie, uncured) and those not susceptible to the event (ie, cured). The observed time-to-event distribution is assumed to be a mixture of these two subgroups, where the cured group is assumed to be event-free and the ‘uncured’ group has their event risk modeled by a suitable survival function, often termed the ‘latency’ model. This latency portion can be modeled with a parametric or semi-parametric model, with the Cox proportional hazards and accelerated failure time models being two popular choices. The probability of belonging to the cured and uncured groups is formulated by a logistic regression model with latent subgroup identities (also called the “cure fraction” or “incidence” model). Using mixture cure models, we can estimate the proportion of long-term survivors (“cure fraction”) and describe the conditional survival function for those who are not “cured”.5
The separate modeling of subgroup classification and subgroup-specific survival facilitates simultaneous investigation of the association of patient characteristics with both cure probability and the survival probability given uncured, which is particularly effective if the prevalence of cured patients is high. For example, certain clinical characteristics like post-resection surgical margins may be highly informative in predicting whether a patient will ever experience a recurrence, but less informative in predicting when that recurrence will occur. Conversely, larger tumor size has been shown to be associated with shorter time to recurrence for those who experience an event, but may be less informative in distinguishing who in the entire population will have a recurrence.6 In practice, cure models have been extensively used to model progression or recurrence-free survival in diseases like breast cancer, multiple myeloma and soft-tissue sarcoma where the rates of long-term survivors are high.8, 9, 10, 11
The process of applying cure models to a clinical research question is similar to that of applying Cox proportional hazards models with a few practical differences. First, an analyst may explore univariable associations and undergo a variable selection process to determine which covariates to adjust for in a multivariable model. Variables may be selected based on clinical rationale, univariable significance, or via other procedures like penalized regression. When using cure models, analysts must select variables most appropriate to each model subcomponent, and these variable sets may or may not overlap. Next, after fitting multivariable models, analysts will evaluate their prediction performance through appropriate evaluation metrics. These may include Brier Scores and K-indicies, which have methods developed for calculation in the context of cure rate models12. Finally, it may be of clinical utility to fit a nomogram. Nomograms are useful prognostic tools for visualizing predicted clinical outcomes based on underlying statistical models, and can concisely communicate estimated relationships between covariates and outcomes of interest in a visually compelling way 13, 14. Despite the emerging use of cure models in clinical decision making and patient care, there is currently no software tool available to easily create or validate nomograms for cure models, hence there is a need for developing a nomogram tool for use in this context. Furthermore, while several R packages are available to fit cure models (smcure, cuRe, rstpm2, flexsurv, evacure), and some offer cure model evaluation metrics, there are no tools for end-to-end implementation of model fitting, model evaluation, summary table producing, and nomogram plotting.
In this study we introduce the cureit pipeline for building and deploying mixture cure models to facilitate their use in a clinical setting. The proposed pipeline, available in an R package, leverages existing tools for cure model parameter estimation and assessment, and offers new computational tools for model summary, visualization, and nomogram building. We illustrate the use of this pipeline in a data set of patients with primary extremity and truncal liposarcoma. We sought to improve the clinical utility of the previously published model by employing mixture cure models to help identify clinical characteristics of long-term survivors. While cure models are commonly referenced in statistical literature, they are still under-utilized in clinical applications.4 We aim to close this gap between theory and practice by providing and demonstrating the cureit package which offers a user-friendly, end-to-end pipeline for applying cure models in the clinical setting.
Methods
The cureit package provides functions to fit, assess and visualize mixture cure models in one unified R pipeline (Figure 1). It is publicly available in a GitHub Repository for download and installation. To demonstrate the main pipeline functions and the value of cure models in a clinical oncology setting, we used a dataset published previously on primary liposarcoma in extremity and truncal liposarcoma patients treated from July 1982 to September 2017 at Memorial Sloan Kettering15.
Figure 1:
Overview of a mixture cure model fitting framework and corresponding functions available in the cureit package.
Model Fitting with the `cureit()` Function
Mixture cure models can be fit using the `cureit()` function of the cureit R package. Equation (1) demonstrates the marginal survival probability assumed by the mixture cure model, where represents the probability of a patient being cured conditioned on observed covariates , and denotes the conditional survival probability of uncured patients given observed covariates . The covariates involved in and may partially overlap, fully coincide, or be entirely different.
(1) |
In practice, the probability of being uncured is usually linked to covariates via a logit link, while the conditional survival distribution is commonly assumed to satisfy the proportional hazards assumption. In this study, we also focus on using the logit link and proportional hazards assumption for the mixture cure models described later.
The `cureit()` function used for model fitting wraps the ` smcure()` function from the smcure package, while providing users with additional information to aid in downstream model evaluation. The smcure package provides variance estimates for statistical inference via bootstrap resampling methods because of the complexity of the expectation-maximization algorithm16. The `cureit()` function enhances the `smcure::smcure()` function by providing users access to the underlying data and results of all bootstrap iterations used to derive variance estimates, as this can be used downstream when calculating model evaluation metrics. Additionally, cureit provides tidier functions, structuring the model fit in a way that is compatible with tidyverse and gtsummary model reporting functionality.17,18
Model Prediction and Evaluation
The cureit package offers several functions to calculate common model performance metrics used to assess predictive accuracy of mixture cure models. The `predict()` method for `cureit` objects calculates linear predictors and predicted survival probabilities given a vector of timepoints and covariate data. Specifically, the predicted survival probability is obtained by replacing unknown parameters in equation (1) with the corresponding estimates. Based on the predicted survival probability, further assessments of the discriminative performance can be conducted via metrics like K-index and Brier Score. 12,19,20,21 The package offers functions for inference of these evaluation metrics through bootstrapping or cross-validation.
Model Visualization via Nomogram
The cureit package offers a `nomogram()` function to visualize and estimate the predicted uncured probabilities and survival probabilities for a given mixture cure model. This function extends and adapts the nomogram methods developed in the rms package for application to mixture cure models. 22 The nomogram consists of two subplots corresponding to the survival and cure fraction components. Predicted probability of being uncured, and predicted survival probability at a given timepoint can be calculated using the nomogram and then combined for an overall survival prediction for the population that is a weighted combination of the two components. ggplot2 is used to compile all nomogram components, and the final nomogram plot is a `ggplot2` object, enabling flexible inclusion of select `ggplot2` elements on top of the basic figure.23
Liposarcoma Data for Cure Rate Application
To demonstrate the cureit pipeline, we used a previously published cohort with clinical data on 1,001 patients with liposarcoma.15 Clinical endpoint data on local recurrence, distant recurrence, and death was updated since the original publication to include additional follow-up time up to January 2022. Clinicopathologic variables including age at presentation, sex, tumor site, tumor size (both as a continuous variable with log2 transformation and as a categorical variable with cutoffs at 5 cm and 10 cm), tumor depth, margin status, and receipt of perioperative chemotherapy and radiation were assessed for correlation with liposarcoma subtype and clinical end points of interest. Liposarcoma subtype was defined histologically as well-differentiated (WDLS), dedifferentiated (DDLS), myxoid, round cell, and pleomorphic. To better categorize long term survivors among extremity and truncal liposarcoma patients, the primary endpoint of interest analyzed was death due to disease following surgical resection. Patients who died from other or unknown causes or patients who were alive at last follow-up were censored at the last follow-up date.
Univariable mixture cure models were fit, with Cox proportional hazards models used to estimate Multivariable models were fit using variables selected in the originally published Cox model, and variables were selected based on univariable significance for either component of the mixture cure model. For the purposes of comparison, conventional Cox proportional hazards models were also fit for the updated data set using the survival package.
To assess predictive performance of mixture cure models and Cox models, Brier scores were calculated on a time grid between the start of follow-up and year 25 after entering the study. Brier scores enable unbiased comparisons between different model formulations, such as Cox and cure models, without being subject to specific model assumptions. The variation of the Brier scores was estimated by repeating the model fitting and Brier score calculating procedures on 500 bootstrapped samples. A K-index, an extension of the Concordance Probability Estimate (CPE) statistic for cure models, was also calculated on the mixture cure models.12 A nomogram was created based on the final fitted multivariable model and includes estimated survival probabilities for the select timepoints.
All p-values and confidence intervals (CIs) reported are two-sided. Statistical significance is defined at the 5% level. All analyses were conducted using R Statistical Software (v4.2.2; R Core Team 2022).
Results
Patient Characteristics
In the previously reported cohort of 1001 patients with extremity or truncal liposarcoma, well-differentiated liposarcoma (WDLS) was the most common subtype represented (n = 452; 45%), followed by myxoid (n = 239; 24%), round cell (n = 126; 13%), pleomorphic (n = 111, 11%), and dedifferentiated (n = 73; 7%; Table 1). Myxoid (median 40 yo [IQR: 33–52]) and round cell patients (median 46 yo [IQR: 36–56]) were younger than patients with other subtypes (median ≥ 57 yo). Most patients (75%) across all subtypes presented with lower extremity tumors, however dedifferentiated (21%) and pleomorphic (19%) patients presented more frequently with truncal tumors compared to other subtypes (≤ 13%). Rates of upper extremity tumors were higher in WDLS (17%) and pleomorphic (18%) compared to other subtypes (≤ 11%). Most patients had no residual tumor after resection (R0: 82%). WDLSs were rarely treated with neoadjuvant treatment and had larger tumor sizes than other subtypes (Table 1).
Table 1:
Cohort Characteristics of Primary Extremity and Truncal Liposarcoma Patients
Characteristic | Overall, N = 1,0011 | Well-differentiated, N = 4521 | Myxoid, N = 2391 | Round Cell, N = 1261 | Pleomorphic, N = 1111 | Dedifferentiated, N = 731 | p-value2 |
---|---|---|---|---|---|---|---|
Age at Presentation | 55 (43, 66) | 59 (49, 68) | 40 (33, 52) | 46 (36, 56) | 57 (52, 71) | 66 (57, 77) | <0.001 |
Sex | 0.5 | ||||||
Female | 424 (42%) | 197 (44%) | 107 (45%) | 45 (36%) | 47 (42%) | 28 (38%) | |
Male | 577 (58%) | 255 (56%) | 132 (55%) | 81 (64%) | 64 (58%) | 45 (62%) | |
Tumor Site | <0.001 | ||||||
Lower extremity | 747 (75%) | 316 (70%) | 199 (83%) | 112 (89%) | 70 (63%) | 50 (68%) | |
Trunk | 128 (13%) | 60 (13%) | 23 (9.6%) | 9 (7.1%) | 21 (19%) | 15 (21%) | |
Upper extremity | 126 (13%) | 76 (17%) | 17 (7.1%) | 5 (4.0%) | 20 (18%) | 8 (11%) | |
Tumor Size | 12 (8, 18) | 15 (9, 21) | 10 (6, 15) | 12 (8, 17) | 10 (6, 14) | 12 (8, 18) | <0.001 |
Unknown | 10 | 4 | 4 | 2 | 0 | 0 | |
Tumor Size Categorized | <0.001 | ||||||
≤ 5 cm | 132 (13%) | 48 (11%) | 41 (17%) | 14 (11%) | 21 (19%) | 8 (11%) | |
> 5, ≤ 10 cm | 268 (27%) | 89 (20%) | 83 (35%) | 38 (30%) | 43 (39%) | 15 (21%) | |
> 10 cm | 595 (59%) | 313 (69%) | 113 (47%) | 72 (57%) | 47 (42%) | 50 (68%) | |
Unknown | 6 (0.6%) | 2 (0.4%) | 2 (0.8%) | 2 (1.6%) | 0 (0%) | 0 (0%) | |
Tumor Depth | 0.06 | ||||||
Deep Depth | 882 (88%) | 408 (90%) | 204 (85%) | 116 (92%) | 93 (84%) | 61 (84%) | |
Superficial Depth | 119 (12%) | 44 (9.7%) | 35 (15%) | 10 (7.9%) | 18 (16%) | 12 (16%) | |
Surgical Resection Margin | 0.004 | ||||||
R0 | 817 (82%) | 348 (77%) | 218 (91%) | 107 (85%) | 92 (83%) | 52 (71%) | |
R1 | 162 (16%) | 92 (20%) | 19 (7.9%) | 16 (13%) | 14 (13%) | 21 (29%) | |
R2 | 21 (2.1%) | 11 (2.4%) | 2 (0.8%) | 3 (2.4%) | 5 (4.5%) | 0 (0%) | |
Unknown | 1 (<0.1%) | 1 (0.2%) | 0 (0%) | 0 (0%) | 0 (0%) | 0 (0%) | |
Chemotherapy Sequence | 0.2 | ||||||
Adjuvant | 60 (56%) | 6 (86%) | 7 (44%) | 17 (46%) | 24 (63%) | 6 (67%) | |
Neoadjuvant | 47 (44%) | 1 (14%) | 9 (56%) | 20 (54%) | 14 (37%) | 3 (33%) | |
Unknown | 894 | 445 | 223 | 89 | 73 | 64 | |
Radiation Therapy Sequence | 0.088 | ||||||
Adjuvant | 268 (86%) | 37 (95%) | 79 (78%) | 63 (86%) | 61 (88%) | 28 (90%) | |
Intraoperative | 1 (0.3%) | 0 (0%) | 0 (0%) | 0 (0%) | 1 (1.4%) | 0 (0%) | |
Neoadjuvant | 44 (14%) | 2 (5.1%) | 22 (22%) | 10 (14%) | 7 (10%) | 3 (9.7%) | |
Unknown | 688 | 413 | 138 | 53 | 42 | 42 |
Median (IQR); n (%).
Kruskal-Wallis rank sum test; Pearson’s Chi-squared test; Fisher’s exact test (when expected cell counts < 5).
Ten patients were missing data on tumor size (in cm). Four of those 10 had available data on categorized tumor size.
Disease-specific Death
With a median follow-up time of 65 months among survivors, 112 patients died due to disease and 178 patients died due to other or unknown causes. The cumulative incidence of disease-specific death significantly differed between histologic subtypes (log rank p-value < 0.001; Figure 2) with the WDLS patients having the lowest incidence of DSD at 5 years (1% [95% CI: 0%, 1.2%]), myxoid and DDLS patients having 8.6% (95% CI: 4.4%, 13%) and 9.8% (95% CI: 2.0%, 17%) respectively, and pleomorphic having the highest incidence at 5 years (41% [95% CI: 30%, 51%]); Figure 2).
Figure 2:
Cumulative incidence curves (inverse of Kaplan-Meier estimates) for disease-specific death in 1,001 liposarcoma patients. Deaths due to unknown or non-cancer causes are censored at time of death.
Model Fitting
Consistent with the previously published study, liposarcoma histologic subtype (p < 0.001) and log2 tumor size (p < 0.001) were significantly associated with disease-specific death in Cox models (Supplemental Table 1). Pleomorphic liposarcoma had the highest incidence of DSD compared to the other subtypes with a HR of 37.4 (95% CI: 16.19, 86.38) when compared to WDLS. Log tumor size also remained significantly associated in multivariable models (1.93 [95% CI: 1.55, 2.41]).
The presence of a plateau at the end of the curves indicates that mixture cure models may provide further insight into which variables contribute to predicting if a patient will die from disease, and which contribute to prediction of time to death for those who will experience death due to disease (Figure 2). In univariable cure models, myxoid (OR ref WDLS: 4.25 [95% CI: 1.23, 14.7]), round cell (OR ref WDLS: 13.2 [95% CI: 3.00, 58.2]) and pleomorphic (OR ref WDLS 10.4 [95% CI: 3.15, 34.2]) histologic subtypes, as well as larger tumor size (OR 2.65 [95% CI: 1.26, 5.55]), and deep tumor depth (OR ref superficial, 4.85 [95% CI, 1.67 to 14.1]) were significantly associated with the incidence of DSD (Table 2). Patients with pleomorphic (HR ref WDLS 14.7 [95% CI: 4.14., 52.3]) and dedifferentiated (HR ref WDLS 9.47 [95% CI: 1.53, 58.6]) subtypes, along with older patients (HR 1.03 [95% CI:1.00–1.06]) and patients with upper extremity tumors (HR lower extremity 3.01 [95% CI: 1.17, 7.72]) had significantly shorter time to DSD (latency) among those expected to experience the event (Table 2).
Table 2:
Univariable and Multivariable Disease-specific Death Cure Models Fitted for Primary Extremity and Truncal Liposarcoma Patients (n= 991)
Univariable Model | Multivariable Model | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Variable | Incidence | Latency | Incidence | Latency | ||||||||
OR | 95% CI | P | HR | 95% CI | P | OR | 95% CI | P | HR | 95% CI | P | |
Age at presentation | 1 | 0.98 to 1.02 | 0.9 | 1.03 | 1 to 1.06 | 0.039 | — | — | — | — | — | — |
Myxoid (ref WDLS) | 4.25 | 1.23 to 14.7 | 0.022 | 2.23 | 0.62 to 8.06 | 0.2 | 6.25 | 1.32 to 29.6 | 0.021 | 2.77 | 0.62 to 12.5 | 0.2 |
Round cell (ref WDLS) | 13.2 | 3 to 58.2 | <.001 | 2.43 | 0.59 to 9.93 | 0.2 | 16.2 | 2.80 to 93.2 | 0.002 | 3.31 | 0.68 to 16.2 | 0.14 |
Pleomorphic (ref WDLS) | 10.4 | 3.15 to 34.2 | <.001 | 14.7 | 4.14 to 52.3 | <.001 | 16.7 | 3.46 to 80.9 | <.001 | 19.2 | 4.34 to 84.8 | <.001 |
Dedifferentiated (ref WDLS) | 2.62 | 0.60 to 11.4 | 0.2 | 9.47 | 1.53 to 58.6 | 0.016 | 3.23 | 0.53 to 19.9 | 0.2 | 10.6 | 1.48 to 75.9 | 0.019 |
Sex | 1.42 | 0.74 to 2.74 | 0.3 | 0.9 | 0.46 to 1.74 | 0.7 | — | — | — | — | — | — |
Trunk (Ref lower extremity) | 0.95 | 0.42 to 2.15 | 0.9 | 1.26 | 0.56 to 2.82 | 0.6 | — | — | — | — | — | — |
Upper extremity (Ref lower extremity) | 0.44 | 0.20 to 1.01 | 0.052 | 3.01 | 1.17 to 7.72 | 0.022 | — | — | — | — | — | — |
Log tumor size | 2.65 | 1.26 to 5.55 | 0.01 | 0.55 | 0.27 to 1.16 | 0.12 | 2.42 | 1.70 to 3.45 | <.001 | 1.15 | 0.78 to 1.70 | 0.5 |
Deep tumor depth (Ref superficial depth) | 4.85 | 1.67 to 14.1 | 0.004 | 0.47 | 0.2 to 1.12 | 0.087 | — | — | — | — | — | — |
R1 (ref R0) | 1.81 | 0.82 to 4.01 | 0.14 | 0.61 | 0.28 to 1.32 | 0.2 | — | — | — | — | — | — |
R2 (ref R0) | 1.29 | 0.29 to 5.71 | 0.7 | 1.74 | 0.31 to 9.59 | 0.5 | — | — | — | — | — | — |
NOTE. Ten patients were omitted because of missing data on tumor size. Significant P values (≤ 0.05) are set in bold.
Abbreviations: HR, hazard ratio; OR, odds ratio; WDLS, well-differentiated liposarcoma.
A multivariable cure model with subtype and tumor size was fit as a comparison to the nomogram with these variables published in the original paper. When compared to WDLS patients, myxoid (OR ref WDLS 6.25 [95% CI: 1.32 – 29.6]), round cell (OR ref WDLS 16.2 [95% CI: 2.80 – 93.2]) and pleomorphic (OR ref WDLS 16.7 [95% CI: 3.46 – 80.9]) subtypes were significantly associated with higher incidence of DSD. Larger tumor size (analyzed as a continuous variable on the log2 scale; OR 2.42 [95% CI:, 1.70 –to 3.45]) was also significantly associated with higher incidence of DSD in this cure model. Pleomorphic (HR ref WDLS 19.2 [95% CI: 4.34 – 84.8]) and dedifferentiated (HR ref WDLS 10.6 [95% CI: 1.48 – 75.9]) histologic subtypes were significantly associated with shorter time to DSD (latency) when compared to WDLS patients (Table 2). Therefore, only patients with the pleomorphic liposarcoma subtype were significantly associated with both higher incidence of DSD and disease latency.
In the multivariable cure model, myxoid and round cell subtypes were significantly associated with the risk of being uncured compared to WDLS after adjusting for tumor size, however they were not associated with disease latency, suggesting that while patients with these subtypes have significantly increased risk of experiencing disease-death compared to WDLS patients, the time from resection to disease death is comparable to that of WDLS patients for those who will experience this event. In contrast, the time to DSD (latency) is significantly shorter for DDLS patients, with most of the DSD events occurring in the first 5 years compared to WDLS patients where most events occur after 8 years. No significant difference was found in the incidence of DSD in DDLS compared to WDLS indicating that the long-term cure rates for DDLS and WDLS are comparable. Tumor size was associated with the incidence but not the latency of the model. Thus, large tumor size increases risk of being uncured regardless of liposarcoma histologic subtype. This unique aspect of interpretation offered by cure models presents a distinct advantage over the Cox models. While hazard ratios for subtypes and their relative significance in a Cox model indicate a relationship between the risk of disease-specific death and subtype, they do not help delineate if the association arises because patients with certain subtypes have an overall higher incidence of disease-related death, or if certain subtypes are associated with a shorter time to disease-related death from resection.
Given that several other variables were significantly associated with DSD in univariable analyses, we fit two extended models additionally adjusted for tumor depth and tumor site as a sensitivity analysis (Supplemental Table 2). In extended models, associations of subtypes and DSD remained consistent in both direction and magnitude (Table 2, Supplemental Table 2).
Model Evaluation and Nomogram
Brier scores, defined as the mean squared difference between the predicted probability of an event and the actual event occurrence (0 or 1) at a given follow-up time, were calculated for select timepoints, and 95% CIs were estimated using 500 bootstrap resamples. Lower Brier scores indicate more accurately predicted probabilities. Brier scores were roughly comparable between the Cox and cure models in the earlier years, however the cure model achieved slightly lower Brier scores than the Cox model until approximately 13 years post-resection (Figure 3). The confidence intervals of the cure model’s Brier scores were generally narrower than those for the Cox model, indicating lower variation in prediction performance for the cure model. The K-index of the mixture cure model was 0.69. A nomogram for the final selected multivariable cure model is presented in Figure 4.
Figure 3:
Brier scores for mixture cure models and Cox proportional hazards models adjusted for subtype and tumor size. 95% Confidence intervals were estimated using 500 bootstrap iterations.
Figure 4:
Nomogram for survival and cure probabilities at 5 years (T= 60 months) for liposarcoma patients following resection. To estimate probabilities, points for each variable should be summed separately for the cured (red) and uncured survival (blue) portions using the ‘Points’ line. Then, respective probabilities are determined using ‘Cured Probability Total Points’/ ‘Cured Probability’ and ‘Uncured survival Total Points’/ and ‘Uncured survival Probability’ lines. For an overall prediction, these two components can be combined using formula (1).
Discussion
By explicitly modeling the relationship of features to a cure fraction and a latency portion, cure models help uncover relationships between clinical or demographic characteristics and clinical cancer outcomes that may be diluted or obscured by standard cox models. Our empirical data application on a sarcoma cohort demonstrated that in comparison to the traditional Cox model, cure models provided additional useful interpretation on how patient characteristics were associated with the chance of experiencing the event of interest. This is particularly appealing for data sets describing patient populations with a high cured portion where the obtained insights may help clinicians identify patients who are likely to be long-term survivors and therefore may be eligible for different treatment options or surveillance schedules than those who will likely experience disease recurrence, progression, or disease-specific death.
Additionally, a cure model may allow for modeling effects that become prominent in late follow-up. Standard Cox models assume that hazard ratios are constant over time, however, if a variable is mainly associated with late events, this assumption may be violated, potentially biasing the estimates and inferences drawn about the variable’s effect. While several clinical studies have investigated late recurrence or metastasis in soft-tissue sarcoma, results are inconclusive and commonly employed methods for modeling late events often require a predetermined cut point for the definition of ‘late’, and fail to capture a variable’s distinct relationship with the incidence and latency components of risk. 24, 25, 26, 27
Compared to standard Cox models, cure models typically require estimation of a larger number of unknown parameters to capture associations between covariates and both the incidence and the latency. Therefore, cure models may be better suited to studies with larger sample sizes where both cured and uncured populations have abundant representation in the dataset. When interpreting a fitted cure model, it is also important to note that certain covariates can be significantly associated with the chance of being cured, while also being significantly associated with a better or worse time-to-event prediction, conditional on being uncured. These extra challenges in model fitting and model interpretation need to be carefully addressed when reporting a cure model. The cureit pipeline aims to partially mitigate these challenges by providing users with functions for the creation of intuitive model summary tables and nomograms.
The open-source tool cureit has several potential directions for future development. In our implementation of the cure model, we applied the Cox proportional hazards submodel to describe the latency portion. Other options, such as an accelerated failure time model or other parametric models, are also commonly used and will be considered for inclusion in future versions of this package. In addition, it may be useful to add functionality to account for the risk of death due to unknown or non-cancer related causes. This may include functions to model competing risks in cure models, or the inclusion of model implementations that allow for adjustment of background mortality rates derived from life tables 28. Lastly, while our software is primarily aimed at clinical data analysts and biostatisticians, expanding the pipeline’s ability to integrate with interactive web-based visualization tools like shiny could help make these models more accessible for clinicians and patients.14, 29
In conclusion, we presented an open access analysis pipeline to fit and evaluate mixture cure models, and demonstrated its clinical utility in the liposarcoma disease setting. This tool was created to encourage clinical researchers to leverage cure models in appropriate settings where a subpopulation of cured individuals may require additional consideration.
Supplementary Material
Acknowledgments:
We thank Joseph Kanik for assistance in editing and preparing manuscript figures.
Supported in part by:
National Institutes of Health P50 CA217694, R21 HG012124, R21 CA214845, P30 CA008748
References
- 1.Canter RJ, Qin LX, Ferrone CR, Maki RG, Singer S, & Brennan MF (2008). Why do patients with low-grade soft tissue sarcoma die? Annals of Surgical Oncology, 15(12), 3550–3560. 10.1245/s10434-008-0163-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Tan MC, Brennan MF, Kuk D, Agaram NP, Antonescu CR, Qin LX, ... & Singer, S. (2016). Histology-based classification predicts pattern of recurrence and improves risk stratification in primary retroperitoneal sarcoma. Annals of Surgery, 263(3), 593–600. 10.1097/SLA.0000000000001149 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Cox DR (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 34, 187–220. [Google Scholar]
- 4.Othus M, Barlogie B, Leblanc ML, & Crowley JJ (2012). Cure models as a useful statistical tool for analyzing survival. Clinical Cancer Research, 18(14), 3731–3736. 10.1158/1078-0432.CCR-11-2859 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Jia X, Sima CS, Brennan MF, & Panageas KS (2013). Cure models for the analysis of time-to-event data in cancer studies. Journal of Surgical Oncology, 108(6), 342–347. 10.1002/jso.23411 [DOI] [PubMed] [Google Scholar]
- 6.Lambert PC, Thompson JR, Weston CL, & Dickman PW (2007). Estimating and modeling the cure fraction in population-based cancer survival analysis. Biostatistics, 8(3), 576–594. [DOI] [PubMed] [Google Scholar]
- 7.Peng Y, & Dear KBG (2000). A nonparametric mixture model for cure rate estimation. Journal of the American Statistical Association, 95(449), 278–287. [Google Scholar]
- 8.De Angelis R, Capocaccia R, Hakulinen T, Soderman B, Verdecchia A. Mixture models for cancer survival analysis: application to population-based data with covariates. Stat Med. 1999. Feb 28;18(4):441–54. doi: . [DOI] [PubMed] [Google Scholar]
- 9.Liu X, Peng Y, Tu D, ... & others. (2012). Variable selection in semiparametric cure models based on penalized likelihood, with application to breast cancer clinical trials. Statistics in Medicine, 31, 2882–2891. [DOI] [PubMed] [Google Scholar]
- 10.Barlogie B, Jagannath S, Vesole DH, ... & others. (1997). Superiority of tandem autologous transplantation over standard therapy for previously untreated multiple myeloma. Blood, 89, 789–93. [PubMed] [Google Scholar]
- 11.Barlogie B, Tricot GJ, Van Rhee F, ... & others. (2006). Long-term outcome results of the first tandem autotransplant trial for multiple myeloma. British Journal of Haematology, 135, 158–64. [DOI] [PubMed] [Google Scholar]
- 12.Zhang Yilong , Shao Yongzhao, Concordance measure and discriminatory accuracy in transformation cure models, Biostatistics, Volume 19, Issue 1, January 2018, Pages 14–26, 10.1093/biostatistics/kxx016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Balachandran VP, Gonen M, Smith JJ, & DeMatteo RP (2015). Nomograms in oncology: more than meets the eye. The Lancet Oncology, 16(4), e173–e180. 10.1016/S1470-2045(14)71116-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Memorial Sloan Kettering Cancer Center. (Year). Liposarcoma nomogram. Retrieved [September 14th, 2023], from https://www.mskcc.org/nomograms/sarcoma/liposarcoma [Google Scholar]
- 15.Bartlett EK, Curtin CE, Seier K, ... & Singer S (2021). Histologic subtype defines the risk and kinetics of recurrence and death for primary extremity/truncal liposarcoma. Annals of Surgery, 273(6), 1189–1196. 10.1097/SLA.0000000000003453 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Cai C, Zou Y, Peng Y, & Zhang J (2012). smcure: an R-package for estimating semiparametric mixture cure models. Computer Methods and Programs in Biomedicine, 108(3), 1255–1260. 10.1016/j.cmpb.2012.08.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Sjoberg D, Whiting K, Curry M, Lavery J, & Larmarange J (2021). Reproducible summary tables with the gtsummary package. The R Journal, 13, 570–580. 10.32614/RJ-2021-053 [DOI] [Google Scholar]
- 18.Wickham H, Averick M, Bryan J, ... & others. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. 10.21105/joss.01686 [DOI] [Google Scholar]
- 20.Graf E, Schmoor C, Sauerbrei W, & Schumacher M (1999). Assessment and comparison of prognostic classification schemes for survival data. Statistics in Medicine, 18(17–18), 2529–2545. [DOI] [PubMed] [Google Scholar]
- 21.Proust-Lima C, Séne M, Taylor JM, & Jacqmin-Gadda H (2014). Joint latent class models for longitudinal and time-to-event data: a review. Statistical Methods in Medical Research, 23(1), 74–90. 10.1177/0962280212445839 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Harrell FE Jr. (2023). rms: Regression modeling strategies. R package version 6.4–1. Retrieved from https://CRAN.R-project.org/package=rms [Google Scholar]
- 23.Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag; New York. ISBN 978–3-319–24277-4, https://ggplot2.tidyverse.org. [Google Scholar]
- 24.Toulmonde M, Le Cesne A, Mendiboure J, Blay JY, Piperno-Neumann S, Chevreau C, ... & Italiano A (2014). Long-term recurrence of soft tissue sarcomas: prognostic factors and implications for prolonged follow-up. Cancer, 120(19), 3003–3006. [DOI] [PubMed] [Google Scholar]
- 25.Lewis JJ, Leung D, Casper ES, Woodruff J, Hajdu SI, Brennan MF. Multifactorial analysis of long-term follow-up (more than 5 years) of primary extremity sarcoma. Arch Surg. 1999;134(2):190–4. 10.1001/archsurg.134.2.190. [DOI] [PubMed] [Google Scholar]
- 26.Engellau J, Anderson H, Rydholm A, Bauer HC, Hall KS, Gustafson P, ... & Nilbert M (2004). Time dependence of prognostic factors for patients with soft tissue sarcoma: a Scandinavian Sarcoma Group Study of 338 malignant fibrous histiocytomas. Cancer: Interdisciplinary International Journal of the American Cancer Society, 100(10), 2233–2239. [DOI] [PubMed] [Google Scholar]
- 27.von Konow A, Ghanei I, Styring E, & Vult von Steyern F (2021). Late local recurrence and metastasis in soft tissue sarcoma of the extremities and trunk wall: better outcome after treatment of late events compared with early. Annals of Surgical Oncology, 28, 7891–7902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Jensen RK, Clements M, Gjærde LK, Jakobsen LH. Fitting parametric cure models in R using the packages cuRe and rstpm2. Comput Methods Programs Biomed. 2022. Nov;226:107125. doi: 10.1016/j.cmpb.2022.107125. Epub 2022 Sep 13. [DOI] [PubMed] [Google Scholar]
- 29.Chang W, Cheng J, Allaire J, Sievert C, Schloerke B, Xie Y, Allen J, McPherson J, Dipert A, Borges B (2023). shiny: Web Application Framework for R. R package version 1.7.4.1, https://CRAN.R-project.org/package=shiny. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.