Abstract
Mixture cure models have been developed as an effective tool to analyze failure time data with a cure fraction. Used in conjunction with the logistic regression model, this model allows covariate-adjusted inference of an exposure effect on the cured probability and on the hazard of failure for the uncured subjects. However, the covariate-adjusted inference for the overall exposure effect is not directly provided. In this paper, we describe a Cox proportional hazards cure model to analyze interval-censored survival data in the presence of a cured fraction and then apply a post-estimation approach by using model-predicted estimates difference to assess the overall exposure effect on the restricted mean survival time scale. For baseline hazard/survival function estimation, simple parametric models as fractional polynomials or restricted cubic splines are utilized to approximate the baseline logarithm cumulative hazard function, or alternatively, the full likelihood is specified through a piecewise linear approximation for the cumulative baseline hazard function. Simulation studies were conducted to demonstrate the unbiasedness of both estimation methods for the overall exposure effect estimates over various baseline hazard distribution shapes. The methods are applied to analyze the interval-censored relapse time data from a smoking cessation study.
Keywords: Mixture cure model, overall exposure effect, interval-censored, fractional polynomials, restricted cubic splines, piecewise linear approximation
1. INTRODUCTION
Survival analysis usually assumes that all individuals in a study population are susceptible and will eventually experience the failure if there is no censoring and the follow-up is sufficiently long. In recent years, there has been an increasing interest in modelling survival data in which there exists a cured component in the population and some individuals are cured or immuned thus will not experience the event of interest. Well-known examples of this scenario are cancer studies with long-term survivors, e.g. the cure occurs when some radiation treatment regimens kill all the cancer cells in patients with squamous cell carcinoma of the tonsil (Withers et al., 1995; Sy, & Taylor, 2000). It may be inappropriate to use standard survival analysis for this data since some individuals are non-susceptible and failing to account for such cured subjects would result in erroneous inference.
The mixture cure model provides an effective means to analyze survival data with a cure fraction by integrating the regression model for the proportion of cured subjects and the survival model (either the Cox proportional hazards model or the accelerated failure time model) for uncured/susceptible subjects. Two-component mixture cure model was first proposed by Berkson and Gage, where an estimable constant for the susceptible proportion and a parametric survival model were used (Berkson & Gage, 1952). Farewell considered a parametric logistic function to model the cured proportion and assumed a Weibull distribution to model the baseline hazard function for cured participants (Farewell, 1982). Peng, Dear and Denham (1998) extended the parametric survival model by introducing a generalized F distribution. On the other hand, Kuk and Chen (1992) applied a semiparametric Cox proportional hazards model by assuming non-specific distribution for the uncured group and used an estimation method involving Monte Carlo approximation. Taylor (1995), Peng and Dear (2000), and Sy and Taylor (2000) modified Kuk and Chen’s method by implementing an expectation-maximization (EM) algorithm for the model parameter estimation. More recently, Wang, Du and Liang (2012) proposed a two-component mixture cure rate model with nonparametric forms for both the cure probability and the hazard rate function, and the estimation is carried out by maximizing a penalized likelihood through an EM algorithm. As an alternative, another important type of long-survivor model, referred as non-mixture cure model or promotion time cure model, assumes that the cured subjects have survival time equal to infinity, and the survival distribution for either cured or non-cured subjects can be integrated into one single formulation (Tsodikov, Ibrahim, & Yakovlev, 2003; Zeng & Lin, 2007). However, these works mainly focused on settings in which either the exact survival times or censoring time are observed.
Interval-censored data is common among epidemiological and medical studies in which the time to an event of interest is known only to lie in an interval, instead of being observed exactly. For instance, individuals may visit a clinic at predetermined times rather than the exact time of event onset, in this situation, the time to event are interval-censored. Unfortunately, the baseline hazard function cannot be canceled out when incorporating the interval censoring into the Cox proportional hazards model, and as a result, the parameter estimation has proven to be quite challenging. Over the past two decades, researchers have studied this type of censoring through complex algorithms or approaches, such as EM algorithm (Dempster, Laird, & Rubin, 1977), self-consistency algorithm (Turnbull, 1976) or interactive convex minorant algorithm (Pan, 1999). Another option is to specify a parametric baseline hazard function which makes the analysis much easier for the interval-censored data (Gong & Fang, 2013; Sparling, Younes, Lachin, & Bautista, 2006).
This paper is motivated by a smoking cessation study investigating a smoking intervention (SI) treatment on the relapse of smoking and on the smoke quitting duration time (Murray et al., 1998). The study involved 223 subjects who did not smoke when they entered into this study and were randomly assigned to two groups: a SI treatment group and a usual care (UC) control group. Each subject was observed around once a year over the five-year follow-up. The exact time of the smoking relapse does not exist, and only an approximate one-year interval was observed from the previous visit to the current visit if the subject resumed smoking during the study period. In addition, the study also includes the proportion of participants that permanently and successfully quit smoking, and they may be regarded as cured subjects. Therefore, this interval-censored data can be modeled by a mixture cure survival model. Researchers have developed some methods recently to analyze this complex failure time data (Kim & Jhun, 2008; Yu & Peng, 2008; Xiang, Ma, & Yau, 2011). Current approaches mainly focus on different model settings, e.g. Kim and Jhun (2008) proposed the cure mixture model based on a piecewise exponential distribution, Yu and Peng (2008) considered a marginal approach to the cure mixture model using a parametric Weibull distribution, and the available analysis results all showed separate exposure/covariate effect estimates for the susceptible probability and the non-susceptible survival time. In practice, it may be more interesting to compare treatment or exposure groups based on the overall effect for clinical trial or observational study analysis, while also adjusting for fixed baseline covariates. To the best of our knowledge, there are no known available analysis methods that quantify the estimators of overall exposure effects using a causal inference perspective under the cure mixture modeling framework.
This paper presents a post-estimation approach, denoted by us as the ‘average-predicted-value’ (APV) method (Albert, Wang & Nelson, 2011; Wang & Griswold, 2017), to use model-predicted values for each person under different exposure status to assess the overall exposure effect on the restricted mean survival time (RMST) scale in the context of mixture cure model for the interval-censored data. The idea of this approach was used by Greenland to estimate a relative risk (Greenland, 2001), Zhang and Akcin (2012) to compare the direct adjusted survival curves based on Aalen’s additive model, and also by us to estimate overall exposure effects for zero-inflated regression models (Albert, Wang & Nelson, 2011) and Tobit models (Wang & Griswold, 2017). Two methods are described to estimate the mixture cure model parameters and the baseline hazard function used in the APV approach. The first method uses simple parametric models as fractional polynomials (FP) or restricted cubic splines (RCS) to approximate the baseline logarithm cumulative hazard function (Sauerbrei & Royston, 1999; Royston & Parmar, 2002; Royston, 2011), and the second approach assumes a piecewise linear approximation for the cumulative baseline hazard function (Turnbull, 1976; Xiang, Ma, & Yau, 2011). The mixture cure model parameters from these two methods can be estimated by maximum likelihood using standard optimization tools, such as quasi-Newton algorithm. Therefore, the APV approach is applied to the mixture cure model to represent the overall exposure effect as the ‘predicting counterfactual contrast’ under different exposure status that can be expressed as the function of estimable parameters.
The rest of the paper is organized as follows. Section 2 describes the Cox proportional hazards cure model and presents the estimation procedures using simple parametric models or piecewise constants to approximate the baseline hazard respectively. In Section 3, we describe a post-estimation approach for inference of the overall exposure effects on the RMST scale. Simulation studies are used in Section 4 to assess the statistical properties of the described methods. In Section 5, we apply this method to assess the treatment effect on smoking relapse in a smoking cessation study. Discussion and concluding remarks are presented in Section 6.
2. THE COX PROPORTIONAL HAZARDS CURE MODEL
2.1. Data and model specification
Denote the observed data for individual i by (Ai, zi, δi), i = 1, 2, …, n, where Ai = (Li, Ri] indicates the interval during which individual i fails, zi is the p-dimensional vector of covariates including a primary treatment/exposure xi and (p - 1) other covariates wi, and δi = 1 if survival time is interval-censored (Ri < +∞) and δi = 0 if survival time is right-censored (Ri = +∞). Let S(t ∣ z) denote the survival function for an individual with survival time t and covariates z, then the likelihood function for interval-censored data given the observed data is,
| (1) |
assuming that 0 ≤ Li < Ri ≤ +∞ for all i = 1, 2, …, n.
Furthermore, we define a latent binary indicator Yi with Yi = 1 indicating that individual i will eventually experience the failure event (uncured) and Yi = 0 indicating that the individual will never experience such event (cured). It is easy to derive that Yi = 1 if δi = 1, and Yi is unknown if δi = 0. The uncured probability of an individual i with covariate zi can be modeled by a logistic regression (Farewell, 1982),
| (2) |
where ξi(zi) is a linear form of the covariates, ξi(zi)= (1, zi’)’α and α is a (p + 1)-dimensional vector representing the effects of covariates on the uncured probability. In addition, we assume that the conditional hazard function for the uncured group takes the general proportional hazards model form for hazard function h(t ∣ zi) and survival function S(t ∣ zi) as,
| (3) |
where ηi(zi) = zi’ β is the linear predictor, β is a p-dimensional vector representing the effects of covariates on the uncured survival baseline hazard, and h0(t) and S0(t) are the baseline hazard function and baseline survival function at survival time t respectively. For convenience, we used the same set of fixed baseline covariates (both used zi) for the uncured logistic regression model and survival proportional hazards model, although different covariate sets can be chosen. Some variable selection methods can be considered in the mixture cure model if needed (Liu et al., 2012; Scolas et al., 2016). Let SM(t ∣ zi) denote the marginal survival function, represented as,
| (4) |
2.2. Estimation Procedures
The likelihood for interval-censored data in the mixture cure model can be written as,
| (5) |
Of note, the likelihood specification in formula (5) can be expressed as the function of complete observed data (Ai, zi, δi) and estimable parameters (α and β) as long as the baseline survival function S0 can be specified. For this proportional hazards mixture cure model with interval-censored data, we apply two different methods to approximate the baseline survival function for the maximum likelihood estimation. The first method applies parametric approximation using either FP or RCS, both of which have simple expression forms and are applicable to cover a wide range of baseline hazard functions in biomedical research (Royston, 2011). The second model is based on Cox’s semi-parametric proportional hazard and use a piecewise linear approximation for the nonparametric cumulative baseline hazard function (Xiang, Ma, & Yau, 2011). As shown by Li et al., the parametric mixture cure model as given by equations (2) and (3) is identifiable, and parameter estimates can be obtained by maximum likelihood. The identifiability of the mixture cure model was also discussed by Diao and Yuan (2019) and Scolas et al., (2018).
2.2.1. Estimation using parametric approximation for the baseline hazard
The logarithm of the baseline cumulative hazard function (lnH0(t)) is approximated with an FP function of degree m > 0 for an argument t > 0 represented as . The powers p1 < p2 < … < pm are positive or negative integers or fractions chosen from a predefined set, P = {−2, −1, −0.5, 0, 0.5, 1, 2, max(3, m)}, where t0 denotes ln(t). The set includes no transformation (pj = 1) and the reciprocal, logarithmic, square root etc. Of note, degree 1 with power 0 represents the Weibull baseline hazard function. In addition, the definition also includes possible ‘repeated powers’ which involve power of ln(t), for example, for an FP of degree m = 3 with powers P = (−1, −1, 0), we have lnH0(t) = θ0+θ1t−1+θ2t−1ln(t)+θ3ln(t). An FP model of degree m has 2m degrees of freedom (d.f.) (excluding θ0); 1 degree of freedom for coefficient θj and 1 degree of freedom for each power specification pj. We only consider models in t of degree 1 or 2, with d.f. 2 or 4 respectively as suggested by Sauerbrei and Royston (1999) and the number of total FP models considered is 44 (8 models for m = 1 and 36 models for m = 2).
In addition, we also modeled lnH0(t) as a restricted cubic spline function s(ln(t); γ) of an argument ln(t) with K ≥ 1 interior knots k1 < … < kK and two boundary knots kmin < k1, kmax > kK. The restricted cubic spline function can be written as a weighted sum of K + 1 basis functions, ln(t), υ1(ln(t)), … , υK(ln(t)) which are derived from cubic polynomial segments defined on the intervals between the knots and constrained to be linear beyond boundary knots kmin, kmax. The spline may be written as,
where the jth basis function is defined for j = 1, … , K as,
The number of knots determines the complexity of the RCS function and its dimension (d.f.) that equals K + 1 excluding γ0. The knot positions does not appear critical for a good fit and we selected the centile-based positions for the internal knots as recommended by Durrelman and Simon (1989), for example, 2 interior knots and boundary knots were obtained, respectively from the 33th, 67th, minimum and maximum interval-censored event time (imputed using the midpoint of the censoring interval). Models with d.f. > 4 were not suggested since the resulting curves are expected to be potentially unstable (Royston & Parmar, 2002). In this paper, we only considered restricted cubic splines with 1, 2 or 3 interior knots with d.f. of 2, 3 or 4 respectively.
The log-likelihood of the mixture cure model for the interval-censored survival data is defined as the logarithm of the expression in equation (5) with the full parametric FP or RCS specification for the baseline hazard function and the resulting log-likelihood is then maximized to estimate the corresponding model parameters (α, β, θ or γ). The final model can be chosen based on a model selection criterion, e.g. Akaike information criterion (AIC) among the 47 potential FP and RCS models (44 FP models and 3 RCS models, see online Appendix A for detailed model specification), and the Bayesian information criterion (BIC) was not used since it has been criticized for its tendency to give models that are too parsimonious (Weakliem, 1999).
2.2.2. Piecewise linear approximation for the cumulative baseline hazard function
We note that the likelihood function (5) only depends on S0(t) through its values at different observation time points, thus, we only focus on the estimation of S0(t) values at these time points. Let {0 = t0 < t1 … < tQ < tQ+1 = +∞} be the ordered distinct time points of all observed intervals {Li, Ri, i = 1, 2, …, n}, and we also define n × (Q + 1) indicators ψiq ≡ I((tq-1, tq] ⊂(Li, Ri]) representing whether the each interval (tq-1, tq] locates within the observed time interval (Li, Ri] or not, q = 1, 2, …, Q, Q + 1, i = 1, 2, …, n. To avoid the range restrictions on the survival function parameters, we reparameterize the baseline survival function S0(t) at t1, …, tQ through new sets of parameters gq = ln{ln[S0(tq–1)]–ln[S0(tq)]}, q = 1, 2, …, Q, similar to that of Xiang, Ma & Yau (2011), indicating a logarithmic increase of the logarithm of survival function from tq-1 to tq. It is easy to express the baseline survival function S0(tq) at each time tq using gq,
where S0(t0) = 1 and S0(tQ+1) = 0. Therefore, the likelihood function (5) can be written as follows,
| (6) |
Consequently, the estimators of parameters (α, β, g) are obtained by maximizing the log-likelihood in equation (6). In the present paper, estimation (maximization) is implemented in SAS (Version 9.4, SAS Institute Inc., Cary, NC, USA) using the PROC NLMIXED procedure.
3. ESTIMATION OF OVERALL EXPOSURE EFFECTS
The APV method involves the calculation of the model-predicted overall responses under different exposure status for each person in a chosen reference population using the proportional hazards mixture cure model setting (formula (2) and (3)). Predictions are made for each person in the selected reference population, both if the person was exposed (xi = 1) and if the person was not exposed (xi = 0) while fixing other covariates (wi) at the person’s observed values. Since each person is either exposed or not exposed, one of these two predicted responses will represent a counterfactual value. The overall exposure effect is represented as an appropriate contrast of two means by averaging the predicted values over the covariate distribution from the chosen reference population (for example, the exposed group in this paper).
Under the proportional hazards mixture cure model, for time-to-event outcome T, we focus on the overall exposure effect estimation on the interpretable restricted mean survival time (RMST) scale. The restricted mean survival time, μ(t*) at specific time t*, represents the mean of minimum of survival time T and t* limited to some horizon t* > 0 which can be proved to be equal to the area under the survival curve S(t) from t = 0 to t = t* (Irwin, 1949; Royston & Parmar, 2011),
For pre-selected t*, μ(t*) is readily interpretable as the ‘life expectancy’ at the particular time horizon t*, for example, if T is years to death, we may assume μ(t*) as the ‘t*-year life expectancy’. We can define marginalized RMST (μM) for the survival outcome T in the presence of cure fraction under our proportional hazards mixture cure model, and then derive the population-wide overall exposure effect on RMST scale (δRMST) using G as the reference group,
| (7) |
| (8) |
The expected value μM(t ∣ xi, wi) for an individual with observed covariate values wi and given exposure status xi can be expressed as a function of α, β, θ/γ/g as in formula (7) by using FP, RCS or piecewise linear approximation for the baseline cumulative hazard function and plugging in estimated coefficients following the fit of the model (2) and (3) to the whole sample. An estimate of the mean difference δRMST is then obtained by averaging the predicted μM difference (xi = 1 vs. 0) over the empirical distribution of the covariates (z) in the reference group G. The variance for the estimate of δRMST can be obtained via the delta method. This approach can be tedious considering that the δRMST estimate includes the integration calculation. An alternative approach is to obtain variance estimates via bootstrap resampling (Efron & Tibshirani, 1993) which allows the computation of confidence intervals without requiring a normality assumption for the estimator.
4. SIMULATION STUDY
4.1. Simulation study design and methods
In this section, we used a simulation study to examine the statistical properties of the overall exposure effect estimators on the restricted mean survival time scales from two described estimation methods under a variety of hazard distributions. The interval-censored survival times (Li, Ri, δi) with cure fraction for the ith subject are generated following the sampling algorithm given below,
We sample a binary exposure indicator Xi (1 if exposed with 50% frequency, 0, otherwise) and a binary covariate Wi (50% frequency of Wi = 1 for both exposed and the non-exposed group);
We generate a binary latent variable Yi using a logistic regression P(Yi = 1) = 1/{1 + exp[-(α0 + α1xi + α2wi)]};
For each individual i, we sample an independent administrative censoring time CAi ~ N(6, 0.25) to curtail the follow-up around 6 years and a separate censoring time Ci from the exponential distribution with mean 6.
If Yi = 0, Li = min(CAi, Ci) < Ri = +∞ and censoring indicator δi = 0.
- If Yi = 1, the event time Ti is generated from the conditional hazard h(t ∣ zi) = h0(t)exp{ηi(zi)} with the baseline hazard h0(t) coming from eight parametric distributions in a manner similar to those we employed before (Wang & Albert, 2017) using the approach set out by Bender, Augustin, & Blettner (2005). The parametric distributions are specified as,
- FP with degree 1 and power P = 0.
- FP with degree 2 and powers P = (−1, 1).
- RCS with 1 interior knot.
- RCS with 3 interior knot.
- Gompertz distribution with positive shape parameter leading to a hazard function that increases with time.
- Gompertz distribution with negative shape parameter leading to a hazard function that decreases with time.
- Mixture Weibull distribution generating a bathtub shape hazard function.
-
Mixture Weibull distribution generating an inverted bathtub shape hazard function.Then we set δi = 1 if min(CAi, Ci, Ti) = Ti, and δi = 0 otherwise.
For δi = 1, we create interval length leni from independent distribution N(1, 0.2) and li from independent U(0, 1), then we choose Li = Ti – li × leni and Ri = Ti + (1 - li) × leni to satisfy Li < Ti ≤ Ri.
For each specified parametric distribution, we consider two scenarios by using different values of β, α0 and α1 to generate two different levels of censoring proportion, heavy censoring with censoring proportion 60% - 70% and moderate censoring with censoring proportion 35% - 45%. Plots of the baseline hazard function and baseline cumulative hazard function from the eight simulated parametric distributions for heavy censoring are presented in Figure S1 of online Appendix C. Regression coefficient values for intercept, exposure, and covariate for cure mixture models (α0, α1, α2, β1 and β2) and cured fractions used in simulation studies are presented in Table S1 of online Appendix D.
For each simulation scenario, 1000 datasets are generated with total sample sizes of 200 and 500 respectively. The APV approaches that apply the estimation procedures utilizing either FP/RCS parametric or piecewise linear approximation for the baseline hazard function were used to estimate the overall exposure effects on the RMST scale at the last observed time for each dataset. Estimators were calculated by summing over the empirical distribution of the baseline covariate from the exposed reference group. 95% confidence intervals were constructed with percentile estimates from 200 bootstrap samples for the FP/RCS parametric approximation approach and delta methods for the piecewise linear approximation approach (see online Appendix B for partial derivatives for the delta method). The true differences in overall means for the exposed versus unexposed groups at the last observed time are defined by the equations (7) and (8) with true coefficients and true parametric baseline hazard function in place of the estimates. From the simulations, we calculated the average estimate of the bias for the mean difference (MD); the average percent error (PE = 100 × (Estimated MD – True MD)/True MD), a measure of relative bias; the standard deviation (SD) of the estimated MD; the average estimated standard error (SE) of MD; and the coverage probability (CP, percent of simulated datasets for which 95% confidence interval for MD covers the true value).
4.2. Simulation study results
Table 1 and Table 2 show simulation results for the estimated overall exposure effect on the RMST scale from the APV approach using two different estimation procedures for the baseline hazard/survival function with moderate (35%−45%) and heavy right censoring proportion (60%−70%) in the presence of a cured fraction. From both Tables, we see that when the sample size is small (n = 100 per group), both estimation procedures may produce a small bias (average PE less than 6.6%) in estimating the mean difference using the APV approach for the sixteen simulation scenarios. When the censoring proportion decreases, we observe that both the SD and the mean SE of the estimated MD tend to be smaller in most simulation scenarios. The coverage probabilities of 95% CI from the bootstrap method for FP/RCS parametric approximation estimators and the delta method for the piecewise linear approximation estimators are close to the nominal level (within 2.0% for the overall exposure effect estimates on the RMST in both Table 1 and Table 2), and the average SE are also reasonably close to the SD of the estimates under different scenarios. These indicate that the proposed estimation and inference procedures work well for our finite samples. For simulation scenarios with larger sample size (n = 250 per exposure group in both Tables), the relative bias from both APV estimators is less than 2% for the overall exposure effect estimates on the RMST scale for all simulation scenarios with moderate or heavy censoring proportions.
Table 1.
Simulation statistics for the estimated overall exposure effects on RMST for interval-censored survival data with heavy right censoring proportion (60%-70%) in the presence of a cured fraction.
| Scenario‡ | True | FP or RCS Parametric Approximation† |
Piecewise Linear Approximation† |
||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Ave Bias |
Ave PE (%) |
SD of Est |
AVE SE |
CP (%) |
Ave Bias |
Ave PE (%) |
SD of Est |
AVE SE |
CP (%) |
||
| n = 100 per group | |||||||||||
| 1 | 0.444 | 0.015 | 3.3 | 0.387 | 0.394 | 95.3 | 0.012 | 2.8 | 0.390 | 0.387 | 94.6 |
| 2 | 0.438 | −0.006 | −1.1 | 0.429 | 0.436 | 95.1 | −0.008 | −1.6 | 0.428 | 0.430 | 93.8 |
| 3 | 0.443 | 0.006 | 1.4 | 0.452 | 0.441 | 95.6 | 0.003 | 0.8 | 0.455 | 0.435 | 94.5 |
| 4 | 0.525 | 0.027 | 5.3 | 0.371 | 0.379 | 96.2 | 0.023 | 4.5 | 0.374 | 0.375 | 94.7 |
| 5 | 0.450 | −0.007 | −1.6 | 0.412 | 0.415 | 96.0 | −0.010 | −2.2 | 0.411 | 0.411 | 94.4 |
| 6 | 0.447 | −0.010 | −2.1 | 0.444 | 0.439 | 95.1 | −0.013 | −2.7 | 0.445 | 0.435 | 93.9 |
| 7 | 0.455 | −0.001 | −0.4 | 0.441 | 0.444 | 95.6 | −0.009 | −2.0 | 0.441 | 0.437 | 94.7 |
| 8 | 0.448 | −0.025 | −5.8 | 0.419 | 0.431 | 95.8 | −0.029 | −6.5 | 0.419 | 0.425 | 94.7 |
| n = 250 per group | |||||||||||
| 1 | 0.456 | 0.008 | 1.8 | 0.268 | 0.257 | 94.1 | 0.004 | 1.0 | 0.269 | 0.259 | 93.6 |
| 2 | 0.45 | 0.006 | 1.3 | 0.271 | 0.281 | 95.4 | 0.001 | 0.1 | 0.272 | 0.281 | 95.2 |
| 3 | 0.457 | 0.004 | 0.8 | 0.283 | 0.286 | 96.3 | 0.001 | 0.2 | 0.285 | 0.286 | 96.7 |
| 4 | 0.535 | 0.008 | 1.5 | 0.249 | 0.247 | 95.5 | 0.002 | 0.4 | 0.249 | 0.249 | 95.2 |
| 5 | 0.461 | 0.001 | 0.2 | 0.262 | 0.271 | 95.7 | −0.001 | −0.3 | 0.263 | 0.270 | 94.6 |
| 6 | 0.46 | −0.001 | −0.4 | 0.289 | 0.285 | 95.3 | −0.003 | −0.7 | 0.290 | 0.285 | 95.0 |
| 7 | 0.468 | 0 | 0.1 | 0.286 | 0.287 | 95.8 | −0.006 | −1.3 | 0.290 | 0.287 | 95.0 |
| 8 | 0.461 | −0.005 | −1.1 | 0.283 | 0.280 | 94.4 | −0.004 | −0.9 | 0.283 | 0.280 | 94.7 |
: assessed at last observed time.
: see section 4.1 (v) for the baseline hazard function specification of different simulation scenarios.
Table 2.
Simulation statistics for the estimated overall exposure effects on RMST for interval-censored survival data with moderate right censoring proportion (35%-45%) in the presence of a cured fraction.
| Scenario‡ | True | FP or RCS Parametric Approximation† |
Piecewise Linear Approximation† |
||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Ave Bias |
Ave PE (%) |
SD of Est |
AVE SE |
CP (%) |
Ave Bias |
Ave PE (%) |
SD of Est |
AVE SE |
CP (%) |
||
| n = 100 per group | |||||||||||
| 1 | 0.461 | 0.009 | 2.0 | 0.400 | 0.385 | 94.3 | 0.007 | 1.5 | 0.403 | 0.382 | 93.3 |
| 2 | 0.458 | 0.002 | 0.4 | 0.400 | 0.391 | 95.1 | 0.003 | 0.5 | 0.401 | 0.386 | 95.1 |
| 3 | 0.480 | 0.009 | 1.8 | 0.360 | 0.361 | 95.8 | 0.008 | 1.7 | 0.36 | 0.356 | 94.7 |
| 4 | 0.526 | −0.011 | −2.1 | 0.373 | 0.359 | 94.9 | 0.004 | 0.8 | 0.382 | 0.365 | 93.4 |
| 5 | 0.446 | 0.013 | 2.9 | 0.400 | 0.388 | 94.5 | 0.010 | 2.3 | 0.398 | 0.382 | 94.1 |
| 6 | 0.462 | 0.006 | 1.5 | 0.426 | 0.409 | 94.0 | 0.004 | 1.0 | 0.427 | 0.405 | 93.4 |
| 7 | 0.465 | −0.015 | −3.2 | 0.411 | 0.424 | 96.0 | −0.020 | −4.3 | 0.413 | 0.420 | 95.5 |
| 8 | 0.463 | −0.014 | −2.9 | 0.395 | 0.390 | 94.0 | −0.016 | −3.3 | 0.396 | 0.386 | 93.6 |
| n = 250 per group | |||||||||||
| 1 | 0.473 | −0.001 | −0.1 | 0.256 | 0.251 | 95.0 | 0.001 | 0.3 | 0.256 | 0.251 | 94.7 |
| 2 | 0.47 | 0.001 | 0.2 | 0.267 | 0.254 | 95.2 | −0.002 | −0.4 | 0.266 | 0.255 | 95.5 |
| 3 | 0.491 | −0.003 | −0.6 | 0.241 | 0.236 | 94.7 | 0 | 0 | 0.242 | 0.235 | 93.2 |
| 4 | 0.541 | −0.008 | −1.4 | 0.242 | 0.241 | 95.4 | −0.006 | −1.1 | 0.240 | 0.241 | 96.0 |
| 5 | 0.458 | 0.004 | 1.0 | 0.249 | 0.251 | 94.8 | 0.002 | 0.5 | 0.248 | 0.250 | 95.0 |
| 6 | 0.473 | 0.003 | 0.7 | 0.270 | 0.266 | 94.6 | 0.001 | 0.2 | 0.271 | 0.266 | 94.1 |
| 7 | 0.475 | −0.005 | −1.0 | 0.279 | 0.274 | 94.3 | −0.001 | −0.2 | 0.277 | 0.272 | 93.2 |
| 8 | 0.475 | −0.002 | −0.5 | 0.261 | 0.253 | 94.6 | −0.003 | −0.6 | 0.260 | 0.252 | 94.8 |
: assessed at last observed time.
: see section 4.1 (v) for the baseline hazard function specification of different simulation scenarios.
We also performed simulation studies for sample size n = 100, and the simulation results are not so robust for few simulation scenarios (see online Appendix E Table S2). Since we used either AIC-selected simple parametric model (FP or RCS) or piecewise linear function to approximate the cumulative baseline hazard function of the susceptible subjects for the overall exposure effect estimation, the results are somehow expected when the sample size is not so large. Upon the reviewers’ request, we also performed additional simulation studies by designing simulation scenarios with close to zero cured fraction (see online Appendix E Table S3) as well as several simulation scenarios including one binary and one continuous covariate (see online Appendix E Table S4). Both APV estimators performed well for the overall exposure effect estimation on the RMST scale for all simulation scenarios when the cure fraction is close to zero or continuous covariates were included in the cure mixture model setting.
5. APPLICATION TO THE SMOKING CESSATION DATA
For illustration purposes, the described methods are applied to the interval-censored smoking relapse data from a smoking cessation study. The data are available in the following webpage, https://github.com/souwwang/Cox-Cure-Overall-Exposure.git. Of the 223 smokers that did not smoke when they entered into this study, it was found that 158 of these 223 subjects had not resumed smoking during the study period, giving a cure proportion of 70.9%. Our primary goal is to determine whether the smoking intervention (SI) can be more effective to help the participants stop smoking comparing with the usual care (UC) control treatment. In addition to the intervention type (SI/UC), other covariates included in both the logistic regression and the proportional hazards components of the mixture cure model are sex, duration as smokers in years, and number of cigarettes smoked per day. The maximum likelihood estimates of the mixture cure models are obtained by applying FP/RCS approximation or piecewise linear approximation for the baseline cumulative hazard function as described in Section 2.2. The survival function estimates for the interval censored data using Turnbull’s nonparametric estimator (1976) are shown in Figure 1 (solid line without considering the cure fraction), together with the fitted marginal survival functions from AIC-selected 3 d.f. RCS model (dashed line, Panel A) and piecewise linear approximation (dashed line, Panel B). Both the 3 d.f. RCS model and piecewise linear approximation of the baseline cumulative hazard function give reasonable agreement with the Turnbull’s estimates, suggesting satisfactory fits of both estimation procedures. The estimated regression coefficients are presented in Table 3. The smoking intervention (SI) has positive effect to reduce smoking relapse susceptibility as well as increase the relapse duration for piecewise linear approximation procedure. In contrast, the smoking intervention (SI) has positive effect to reduce smoking relapse susceptibility but shorten the relapse duration for 3 d.f. RCS estimation procedure. The SI treatment effects do not reach statistical significance at the 0.05 level from both estimation procedures. When using the piecewise linear approximation, it is found that individuals who have smoked for a longer duration have a significantly lower possibility of the relapse, but a shorter duration to resume smoking. Subjects that have greater cigarette consumption tend to have a higher probability to smoke again but relapse slower than those with less cigarette consumption given the subject is susceptible.
Figure 1.
Estimated marginal survival function curves using AIC-selected 3 d.f. RCS model (dashed line, Panel A) and piecewise linear approximation (dashed line, Panel B), overlaid on the Turnbull’s survival estimates (solid line, Panel A and B) and its associated 95% confidence interval (gray areas, Panel A and B).
Table 3.
Parameter estimates, overall exposure effect estimates and the standard errors (SE) or 95% Confidence Intervals (CIs) from the mixture cure models for the smoking cessation data.
| 3 d.f. RCS model Approximation |
Piecewise Linear Approximation |
|||||||
|---|---|---|---|---|---|---|---|---|
| Logistic model | Survival model | Logistic model | Survival model | |||||
| Estimate | SE | Estimate | SE | Estimate | SE | Estimate | SE | |
| Intercept | 0.2608 | 1.1513 | - | - | 1.7668 | 2.0563 | - | - |
| Sex (male = 0) | 0.4057 | 0.4072 | 0.2734 | 0.4529 | 0.1170 | 0.6243 | 0.5215 | 0.5054 |
| Duration as smoker | −0.0701 | 0.0494 | 0.0626 | 0.0395 | −0.1291* | 0.0630 | 0.0799* | 0.0358 |
| Cigarettes/day | 0.0606 | 0.0498 | −0.0698* | 0.0265 | 0.0874 | 0.0461 | −0.0563* | 0.0281 |
| SI vs. UC (usual care = 0) | −0.7354 | 0.4433 | 0.2524 | 0.5454 | −0.5391 | 0.6021 | −0.1651 | 0.5271 |
| Overall Exposure (SI) Effect on RMST, year† | 0.299 (−0.203, 0.732) | 0.320 (−0.139, 0.778) | ||||||
: assessed at last observed time.
: p < 0.05.
Table 3 also shows the inference results of the overall exposure effects at last observed time (5.492 years) on the RMST scale using the APV approach. The SI exposed group is used as the reference group, and the overall exposure estimates show the similar results from both estimation procedures. Comparing with the usual treatment, the ‘smoking-free life expectancy (with follow up restricted to 5.492 years)’ increases 0.299 years (95% CI: −0.203, 0.732) and 0.320 years (95% CI: −0.139, 0.778) for two estimation procedures respectively. In addition, we also plot the estimated overall exposure effect estimates over the range from 0.85 to 5.492 on the RMST scale (Figure 2). The estimates are comparable for two different estimation procedures and no significant overall exposure effects are detected over the whole range on both scales (95% confidence intervals for the overall exposure effect estimates fail to exclude 0).
Figure 2.

Estimated overall exposure (SI) effect on RMST (Panel A and B) at different assessment time points using AIC-selected 3 d.f. RCS approximation (Panel A) or piecewise linear approximation (Panel B) procedures. Solid line, estimated overall exposure (SI) effect; gray area, the associated 95% confidence intervals.
6. DISCUSSION
It is challenging to analyze interval-censored survival time data with a cure fraction, and current described analysis methods focus on different model settings and the inference targets on the exposure and covariate effects of separate components using the mixture cure model. In this article, we suggest using the APV approach to compare the overall response means between groups while controlling for other baseline covariates for the proportional hazards cure model with interval-censored survival data. Two estimation procedures are implemented to avoid the risk of model misspecification of the survival distribution. The first method uses AIC-selected simple parametric model (FP or RCS) to approximate the logarithm baseline cumulative hazard function and the second approach applies a piecewise linear approximation for the baseline cumulative hazard function. The estimate of overall exposure effect is then obtained by plugging the estimated coefficients into model-predicted overall mean difference through a post-estimation procedure. The simulation study shows that the applied methods provide satisfactory estimation results of the overall exposure effect over various baseline hazard distribution shapes especially when the sample size is large, and we demonstrate the applicability of the proposed methods using a smoking cessation data example. The methods described have been implemented in a SAS Macro, which is available for downloading from the following webpage, https://github.com/souwwang/Cox-Cure-Overall-Exposure.git.
Although the original data set provides zip code information indicating that the data were collected from study subjects living in 51 zip code regions in the southeastern corner of Minnesota, the clustering of the subjects is not considered in our analysis due to the weak correlation of subjects that reside in the same zip code region. To handle the possible within-cluster correlation, the Cox proportional hazards cure mode can be fitted by introducing correlated random effects, termed the frailty of the subject, into both the logistic regression and survival function components through linear predictor expressions (Kim & Jhun, 2008; Xiang, Ma, & Yau, 2011). Previous analysis results for this smoking cessation study example showed that the estimates based on the models with and without frailty give similar results (Kim & Jhun, 2008) and the estimated standard deviation of the random effects in the logistic model and the survival model are not statistically significant (Yu & Peng, 2008). The APV approach in this paper can be extended to the cure frailty model to accommodate the dependence of subjects within the same cluster and estimate model predicted overall mean difference through the integration over the frailty distribution, but the computation would be much more extensive.
In this paper, we used either AIC-selected simple parametric model (FP or RCS) or piecewise linear function to model the survival distribution of the susceptible subjects. For eight designed complex hazard shapes in our simulation studies including two FP models with degree 1 or 2, two RCS models with 1 or 3 internal knots, and four common types of hazard function (increasing, decreasing, bathtub shape and inverted bathtub shape), our estimation models approximate the baseline hazard function to a sufficient degree of accuracy for the overall exposure effect estimation, especially for cases when the sample size is large. The performance of these two approximation procedures for the overall exposure effect estimation in our simulation study is comparable, however, we cannot use model selection criteria, such as the AIC or Bayesian information criterion (BIC) to select among FP/RCS vs. piecewise linear function, since the latter is completely data driven and needs to estimate re-parameterized survival functions at each observed time point thus resulting in inflated AIC/BIC values in this piecewise linear approximation approach. In addition, the variance of the estimated exposure effect from the delta method is not calculable in 10–20% of data sets for some simulation scenarios using piecewise linear approximation approach, possibly due to the large number of estimable parameters and unstable covariance matrix in this approach.
For the defined overall exposure effect on the RMST scale, we need to choose a specific time point t* at which the RMST function is evaluated. In this paper, we chose t* as the last observed time, and we also estimated the overall exposure effects over the whole range until last observed time for our data example. Other clinically motivated, t* occurring before the last observed time may be used, e.g. five- or ten-year survival for common cancers. Since our overall exposure effect estimation equation (8) need the survival distribution profile until t*, we recommend choosing a time point before the last observed time in order to obtain ‘accurate’ survival distribution approximation for the overall exposure effects estimation (Wang & Albert, 2017).
Cure mixture models require not only biological evidence for cure possibility, but also a reasonably long follow-up time (Farewell, 1986). The latter is important, since insufficient follow-up time can lead overestimated cure rate. If the follow-up time is long enough, and the limit of the survival function suggests a nonzero asymptote, a cure mixture model can be considered (Laska & Meisner, 1992). Maller and Zhou (1994) proposed a nonparametric test to decide whether the follow-up time is sufficient, and this test is based on the length of the interval at the right tail where the Kaplan-Meier estimator of the survival function is constant. Hsu, Todem and Kim constructed a sup-score test statistic for cure fractions and established its limiting null distribution as a functional of mixtures of chi-square processes. In practice, they suggested a simple resampling procedure to approximate this limiting distribution and used it for the hypothesis testing (2016). Lopez-Cheda et al. proposed a nonparametric covariate hypothesis test for the probability of cure in mixture cure models and applied a bootstrap method to approximate the null distribution of the test statistic (2020). In addition, some residual-based diagnostic methods were developed to evaluate the fit of cure mixture models. Wileyto et al. (2012) proposed a pseudo-residual, modeled on Schoenfeld’s, to assess the fit of the survival regression in the non-cured fraction, Peng and Taylor (2017) proposed modified martingale and Cox–Snell residuals that are specific for the uncured group of the mixture cure model and Scolas et al. (2018) defined the Cox-Snell residuals intended to check the survival distribution of the uncured sub-population and of the entire population when all subjects are right- or interval-censored and some subjects are cured with unknown cure status. The discussion of hypothesis testing for the cure fraction and diagnostic checks in mixture cure models is beyond the scope of this paper.
In conclusion, we have described and successfully applied a post-estimation APV approach to estimate the population-wide overall exposure effect for the Cox proportional hazards cure model with interval-censored survival data. Simple parametric model or piecewise linear function is used to approximate the baseline cumulative hazard function, allowing estimation of model coefficients and the overall exposure effect on the RMST scale. For the survival component of the susceptible subjects, we assume the proportional hazards model in this paper, but our approach can be extended to accelerated failure time model (AFT) (Lambert et al., 2004) or extended hazard model (Tong et al., 2013) which would provide alternative ways of dealing with interval-censored survival data with a cure fraction.
Supplementary Material
ACKNOWLEDGEMENT
Dr. Hui Zhang receives supports from the Northwestern Brain Tumor SPORE (NCI Grant #P50CA221747), Mesulam Center for Cognitive Neurology and Alzheimers Disease (NIA Grant #P30AG013854), and Robert H. Lurie Comprehensive Cancer Center (NCI Grant #P30CA060553), all of which are affiliated with Northwestern University Feinberg School of Medicine in Chicago, IL.
Footnotes
CONFLICT OF INTEREST
The authors have declared no conflict of interest.
SUPPORTING INFORMATION
Additional supporting information may be found in the online version of this article at the publisher’s website.
REFERENCES
- Albert JM, Wang W, & Nelson S (2011). Estimating overall exposure effects for zero-inflated regression models with application to dental caries. Statistical Methods in Medical Research, 23, 257–278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bender R, Augustin T, & Blettner M (2005). Generating survival times to simulate Cox proportional hazards models. Statistics in Medicine, 24, 1713–1723. [DOI] [PubMed] [Google Scholar]
- Berkson J, & Gage RP (1952). Survival curve for cancer patients following treatment. Journal of American Statistical Association 47, 501–515. [Google Scholar]
- Dempster AP, Laird NM, & Rubin DB (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39, 1–38. [Google Scholar]
- Diao G, & Yuan A (2019) A class of semiparametric cure models with current status data. Lifetime Data Analysis, 25, 26–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Durrelman S, & Simon R (1989). Flexible regression models with cubic splines. Statistics in Medicine, 8, 551–561. [DOI] [PubMed] [Google Scholar]
- Efron B, & Tibshirani R (1993) An introduction to the bootstrap. New York: Chapman and Hall. [Google Scholar]
- Farewell VT (1982). The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 38, 1041–1046. [PubMed] [Google Scholar]
- Farewell VT (1986) Mixture models in survival analysis: are they worth the risk? Canadian Journal of Statistics, 14, 257–262. [Google Scholar]
- Gong Q, & Fang L (2013). Comparison of different parametric proportional hazards models for interval-censored data: a simulation study. Contemporary Clinical Trials 36, 276–283. [DOI] [PubMed] [Google Scholar]
- Greenland S (2001). Model-based estimation of relative risks and other epidemiological measures in studies of common outcomes and in case-control studies. American Journal of Epidemiology, 160, 301–305. [DOI] [PubMed] [Google Scholar]
- Hsu WW, Todem D, Kim K. A sup-score test for the cure fraction in mixture models for long-term survivors. Biometrics. 2016. Dec;72(4):1348–1357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Irwin JO (1949). The standard error of an estimate of expectation of life, with special reference to expectation of tumourless life in experiments with mice. The Journal of Hygiene, 47, 188–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim YJ, & Jhun M (2008). Cure rate model with interval censored data. Statistics in Medicine, 27, 3–14. [DOI] [PubMed] [Google Scholar]
- Kuk AYC, & Chen CH (1992). A mixture model combining logistic regression with proportional hazards regression. Biometrika, 79, 531–541. [Google Scholar]
- Lambert P, Collett D, Kimber A, & Johnson R (2004). Parametric accelerated failure time models with random effects and an application to kidney transplant survival. Statistics in Medicine, 23, 3177–3192. [DOI] [PubMed] [Google Scholar]
- Laska EM, & Meisner MJ (1992) Nonparametric estimation and testing in a cure model. Biometrics, 48, 1223–1234. [PubMed] [Google Scholar]
- Li CS, Taylor JMG, & Sy JP (2001) Identifiability of cure models. Statistics & Probability Letters, 54, 389–395. [Google Scholar]
- Liu X, Peng Y, Tu D, & Liang H (2012) Variable selection in semiparametric cure models based on penalized likelihood, with application to breast cancer clinical trials. Statistics in Medicine, 31, 2882–2891. [DOI] [PubMed] [Google Scholar]
- López-Cheda A, Jácome MA, Van Keilegom I, Cao R (2020) Nonparametric covariate hypothesis tests for the cure rate in mixture cure models. Statistics in Medicine, 39, 2291–2307. [DOI] [PubMed] [Google Scholar]
- Maller RA & Zhou S (1994). Testing for sufficient followup and outliers in survival data. Journal of the American Statistical Association, 89, 1499–1506. [Google Scholar]
- Murray R, Anthonian NR, Connett JE, Wise RA, Lindgren PG, Green PG, & Nides MA (for the Lung Health Study Research Group). (1998). Effects of multiple attempts to quit smoking and relapses to smoking on pulmonary function. Journal of Clinical Epidemiology, 51, 1317–1326. [DOI] [PubMed] [Google Scholar]
- Pan W (1999). Extending the iterative convex minorant algorithm to the cox model for interval-censored data. Journal of Computational and Graphical Statistics, 8, 109–120. [Google Scholar]
- Peng Y, Dear KBG, & Denham JW (1998). A generalized F mixture model for cure rate estimation. Statistics in Medicine, 17, 813–830. [DOI] [PubMed] [Google Scholar]
- Peng Y, & Dear KBG (2000). A nonparametric mixture model for cure rate estimation. Biometrics, 56, 237–243. [DOI] [PubMed] [Google Scholar]
- Peng Y, & Taylor JMG (2017). Residual-based model diagnosis methods for mixture cure models. Biometrics. 56, 495–505. [DOI] [PubMed] [Google Scholar]
- Royston P (2011). Estimating a smooth baseline hazard function for the Cox model. Technical Report, University College London. [Google Scholar]
- Royston P, & Parmar MKB (2002). Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Statistics in Medicine, 21, 2175–2197. [DOI] [PubMed] [Google Scholar]
- Royston P, & Parmar MKB (2011). The use of restricted mean survival time to estimate the treatment effect in randomized clinical trials when the proportional hazards assumption is in doubt. Statistics in Medicine, 30, 2409–2421. [DOI] [PubMed] [Google Scholar]
- Sauerbrei W & Royston P (1999). Building multivariable prognostic and diagnostic models: transformation of the predictors using fractional polynomials. Journal of the Royal Statistical Society, Series A, 162, 71–94. [Google Scholar]
- Scolas S, El Ghouch A, Legrand C, & Oulhaj A (2016) Variable selection in a flexible parametric mixture cure model with interval-censored data. Statistics in Medicine, 35, 1210–1225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scolas S, Legrand C, Oulhaj A, & El Ghouch A (2018) Diagnostic checks in mixture cure models with interval-censoring. Statistical Methods in Medical Research, 27, 2114–2131. [DOI] [PubMed] [Google Scholar]
- Sparling YH, Younes N, Lachin JM, & Bautista OM (2006). Parametric survival models for interval-censored data with time-dependent covariates. Biostatistics, 7, 599–614. [DOI] [PubMed] [Google Scholar]
- Sy JP, & Taylor JMG (2000). Estimation in a Cox proportional hazards cure model. Biometrics, 56, 227–236. [DOI] [PubMed] [Google Scholar]
- Taylor JMG (1995). Semi-parametric estimation in failure time mixture models. Biometrics, 51, 899–907. [PubMed] [Google Scholar]
- Tong X, Zhu L, Leng C, Leisenring W, & Robison LL (2013). A general semiparametric hazards regression model: efficient estimation and structure selection. Statistics in Medicine, 32, 4980–4994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsodikov A, Ibrahim JG, & Yakovlev AY (2003). Estimating cure rates from survival data: an alternative to two-component mixture models. Journal of the American Statistical Association, 98, 1063–1078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Turnbull BW (1976). The empirical distribution function with arbitrarily grouped, censored and truncated data. Journal of the Royal Statistical Society. Series B, 38, 290–295. [Google Scholar]
- Wang L, Du P, & Liang H (2012). Two-component mixture cure rate model with spline estimated nonparametric components. Biometrics, 68, 726–35. [DOI] [PubMed] [Google Scholar]
- Wang W, & Albert JM (2017). Causal mediation analysis for the Cox proportional hazards model with a smooth baseline hazard estimator. Journal of the Royal Statistical Society, Series C, 66, 741–757. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang W, & Griswold ME (2017). Natural interpretations in Tobit regression models using marginal estimation methods. Statistical Methods in Medical Research, 26, 2622–2632. [DOI] [PubMed] [Google Scholar]
- Weakliem DL (1999) A critique of the Bayesian information criterion for model selection. Sociological Methods & Research, 27, 359–397. [Google Scholar]
- Wileyto EP, Li Y, Chen J, & Heitjan DF (2013) Assessing the fit of parametric cure models. Biostatistics,14, 340–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Withers HR, Peters LJ, Taylor JM, … Hanks GE (1995). Local control of carcinoma of the tonsil by radiation therapy: an analysis of patterns of fractionation in nine institutions. International Journal of Radiation Oncology, Biology, Physics, 33, 549–562. [DOI] [PubMed] [Google Scholar]
- Xiang L, Ma X, & Yau KKW (2011). Mixture cure model with random effects for clustered interval-censored survival data. Statistics in Medicine, 30, 995–1006. [DOI] [PubMed] [Google Scholar]
- Yu B, & Peng Y (2008). Mixture cure models for multivariate survival data. Computational Statistics and Data Analysis, 52, 1524–1532. [Google Scholar]
- Zhang X, & Akcin H (2012). A SAS macro for direct adjusted survival curves based on Aalen's additive model. Computer Methods and Programs in Biomedicine, 108, 310–317. [DOI] [PubMed] [Google Scholar]
- Zeng D, & Lin DY (2007). Efficient estimation for the accelerated failure time model. Journal of American Statistical Association, 102, 1387–1396. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.

