Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Dec 4.
Published in final edited form as: Biom J. 2009 Dec;51(6):932–945. doi: 10.1002/bimj.200800244

An Accelerated Failure Time Mixture Cure Model with Masked Event

Jenny J Zhang 1,*, Molin Wang 1,2
PMCID: PMC4669581  NIHMSID: NIHMS509858  PMID: 20029894

Abstract

We extend the Dahlberg and Wang (Biometrics 2007, 63, 1237–1244) proportional hazards (PH) cure model for the analysis of time-to-event data that is subject to a cure rate with masked event to a setting where the PH assumption does not hold. Assuming an accelerated failure time (AFT) model with unspecified error distribution for the time to the event of interest, we propose rank-based estimating equations for the model parameters and use a generalization of the EM algorithm for parameter estimation. Applying our proposed AFT model to the same motivating breast cancer dataset as Dahlberg and Wang (Biometrics 2007, 63, 1237–1244), our results are more intuitive for the treatment arm in which the PH assumption may be violated. We also conduct a simulation study to evaluate the performance of the proposed method.

Keywords: Accelerated failure time model, Cure rate, EM algorithm, Masked event, Rank-based estimating equations

1 Introduction

There are certain clinical studies where any of a number of different potential events may lead to an observed failure, of which only one is the event of interest and the exact event may not always be identifiable. We refer to the unidentifiable failures as masked. In addition, a portion of the patient population is cured, i.e. they do not experience the event of interest. In this paper, we propose a semiparametric method to model the cure rate (incidence), the failure time distribution of the event of interest (latency), and the covariate effects on both when the proportional hazard (PH) assumption does not hold.

Cure rate models were first presented by Berkson and Gage (1952) as an appropriate method to model data where a portion of the subjects may not experience the event of interest. Since then, many parametric and semiparametric mixture cure models have been proposed (e.g. Peng, Dear, and Denham, 1998; Sy and Taylor, 2000; Peng and Dear, 2000; Li and Taylor, 2002; Zhang and Peng, 2007). Estimation methods for survival data that account for masked event have also been studied in a number of situations (e.g. Goetghebeur and Ryan, 1995; Flehinger, Reiser, and Yashchin, 1998; Craiu and Duchesne, 2004). The accelerated failure time (AFT) model was first advocated as a useful alternative to the PH model for censored time-to-event data by Wei (1992). Although the PH model specifies that the effects of the covariates act multiplicatively on the hazard function, the AFT model regresses the logarithm of the failure times over the covariates, postulating a direct relationship between failure time and the covariates. Despite the theoretical advances made in the last decade (e.g. Tsiatis, 1990; Ritov, 1990; Fygenson and Ritov, 1994; Jin et al., 2003), semiparametric methods for the AFT model have rarely been used in applications due to the scarcity of efficient and reliable computational methods.

Very few estimation methods are available for time-to-event data subject to a cure rate with masked event. Dahlberg and Wang 2007, proposed a semiparametric PH mixture cure model for such data (for ease of reference, Dahlberg and Wang, 2007, will be referred to as DW hereafter). Assuming a PH model for the latency and a logistic model for the incidence, DW used the EM algorithm to conduct likelihood maximization and parameter estimation. Their motivating example came from International Breast Cancer Study Group (IBCSG) Trial VIII, where premenopausal, node-negative breast cancer patients were randomized to four treatment arms, and stratified according to estrogen-receptor (ER) status, whether radiotherapy was planned after surgery, and institution. We use the same example here to motivate our work; please refer to DW and Castiglione-Gertsch et al. (2003) for details on IBCSG Trial VIII.

It has been shown that some breast cancer adjuvant therapies for premenopausal women may interrupt menses, or induce amenorrhea (i.e. treatment-induced amenorrhea or TIA). As the number of young women diagnosed with breast cancer continues to increase, so does the demand for information about the impact of adjuvant therapies on menses and fertility. Such information strongly influences the treatment decisions of these patients (Partridge et al., 2004). The underlying process of TIA, however, is not well understood and the observed data is complicated by the fact that menopause may also occur. Since both TIA (event I) and menopause (event II) result in an observed cessation of menses (the failure), the event leading to the observed failure is masked unless the patient recovers her menses (event III) after treatment. Moreover, a cured proportion is assumed to exist in the population since not all patients on treatment may experience TIA.

When DW applied their PH model to the goserelin arm of the IBCSG data, they found that older patients take a significantly longer time to experience TIA than younger patients, which is counter-intuitive since older patients are expected to have higher degrees of ovarian function suppression. Figure 1 plots the estimated log-integrated hazard versus log(time) for categorical age for the goserelin arm; a possible violation of the PH assumption is suggested by the crossing of the curves. Thus, DW’s PH model may be inappropriate for this treatment arm. As an alternative, the AFT model does not require the PH assumption and has parameter interpretations that bring different insights into the problem.

Figure 1.

Figure 1

Proportional hazards assumption check for goserelin arm of IBCSG Trial VIII.

In this paper, we extend DW’s PH mixture cure model for time-to-event data with masked event to the setting where the PH assumption does not hold through use of an AFT model. Sections 2 and 3 describe, respectively, the model and estimation method in detail. In Section 4, we discuss results from a simulation study performed to evaluate the proposed method, and in Section 5, we apply the method to the IBCSG Trial VIII data. We close with some discussion in Section 6.

2 AFT Mixture Cure Model

We use the same notation and definitions as DW, and refer the reader to Section 2.1 of their paper for details. Both Szwarc and Bonetti (2006) and DW assume that TIA may occur after menopause (and before treatment end) given the subject is uncured. We adopt the same assumption. We argue that such an assumption is valid since, although TIA would be an unobservable event in this case, the underlying, treatment-induced, biological changes would still occur. Moreover, menopause may also occur after TIA, with or without recovery first. Unlike chemotherapy, which may permanently damage ovarian function (deHaes et al., 2003), there is little evidence to suggest that hormonal therapy (e.g. goserelin) has any effect on the natural process of menopause.

For the goserelin arm of the IBCSG trial that motivated this work, Szwarc and Bonetti (2006) showed that the occurrence of TIA due to goserelin does not affect time to menopause. Thus, we assume that time to TIA (event I) and time to menopause (event II) are independent conditional on the covariates. Note that our interest lies only in TIA, and menopause is essentially regarded as a censoring event. The only difference between menopause in our setting and a censoring event in the standard time-to-event data setting is that, given a failure, we may not know whether it is TIA or menopause. We also assume that the other censoring events are independent conditional on the covariates. In a doctoral thesis (Zhang and Wang, 2008), we investigated the relationship between TIA and disease recurrence or death for the goserelin arm; no notable association was found, controlling for patient characteristics.

We propose the following mixture cure model for the time to the event of interest (event I), T1,

ST1(t1i)=αiST1(t1i|τi=1)+(1αi),

where αi = Pi = 1). The incidence, αi, is modeled using a logistic regression and the latency, ST1(t1ii = 1), is modeled using an AFT model with unspecified error distribution. Specifically, the model for the incidence is

αi(Zi)=exp(ZiT)1+exp(ZiT), (1)

where is a vector of regression parameters for the vector of covariates Zi and T denotes the transpose. The model for the latency is

log(T1i)=ZiβT+εi, (2)

where T1i is the time to event I for the i-th subject, β is a vector of regression parameters for the covariates Zi, and the distribution of the independent error terms, εi, is unspecified. Let fε(․) and Fε(․) represent the probability and cumulative density functions of ε, respectively. For simplicity of exposition, we assume that time to event II, T2, follows a parametric distribution with parameters Ω, where fT2(․) and FT2(․) are the corresponding probability and cumulative density functions, respectively. The method can be easily extended such that T2 follows a semiparametric distribution.

3 Estimation

3.1 Complete-data likelihood

Since direct maximization of the observed likelihood is difficult, we use an iterative method analogous to the EM algorithm to estimate the set of parameters Θ = {ᾱ, β, Ω}. Let εi(β) = log(yi) − ziβT, where yi is the minimum of the failure time (xi) and the censoring time (ci). The complete-data likelihood, L(Θ|𝒫i), can be written and factored into three distinct components, L1(|𝒫i), L2(β|𝒫i), and L3(Ω|𝒫i), as in Section 3.1 of DW with the following notational exceptions due to model differences: (i) FT1(․) in DW is Fε(․) in our setting and (ii) t(j) in DW is ε(j)(β), where ε(1)(β)<ε(2)(β)< ⋯ <ε(k)(β) are the k distinct, ordered, uncensored failure residuals. Formulations of L1(․), L2(․), and L3(․) are given in Appendix A for ease of reference.

3.2 Complete-data estimating functions

For L1(|𝒫i) and L3(Ω|𝒫i), the corresponding complete-data estimating functions are simply the score functions, denoted by U1(ᾱ, 𝒫i) and U3(Ω, 𝒫i), respectively. Specifically,

U1(,𝒫i)=i:δi=6τizii:δi=1ζi(1τi)zi+i:δi{1,2,3}zii:δi{4,5,6}ziαi (3)

and

U3(Ω,𝒫i)=Ω[i:δi=1ζilog{fT2(xi)}]Ω[i:δi=1ζilog{1FT2(xi)}]+Ω[i:δi{1,2,4,6}log{1FT2(xi)}]+Ω[i:δi{3,5}log{fT2(t2i)}]. (4)

A score function for β based on L2(β|𝒫i) requires the correct specification of a parametric distribution for the error, ε, in the AFT model. As an alternative, we propose a complete-data estimating function, U2(β, 𝒫i), which does not require knowledge of the distribution of ε.

Fygenson and Ritov (1994) proposed the following estimating function for censored time-to-event data in the standard framework (i.e. no cure rate or masked event), which is easily shown to be monotone in each component of β:

U(β)=n1i=1nh=1nφi(zizh)1(εh(β)εi(β)), (5)

where φi is the censoring indicator (0 = censored, 1 = uncensored), and 1(․) denotes the indicator function. In our setting, we have two possible events leading to a failure (event I and event II) and we are only interested in failures due to event I. Moreover, some subjects are cured with respect to event I. These complications will lead to a modified form of estimating function (5).

Let Di denote the event that the i-th subject is observed to fail and the underlying event leading to failure (whether masked or unmasked) is event I, and Ei,h denote the event that the h-th subject is uncured and in the risk set for event I at εi(β). Also, let Δ be the censoring indicator of whether (1) or not (0) a failure is observed before treatment end (U), the upper bound of T1. An estimating function analogous to (5) is then n1i=1nh=1n1(DiEi,h)(zizh). It is straightforward to show that 1(DiEi,h) can be written as viwh1(εh(β) ≥ εi(β)), where vi = Δi{(1 − γi)(1 − ζi)+γi} and wh = Δh[(1 − γh){(τhζh)+(1 − ζh)}+γh]+(1 − Δh)(τh).

Thus, we propose the following estimating function for β:

U2(β,𝒫i)=n1i=1nh=1nviwh(zizh)1(εh(β)εi(β)). (6)

The unbiasedness of U2(β, 𝒫i) follows directly from the fact that it can be rewritten as a U-statistic with a symmetric kernel:

U2(β,𝒫i)=n1i=2nh=1i1(zizh){viwh1(εh(β)εi(β))vhwi1(εi(β)εh(β))}.

It follows that, the set of complete-data estimating equations for Θ is

Uc(𝒫i,Θ)=(U1(,𝒫i),U2(β,𝒫i),U3(Ω,𝒫i))=0.

3.3 The ES algorithm

We apply the ES algorithm, an iterative estimation method analogous to the EM algorithm (Dempster, Laird, and Rubin, 1977) proposed by Elashoff and Ryan (2004) for parameter estimation. The ES algorithm accommodates missing data in cases where a set of estimating equations can be found for the complete-data setting. In essence, it can be seen as a generalization of the EM algorithm for estimating equations; if the estimating equations arise as score functions from a standard likelihood, then the ES algorithm reduces to the EM algorithm.

To employ the ES algorithm, we need to rewrite Uq(𝒫i, Θ), q = 1, 2, 3, as the sum of a function involving the complete data, Sq(𝒫i), and a function involving only the observed data, bq(Oi). That is, Uq(𝒫i, Θ) = Sq(𝒫i)+bq(Oi). For U1(ᾱ, 𝒫i) and U3(Ω, 𝒫i), S1(𝒫i) and S3(𝒫i) are the first two terms of (3) and (4), respectively, while b1(Oi) and b3(Oi) are the last two terms of (3) and (4), respectively. For U2(β, 𝒫i),

S2(𝒫i)=n1i=1nh=1nϑih(zizh)1(εh(β)εi(β))

and

b2(Oi)=n1i=1nh=1nϖih(zizh)1(εh(β)εi(β)),

which are obtained by noting that viwh in estimating function (6) can be re-expressed as ϑihih, where ϑih = Δi[(1 − γi){Δh(1 − γh)[ζh(1 − τh)+ζi(1 − ζhhτh)]}+γi{(1 − Δhh − Δh(1 − τh)(1 − γhh}] involves the complete data and ϖih = ΔiΔh{1 − γi(1 − γh)} involves only the observed data.

We can then replace S(𝒫i) = (S1(𝒫i), S2(𝒫i), S3(𝒫i)) in Uc(𝒫i, Θ) with E[S(𝒫i)|Oi, Θ], the expectation of S(𝒫i) conditional on the observed data, Oi, and the unknown parameters, Θ, resulting in new estimating equations that involve only the observed data:

Uobs(Oi,Θ)=E[S(𝒫i)|Oi,Θ]+b(Oi)=0,

where b(Oi) = (b1(Oi), b2(Oi), b3(Oi)). Following the arguments in Elashoff and Ryan (2004), Uobs(․) = 0 is an unbiased estimating equation for Θ. The estimate of Θ can be obtained by solving Uobs(․) = 0 using the ES algorithm, which iterates between an E-step and a S-step. The E-step computes Ŝ = E[S(𝒫i)|Oi, Θ(m)], the conditional expectation of S(𝒫i) with respect to the latent variables τi and ζi given the observed data and the latest updated parameter estimates, Θ(m). The S-step substitutes the conditional expectations calculated in the E-step into the complete-data estimating equations and solves Uobs(․) = Ŝ+b(Oi) = 0.

To solve the component of Uobs(․) that involves the AFT model parameter β, we note that U2(β, 𝒫i) can be taken as the gradient of

2(β,𝒫i)=n1i=1nh=1nϑih|εi(β)εh(β)|1(εi(β)<εh(β))+n1i=1nh=1nϖih|εi(β)εh(β)|1(εi(β)<εh(β)),

which is a convex function, and finding the root of U2(β, 𝒫i) = 0 is equivalent to minimizing ℋ2(β, 𝒫i). Thus, the BFGS algorithm (Press et al., 1992), a quasi-Newton method where the Hessian is updated by analyzing successive gradient vectors, is used to solve the estimating equations in the S-step. This algorithm is implemented with the optim() function in the statistical software package R given that you provide the gradient. Since the rank-based estimating function for β given in (6) is not continuous, a sandwich variance estimator is difficult to calculate. Therefore, we consider the case resampling bootstrap method (Efron and Tibshirani, 1993) for standard error estimation.

3.4 Expectations

For the E-step of the ES algorithm, it is necessary to calculate and update similar conditional expectations as in Section 3.1 of DW with respect to the latent variables τi and ζi given the observed data, Oi, and the latest updated parameter estimates, Θ(m). For ease of reference, the formulation of these conditional expectations are given in Appendix B; please refer to DW for derivation details keeping in mind the notational differences outlined previously in Section 3.1.

Updating the conditional expectations involves the conditional survival function Sεi(β)|τi = 1, Zi). We assume a nonparametric form for the conditional survival function,

Sε(εi(β)|τi=1,Zi)=j:ε(j)(β)εi(β)νj, (7)

where νj ≥ 0, ν0 = 1, Sε(ε(j)(β)|τi=1,Zi)=Sε(ε(j1)(β)|τi=1,Zi), λε(j)(β)|τi = 1, Zi) = 1 − νj (Kalbfleisch and Prentice, 2002, Section 4.3), and λε(․) = fε(․)/Sε(․) is the hazard function for ε. Let ν = (ν1, …, νk).

Let Rj denote the risk set at ε(j)(β) and κl={(ζlτl)ζl}1(δl=1)+(τl)1(δl=6)+1(δl6). Following similar derivations as Kalbfleisch and Prenctice (2002) and plugging in the nonparametric form of the conditional survival function (7), L2(β|𝒫i) can be rewritten as

j=1klAjSε(ε(j)(β)|τl=1,Zl)τlζl{λε(ε(j)(β)|τl=1,Zl)Sε(ε(j)(β)|τl=1,Zl)}1ζl×lBjλε(ε(j)(β)|τl=1,Zl)Sε(ε(j)(β)|τl=1,Zl)lCjSε(ε(j)(β)|τl=1,Zl)τl=j=1klAjνjτlζl(1νj)1ζllBj(1νj)lRjAjBjνjκl, (8)

where Aj, Bj, and Cj are as defined in Appendix A.

The closed-form maximum likelihood estimate (MLE) of ν, based on (8), leads to the following formula for updating ν in the conditional expectations.

ν^(m+1)=(1lAj(1ζl)+dBjlAj(1ζl+(τlζl))+dBj+lRjAjBjκl,j=1,,k),

where dBj is the number of unmasked failures at ε(j)(β) and the superscript ‡ denotes the conditional expectation of those indicators given Oi, Θ(m), and ν(m). Note, if there is no cure rate or masked event, ν̂(m+1) would simply be equal to (1 − dj/Rj, j = 1, …, k), where dj and Rj are the number of failures and the risk set at ε(j)(β), respectively.

4 Simulation Study

To evaluate the performance of the proposed method, a simulation study is conducted where sample generation is based on the incidence and latency models specified in (1) and (2) with β = −0.05 or 0, and = (−6, 0.2) or (−10, 0.3) corresponding to respective average cure rates of around 27 and 39%. We generate the residuals, ε, in (2) from N(0.2, 0.01) and the covariate Z from N(35, 9). The simulated samples are modeled after the IBCSG Trial VIII data, where Z is a random variable for age at entry, T1 is the time from study entry to TIA, and M is age at menopause, which is assumed to follow N(51, 71). Since the patient population is premenopausal, for patients who enter the study, the conditional CDF for M is

FM(m|Z=z,Z<M)=FM(m)FM(z)1FM(z),zm.

The Kaplan–Meier survival estimate, assuming that all masked events are TIA, is used as the initial estimate for the conditional survival function. For each simulated sample, the logistic regression parameter estimates using only the unmasked data (i.e. the naive estimates) are used as starting values for = (p0, p1). Four different simulations are conducted, denoted A, B, C, and D in Table 1. Simulations A and B contain both cure rate and masked event while simulations C and D contain only masked event (i.e. no cure rate). The number of subjects in each sample (n) is 100, and all simulation results are for 100 replicates. To mimic the IBCSG data, there is a large proportion of masked event in all samples with and without a cure rate. We see that the results converged reliably to the true parameter values.

Table 1.

Simulation results for 100 replications with sample size n = 100, where SE is the empirical standard error and the boostrap SE is based on 100 replicates.

Simulation Parameter True value Mean estimate (SE) Bootstrap SE
p0 −6 −7.954 (5.770) 6.057
A p1 0.2 0.224 (0.166) 0.174
β −0.05 −0.051 (0.017) 0.018
p0 −10 −11.984 (5.879) 6.142
B p1 0.3 0.326 (0.168) 0.176
β 0 −0.0004 (0.019) 0.020
C β −0.05 −0.050 (0.017) 0.018
D β 0 0.0010 (0.018) 0.019

Simulations were also conducted to investigate the accuracy of the bootstrap standard errors mentioned at the end of Section 3.3; 100 bootstrap samples of size n = 100 were used for each simulation replicate. As seen in Table 1, the relative difference (i.e. ratio of [bootstrap SE − empirical SE]/empirical SE) between the bootstrap and empirical SEs is around 5%.

5 Breast Cancer Example

We now fit our proposed AFT model and apply our estimation method to the IBCSG Trial VIII data, with particular focus on the goserelin arm for which, as was shown in Fig. 1, the PH assumption may be violated. There are 304 (47%), 316 (80%), and 302 (83%) patients eligible for analysis (with corresponding percentages of masked events) in the goserelin, CMF (chemotherapy), and CMF+goserelin arms, respectively. The objective of this analysis is to estimate the time to TIA, where we are interested in the effect of the single continuous covariate, age at entry (Z). As in DW, we assume that age at menopause follows N(51.02, 71.23) in this study population, and eliminate the piece of the likelihood from the model for T2 (time to menopause) to reduce the number of parameters in the estimation procedure and thus gain some efficiency. Given the pharmacodynamics of goserelin, it is assumed that the treatment induces amenorrhea in all patients eventually (Castiglione-Gertsch et al., 2003), thus, αi(zi) is fixed to equal 1 for all patients on this treatment arm and the corresponding logistic regression parameters are not estimated.

Using our proposed estimation method, the parameter estimates and corresponding bootstrap standard errors (SE) by treatment arm are shown in Table 2. The bootstrap SEs are based on 100 bootstrap replicates for each arm. As in the simulations, we use the naive estimates of as starting values. The estimates of p1 for the CMF and combination therapy arms are both positive with corresponding p-values of <0.001 and 0.056, respectively. Thus, both the CMF and combination therapy arms show that older patients have a higher probability of experiencing TIA than younger patients (Fig. 2). However, this age effect is only significant for the CMF arm. This relationship between age at entry and probability of TIA was also shown in DW, and are consistent with those reported in the IBCSG Trial VIII clinical paper (Castiglione-Gertsch et al., 2003).

Table 2.

Parameter estimates by treatment arm for IBCSG Trial VIII, where values in parentheses are bootstrap standard errors.

Parameter Goserelin (SE) CMF (SE) CMF + goserelin (SE)
p0 −7.0910 (2.153) −2.3722 (2.626)
p1 0.1863 (0.052) 0.1416 (0.074)
β 0.0000* −0.0314 (0.006) −0.0609 (0.005)
*

SE <0.00001.

Figure 2.

Figure 2

Probability of being uncured with respect to TIA for CMF and combination arms.

The estimates of β for the latency are statistically significant for the CMF and combination therapy arms, where the corresponding p-values are both <0.0001. In contrast to the PH model, the β estimates from the AFT model have interpretations as acceleration factors. Specifically, if we let T1(z) denote the time from study entry to TIA for subjects age z, then the acceleration factor for age z1 relative to age z2 can be calculated as AF(z1,z2)=T1(z2)/T1(z1)=exp{β(z1z2)}, where values greater than 1 denote that subjects age z1 have a more accelerated (i.e. shorter) time to TIA than subjects age z2 and vice-versa for values less than 1.

For example, in the CMF arm, AF(35,44) = 0.75 (0.041) and AF(55,44) = 1.41 (0.093), where values in parentheses are standard errors and 44 is the mean age. In other words, on the CMF arm, subjects age 35 will progress to TIA 0.25 times slower than subjects age 44, whereas subjects age 55 will progress to TIA 1.4 faster than subjects age 44. Thus, older patients on the CMF arm have a greater accelerated risk of TIA than younger patients. Similar conclusions can be drawn for the combination therapy arm, where AF(35,44) = 0.58 (0.026) and AF(55,44) = 1.95 (0.107), and the mean age is also 44. These conclusions are reflected in the corresponding estimated survival curves with respect to TIA for ages 35, 45, and 55 in Fig. 3, and are similar to those obtained in DW.

Figure 3.

Figure 3

Estimated survival curves with respect to TIA by treatment arm for ages 35, 45, and 55, where the conditional survival probability of TIA is ST1(t1ii = 1) and the marginal survival probability of TIA is ST1(t1i) = αiST1(t1ii = 1)+(1 − αi).

The age effect on the cure rate for the CMF and combination therapy arms can also be seen in the leveling off of the estimated marginal survival curves at the bottom of Fig. 3. The estimated marginal survival curves for the younger patients level off sooner than older patients in both arms, implicating that younger patients have a lower probability of experiencing TIA. For patients on the goserelin arm, the risk of TIA does not depend on age at entry, which “corrects” the counter-intuitive effect found by DW and agrees with what is clinically expected. Given that goserelin injections were administered for 24 months in IBCSG Trial VIII, we see from Fig. 3 that goserelin induces amenorrhea very quickly (median time of 3 months), which is indicative of the pharmacodynamics of the treatment. As discussed in Section 2, we believe that the assumption of independent events (TIA and menopause) is justified for our treatment arm of interest (goserelin), in which the PH assumption may be violated. We acknowledge, however, that this assumption may not hold for the chemotherapy-containing arms of IBCSG trial VIII.

Because of the discrepancy between some aspects of our proposed AFT model and DW’s PH model when applied to the goserelin arm, it is useful to investigate the goodness-of-fit. We thank a referee for suggesting the method used in Li and Taylor (2002), where a quantity with an approximate uniform (0,1) distribution if the model fits is estimated from each observation. We then graphically compare the empirical distribution of this quantity across the observations to a uniform distribution. More specifically, when we fit a time-to-event model (AFT or PH), we obtain the estimated survival probabilities, denoted by {Ĝi(Ti), i = 1, …, n}, based on our parameter estimates. This quantity should have an approximate uniform (0,1) distribution. In Fig. 4, we plot the Kaplan–Meier curve where {1 − Ĝi(Ti), i = 1, …, n} are regarded as the survival times with corresponding indicators of whether or not Ti is censored for the AFT and PH models for the goserelin arm, and compare it with the diagonal line representing the uniform distribution. We see a much better approximation to a uniform for the proposed AFT model than DW’s PH model. The gaps in the PH model plot are due to the large drops in estimated survival probabilities over time for the goserelin arm in DW (e.g. from 0.688 at t1 = 1 to 0.219 at t1 = 2).

Figure 4.

Figure 4

Kaplan–Meier estimates of the estimated distribution evaluated at the observed data for the goserelin arm of IBCSG Trial VIII.

6 Discussion

The proposed method for the analysis of time-to-event data subject to a cure rate with masked event is an extension of the work by DW to the setting where the PH assumption does not hold. Our estimation method is based on the AFT model with unspecified error distribution and does not require any distributional assumptions on T1 or T2. In contrast to the results from DW’s PH model, our results for the goserelin arm in the IBCSG data example follow clinical intuition.

A well-known issue of the mixture cure model is that correct estimation of the cure rate (αi) is contingent on the identifiability of the model, i.e. the ability to distinguish between cured patients and long-term uncured survivors. Yu et al. (2004) investigated the identifiability of mixture cure models in the setting of parametric models and no masked event through extensive simulations. They showed that cure rate estimates could differ greatly if the latency distribution is incorrectly specified and/or if the length of follow-up is less than the median survival time of uncured patients. The semiparametric AFT mixture cure models proposed in this paper and Zhang and Peng’s (2007) do not require specification of the latency distribution. Zhang and Peng (2007) focused on a setting without masked event and found, through a simulation study, that their semiparametric AFT mixture cure model performed favorably compared to the parametric AFT mixture cure model with correct latency distribution specification, implying that their model has good identifiability. Considering event II (menopause) as a form of censoring for our event of interest (event I: TIA), our proposed estimating function (6) reduces to Zhang and Peng’s (2007).

As pointed out by a referee, since TIA can only occur during treatment in our breast cancer example, if ST1(t1 = Ui = 1, Zi)>0, where U denotes treatment end, there may be identifiability problems. In the case where there is no finite upper limit for T1, if the survival probability as time goes to infinity is non-zero, Taylor (1995) and Sy and Taylor (2000) suggested imposing a “zero-tail constraint,” where the survival probabilities for all times past the last failure time are forced to equal 0. We employ a similar constraint in our case; that is, let Ŝε(j)(β)|τi = 1, Zi) = 0 for all j such that ε(j)(β)>εU(β), where εU(β) = log(U) − (ZβT)* and (ZβT)* = max(ZiβT, i = 1, …, n). It is easy to show that, under the above constraint, the survival probability of T1 conditional on τi = 1 is zero at t1 = U. The use of such a constraint was avoided in our IBCSG data example.

It is worthwhile to note that the ES algorithm proposed by Elashoff and Ryan (2004) described in Section 3.3 does not involve an infinite dimensional parameter that needs to be estimated iteratively. Our application of the algorithm involves such a parameter in the estimation of the conditional survival function, Sεi(β)|τi = 1, Zi), discussed in Section 3.4. However, from our simulation study in Section 4, we see that the results converged reliably to the true parameter values.

In both our simulation study and breast cancer data application, we assumed that the incidence and latency are affected by the same covariate (i.e. age at entry); however, our methodology would also apply when the incidence and the latency are affected by different covariates; a covariate that is important for the incidence may not be important for the latency and vice versa. In addition, although the proposed method is dependent on the presence of some unmasked events in order to give accurate estimates of the cure rate and survival distribution, our simulations and IBCSG data analysis show that our method is able to accommodate fairly substantial proportions of masked event.

Acknowledgements

We thank the patients, physicians, nurses, and data managers who participated in the International Breast Cancer Study Group (IBCSG) Trial VIII. We further acknowledge support from the United States National Cancer Institute (CA-75362) and the United States National Institute of Health Cancer Training Grant (Jing J. Zhang). We express our gratitude to Richard Gelber, Robert Gray, Ann Partridge, and Zhi-Min Yuan for their assistance throughout. We also thank the editor, associate editor, and referee for their insightful comments and suggestions.

Appendix

A Components of the complete-data likelihood referred to in Section 3.1

L1(|𝒫i)=i:δi=1πiτiζi+(1ζi)(1πi)(1τi)ζii:δi{2,3}πii:δi{4,5}(1πi)×i:δi=6πiτi(1πi)(1τi)
L2(β|𝒫i)=j=1klAj{1Fε(ε(j)(β))}τlζlfε(ε(j)(β))1ζllBjfε(ε(j)(β))×lCj{1Fε(ε(j)(β))}τl
L3(Ω|𝒫i)=i:δi=1fT2(xi)ζi{1FT2(xi)}1ζii:δi{3,5}fT2(t2i)i:δi{2,4,6}{1FT2(ci)}

Note that Aj denotes the set of all subjects who experienced a masked event at ε(j)(β), Bj denotes the set of all subjects who experienced an unmasked failure at ε(j)(β), and Cj denotes the set of subjects censored in the interval [ε(j)(β), ε(j+1)(β)), j = 1, …, k.

B Conditional expectations referred to in Section 3.4

For notational simplicity, we let β denote β(m).

E(ζI|Oi,Θ(m),τi=1)={1Fε(εi(β))}fT2(xi){1Fε(εi(β))}fT2(xi)+{1FT2(xi)}fε(εi(β))
E(τI|Oi,Θ(m))={1FT1(εi(β))}αifT2(xi)+{1FT2(xi)}αifε(εi(β))fT2(xi){1αiFε(εi(β))}+{1FT2(xi)}αifε(εi(β))
E(τi|Oi,Θ(m))=αi{1Fε(εi(β))}(1αi)+αi{1Fε(εi(β))}
E(ζiτi|Oi,Θ(m))=E(ζi|Oi,Θ(m),τi=1)E(τi|Oi,Θ(m))
E(ζi|Oi,Θ(m))=E(ζiτi|Oi,Θ(m))+1E(τi|Oi,Θ(m))

Footnotes

Supporting Information for this article is available from the author or on the WWW under http://dx.doi.org/10.1002/bimj.200800244.

Conflict of Interests Statement

The authors have declared no conflict of interest.

References

  1. Berkson J, Gage RP. Survival curve for cancer patients following treatment. Journal of the American Statistical Association. 1952;47:501–515. [Google Scholar]
  2. Castiglione-Gertsch M, O’Neill A, Price KN, Goldhirsch A, Coates AS, Colleoni M, Nasi ML, Bonetti M, Gelber RD on behalf of the International Breast Cancer Study Group. Adjuvant chemotherapy followed by goserelin versus either modality alone for premenopausal lymph node-negative breast cancer: a randomized trial. Journal of the National Cancer Institute. 2003;95:1833–1846. doi: 10.1093/jnci/djg119. [DOI] [PubMed] [Google Scholar]
  3. Craiu RV, Duchesne T. Inference based on the EM algorithm for the competing risks model with masked causes of failure. Biometrika. 2004;91:543–558. [Google Scholar]
  4. Dahlberg S, Wang M. A proportional hazards cure model for the analysis of time to event with frequently unidentifiable causes. Biometrics. 2007;63:1237–1244. doi: 10.1111/j.1541-0420.2007.00811.x. [DOI] [PubMed] [Google Scholar]
  5. deHaes H, Olschewski M, Kaufmann M, Schumacher M, Sauerbrei W. Quality of life in goserelin-treated versus cyclophosphamide + methotrexate 6 flourouracil-treated premenopausal and perimenopausal patients with node-positive, early breast cancer: the zoladex early breast cancer research association trialists group. Journal of Clinical Oncology. 2003;21:4510–4516. doi: 10.1200/JCO.2003.11.064. [DOI] [PubMed] [Google Scholar]
  6. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statlstical Society, Series B. 1977;39:1–22. [Google Scholar]
  7. Efron B, Tibshirani RJ. An Introduction to the Bootstrap. New York: Chapman & Hall; 1993. [Google Scholar]
  8. Elashoff M, Ryan L. An EM algorithm for estimating equations. Journal of Computational and Graphical Statistics. 2004;13:48–65. [Google Scholar]
  9. Flehinger JB, Reiser B, Yashchin E. Survival with competing risks and masked causes of failures. Biometrika. 1998;85:151–164. doi: 10.1023/a:1014891707936. [DOI] [PubMed] [Google Scholar]
  10. Fygenson M, Ritov Y. Monotone estimating equations for censored data. The Annals of Statistics. 1994;22:732–746. [Google Scholar]
  11. Goetghebeur E, Ryan L. Analysis of competing risks survival data when some failure types are missing. Biometrika. 1995;82:821–833. [Google Scholar]
  12. Jin Z, Lin DY, Wei LJ, Ying Z. Rank-based inference for the accelerated failure time model. Biometrika. 2003;90:341–353. [Google Scholar]
  13. Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. 2nd edn. New York: Wiley; 2002. [Google Scholar]
  14. Li CS, Taylor JMG. A semiparametric accelerated failure time cure model. Statistics in Medicine. 2002;21:3235–3247. doi: 10.1002/sim.1260. [DOI] [PubMed] [Google Scholar]
  15. Partridge AH, Gelber S, Peppercorn J, Sampson E, Knudsen K, Laufer M, Rosenberg R, Przypyszny M, Rein A, Winer EP. Web-based survey of fertility issues in young women with breast cancer. Journal of Clinical Oncology. 2004;22:4174–4183. doi: 10.1200/JCO.2004.01.159. [DOI] [PubMed] [Google Scholar]
  16. Peng Y, Dear KBG. A nonparametric mixture model for cure rate estimation. Biometrics. 2000;56:237–243. doi: 10.1111/j.0006-341x.2000.00237.x. [DOI] [PubMed] [Google Scholar]
  17. Peng Y, Dear KBG, Denham JW. A generalized F mixture model for cure rate estimation. Statistics in Medicine. 1998;17:813–830. doi: 10.1002/(sici)1097-0258(19980430)17:8<813::aid-sim775>3.0.co;2-#. [DOI] [PubMed] [Google Scholar]
  18. Press WH, Teukolsky SA, Vetterling WT, Flannery BP. Numerical Recipes in C:the Art of Scientific Computing. 2nd edn. New York: Cambridge University Press; 1992. [Google Scholar]
  19. Ritov Y. Estimation in a linear regression model with censored data. The Annals of Statistics. 1990;18:303–328. [Google Scholar]
  20. Sy JP, Taylor JMG. Estimation in a Cox proportional hazards cure model. Biometrics. 2000;56:227–236. doi: 10.1111/j.0006-341x.2000.00227.x. [DOI] [PubMed] [Google Scholar]
  21. Szwarc SE, Bonetti M. Modeling menstrual status during and after adjuvant treatment for breast cancer. Statistics in Medicine. 2006;25:3534–3547. doi: 10.1002/sim.2445. [DOI] [PubMed] [Google Scholar]
  22. Taylor JMG. Semiparametric estimation in failure time mixture models. Biometrics. 1995;51:899–907. [PubMed] [Google Scholar]
  23. Tsiatis AA. Estimating regression parameters using linear rank tests for censored data. The Annals of Statistics. 1990;18:354–372. [Google Scholar]
  24. Wei LJ. The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Statistics in Medicine. 1992;11:1871–1879. doi: 10.1002/sim.4780111409. [DOI] [PubMed] [Google Scholar]
  25. Yu B, Tiwari RC, Cronin KA, Feuer EJ. Cure fraction estimation from the mixture cure models for grouped survival data. Statistics in Medicine. 2004;23:1733–1747. doi: 10.1002/sim.1774. [DOI] [PubMed] [Google Scholar]
  26. Zhang J, Peng Y. A new estimation method for the semiparametric accelerated failure time mixture cure model. Statistics in Medicine. 2007;26:3157–3171. doi: 10.1002/sim.2748. [DOI] [PubMed] [Google Scholar]
  27. Zhang JJ, Wang M. Novel Methodologies for the Analysis of Complex Failure Time Data and Alternative Progression-Free Survival Estimators. Ph.D. Thesis. Harvard University; 2008. Latent class joint model of ovarian function suppression and DFS for premenopausal breast cancer patients. [Google Scholar]

RESOURCES