Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2015 May 24;34(24):3181–3193. doi: 10.1002/sim.6529

Bias in progression‐free survival analysis due to intermittent assessment of progression

Leilei Zeng 1,, Richard J Cook 1, Lan Wen 1, Audrey Boruvka 1
PMCID: PMC4744753  PMID: 26011411

Abstract

Cancer clinical trials are routinely designed to assess the effect of treatment on disease progression and death, often in terms of a composite endpoint called progression‐free survival. When progression status is known only at periodic assessment times, the progression time is interval censored, and complications arise in the analysis of progression‐free survival. Despite the advances in methods for dealing with interval‐censored data, naive methods such as right‐endpoint imputation are widely adopted in this setting. We examine the asymptotic and empirical properties of estimators of the marginal progression‐free survival functions and associated treatment effects under this scheme. Specifically, we explore the determinants of the asymptotic bias and point out that there is typically a loss in power of tests for treatment effects. Copyright © 2015 John Wiley & Sons, Ltd.

Keywords: illness‐death process, intermittent observation, progression‐free survival, model misspecification, multistate model

1. Introduction

A fundamental goal in oncology is the reduction of mortality due to cancer. Therapeutic advances for many cancers and the increasing pressure to evaluate experimental treatments in a timely and cost‐effective manner have made it challenging to design adequately powered trials based on the time from randomization to death. This has led to increased use of the composite progression‐free survival endpoint 1, where the nominal response is the time from randomization to the first of progression or death. From 2005 to 2009 inclusive, for example, just over one‐quarter of the trials published on breast, colorectal and non‐small‐cell lung cancer in the Journal of Clinical Oncology assess treatment effects based on the composite endpoint of progression‐free survival 2.

The utility and interpretation of findings based on progression‐free survival are currently being discussed in the oncology literature 3. Miksad et al. 4 point out that because interest primarily lies in the effect of treatment on overall survival, there is an implicit assumption that progression‐free survival is a surrogate for overall survival. These authors conduct a scientific review to examine the empirical evidence of an association between findings from progression‐free survival and overall survival analyses; they conclude that while there is an association between the findings, the predictive accuracy is not high. Sidhu et al. 5 study this issue in metastatic colorectal cancer trials involving fluoropyrimidine‐based regimens and found a higher correlation, leading them to conclude that progression‐free survival is a valid surrogate for overall survival in this context. In another context, Ballman et al. 6 examine the association between 6‐month progression‐free survival and overall survival at 12 months. With a view to gaining insight into the structural nature of the relation between these two endpoints, Broglio and Berry 7 decompose the effect of treatment on overall survival into components, one of which relates to progression‐free survival. Saad et al. 8 emphasize the importance of a clear definition of progression when analysed on its own or as part of a composite endpoint.

Frydman and Szarek 9 point out that the classical illness‐death model (Figure 1) offers a natural framework for joint consideration of a non‐terminal event, such as progression, and the terminal event death. In the current setting, we consider individuals to be in state 0 at the time of randomization when they are progression free and alive. Individuals enter state 1 upon progression and subsequently enter state 2 upon death. The possibility of death without progression is accommodated by direct transition from state 0 to state 2. Let T jk denote the potential jk transition time. If T 01<T 02, we let T 1=T 01 denote the time of entry to state 1. The time of entry to state 2 is the overall survival time denoted by T 2=I(T 02<T 01)T 02+I(T 01T 02)T 12. The progression‐free survival time is the time of exiting state 0 denoted by T= min(T 01,T 02)= min(T 1,T 2).

Figure 1.

Figure 1

A three‐state model for joint consideration of progression and death.

Challenges arise when progression‐free survival is used as an endpoint, as progression status is only observed at periodic assessment times and the time of progression is at best interval censored. The time to the composite progression‐free survival endpoint is then subject to a hybrid censoring scheme involving interval censoring for progression and right censoring for death.

Figure 2(a) shows a timeline diagram where T 1 and T 2 are the times of progression and death, respectively; A k,k = 1,2,… denote the assessment times; and C is a right censoring time. In this scenario, progression is first detected at A 2, so one knows that T lies in the interval (A 1,A 2]. The most common strategy for dealing with this type of data, however, is to use right‐endpoint imputation whereby the surrogate S = A 2 is used in lieu of the event time T and standard survival analysis methods are adopted; this approach is recommended by the FDA 10. A further complication arises when death occurs without prior evidence of progression (Figure 2(b)) as it is unknown whether or not progression occurred between the last negative assessment (A 2) and the time of death. The convention in this case is to assume individuals have not progressed and hence use S = T 2 as the progression‐free survival time (e.g. see Dejardin et al. 11). The strong implicit assumptions associated with the approaches described earlier warrant careful consideration when dealing with the intermittently observed progression component in the composite progression‐free survival endpoint.

Figure 2.

Figure 2

Timeline diagrams for progression (T 1), death (T 2), assessments (A 1 and A 2) and censoring (C).

Given the vast number of cancer trials using progression‐free survival as the primary endpoint, the challenges in interpreting findings based on composite endpoints and the seemingly self‐evident shortcomings of using the documented time of progression rather than the actual progression time, it is surprising that this problem has not received more attention. Frydman and Szarek 9 and Lefonndré et al. 12 highlight and examine the biases that arise from some form of imputation in this setting. Panageas et al. 13 discuss the role of the assessment schedule on inferences regarding progression‐free survival analysis and discuss, in particular, the phenomenon that the Kaplan–Meier (KM) estimate tends to drop around the time of scheduled assessments in a way that reflects both the risk of progression and the assessment process. Binder and Schumacher 14 consider modelling the time to progression and examine the impact of censoring progression times at death for individuals not observed to have progressed. Lefonndré et al. 12 advocate suitable methods for interval‐censored data when modelling the joint distribution of progression and death times. Much of this work has been based on simulation studies.

In this article, we examine the asymptotic and empirical properties of estimators arising from use of the surrogate ‘observed’ progression‐free survival times as is customarily carried out under the intermittent assessment scheme. We consider the settings where the assessment times are pre‐scheduled, but there is modest variation between individuals in precisely when they occur, to reflect data arising in clinical trials. We also consider the setting in which assessment times are governed by an independent and possibly progression‐dependent stochastic process to correspond to the less regulated assessment schemes of observational cohort studies. We formulate models for each component of the composite endpoint (progression and death) in the framework of an illness‐death model with proportional intensities so that the proportional hazards assumption is satisfied for the endpoint of progression‐free survival. Asymptotic calculations are carried out by considering an enlarged state space that allows joint consideration of the disease process and the assessment process. From this stochastic model, the distribution of the ‘observed’ progression‐free survival time is derived, where it is defined by right‐endpoint imputation when progression is detected and the survival time otherwise. We contrast this distribution with that of the latent progression‐free survival time in a variety of scenarios. We also use the theory of misspecified Cox models to derive the asymptotic biases of the estimated treatment effects from a Cox proportional hazard model for progression‐free survival 15, 16, 17.

The remainder of this paper is organized as follows. In Section 2, we introduce further notation for the three‐state illness‐death model, and we discuss issues in the use of ‘observed’ progression‐free survival when progression is intermittently observed in Section 3. We derive the expected partial score functions, which are central to the derivation of the asymptotic bias in estimates of treatment effect from the Cox proportional hazards model. We also derive the asymptotic power associated with Wald‐type tests of the null hypothesis of no treatment effect and demonstrate the potential reduction in power from adopting right‐endpoint imputation. Empirical investigations are conducted in Section 4 to illustrate the finite sample properties, and concluding remarks are given in Section 5.

2. The Markov illness‐death model for progression and death

We let Z(t) denote the state occupied at time t, and {Z(u),0 < u} represent the corresponding stochastic process. At time t > 0, the history of the process, denoted as H(t)={Z(u),0 < u < t}, contains information on the timing and nature of any transitions over (0,t). Transitions among the states S={0,1,2} at any time t are governed by the transition intensities

limΔt0P(Z(t+Δt)=k|H(t))Δt=Yj(t)λjk(t|H(t)),j<kS, (1)

where Y j(t) = I(Z(t ) = j) indicates that the individual is ‘at risk’ of a transition out of state j at time t, j = 0,1.

If the transition intensities depend on the process history only through state occupied at t , we write

λjk(t|H(t))=λjk(t),

and the process is Markov. We can then let A(t)=λjk(t) denote the transition intensity matrix, where Ajk(t)=λjk(t) for j<kS,Ajj(t)=k>jλjk(t), and other elements are zero. Let P(s,t)=[pjk(s,t)] denote the transition probability matrix with elements p jk(s,t) = P(Z(t) = k|Z(s) = j),s < t. The matrix P(s,t) satisfies the Kolmogorov forward differential equation

ddtP(s,t)=P(s,t)A(t),t>s, (2)

which enables one to express the transition probabilities in terms of intensity functions 18. When the process is time homogeneous, the calculation becomes much simpler and P(t)=exp(At). For the three‐state illness‐death models shown in Figure 1, the explicit expressions for p 00(s,t) and p 01(s,t) are

p00(s,t)=expst[λ01(u)+λ02(u)]du,p01(s,t)=stλ01(u)expsu[λ01(v)+λ02(v)]dvexputλ12(v)dvdu. (3)

Note that the survival function for the progression‐free survival endpoint is then

F(t)=P(T>t)=p00(0,t),t>0, (4)

with respective hazard function h(t)=dlog(F(t))/dt=λ01(t)+λ02(t), the sum of the cause‐specific hazard functions for 0→1 and 0→2 transitions.

At times in what follows, it will also be useful to consider the counting process formulation of multistate processes, and with that in mind, we define N jk(t) as the number of jk transitions over (0,t] and let Δ N jk(t) = N jk(t + Δ t ) − N jk(t ) count the number of jk transitions over the interval [t,t + Δ t). Then dNjk(t)=limΔt0ΔNjk(t) indicates the occurrence of a jk transition at time t.

3. Progression‐free survival with intermittent assessment schemes

3.1. Distribution of imputed progression‐free survival time

When interest lies in evaluating treatment effects on the basis of progression‐free survival, standard survival models are often fitted with the time T= min(T 1,T 2) in mind. As discussed in Section 2, progression status is only determined at periodic assessment times, and hence, among individuals for whom progression is detected, the progression time is interval censored. For such individuals, the convention is to use the assessment time at which progression is detected as the progression‐free survival time. For individuals dying without progression having previously been detected, it is unknown whether or not progression occurred, and in this case, it is conventional to assume they were progression‐free at the time of death and to take the time of death as the progression‐free survival time. These conventions lead to invalid inference because the distributions of the adopted event time and the actual progression‐free survival time differ. In what follows, we formally construct a probability model that represents the process generating the available data. We then use this model to evaluate the performance of methods used in this situation.

Let C A denote an administrative censoring time and R i a random right censoring time for individual i, so that the survival status is known over the interval (0,C i], where C i= min(C A,R i). Suppose progression status is assessed at a sequence of times 0 = a i0<a i1<⋯<a iK i<C i. The surrogate for the progression‐free survival time T is given by

S=min{aik;IZaik=1,k=1,,Ki}·IZaiKi=1+T2·IZaiKi=0, (5)

which takes the value of the assessment time if progression is detected, and the time of death otherwise. The progression‐free survival time is considered right censored at the last negative assessment if the final assessment is negative and death has not been observed during the course of follow‐up 11. We derive the distribution function for the surrogate event time S under two different intermittent assessment schemes, deferring discussion of covariates until Section 3.2.

The first scenario is designed to mimic the setting of a randomized trial and the second a less regular assessment scheme such as one might see in a cohort study. The prostate cancer trial by Scher et al. 19 is a suitable example in which progression is evaluated every month during the first 6 months of follow‐up and every 3  months thereafter. For ease of exposition, we assume there is an assessment at the time of randomization denoted a 0=0, and K regularly spaced follow‐up assessments are scheduled over (0,C A] at a k=k C A/K,k = 1,…,K. According to (5), the surrogate progression‐free survival time S is one of these fixed assessment times if progression is detected, but it otherwise takes on any value over (0,C A] if death is observed. This S is a mixed‐type non‐negative random variable with positive mass at a finite number of discrete points (at the assessment times) and a cumulative distribution function

FS(s)=P(Ss)=0sf(u)du+k:aksP(S=ak),

where

f(u)=p00(1a0,ak1)j=01p0j(ak1,u)λj2(u)foru(ak1,ak),P(S=ak)=p00(a0,ak1)p01(ak1,ak),

and p 00(u,v) and p 01(u,v) are given in (3). The cumulative function is a complicated time‐varying function of the baseline intensities in the three‐state multiplicative illness‐death model characterizing the true underlying process of progression and death, as is the hazard function h S(s).

For cohort studies, point process models offer a useful framework for characterizing the stochastic assessment process. We let A k denote the random time of the kth assessment, k = 1,2,…, and Na(t)=k=1I(Akt) count the number of follow‐up assessments occurring over (0,t]. If Y(s) = I(sC) and the history of the joint process is defined as H (t)={(Z(a k),a k),k = 0,…,N a(t );I(sT 2),Y(s),0 < s < t} for an arbitrary individual, then their visit intensity is defined as

limΔt0P(ΔNa(t+Δt)=1|H(t))Δt=Y(t)λ(t|H(t)). (6)

A joint model for the response, assessment and censoring processes can be constructed by first expanding the state space. Specifically, we define a new multistate process {Z (s),0 < s} with a state space S={V0,V1,,V1p,V2p,,P0,P1,,D0,D1,,} as depicted in Figure 3. The subscript on the states V k,k = 1,2,…, reflects the number of assessment visits for which progression was not detected; for the states Vkp,k=1,, the superscript p designates an assessment made post‐progression. Thus, an individual is in state V k at time t if they are alive and progression‐free and N a(t) = k (i.e. A kt< min(T 1,T 2,A k + 1)). They are in state P k at time t if they progressed after the kth assessment but have not yet experienced the (k + 1)st assessment or died (i.e. A k<T 1t< min(T 2,A k + 1)). An individual is in state D k at time t if I(A k<T 2< min(t,A k + 1)) and in state Vkp at time t if I(A k<T 1<A k + 1t < T 2),k = 0,1,2,…. Following the occurrence of the kth assessment, the next event to occur (visit k + 1, progression or death) is governed by a competing risk process; transitions to the right (i.e. V kV k + 1) occur with the cause‐specific intensity λ (t|H (t)) = α(t). Downward transitions from the V k state (i.e. V kP k and V kD k) correspond to transitions out of state 0 in the original three‐state illness‐death process and hence have the same transition intensities λ 01(t) and λ 02(t); likewise, the P kD k transitions have intensity λ 12(t) under a Markov model. Finally, the PkVk+1p transition corresponds to the occurrence of the (k + 1)st assessment following progression and has intensity λ (tT 1<t,H (t)) = α p(t). If we assume the assessment process is independent of the disease process, then α p(t) = α(t), but otherwise, we may assign a different intensity, such as α p(t) = α(t) exp(ρ), to reflect increase intensity of the assessment process post‐progression if, for example, ρ > 0. The latter may be reasonable if clinic visits are more likely if symptoms associated with progression become evident; note that such a process would violate the sequential missing at random assumption 20 and invalidate even methods based on interval‐censored data 21.

Figure 3.

Figure 3

A multistate diagram for joint consideration of progression, death and recurrent assessment times.

We define V as the set of states V 0,V 1,…, and define Vp,P and D similarly. Under such a setup, if K is large enough, the survival function for the surrogate progression‐free survival time S is simply P(Z(s)VpD). In other words, the survival function FS(s) can be expressed as a function of the transition probabilities of the multistate process given in Figure 3 such that

FS(s)=1PZ(s)VpZ(0)=0PZ(s)DZ(0)=0. (7)

The transition probabilities necessary to compute this can be obtained by the Kolmogorov differential equation of Section 2 but can also be calculated directly from the probability of particular sample paths. For example, for an individual to be in state Vkp at time s, k − 1 assessments have to occur prior to progression, and the individual must survive to the kth assessment at A k<s. Over [0,s], the individual thus follows the path V0Vk1Pk1Vkp at times 0 = a 0<⋯<a k − 1<t 1<a ks, so the probability PZ(s)=VkpZ(0)=0 takes form

0s0ak0t10a2j=1k1α(aj)λ01(t1)αp(ak)exp0t1λ01(u)+λ02(u)+α(u)du×expt1akλ12(u)+αp(u)duda1dak1dt1dak. (8)

An expression for P(Z (s) = D kZ (0) = 0) is derived similarly by noting that over [0,s], an individual must follow either the path V 0→⋯→V kD k with transition times 0 = a 0<⋯<a k<t 2s, or path V 0→…→V kP kD k with transitions at times 0 = a 0<⋯<a k<t 1<t 2s. Hence, P(Z (s) = D kZ (0) = 0) takes the form

0s0t20a2j=1kα(aj)λ02(t2)exp0t2λ01(u)+λ02(u)+α(u)duda1dakdt2+0s0t20t10a2j=1kα(aj)λ01(t1)λ12(t2)exp0t1λ01(u)+λ02(u)+α(u)du×expt1t2λ12(u)+αp(u)duda1dakdt1dt2 (9)

When the multistate process is time homogeneous where λ jk(t) = λ jk and the assessment process is ignorable and time homogeneous (α(t) = α p(t) = α), expressions (8) and (9) can be further simplified. Substituting (8) and (9) in (7) results in an expression for FS(s), and the hazard h S(s) is no longer a simple sum of λ 01(s) and λ 02(s).

We compare the survival functions for progression‐free survival time (4) and the surrogate progression‐free survival time (7) under various parameter configurations for the setting where the assessments arise from a point process. We assume time‐homogenous transition intensities λjk(t)=λjk,j<kS, with values set such that (i) P(T 01<T 02) = λ 01/(λ 01+λ 02) equals a desired probability of progression, (ii) λ 12/λ 02=1.5 corresponds to an increased risk of death following progression and (iii) π A=P(T > C A) = p 00(0,C A) = 0.20 gives a 20% administrative censoring rate for the progression‐free survival endpoint; we set C A=1 without loss of generality. The random right censoring time R i is assumed to follow an exponential distribution with the hazard set to yield the desired net censoring rates (π N), where π N=P(C i<T) with C i= min(C A,R i). We set α(t) = α and α p(t) = α exp(ρ) with ρ={log1, log2} for an progression independent or dependent assessment process.

Figure 4(a) contains a plot of the survival functions for the surrogate (black dashed line) and actual (black solid line) progression‐free survival times for the case when P(T 01<T 02) = 0.8, and the assessment process is event independent with a time homogeneous rate α = 2. The asymptotic bias in the estimator of the median progression‐free survival time is evident in Figure 4(a) from the distance between the vertical lines; the median progression‐free survival time is overestimated by a factor of (0.31 − 0.12)/0.12 = 2.6. Similarly, a progression‐free survival analysis based on the surrogate time leads to an over‐estimation of the probability of being event free. We also find that when all other factors are fixed, the extent of over‐estimation is greater when assessments are less frequent (e.g. comparing the biases when α = 2 versus when α = 4). Finally, the asymptotic bias decreases as P(T 01<T 02) decreases below 0.8 because as the probability of failure due to death increases, the probability that the actual failure time will be recorded increases.

Figure 4.

Figure 4

Survival functions for the surrogate progression‐free survival time (dotted) and the actual progression‐free survival time (solid) when the assessment process is (a) event independent and (b) event dependent; here, P(T 01<T 02) = 0.8,α(t) = α = 2,λ 12/λ 02=1.5 and π A=0.2.

Figure 4(b) displays an analogous plot to Figure 4(a) for the case of an event‐dependent assessment process with ρ= log2; here, the intensity for the next assessment doubles after progression occurs. The findings are similar, but the biases are lower because the lag from progression to its detection is stochastically smaller.

3.2. Cox regression based on the surrogate progression‐free survival time

Treatment effects are naturally characterized by multiplicative intensity models of the form

λjk(tx)=λjk0(t)expxβjk, (10)

where λ jk0(t) is the baseline rate of jk transitions and β=(βjk,j<kS) is the vector of regression parameters. For simplicity, we focus on a binary treatment indicator x, so that the hazard ratio for the progression‐free survival time is

h(tx=1)h(tx=0)=λ010(t)exp(β01)+λ020(t)exp(β02)λ010(t)+λ020(t). (11)

The proportionality assumption holds for h(t) under either (i) β 01=β 02 or (ii) λ 010(t) = λ 020(t) 22, 23. Under the first scenario, the treatment effect on progression‐free survival is the same as the treatment effect on progression and death, β = β 01=β 02. The second scenario leads to a treatment effect on progression‐free survival given by β= log[(exp(β 01)+ exp(β 02))/2]. When neither of these two conditions are satisfied, the hazard ratio (11) will be a complicated time‐varying function of the baseline intensities and the treatment effects for progression and death.

We consider the setting when proportionality holds for the progression‐free survival endpoint under condition (i) and a semi‐parametric regression model for the hazard takes a form h(t;x) = h 0(t) exp(x β), where h 0(t) is the baseline hazard, x is a binary treatment indicator and β is the true treatment effect. While this is no doubt unrealistic, we do so because our primary objective is to evaluate the impact of using the surrogate event time S in a Cox regression model under the intermittent assessment scheme. If neither condition (i) nor (ii) is satisfied, the Cox model for T is misspecified to start with, and it will be difficult to determine the role of the inspection process in the results.

To distinguish the true model from the one for the surrogate, we write

hS(s;x)=hS0(s)exp(),s>0, (12)

for the working Cox model based on the surrogate time S with baseline hazard h S0(s) and treatment effect γ. Let Yi(s)=I(sSi) be the at‐risk indicator and dN i(s) = I(S i=t) be the event indicator, which are all defined based on the imputed event time S. If Y i(s) = I(sC i) indicates individual i is still under observation, then Y¯i(s)=Yi(s)Yi(s) indicates they are under observation and at risk of the composite event. The partial likelihood 24, 25 leads to a score function for γ of the form U(γ)=i=1nUi(γ), where

Ui(γ)=0Y¯i(s)XiR(1)(γ;s)R(0)(γ;s)dNi(s), (13)

R(1)(γ,s)=i=1nY¯i(s)Xiexp(Xiγ) and R(0)(γ,s)=i=1nY¯i(s)exp(Xiγ). If γ^ denotes the estimate of γ by solving U(γ) = 0, then γ^γ* in probability, where γ * is the solution to

0E[Y¯i(s)XidNi(s)]r(1)(γ,s)r(0)(γ,s)E[Y¯i(s)dNi(s)]=0, (14)

with r (0)(γ,s) = E{R (0)(γ,s)} and r (1)(γ,s) = E{R (1)(γ,s)}, respectively 16. Lin and Wei 17 prove that n(γ^γ*) is asymptotically normal with

varn(γ^γ*)=[A1(γ*)][B(γ*)][A1(γ*)], (15)

where

A(γ)=0EY¯i(s)r(1)(γ,s)r(0)(γ,s)r(1)(γ,s)r(0)(γ,s)2dNi(s),B(γ)=0EY¯i(s)Xir(1)(γ,s)r(0)(γ,s)2dNi(s).

We define γ *β as the asymptotic bias.

The expectations in (14) and (15) are taken with respect to the distribution of the surrogate time S with expressions E{Y¯i(s)XidNi(t)}=G(s)fS(s;x=1)P(X=1),E{Y¯i(s)dNi(t)}=x=01G(s)fS(s;x)P(X=x),r(1)(γ,s)=nG(s)FS(s;x=1)exp(γ)P(X=1) and r(0)(γ,s)=nx=01G(s)FS(s;x)exp()P(X=x), where G(s)=P(Ris) denotes the survival function for the independent random right censoring time R i. Note that f S(s;x) and FS(s;x) are the density and survival functions for the surrogate progression‐free survival time S derived in Section 3.1, and they are functions of baseline intensities and treatment effects in the true three‐state illness‐death model as well as the parameters governing the assessment process. These enable one to evaluate how the bias in the estimator from the imputed data analysis changes according to these factors. In the Appendix, we report the results of a small investigation regarding the asymptotic behaviour of Cox regression coefficients of the treatment effect when the surrogate progression‐free survival data are used. For the scenarios studied, we find that there is generally a conservative bias, meaning the naive use of the surrogate progression‐free survival time underestimates the impact of treatment.

4. Simulations and analyses based on correct and misspecified models

Here, we report on simulation studies designed to empirically assess the bias and frequency properties of tests and confidence intervals in terms of power and coverage probabilities. Individuals were randomized to one of two treatment arms in a balanced fashion with equal probability. Given the treatment assignment, the times to progression or/and death were simulated based on the three‐state multiplicative model; T 01 and T 02 were simulated based on exponential distributions with rates λ 01 exp(x β) and λ 02 exp(x β), respectively, and if T 01<T 02, we simulate T 12 according to an exponential distribution with rate λ 12 exp(x β 12). The values for the baseline intensities are set to satisfy the following constraints: (i) P(T 01<T 02X = 0) = λ 01/(λ 01+λ 02)={0.6,0.8}; (ii) λ 12/λ 02=1.5; and (iii) π A=P(T > C AX = 0) = 0.20. We set β 01=β 02=β= log0.4 and β 12= log1. The random right censoring time R i is simulated by an exponential distribution with the rate set to achieve π N={0.4,0.6}.

For the case with fixed assessment times, we assume K = 4 assessments occur over [0,C A], where C A=1, at times a k=k C A/K + ϵ k, where ϵ kN(0,σ e) and k = 1,…,K. The random error ϵ k is added to mimic the situation where some variation occurs around the evenly spaced assessment times in a protocol. We let σ e={C A/(6K),C A/(20K)} for moderate and mild variation so that the chance of two consecutive assessment times would be realized in the reverse order is small. We plot (4) along with the KM estimates based on the actual progression‐free survival time (dashed line) and the surrogate progression‐free survival time (solid line) for a data set of n = 1000 individuals. The plots reveal a striking, familiar but misleading stepwise drop in the KM curves based on the surrogate event times as seen in 19, for example. This pattern becomes more distinctive as the variation around pre‐scheduled assessment times decreases (as shown in Figure 5(b)) and the risk of progression increases. In addition, the KM curves under right‐endpoint imputation sit above KM curves based on actual progression‐free survival event times and the theoretical curves.

Figure 5.

Figure 5

Kaplan–Meier curves estimated using right end point imputed progression‐free survival event times and actual progression‐free survival event times, contrasted against the theoretical survival function of progression‐free survival, for treatment and control groups. Number of fixed assessments K = 4 with a random normal noise ϵ kN(0,σ e). (β 01=β 02=β= log0.5,β 12=0,P(T 01<T 02X = 0) = 0.80, λ 12/λ 02=1.5,π A=0.2,π N=0.4).

We also simulated random assessment times based on a Poisson process where the gap time between two consecutive assessments follows an exponential distribution with a rate α. For each parameter configuration, we calculated the sample size required to achieve 80% power for a Wald test at the 5% significance level under the Cox model. Three types of analysis were conducted on each simulated dataset for a given sample size including a Cox model for the true progression‐free survival time, a Cox model for the surrogate progression‐free survival time and a correct analysis based on the actual panel observation scheme using a time‐homogenous illness‐death model of Section 2. The first and third analyses yield consistent estimates of the treatment effect, but the first, of course, is not possible in practice. Under the assumption that the inspection process satisfies the sequentially missing at random condition 20 and is non‐informative, the partial likelihood has the form

k=1KiPrZaikZai,k1,Xi)j=01PrZti2=jZaiKi,Xiλj2ti2XiITi2<Ci.

This analysis can be implemented with the MSM package 26.

Table 1 reports the difference between the average estimated treatment effects and the true β (EBIAS) based on 1000 replications for each parameter configuration, the empirical standard error of the estimates (ESE), the average robust standard error (RSE) based on (15) and the empirical coverage probability of nominally 95% CIs for β (ECP). The proportion of replicates in which a test of H 0:β = 0 is rejected under a two‐sided Wald test is also reported as the empirical power (EP). The Cox regression analysis of the actual event time and the multistate analysis yielded estimates with negligible empirical biases, but the naive Cox analysis based on the imputed event times gave biased estimates of treatment effect. The empirical coverage probabilities from all three analysis are very close to the nominal level. Although the multistate analysis based on panel data and the Cox model based on the true event time both yield consistent estimates, the former is associated with slightly larger standard errors and lower EP, reflecting the loss of information when event times are interval censored. More appreciable differences in EP are seen when comparing the naive Cox regression analysis with the parametric multistate analysis. The EP from the naive Cox regression analysis are below 80% in general and can be as low as 62.7%. The impact of the misspecification due to right‐endpoint imputation can therefore be quite substantial. A selection of results in Table 1 are graphically displayed in Figure A.1 in the Appendix, demonstrating good agreement between the empirical finite sample behaviour and results anticipated based on large sample theory.

Table 1.

Assessments arise from point process; number of simulations = 1000 for 80% statistical power.

Cox model
True PFS time Surrogate PFS time TH multistate analysis
α P 1 N EBIAS ESE RSE ECP% EP% EBIAS ESE RSE ECP% EP% EBIAS ESE RSE ECP% EP%
β = log(050),π N=40%
2.00 0.60 111 −0.003 0.263 0.253 94.0 78.5 0.059 0.284 0.277 93.9 62.7 −0.006 0.271 0.276 95.4 73.3
2.00 0.80 111 −0.007 0.257 0.252 94.7 80.9 0.078 0.272 0.268 93.0 64.6 −0.006 0.269 0.278 95.9 72.4
4.00 0.60 111 −0.003 0.263 0.253 94.0 78.5 0.028 0.284 0.269 94.4 69.4 −0.008 0.252 0.266 95.9 78.5
4.00 0.80 111 −0.007 0.257 0.252 94.7 80.9 0.037 0.287 0.277 93.6 67.1 −0.008 0.259 0.257 95.6 78.4
8.00 0.60 111 −0.003 0.263 0.253 94.0 78.5 0.008 0.280 0.263 94.2 73.6 −0.005 0.254 0.252 95.3 80.4
8.00 0.80 111 −0.007 0.257 0.252 94.7 80.9 0.006 0.280 0.267 94.2 74.3 −0.003 0.254 0.253 95.3 80.2
β = log(0.50),π N=60%
2.00 0.60 167 −0.004 0.253 0.253 95.1 80.0 0.039 0.272 0.265 93.2 69.8 −0.005 0.267 0.276 95.5 73.6
2.00 0.80 167 0.002 0.246 0.254 96.3 79.1 0.065 0.268 0.272 94.9 64.3 0.004 0.276 0.277 95.4 72.0
4.00 0.60 167 −0.004 0.253 0.253 95.1 80.0 0.014 0.269 0.267 94.9 73.2 −0.004 0.252 0.026 96.1 78.3
4.00 0.80 167 0.002 0.246 0.254 96.3 79.1 0.034 0.271 0.274 94.9 69.2 0.005 0.259 0.258 94.6 77.1
8.00 0.60 167 −0.004 0.253 0.253 95.1 80.0 0.003 0.269 0.266 94.3 74.3 0.009 0.249 0.253 95.5 79.4
8.00 0.80 167 0.002 0.246 0.254 96.3 79.1 0.018 0.265 0.271 95.0 72.2 −0.005 0.251 0.254 95.8 78.7
β = log(075),π N=40%
2.00 0.60 634 −0.002 0.104 0.103 94.1 82.0 0.023 0.111 0.114 94.9 65.4 −0.003 0.114 0.113 95.3 73.4
2.00 0.80 634 0.003 0.100 0.103 95.9 79.5 0.036 0.110 0.110 93.4 61.9 −0.006 0.112 0.113 94.1 71.7
4.00 0.60 634 −0.002 0.104 0.103 94.1 82.0 0.012 0.109 0.110 95.1 73.3 −0.002 0.102 0.105 95.8 78.4
4.00 0.80 634 0.003 0.100 0.103 95.9 79.5 0.021 0.111 0.113 95.0 66.6 −0.001 0.107 0.105 94.3 78.1
8.00 0.60 634 −0.002 0.104 0.103 94.1 82.0 0.001 0.107 0.107 94.7 76.4 −0.003 0.104 0.103 93.9 80.6
8.00 0.80 634 0.003 0.100 0.103 95.9 79.5 0.008 0.106 0.109 95.2 72.9 −0.003 0.105 0.104 94.8 78.8
β = log(075),π N=60%
2.00 0.60 952 −0.001 0.105 0.103 93.5 80.6 0.017 0.110 0.108 94.6 70.3 −0.006 0.111 0.112 94.7 76.0
2.00 0.80 952 −0.003 0.104 0.103 94.9 81.0 0.023 0.109 0.111 94.7 67.4 −0.004 0.114 0.112 94.9 75.1
4.00 0.60 952 −0.001 0.105 0.103 93.5 80.6 0.011 0.109 0.109 95.1 72.5 −0.003 0.103 0.105 95.5 79.3
4.00 0.80 952 −0.003 0.104 0.103 94.9 81.0 0.011 0.111 0.112 94.9 68.9 −0.004 0.105 0.108 94.3 79.9
8.00 0.60 952 −0.001 0.105 0.103 93.5 80.6 0.004 0.109 0.108 94.3 73.9 −0.000 0.103 0.103 94.8 79.6
8.00 0.80 952 −0.003 0.104 0.103 94.9 81.0 −0.001 0.108 0.110 94.5 74.7 −0.005 0.106 0.103 94.8 78.7

Both the true and surrogate progression‐free survival (PFS) times are subject to right censoring.β 12= log1,P 1=P(T 01<T 02X = 0) = λ 01/(λ 01+λ 02),λ 12/λ 02=1.5,C A=1,π A=0.20.

5. Discussion

There has been much discussion about the use of composite endpoints in clinical trials 22, 23, and while views are somewhat divided, it is apparent that a clear interpretation of findings based on composite endpoints is difficult given the many factors influencing estimators. More discussion on use of progression‐free survival as a primary endpoint in oncology can be found in recent papers by 2, 3, 27, 28.

We have restricted attention to a idealized setting in which the treatment effect is the same for progression and death in order to simplify discussion and focus attention on the effects of intermittent assessments.

Frydman and Szarek 9 discuss the issue of interval‐censored progression times in the context of nonparametric estimation of progression‐free survival distribution. Boruvka and Cook 29 consider estimation and inference with the Cox model when two competing events are subject to different censoring schemes and interest lies in the transition intensity ratios. They point out that the present problem can be cast in this dual censoring framework, and hence, their methods could be utilized in this context. We have focused here, however, on the study of currently used methods to alert researchers to their limitations. Of particular importance is our finding that the standard practice of using the surrogate progression‐free survival time can lead to a substantial drop in the power of the trial when assessments are infrequent. To ensure trials are adequately powered, sample sizes should be increased to address the loss of power associated with the use of the surrogate event time S. One approach to this is to derive the sample size formula based on the three‐state illness‐death model addressing the interval‐censored progression times; piecewise constant baseline hazard function can be adopted to ease the calculation. Alternatively, one can simply adjust the sample size based on the misspecified model by deriving the limiting behaviour of the naive estimator and accommodating the bias and robust large sample variance in the calculations to ensure the power is maintained at the nominal level despite the misspecification.

These issues, while motivated by the setting of cancer clinical trials, arise in many other areas of health research involving joint consideration of non‐fatal events and death. Examples include osteoporosis trials in which elderly patients are at risk of both asymptomatic fractures (e.g. vertebral compression) and death where the former events are only detected by periodic bone scans, studies of cognitive impairment where periodic assessments may be scheduled and carried out until death or administrative censoring, among others.

Acknowledgements

This research was supported by grants from the Natural Sciences and Engineering Research Council of Canada to L. Zeng (RGPIN 327107) and R. J. Cook (RGPIN 155849) and the Canadian Institutes for Health Research to R. J. Cook (FRN 13887). L. Zeng is a Graham Trust Chair in Health Statistics, and R. J. Cook is a Canada Research Chair in Statistical Methods for Health Research.

Appendix A.

A.1. Asymptotic properties of treatment effect estimates from Cox regression on surrogate PFS

We set P(X = 1) = P(X = 0) = 0.5 and take the transition intensities in the illness‐death model to have a multiplicative form λ jk(t) = λ jk exp(x β). As before, values for the baseline intensities λ jk are based on specifications for P(T 01<T 02X = 0) = λ 01/(λ 01+λ 02),λ 12/λ 02 and π A; see Section 3.1. We set β 01=β 02=β∈{log0.9, log0.75, log0.5} for mild, moderate and strong (beneficial) treatment effects. We let β 12= log1 for no treatment effect on the intensity for progression to death transition.

Figure A.1 displays percent relative asymptotic bias (100(γ *β)/β), asymptotic power associated with a Wald test of no treatment effect and the asymptotic coverage probability of a 95% CI when fitting a Cox model based on the surrogate progression‐free survival time under an ignorable random assessment process with intensity α(t) = α. The treatment effect is under‐estimated in general with the percent relative biases ranging from about 3% to 15%; see Figure 5. As expected, the magnitude of the bias increases as the rate of assessments decreases and as the competing risk of progression over death, P(T 01<T 02X = 0), increases. The size of the treatment effect itself seems to have very little impact on the magnitude of the percent relative bias. For each parameter configuration, we derived the sample size required to achieve an 80% power for a two‐sided Wald test of treatment effect at the 5% significance level under the Cox model for the progression‐free survival analysis. We then calculated the asymptotic power that one can actually obtain when the surrogate time is used for analysis. Figure A.1(c, d) shows that using the surrogate time can lead to an appreciable loss in power, which increases as the assessment rate decreases and the probability of progression increases. For example, when α = 2 and P(T 01<T 02X = 0) = 0.8, the power from the naive progression‐free survival analysis can drop to as low as 63% when the nominal level is 80%. The asymptotic coverage probability of the 95% CI, on the other hand, is quite robust to the model misspecification. Analogous results are found when the assessment times are fixed and, hence, are not reported here.

Figure A.1.

Figure A.1

Asymptotic percent relative bias (100(γ *β)/β), asymptotic power and asymptotic coverage probability of Cox regression coefficient of treatment effect from progression‐free survival analysis using imputed data; here, β 01=β 02=β={log0.5, log0.75, log0.9},β 12= log1,λ 12/λ 02=1.5,π A=0.2 and π N=0.4.

Zeng, L. , Cook, R. J. , Wen, L. , and Boruvka, A. (2015) Bias in progression‐free survival analysis due to intermittent assessment of progression. Statist. Med., 34: 3181–3193. doi: 10.1002/sim.6529.

The copyright line for this article was changed on 18 June 2015 after original online publications.

References

  • 1. Saaed ED, Katz A, Buyse M. Overall survival and post‐progression survival in advanced breast cancer: a review of recent randomized clinical trials. Journal of Clinical Oncology 2010. 28(11): 1958–1962. [DOI] [PubMed] [Google Scholar]
  • 2. Booth CM, Eisenhauer EA. Progression‐free survival: meaningful or simply measurable?. Journal of Clinical Oncology 2012. 30(10): 1030–1033. [DOI] [PubMed] [Google Scholar]
  • 3. Fleming TR, Rothmann MD, Lu HL. Issues in using progression‐free survival when evaluating oncology products. Journal of Clinical Oncology 2009. 27(17): 2874–2880. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Miksad RA, Zietemann V, Gothe R, Schwarzer R, Conrads‐Frank A, Schnell‐Inderst P, Stollenwerk B, Siebert U. Progression‐free survival as a surrogate endpoint in advanced breast cancer. International Journal of Technology Assessment in Health Care 2009. 24(4): 371–383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Sidhu R, Rong A, Dahlberg S. Evaluation of progression‐free survival as a surrogate endpoint for survival in chemotherapy and targeted agent metastatic colorectal cancer trials. Clinical Cancer Research 2013. 19(5): 969–976. [DOI] [PubMed] [Google Scholar]
  • 6. Ballman KV, Buckner JC, Brown PD, Giannini C, Flynn PJ, LaPlant BR, Jaeckle KA. The relationship between six‐month progression‐free survival and 12‐month overall survival end points for phase II trials in patients with glioblastoma multiforme. Neuro‐Oncology 2007. 9(1): 29–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Broglio KR, Berry DA. Detecting an overall survival benefit that is derived from progression‐free survival. Journal of the National Cancer Institute 2009. 101(23): 1642–1649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Saad ED, Katz A. Progression‐free survival and time to progression as primary end points in advanced breast cancer: often used, sometimes loosely defined. Annals of Oncology 2009. 20(3): 460–464. [DOI] [PubMed] [Google Scholar]
  • 9. Frydman H, Szarek M. Nonparametric estimation in a Markov “illness‐death” process from interval censored observations with missing intermediate transition status. Biometrics 2009. 65(1): 143–151. [DOI] [PubMed] [Google Scholar]
  • 10. FDA . Guidance for industry: Clinical trial endpoints for the approval of cancer drugs and biologics, Clinical/Medical Guidances. U.S. Department of Health and Human Services, 2007. http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM259421.pdf [Accessed on 5 May 2015].
  • 11. Dejardin D, Lesaffre E, Verbeke G. Joint modeling of progression‐free survival and death in advanced cancer clinical trials. Statistics in Medicine 2010; 29:1724–1734. [DOI] [PubMed] [Google Scholar]
  • 12. Leffondré K, Touraine C, Helmer C, Joly P. Interval‐censored time‐to‐event and competing risk with death: is the illness‐death model more accurate than the Cox model?. International Journal of Epidemiology 2013. 42(4): 1177–1186. [DOI] [PubMed] [Google Scholar]
  • 13. Panageas KS, Ben‐Porat L, Dickler MN, Chapman PB, Schrag D. When you look matters: the effect of assessment schedule on progression‐free survival. Journal of the National Cancer Institute 2007. 99(6): 428–432. [DOI] [PubMed] [Google Scholar]
  • 14. Binder N, Schumacher M. Missing information caused by death leads to bias in relative risk estimates. Journal of Clinical Epidemiology 2014. 67(10): 1111–1120. [DOI] [PubMed] [Google Scholar]
  • 15. White H. Maximum likelihood estimation of misspecified models. Econometrica 1982. 50(1): 1–26. [Google Scholar]
  • 16. Struthers CA, Kalbfleisch JD. Misspecified proportional hazard models. Biometrika 1986. 73(1): 363–369. [Google Scholar]
  • 17. Lin DY, Wei LJ. The robust inference for the Cox proportional hazards model. Journal of the American Statistical Association 1989. 84(408): 1074–1078. [Google Scholar]
  • 18. Cox DR, Miller HD. The Theory of Stochastic Processes First edn. Wiley: New York, 1965. [Google Scholar]
  • 19. Scher HI, Fizazi K, Saad F, Taplin M, Sternberg CN, Miller K, Wit R, Mulders P, Chi KN, Shore ND, Armstrong AJ, Flaig TW, Flechon A, Mainwaring P, Fleming M, Hainsworth JD, Hirmand M, Selby B, Seely L, Bono JSfortheAFFIRMInvestigators. Increased survival with enzalutamide in prostate cancer after chemotherapy. The New England Journal of Medicine 2012. 367(13): 1187–1197. [DOI] [PubMed] [Google Scholar]
  • 20. Hogan JW, Roy J, Korkontzelou C. Handling dropouts in longitudinal studies. Statistics in Medicine 2004; 23(9): 1455–1497. [DOI] [PubMed] [Google Scholar]
  • 21. Grüger J, Kay R, Schumacher M. The validity of inferences based on incomplete observations in disease state models. Biometrics 1991. 47(2): 595–605. [PubMed] [Google Scholar]
  • 22. Wu L, Cook RJ. Misspecification of Cox regression models with composite endpoints. Statistics in Medicine 2012. 31(28): 3545–3562. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Cook RJ, Lee K‐A, Statistical considerations in the use of composite endpoints in clinical trials In Developments in statistical evaluation in clinical trials, van Montfort K, Oud J, Ghidey W. (eds). Springer Science + Business Media: New York, NY, 2014. [Google Scholar]
  • 24. Kalbfleisch JD, Prentice RL. The statistical analysis of failure time data Second, Wiley Series in Probability and Statistics, Wiley‐ Interscience: A John Wiley & Sons, Inc: Hoboken, New Jersey, 2002. [Google Scholar]
  • 25. Lawless JF. Statistical models and methods for lifetime data Second, Wiley Series in Probability and Statistics, Wiley‐ Interscience: A John Wiley & Sons, Inc: Hoboken, New Jersey, 2003. [Google Scholar]
  • 26. Jackson CH. Multi‐state models for panel data: the MSM package for R. Journal of Statistical Software 2011. 38(8): 1–28. [Google Scholar]
  • 27. Buyse M, Burzykowski T, Carroll K, Michiels S, Sargent DJ, Miller LL, Elfring GL, Pignon JP, Peidbois P. Progression‐free survival is a surrogate for survival in advanced colorectal cancer. Journal of Clinical Oncology 2007. 25(33): 5218–5224. [DOI] [PubMed] [Google Scholar]
  • 28. Soria JC, Massard C, Chevalier TL. Should progression‐free survival be the primary measure of efficacy for advanced NSCLC therapy?. Annals of Oncology 2010. 21(12): 2324–2332. [DOI] [PubMed] [Google Scholar]
  • 29. Boruvka A, Cook RJ. Sieve estimation in a Markov illness‐death process under dual censoring. Resubmitted, Biostatistics 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Statistics in Medicine are provided here courtesy of Wiley

RESOURCES