Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2022 Nov 14;51(4):646–663. doi: 10.1080/02664763.2022.2145272

The heterogeneity effect of surveillance intervals on progression free survival

Zihang Zhong a, Min Yang a, Senmiao Ni a, Lixin Cai a, Jingwei Wu b,, Jianling Bai a,, Hao Yu a,CONTACT
PMCID: PMC10896158  PMID: 38414801

Abstract

Progression-free survival (PFS) is an increasingly important surrogate endpoint in cancer clinical trials. However, the true time of progression is typically unknown if the evaluation of progression status is only scheduled at given surveillance intervals. In addition, comparison between treatment arms under different surveillance schema is not uncommon. Our aim is to explore whether the heterogeneity of the surveillance intervals may interfere with the validity of the conclusion of efficacy based on PFS, and the extent to which the variation would bias the results. We conduct comprehensive simulation studies to explore the aforementioned goals in a two-arm randomized control trial. We introduce three steps to simulate survival data with predefined surveillance intervals under different censoring rate considerations. We report the estimated hazard ratios and examine false positive rate, power and bias under different surveillance intervals, given different baseline median PFS, hazard ratio and censoring rate settings. Results show that larger heterogeneous lengths of surveillance intervals lead to higher false positive rate and overestimate the power, and the effect of the heterogeneous surveillance intervals may depend upon both the life expectancy of the tumor prognoses and the censoring proportion of the survival data. We also demonstrate such heterogeneity effect of surveillance intervals on PFS in a phase III metastatic colorectal cancer trial. In our opinions, adherence to consistent surveillance intervals should be favored in designing the comparative trials. Otherwise, it needs to be appropriately taken into account when analyzing data.

Keywords: Progression-free survival, surveillance interval, cancer clinical trial, false positive rate, power

1. Introduction

Cancer is considered as main culprits of human health worldwide. In 2018, there are estimated 18.1 million new cancer cases and 9.6 million cancer deaths around the world [6]. In China, cancer is the leading cause of death and poses a heavily deleterious burden on the individual patient and society [15].

Over the last few decades, with the need of efficient treatments for these serious and life-threatening tumors, increasingly innovative and advanced designs were conducted to improve clinical outcomes and survival endpoints. In oncology clinical trials, overall survival (OS) remains the gold standard endpoint since it is defined as time from the start of treatment to the time of death or loss to follow up [11]. Therefore, it not only offers the most objective efficacious assessment but also provides uncompromised measure of quality of life in cancer patients [7]. However, it requires both large sample sizes and prolonged follow-up, and may also be confounded by post-protocol therapy. These concerns have encouraged researchers to develop and validate alternative reliable surrogate endpoints for treatment efficacy. In 2018, the U.S. Food and Drug Administration has published a list of ‘surrogate endpoints which were the basis of approval or license of a drug or a biological product’ [50]. According to this list, progression free survival (PFS), defined as the time from randomization until the first evidence of objective tumor progression or death, is considered as one of the acceptable surrogate endpoints in planned clinical trials for solid tumors and hematological malignancies. In fact, PFS has been commonly used in a large number of clinical phase III trials to support approval process for new agents [53]. PFS has gained increasing popularity mainly because it occurs earlier, is less affected by subsequent therapy, and is more related to targeted agents [26]. Meanwhile, many studies have confirmed that PFS is highly correlated with OS, thus it can be served as a reliable alternative endpoint for OS in advance cancer, such as metastatic colorectal cancer [19,46,49], metastatic renal cell carcinoma [14], and metastatic medullary thyroid cancer [45]. On the other hand, the uses of PFS are controversial and debatable because of unavoidable issues of its own, including measurement error, missing assessments, inconsistent subjectivity in tumor assessments, heterogeneous assessment schedules, and multiple biases pertaining to its definition [26,41,47].

Among these issues, tumor assessment based on heterogeneous surveillance intervals across different treatment arms is not uncommon and may result in unexpected bias. In a two-arm randomized trial, the reassessment for tumor lesion is typically scheduled after a fixed number of treatment cycles, or at a time point usually defined at a prior in protocols as a multiple of the treatment cycle lengths [34], e.g.every 6 or 8 weeks (42 or 56 days). However, treatment cycle lengths typically vary between treatment groups. For example, in a randomized phase III study comparing efficacy of Temozolomide versus Dacarbazine (DTIC) in patients with advanced metastatic malignant melanoma, the treatment cycle in Temozolomide arm was 28 days, and the treatment cycle in DTIC arm was 21 days [24]. These scheduled radiological assessments of progression were performed before treatment started and repeated after every second cycle in both arms. Furthermore, the initial tumor surveillance intervals may be modified in a given trial, for example, patients on the arm with inferior efficacy may be arranged to take more frequent tumor scans to monitor the tumor progression compared to those with superior efficacy [8]. In assessing PFS, patients will only be evaluated at these scheduled time points on the basis of prespecified surveillance intervals in all treatment arms, thus the true progression time is known to lie somewhere between the two assessment points, unless the exact time is completely coincidental on the scheduled time points or the patient died (because the exact death time could be obtained from the death certificate). A practical approach is to replace the actual event time with the right assessment point and apply widely used standard survival data methods for right-censored data to the imputed data [22,48]. Even so, this straightforward approach may still complicate interpretation of PFS in clinical trial, if patients across treatment arms are assessed based on different surveillance intervals.

These concerns have encouraged us to conduct comprehensive simulation studies to explore whether the heterogeneity of the surveillance intervals may interfere with the validity of the conclusion of efficacy based on PFS and also to assess the extent to which the variation will bias the results. This paper is organized as follows. In Section 2, we introduce a general framework for simulating survival data with predefined surveillance intervals. In Section 3, we conduct extensive simulation studies to assess the heterogeneity effect of surveillance intervals on PFS. In Section 4, we demonstrate the impact of the heterogeneity of the surveillance intervals on PFS using a phase III metastatic colorectal cancer trial application. We provide some suggestions for practical considerations in Section 5.

2. Survival data with predefined surveillance intervals

In this section, we introduce three steps to simulate survival data with predefined surveillance intervals: (i) simulate time to event T; (ii) simulate time to censoring C; and (iii) simulate actual PFS data and surveillance interval determined PFS data.

2.1. Parametric distribution of time to event T incorporating the covariates

The common parametric distributions of time to event T include exponential, Weibull, gamma, log-normal and log-logistic distribution [3,43]. Among them, Weibull distribution is considered attractive because it is characterized by both scale parameter and shape parameter [31]. The shape parameter is mostly adequate to describe monotone failure rate over the study interval. For this reason, in this paper, we focus on simulating time to event T based on Weibull distribution. A more detailed description about simulation of time to event T incorporating the covariates is provided in Appendix A.1.

2.2. Parametric distribution of time to censoring C under predefined censoring rate

Often times, the time to censoring C is defined to follow prespecified distributions, such as a uniform, exponential, or Weibull distribution, with a designated censoring parameter [42]. However, using such censoring parameter to generate censored survival data usually will not guarantee a desired censoring rate p. Even though some studies have found that there is no serious bias for the proportional hazards case regardless of censoring rate, others have found that higher censoring proportion is associated with higher bias and lower power, thus have recommended minimizing the censoring rate whenever necessary [23,29,37].

Fei Wan [52] proposed a numerical approach to simulate right-censored survival data for a given censoring proportion by incorporating a variety of combinations of baseline hazard function, censoring distribution and a set of covariates. Using this method, if we assume the baseline hazard follows a Weibull distribution, we may simulate time to censoring C under any predefined censoring rate. More technical details are provided in Appendix A.2.

2.3. Actual PFS data and surveillance interval determined PFS data

Progression Free Survival is defined as the time from randomization to the first evidence of objective tumor progression or death [21]. The actual progression free survival data can be simulated based on time to event T and time to censoring C following Sections 2.12.2 and Appendices A.1–A.2. For any individual who died without detected progression, the actual PFS is the time from randomization to death; otherwise, the actual PFS is the time from randomization to either the first evidence of objective tumor progression or time to censoring, whichever comes first. If an individual does not have experience progression event or death until the end of the study, this individual is considered as a censored individual without event.

In a clinical trial with predefined surveillance intervals, the progression status can only be determined based on periodical assessments of radiographic scans while both death time and survival status can be obtained exactly. Under predefined surveillance intervals, among any individual whose tumor progression is detected, the actual PFS is usually unknown but a time interval adhered to it is observed (interval-censored). A conventional approach is to define its surveillance interval determined PFS by the closest assessment point that is greater than or equal to the actual PFS [48]; for any individual who died without detected progression, the surveillance interval determined PFS is still defined as the exact death time, equals to the actual PFS; for any individual who is censored, the surveillance interval determined PFS is defined as the time from randomization to the last assessment that is smaller than or equal to the actual PFS.

3. Simulation studies

We conduct simulation studies to explore the effect of the heterogeneity of the surveillance intervals on the validity of the efficacy based on PFS in a two-arm randomized control trial. We consider to compare the analytical results between actual PFS data and surveillance interval determined PFS data under various scenarios with different baseline median survival time, censoring rates, and lengths of surveillance intervals.

3.1. Data generation

(1) Actual PFS data: Both Section 2.12.2 and Appendices A.1–A.2 are used to generate time to event T and time to censoring C. To generate T, we define the baseline hazard function h0(t)=αλαtα1 from Weibull(α,λ), where the shape parameter α is set as 1.2, and the scale parameter λ is determined by λ=tM(log2)1/α. In the simulation, the values of tM, the predefined baseline median survival time, varies from 91 days (i.e. 14 weeks) to 147 days (i.e. 21 weeks), is incremented by either 7 days or 14 days. To generate C, we assume it follows Uniform(0,θ) distribution, where the censoring parameter θ is computationally determined based on Wan's approach to yield 10% to 50% censoring proportions. We also set 5% death rate among those uncensored subjects.

In the present simulation studies, the covariates X in a Cox proportional hazard model are generated as a mixture of discrete and continuous variables. Specifically, four independent variables are generated separately from three different distributions. X1 and X2 are generated from a Bernoulli distribution B(p=0.5), X3 from a Normal distribution N(0,1) and X4 from a Uniform distribution Uniform(0,1). X1 is a binary variable that is used to indicate the treatment allocation (subject received either treatment A, or treatment B). The regression coefficients are defined as β1= log(hazard ratio of treatment B versus treatment A), β2=0.1,β3=0.2,β4=0.3 respectively. The hazard ratio ranges from 0.5 to 1, based on the setting that predefined baseline median survival time in treatment B is either the same as in treatment A or 1.1 to 2 times longer than treatment A.

(2) Surveillance interval defined PFS data: Section 2.3 is applied to the actual PFS data to generate the surveillance interval determined PFS data based on heterogenous lengths of surveillance intervals between treatment A and treatment B ( tSI.A and tSI.B). We consider three different settings: 42 vs. 56 days; 42 vs. 70 days and 42 vs. 84 days. These numbers are chosen to resemble typical treatment cycle lengths in a two-arm randomized control trial for tumor lesion, e.g. every 6, 8, or 10 weeks.

3.2. Performance assessment

We evaluate the effect of the heterogeneity of the surveillance intervals on the validity of PFS efficacy when: (1) the actual median PFS time is the same between treatment A and treatment B ( tM.A=tM.B ); (2) the actual median progression free times in treatment B is 1.1 to 2 times longer than those in treatment A ( tM.B/tM.A is between 1.1 and 2.0). All of the various simulation parameter settings are provided in Table 1.

Table 1.

The parameters settings in the simulation studies.

Part Median PFS of A (days) Median PFS of B (days) Surveillance interval of A/B (days) Censoring rate
Part I 91,98,105,112,119,126,133,140,147 Same to A 42/56 0.1-0.5
      42/70 0.1–0.5
      42/84 0.1–0.5
Part II 91,105,119,133,147 1.1-2 times to A 42/56 0.1–0.5
      42/70 0.1–0.5
      42/84 0.1–0.5

To explore the result, we simulate 2000 runs with sample size n = 300 each and estimate the hazard ratio of B/A using the Cox proportional hazard model based on actual PFS data and surveillance interval defined PFS data separately. Conventional statistical methods treating PFS as right-censored data applies to all model fits [22,48]. The mean (EMEAN) and empirical standard error (ESE) of the hazard ratios based on the latter one are reported. We also compare the estimated hazard ratios with the actual hazard ratio in each dataset and report the empirical coverage probability (ECP), mean bias (EBIAS), relative bias (RBIAS) and mean square error (MSE). In addition, we calculate the false positive rate (FPR) when the actual median survival times are the same between treatment A and treatment B; and the power (POWER) when they are not the same so as to demonstrate the effect of the heterogeneity of the surveillance intervals on the validity of PFS efficacy.

3.3. Simulation results

First, we present the simulation results when the actual median PFS times are the same between treatment A and treatment B. Figure 1 shows the false positive rate curves under various censoring rates and heterogeneous lengths of surveillance intervals in treatment B, when the surveillance interval in treatment A is 42 days and the actual median PFS time ranges from 91 days to 147 days. It is easy to see that higher false positive rate is associated with larger heterogeneous lengths of surveillance intervals. For example, when the surveillance interval in treatment B is 84 days, it leads to a substantially inflated false positive rate (ranges from 10% to 35%). However, when it is reduced to 56 days, it only leads to a moderate inflated false positive rate (up to 10%). This result confirms that the heterogeneity of the surveillance intervals may interfere with the validity of the conclusion of efficacy based on PFS. Meanwhile, both false positive rate curves decline closely to the nominal value of 5% when the actual median PFS time gets longer. This result implies that the effect of heterogeneous lengths of surveillance intervals will be debilitated when the nature of median PFS time is longer enough. We can also see that the inflated false positive rate due to the heterogeneous lengths of surveillance intervals tends to be smaller when the censoring rate increases. This result explains the fact that as censoring becomes heavier the actual rate of observed event time actually decreases, so as to the heterogeneous effect of surveillance intervals on the PFS data.

Figure 1.

Figure 1.

The false positive rate curves when the actual median survival time are the same between two treatment arms, under various censoring rates and heterogenous lengths of surveillance intervals. The black color curve represents the proportion of iterations that is rejecting the null hypothesis using the actual PFS data.

Table 2 summarizes the performance results based on the heterogeneous surveillance interval defined PFS data when the actual median PFS time are the same between treatment A and treatment B with fixed censoring rate of 0.3. Not surprisingly, even though all the EMEAN are close to 1 (the nominal hazard ratio), they are below 1. This is explained by the fact that shorter surveillance intervals in treatment A increases the frequency of assessment, thus increases the chance for detecting tumor progression. Therefore, the estimated hazard ratio based on the heterogeneous surveillance interval defined PFS data is always less than one (i.e. concluding that treatment A is less efficacious than treatment B even if they are actually similar). Such impact is more substantial when heterogeneous lengths of surveillance intervals gets larger. Similarly, both of the (absolute) EBIAS and RBIAS, and MSE increase with larger heterogeneous lengths of surveillance intervals. This becomes more obvious when the actual median survival time is shorter. In addition, the ECP are commonly below 0.95 and can be as low as 0.742. Figure 2 further plots the EBIAS under different censoring rates. It yields similar patterns with respect to different censoring rates.

Table 2.

Performance assessment on the estimated hazard ratios when the actual median survival times are the same between treatment A and treatment B, based on simulations = 2000, censoring rate = 0.3.

tM.A tM.B tSI.A tSI.B EMEAN ESE ECP EBIAS RBIAS MSE FPR
91 91 42 56 0.946 0.136 0.909 −0.067 −0.067 0.023 0.083
      70 0.908 0.130 0.859 −0.105 −0.104 0.028 0.128
      84 0.848 0.121 0.742 −0.165 −0.163 0.042 0.237
98 98 42 56 0.954 0.137 0.923 −0.060 −0.059 0.022 0.062
      70 0.918 0.132 0.874 −0.095 −0.094 0.027 0.107
      84 0.863 0.123 0.770 −0.150 −0.148 0.038 0.203
105 105 42 56 0.963 0.140 0.919 −0.055 −0.054 0.023 0.068
      70 0.929 0.137 0.878 −0.089 −0.087 0.027 0.104
      84 0.878 0.128 0.789 −0.140 −0.138 0.036 0.183
112 112 42 56 0.951 0.133 0.936 −0.050 −0.050 0.020 0.064
      70 0.918 0.130 0.906 −0.082 −0.082 0.024 0.093
      84 0.873 0.122 0.824 −0.128 −0.128 0.031 0.175
119 119 42 56 0.968 0.138 0.937 −0.045 −0.044 0.021 0.058
      70 0.936 0.134 0.904 −0.077 −0.076 0.024 0.085
      84 0.893 0.127 0.843 −0.120 −0.119 0.030 0.141
126 126 42 56 0.967 0.140 0.926 −0.041 −0.041 0.021 0.070
      70 0.937 0.136 0.904 −0.071 −0.071 0.023 0.087
      84 0.896 0.130 0.849 −0.112 −0.111 0.029 0.138
133 133 42 56 0.969 0.141 0.931 −0.040 −0.039 0.021 0.067
      70 0.939 0.136 0.909 −0.069 −0.069 0.023 0.084
      84 0.902 0.131 0.856 −0.106 −0.105 0.028 0.134
140 140 42 56 0.976 0.138 0.943 −0.036 −0.036 0.020 0.053
      70 0.947 0.133 0.925 −0.065 −0.064 0.022 0.066
      84 0.912 0.128 0.882 −0.100 −0.098 0.026 0.104
147 147 42 56 0.978 0.135 0.942 −0.033 −0.033 0.019 0.053
      70 0.951 0.131 0.924 −0.060 −0.059 0.021 0.066
      84 0.918 0.126 0.884 −0.093 −0.092 0.025 0.104

Figure 2.

Figure 2.

The estimated hazard ratio bias (EBIAS) curves based on surveillance-interval defined PFS.

Next, we present the simulation results when the actual median PFS time in treatment B is 1.1–2 times longer than the actual median PFS time in treatment A. Figure 3 shows the empirical power curves using a two-sided Wald test under various censoring rates and heterogeneous lengths of surveillance intervals, i.e. the proportion over 2000 simulation iterations in which a test of equal efficacy between treatment A and B is rejected. In each plot, we also add the power curve based on actual PFS data as the reference (black color curve). We can see that larger heterogeneous lengths of surveillance intervals yields higher power than the nominal one. For example, when the surveillance interval in treatment A is 42 days, and the surveillance interval in treatment B is 84 days, its power curve (yellow color curve) is always on the top compares to 70 days and 56 days (blue color curve and red color curve, respectively). This result concludes the similar finding regarding the false positive rate: the power parameter cannot be precisely estimated on the basis of heterogeneous lengths of surveillance intervals data and thus cannot appropriately determine efficacy based on PFS data. In addition, simulations with a large differences of the actual median PFS time between treatment B and A results in a higher power result. For example, when the actual median PFS time in treatment B is 1.5 times longer than the actual median PFS time in treatment A, nearly 80% of the time in which a test of equal efficacy is rejected, irrespective of how much longer the heterogeneous surveillance interval is used to define PFS data. The power of the test using heterogeneous lengths of surveillance intervals is comparable with that of the test using actual PFS data when the actual median PFS time in both treatments is large (Figure 3 from top to bottom). Moreover, increasing the censoring rate from 0.1 to 0.5 (Figure 3 from left to right) decreases the power of the test slightly due to the increase in the incomplete observations in survival data.

Figure 3.

Figure 3.

The empirical power curves when the actual median survival time in treatment B is 1.1–2 times longer than that in treatment A, under various censoring rates and heterogenous lengths of surveillance intervals. The black curve represents the proportion of iterations that is rejecting the null hypothesis using the actual PFS data.

Table 3 summarizes the performance results based on the heterogeneous surveillance interval defined PFS data when the actual median PFS time in treatment A is 91 days, and the actual median PFS time in treatment B is 1.1–2 times longer than that in treatment A, with fixed censoring rate of 0.3. Similar to Table 2, all the EMEAN values are below 1 given the fact that not only the length of surveillance intervals in treatment B is longer than that in treatment A but also the actual median PFS time in treatment B is longer than in treatment A. All the EBIAS estimates are negative implies that using heterogeneous surveillance interval generally produces a slightly smaller hazard ratio estimate than the actual one, though such test typically leads to dramatically inflated power (i.e. more often to erroneously conclude that treatment A is significantly less efficacious than treatment B compares to the actual PFS data has). Such impact becomes more significant when heterogeneous lengths of surveillance intervals gets larger. However, as the actual median survival time difference between treatment A and B increases, both of the (absolute) EBIAS and RBIAS, and MSE decrease, and the POWER approximately stabilize at one value despite the heterogeneous surveillance intervals. Figure 4 further plots the EBIAS under different censoring rates. It yields similar patterns with respect to different censoring rates.

Table 3.

Performance assessment on the estimated hazard ratios when the actual median survival times in treatment A is 91 days, and the actual median survival time in treatment B is 1.1–2 times longer than the actual median survival time in treatment A, based on simulations = 2000, censoring rate = 0.3.

tM.A tM.B/tM.A tSI.A tSI.B EMEAN ESE ECP EBIAS RBIAS MSE POWER
91 1.1 42 56 0.864 0.124 0.921 −0.054 −0.059 0.018 0.205
      70 0.831 0.121 0.878 −0.087 −0.095 0.022 0.286
      84 0.779 0.112 0.768 −0.139 −0.151 0.032 0.453
91 1.2 42 56 0.795 0.115 0.922 −0.046 −0.055 0.015 0.389
      70 0.766 0.113 0.874 −0.075 −0.089 0.018 0.485
      84 0.722 0.104 0.781 −0.119 −0.142 0.025 0.659
91 1.3 42 56 0.738 0.107 0.920 −0.039 −0.050 0.013 0.586
      70 0.711 0.104 0.885 −0.066 −0.084 0.015 0.692
      84 0.672 0.097 0.802 −0.105 −0.135 0.020 0.820
91 1.4 42 56 0.687 0.098 0.929 −0.033 −0.046 0.011 0.776
      70 0.662 0.096 0.892 −0.058 −0.080 0.013 0.842
      84 0.628 0.090 0.825 −0.092 −0.128 0.017 0.918
91 1.5 42 56 0.643 0.093 0.928 −0.029 −0.043 0.010 0.886
      70 0.621 0.091 0.903 −0.052 −0.077 0.011 0.928
      84 0.590 0.085 0.838 −0.083 −0.123 0.014 0.967
91 1.6 42 56 0.607 0.087 0.931 −0.026 −0.041 0.008 0.950
      70 0.585 0.085 0.901 −0.047 −0.075 0.009 0.965
      84 0.557 0.080 0.846 −0.075 −0.119 0.012 0.985
91 1.7 42 56 0.571 0.083 0.934 −0.022 −0.037 0.007 0.981
      70 0.551 0.081 0.907 −0.042 −0.070 0.008 0.990
      84 0.526 0.076 0.857 −0.067 −0.113 0.010 0.996
91 1.8 42 56 0.537 0.079 0.942 −0.021 −0.037 0.007 0.995
      70 0.519 0.077 0.910 −0.039 −0.069 0.007 0.996
      84 0.496 0.073 0.858 −0.062 −0.111 0.009 1.000
91 1.9 42 56 0.510 0.073 0.947 −0.018 −0.035 0.006 0.998
      70 0.494 0.072 0.920 −0.034 −0.065 0.006 1.000
      84 0.472 0.069 0.875 −0.056 −0.106 0.008 1.000
91 2.0 42 56 0.485 0.072 0.931 −0.017 −0.033 0.005 0.999
      70 0.470 0.070 0.910 −0.032 −0.065 0.006 0.999
      84 0.450 0.067 0.875 −0.052 −0.104 0.007 1.000

Figure 4.

Figure 4.

The estimated hazard ratio bias (EBIAS) curves based on surveillance-interval defined PFS.

4. Application to second-line treatment of metastatic colorectal cancer

In this section, we demonstrate the impact of the heterogeneity of the surveillance intervals on efficacy assessment based on PFS in a real example. The NCT00339183 study is a phase III randomized clinical trial to evaluate the efficacy of the addition of Panitumumab (6.0 mg/kg) to FOLFIRI (fluorouracil, leucovorin, and irinotecan) chemotherapy as second-line treatment of mCRC (metastatic colorectal cancer) with PFS as the primary endpoint [36]. In this study, PFS event was defined as either radiological progression per mRECIST (modified RECIST) determined by blinded central review or death. In addition, tumor response assessment per mRECIST was performed every 8 weeks for each subject in the two groups until disease progression. They found a significant PFS improvement with Panitumumab plus FOLFIRI versus FOLFIRI alone in patients with WT (wild-type) KRAS tumors but not in those with MT (Mutant) KRAS tumors [35,36]. These results have also been validated by other pooled analysis [1,38].

We retain the original PFS data collected by every 8 weeks for patients in FOLFIRI alone group (FOLFIRI alone) and re-calculate the surveillance interval determined PFS for those in Panitumumab plus FOLFIRI group (Panitumumab + FOLFIRI) by applying different surveillance interval lengths other than 8 weeks. By reviewing multiple colorectal cancer trial protocols, we find out that a typical assessment interval ranges from 8 weeks to 3 months (up to 12 weeks) [16,32,51], Therefore, we consider re-analyzing the data under three different scenarios: A. Panitumumab + FOLFIRI group has 8-week surveillance interval (this is what the original data has); B. Panitumumab + FOLFIRI group has 10-week surveillance interval; C. Panitumumab + FOLFIRI group has 12-week surveillance interval. Both scenarios B and C illustrate the heterogeneous surveillance interval lengths between the two groups. All analyses are stratified by tumor KRAS status, and are conducted using the Cox proportional hazard model adjusted by ECOG status, prior bevacizumab, and prior oxaliplatin exposures as described in the original protocol.

In patients with WT KRAS tumors ( n=474), Panitumumab + FOLFIRI significantly improves PFS versus FOLFIRI alone under all three scenarios (Figure 5). The hazard ratio is estimated to 0.755 in scenario A ( p=0.005, Figure 5a) which is comparable to the published result, to 0.654 and 0.636 respectively in scenario B and C with p<0.001 (Figure 5 b,c). It clearly exemplifies the fact that, a more heterogeneous surveillance interval length leads to a much smaller hazard ratio estimate and much stronger statistical significance. In other words, it may exaggerate the actual efficacy of the treatment. In patients MT KRAS tumors ( n=381), even though Panitumumab + FOLFIRI does not significantly improve PFS versus FOLFIRI alone under scenario A (Figure 6 a, same result compared to the published article), the results become statistically significant when a slightly longer surveillance interval is used in the Panitumumab + FOLFIRI group (Figure 6 b, HR=0.802,p=0.0437 for 10-weeks and Figure 6 c, HR=0.805,p=0.0469 for 12 weeks, respectively). It may falsely conclude promise of efficacy when heterogeneous surveillance interval lengths are utilized in the endpoint assessment.

Figure 5.

Figure 5.

Surveillance-interval defined Progression-Free survival by Wild-type KRAS with different surveillance intervals. (a) Panitumumab + FOLFIRI: 8-week surveillance interval vs. FOLFIRI: 8-week surveillance interval (set as reference); (b) Panitumumab + FOLFIRI: 10-week surveillance interval vs. FOLFIRI: 8-week surveillance interval; (c) Panitumumab + FOLFIRI: 12-week surveillance interval vs. FOLFIRI: 8-week surveillance interval.

Figure 6.

Figure 6.

Surveillance-interval defined Progression-Free survival by Mutant KRAS with different surveillance intervals. (a) Panitumumab + FOLFIRI: 8-week surveillance interval vs. FOLFIRI: 8-week surveillance interval (set as reference); (b) Panitumumab + FOLFIRI: 10-week surveillance interval vs. FOLFIRI: 8-week surveillance interval; (c) Panitumumab + FOLFIRI: 12-week surveillance interval vs. FOLFIRI: 8-week surveillance interval.

5. Discussion

PFS has been widely used as an surrogate measure of OS in studies of various malignant tumors [2,25,30,39,41]. However, the length of the surveillance interval will ultimately determine the proxy tumor progression date, thus inevitably affects the estimates of median PFS [34]. Furthermore, if surveillance intervals are heterogeneous within a trial, it may bias the median PFS comparisons between treatment arms. In this study, we found that the heterogeneity of the surveillance intervals may interfere with the degree of validity of the efficacy based on PFS: larger heterogeneous lengths of surveillance intervals leads to higher false positive rate and overestimate the power, while such heterogeneous effect is weakened if either the true median PFS is longer enough or censoring rate increases.

In our simulation study and real example analysis, we showed that subjects who have more frequent surveillance assessment are more likely to yield a serious hazard compared with those who are receiving less frequent assessment, mainly due to the fact that more frequent assessment results in an earlier detected progression. This finding is consistent with other studies who also reported that observing event time is sensitive to the frequency of assessment, thus consequentially influences the efficacious interpretations [20,27,34,41]. To recognize the effect of frequency assessment, European Medicines Agency (EMA) has provided methodological consideration for using PFS in trials and suggested that ‘The methods and frequency of tumor assessment should be the same across study arms, even when treatment cycles are of different lengths’ [13].

Furthermore, we observed that the effect of the heterogeneous surveillance intervals may depend upon both the life expectancy of the tumor prognoses and the censoring proportion of the survival data. For the disease with expectedly shorter median PFS, using heterogeneous surveillance intervals may lead to considerable bias in inflated false positive rate and estimated power. We recommend to either shorten the assessment intervals or manage to maintain the same lengths of surveillance intervals across study arms. In practice, however, if more frequent assessment is infeasible, investigators should still consider administering the surveillance intervals based on historical standards for particular disease groups to warrant the compliance [34,44]. Even so, it is important to ensure that the planned assessment shall not be interrupted by discontinued or delayed protocol assessments or any other events that might introduce imbalance in the scheduled assessment times [12]. In addition, we revealed that relatively large censoring rate is likely alleviating the effect of the heterogeneous surveillance intervals on the validity of the conclusion of efficacy based on PFS. This may suggest, under certain circumstances, that investigators may deliberately administer heterogeneous surveillance intervals across treatment arms if a high censoring rate is presumed. Nevertheless, censoring of patients may undeniably introduce bias in the analysis. As a result, trial protocol with PFS as the primary end point should still strive to define consistent interval assessments between treatment arms [9].

The issues of heterogeneous surveillance intervals on PFS are continuously considered and carefully investigated by many statisticians in practice. In conventional approaches, right point of the surveillance interval is used to replace the true progression time, thus, most of the standard survival analysis methods for such right-censored PFS data are steadily applied. Because the estimation of PFS largely depends on the assessment schedule, as a result, comparisons between treatment arms will be biased if the surveillance intervals are administered inconsistently unless the true progression time is correctly captured. A most prevalent solution to address this pitfall is to assume progression occurs within two consecutive surveillance intervals, a so called interval-censored PFS data [28]. Yet, most of the standard survival analysis does not well handle interval censoring. Commonly, different ways to impute the progression time are proposed [34,41]. Alternatively, other approaches attempt to analyze interval-censored data without imputation as well [4,5,10,40]. Further methodological research on interval-censored data might assist in minimizing the bias caused by the heterogeneous surveillance intervals on PFS [17,18].

In summary, despite selecting PFS as a primary surrogate endpoint for OS inherits considerable merits and becomes more evident in oncology trials, its measurement of progression time is largely reliant on the planned surveillance intervals, thus subject to significant bias particularly in trials that heterogenous surveillance intervals are administered. Therefore, adherence to consistent surveillance intervals is of paramount importance, and this should be favored in designing the comparative trials.

Appendix.

A.1. Parametric distribution of time to event T incorporating the covariates

The Weibull distribution [33], Weibull(α,λ), has the probability density function:

f(t;α,λ)=αλαtα1exp(λαtα),λ>0,α>0,t0 (A1)

where α and λ are the shape and scale parameters, respectively. Its corresponding hazard and survival function are:

h(t;α,λ)=αλαtα1 (A2)
S(t;α,λ)=tαλαuα1exp(λαuα)du=exp(λαtα) (A3)

Since the median survival time tM is the value at which S(t;α,λ)=0.5, so tM=λ(log2)1/α. Furthermore, to model the time to event T through Cox proportional hazard model that allows for incorporating the covariates Xi for the ith subject, the hazard function can be expressed as

h(t;Xi,β,α,λ)=h0(t)exp(XiTβ)=αλαtα1exp(XiTβ)=α[λexp(XiTβ/α)]αtα1 (A4)

Here, Xi is a q-dimensional covariates vector, includes different covariate forms, β is the corresponding q-dimensional coefficient vector, and h0(t) is a baseline hazard function. If we regard the second multiplier factor, λexp(XiTβ/α), as a new scale parameter, then the time to event T can be considered to follow the distribution Weibull(α,λ[exp(XiTβ/α)]). Accordingly, the density function is expressed by

f(t;Xi,β,α,λ)=α[λexp(XiTβ/α)]αtα1exp[exp(XiTβ)λαtα] (A5)

The survival function is

S(t;Xi,β,α,λ)=exp[exp(XiTβ)λαtα] (A6)

A.2. Parametric distribution of time to censoring C under predefined censoring rate

Using Fei Wan's method [52], we consider to simulate time to censoring C from a noninformative uniform distribution Uniform(0,θ), where θ is the censoring parameter. Thus the density function of C is given by g(cθ)=1θ,0<c<θ.

We further define δ=I(TC) as censoring indicator, so that if subject is censored, then δ=1; otherwise, 0. There are three steps to simulate censored survival data with a predefined censoring rate [52]. The first step is to derive the individual censoring probability:

p(δ=1Xi,θ)=p(CT<,0C<)=0g(cθ)cf(tXi)dtdc=01θF(tXi)|cdc=01θexp[H0(c)exp(XiTβ)]dc (A7)

where H0() is the cumulative baseline hazard function.

The second step is to derive the population-level censoring rate from the individual-level censoring probability:

p(δ=1θ)=EXi(p(δ=1Xi,θ))=Qp(δ=1x,θ)fx(x)dx (A8)

where fx() is the joint density distribution of the q-dimensional covariates X, and Q is the probability space of it. p(δ=1θ) may not have a closed-form expression if X contains arbitrary form covariate distributions, in which nonparametric numerical integration method is typically used.

The last step is to solve the value of θ for a predefined censoring rate. For any given p, the censoring proportion for the simulated data, θ can be solved with the following equation :

γ(θp)=p(δ=1θ)p=Dp(δ=1u,θ)fλi(u)dup (A9)

In this expression, D is the probability space of λi, where the single variable λi=exp(XiTβ) is used to simplify the calculation so that the density distribution fλi(u) can be derived exactly or approximated numerically by nonparametric smoothing methods such as Gaussian kernel density estimates.

Specifically, if we assume the baseline hazard follows a Weibull distribution Weibull(α,λ) and the covariates X have mixture distributions, the censoring probability for the ith subject is expressed as

p(δ=1λi,α,θ)=λiαθΓ(1α,(θλi)α) (A10)

where Γ(.,.) denotes a lower incomplete Gamma function.

Funding Statement

This work was supported by National Natural Science Foundation of China [81773554] and [82273738].

Disclosure statement

This publication is based on research using information obtained from https://data.ProjectDataSphere.org, which is maintained by Project Data Sphere. Neither Project Data Sphere nor the owner(s) of any information from the web site have contributed to, approved or are in any way responsible for the contents of this publication.

No potential conflict of interest was reported by the author(s).

References

  • 1.Amado R.G., Wolf M., Peeters M., Van Cutsem E., Siena S., Freeman D.J., Juan T., Sikorski R., Suggs S., Radinsky R., Patterson S.D., and Chang D.D., Wild-type kras is required for panitumumab efficacy in patients with metastatic colorectal cancer, J. Clin. Oncol. 26 (2008), pp. 1626–1634. [DOI] [PubMed] [Google Scholar]
  • 2.Beaver J.A., Howie L.J., Pelosof L., Kim T., Liu J., Goldberg K.B., Sridhara R., Blumenthal G.M., Farrell A.T., Keegan P., Pazdur R., and Kluetz P.G., A 25-year experience of us food and drug administration accelerated approval of malignant hematology and oncology drugs and biologics a review, JAMA. Oncol. 4 (2018), pp. 849–856. [DOI] [PubMed] [Google Scholar]
  • 3.Bender R., Augustin T., and Blettner M., Generating survival times to simulate Cox proportional hazards models, Stat. Med. 24 (2005), pp. 1713–1723. [DOI] [PubMed] [Google Scholar]
  • 4.Bogaerts K., Komárek A., and Lesaffre E., Survival Analysis with Interval-Censored Data: A Practical Approach with Examples in R, SAS, and BUGS, Chapman and Hall/CRC, Boca Raton, 2017. [Google Scholar]
  • 5.Boruvka A. and Cook R.J., Sieve estimation in a Markov illness-death process under dual censoring, Biostatistics 17 (2016), pp. 350–363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Bray F., Ferlay J., Soerjomataram I., Siegel R.L., Torre L.A., and Jemal A., Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries, CA: Cancer J. Clin. 68 (2018), pp. 394–424. [DOI] [PubMed] [Google Scholar]
  • 7.Cheema P.K. and Burkes R.L., Overall survival should be the primary endpoint in clinical trials for advanced non-small-cell lung cancer, Curr. Oncol. 20 (2013), pp. 150–160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Dall'Era M.A., Patient and disease factors affecting the choice and adherence to active surveillance, Curr. Opin. Urol. 25 (2015), pp. 272–276. [DOI] [PubMed] [Google Scholar]
  • 9.Dancey J.E., Dodd L.E., Ford R., Kaplan R., Mooney M., Rubinstein L., Schwartz L.H., Shankar L., and Therasse P., Recommendations for the assessment of progression in randomised cancer treatment trials, Eur. J. Cancer. 45 (2009), pp. 281–289. [DOI] [PubMed] [Google Scholar]
  • 10.Diao G., Zeng D., Ke C., Ma H., Jiang Q., and Ibrahim J.G., Semiparametric regression analysis for composite endpoints subject to componentwise censoring, Biometrika 105 (2018), pp. 403–418. [Google Scholar]
  • 11.Driscoll J.J. and Rixe O., Overall survival: Still the gold standard: Why overall survival remains the definitive end point in cancer clinical trials, Cancer J. 15 (2009), pp. 401–405. [DOI] [PubMed] [Google Scholar]
  • 12.Eisenhauer E.A., Therasse P., Bogaerts J., Schwartz L.H., Sargent D.J., Ford R.C., Dancey J., Arbuck S., Gwyther S., Mooney M., Rubinstein L., Shankar L., Dodd L., Kaplan R., Lacombe D., and Verweij J., New response evaluation criteria in solid tumours: Revised recist guideline (version 1.1), Eur. J. Cancer. 45 (2009), pp. 228–247. [DOI] [PubMed] [Google Scholar]
  • 13.EMA , Appendix 1 to the guideline on the evaluation of anticancer medicinal products in man, methodological consideration for using progression-free survival (pfs) or disease-free survival (dfs) in confirmatory trials, [EB/OL]. Available at https://www.ema.europa.eu/en/appendix-1-guideline-evaluation-anticancer-medicinal-products-man-methodological-consideration-using.
  • 14.Escudier B., Bellmunt J., Négrier S., Bajetta E., Melichar B., Bracarda S., Ravaud A., Golding S., Jethwa S., and Sneller V., Phase III trial of Bevacizumab plus interferon alfa-2a in patients with metastatic renal cell carcinoma (avoren): Final analysis of overall survival, J. Clin. Oncol. 28 (2010), pp. 2144–2150. [DOI] [PubMed] [Google Scholar]
  • 15.Feng R.M., Zong Y.N., Cao S.M., and Xu R.H., Current cancer situation in China: Good or bad news from the 2018 global cancer statistics?, Cancer. Commun. 39 (2019), pp. 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Fischer von Weikersthal L., Schalhorn A., Stauch M., Quietzsch D., Maubach P.A., Lambertz H., Oruzio D., Schlag R., Weigang-Köhler K., Vehling-Kaiser U., Schulze M., Truckenbrodt J., Goebeler M., Mittermüller J., Bosse D., Szukics B., Grundeis M., Zwingers T., Giessen C., and Heinemann V., Phase III trial of irinotecan plus infusional 5-fluorouracil/folinic acid versus irinotecan plus oxaliplatin as first-line treatment of advanced colorectal cancer, Eur. J. Cancer. 47 (2011), pp. 206–214. [DOI] [PubMed] [Google Scholar]
  • 17.Frydman H., A note on nonparametric estimation of the distribution function from interval-censored and truncated observations, J. R. Stat. Soc. Ser. B (Methodol.) 56 (1994), pp. 71–74. [Google Scholar]
  • 18.Frydman H. and Szarek M., Nonparametric estimation in a Markov ‘illness–death’ process from interval censored observations with missing intermediate transition status, Biometrics 65 (2009), pp. 143–151. [DOI] [PubMed] [Google Scholar]
  • 19.Giessen C., Laubender R.P., Ankerst D.P., Stintzing S., Modest D.P., Mansmann U., and Heinemann V., Progression-free survival as a surrogate endpoint for median overall survival in metastatic colorectal cancer: Literature-based analysis from 50 randomized first-line trials, Clin. Cancer. Res. 19 (2013), pp. 225–235. [DOI] [PubMed] [Google Scholar]
  • 20.Gignac G.A., Morris M.J., Heller G., Schwartz L.H., and Scher H.I., Assessing outcomes in prostate cancer clinical trials: A twenty-first century tower of babel, Cancer 113 (2008), pp. 966–974. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Gutman S.I., Piper M., Grant M.D., Basch E., Oliansky D.M., and Aronson N., Progression-Free Survival: What Does it Mean for Psychological Well-being Or Quality of Life?, Agency for Healthcare Research and Quality (US), Rockville, MD, 2013. [PubMed] [Google Scholar]
  • 22.Herbst R.S., Baas P., Kim D.W., Felip E., Pérez-Gracia J.L., Han J.Y., Molina J., Kim J.H., Arvis C.D., Ahn M.J., Majem M., Fidler M.J., de Castro Jr G., Garrido M., Lubiniecki G.M., Shentu Y., Im E., Dolled-Filhart M., and Garon E.B., Pembrolizumab versus docetaxel for previously treated, pd-l1-positive, advanced non-small-cell lung cancer (keynote-010): A randomised controlled trial, The Lancet 387 (2016), pp. 1540–1550. [DOI] [PubMed] [Google Scholar]
  • 23.Jiang H., Symanosski J., Paul S., Qu Y., Zagar A., and Hong S., The type I error and power of non-parametric logrank and Wilcoxon tests with adjustment for covariates–a simulation study, Stat. Med. 27 (2008), pp. 5850–5860. [DOI] [PubMed] [Google Scholar]
  • 24.Kaufmann R., Spieth K., Leiter U., Mauch C., von den Driesch P., Vogt T., Linse R., Tilgen W., Schadendorf D., Becker J.C., Sebastian G., Krengel S., Kretschmer L., Garbe C., and Dummer R., Dermatologic Cooperative Oncology Group , Temozolomide in combination with interferon-alfa versus temozolomide alone in patients with advanced metastatic melanoma: A randomized, phase III, multicenter study from the Dermatologic Cooperative Oncology group, J. Clin. Oncol. 23 (2005), pp. 9001–9007. [DOI] [PubMed] [Google Scholar]
  • 25.Kemp R. and Prasad V., Surrogate endpoints in oncology: When are they acceptable for regulatory and clinical decisions, and are they currently overused?, BMC. Med. 15 (2017), pp. 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Korn R.L. and Crowley J.J., Overview: Progression-free survival as an endpoint in clinical trials with solid tumors, Clin. Cancer. Res. 19 (2013), pp. 2607–2612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Lange J.M., Gulati R., Leonardson A.S., Lin D.W., Newcomb L.F., Trock B.J., Carter H.B., Carroll P.R., Cooperberg M.R., Cowan J.E., Klotz L.H., and Etzioni R., Estimating and comparing cancer progression risks under varying surveillance protocols, Ann. Appl. Statist. 12 (2018), pp. 1773–1795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Leffondré K., Touraine C., Helmer C., and Joly P., Interval-censored time-to-event and competing risk with death: Is the illness-death model more accurate than the Cox model?, Int. J. Epidemiol. 42 (2013), pp. 1177–1186. [DOI] [PubMed] [Google Scholar]
  • 29.Lin N., Logan S., and Henley W.E., Bias and sensitivity analysis when estimating treatment effects from the Cox model with omitted covariates, Biometrics 69 (2013), pp. 850–860. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Llovet J.M., Montal R., and Villanueva A., Randomized trials and endpoints in advanced HCC: Role of PFS as a surrogate of survival, J. Hepatol. 70 (2019), pp. 1262–1277. [DOI] [PubMed] [Google Scholar]
  • 31.Meller M., Beyersmann J., and Rufibach K., Joint modeling of progression-free and overall survival and computation of correlation measures, Stat. Med. 38 (2019), pp. 4270–4289. [DOI] [PubMed] [Google Scholar]
  • 32.Mettu N.B., Ou F.S., Zemla T.J., Halfdanarson T.R., Lenz H.J., Breakstone R.A., Boland P.M., Crysler O.V., Wu C., Nixon A.B., Bolch E., Niedzwiecki D., Elsing A., Hurwitz H.I., Fakih M.G., and Bekaii-Saab T., Assessment of capecitabine and bevacizumab with or without atezolizumab for the treatment of refractory metastatic colorectal cancer: A randomized clinical trial, JAMA Netw. Open 5 (2022), p. e2149040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Mudholkar G.S., Srivastava D.K., and Kollia G.D., A generalization of the Weibull distribution with application to the analysis of survival data, J. Am. Stat. Assoc. 91 (1996), pp. 1575–1583. [Google Scholar]
  • 34.Panageas K.S., Ben-Porat L., Dickler M.N., Chapman P.B., and Schrag D., When you look matters: The effect of assessment schedule on progression-free survival, J. Natl. Cancer. Inst. 99 (2007), pp. 428–432. [DOI] [PubMed] [Google Scholar]
  • 35.Peeters M., Oliner K.S., Price T.J., Cervantes A., Sobrero A.F., Ducreux M., Hotko Y., André T., Chan E., Lordick F., Punt C.J.A., Strickland A.H., Wilson G., Ciuleanu T.E., Roman L., Van Cutsem E., He P., Yu H., Koukakis R., Terwey J.-H., Jung A.S., Sidhu R., and Patterson S.D., Analysis of KRAS/NRAS mutations in a phase III study of panitumumab with FOLFIRI compared with FOLFIRI alone as second-line treatment for metastatic colorectal cancer, Clin. Cancer. Res. 21 (2015), pp. 5469–5479. [DOI] [PubMed] [Google Scholar]
  • 36.Peeters M., Price T.J., Cervantes A., Sobrero A.F., Ducreux M., Hotko Y., André T., Chan E., Lordick F., Punt C.J., Strickland A.H., Wilson G., Ciuleanu T.E., Roman L., Van Cutsem E., Tzekova V., Collins S., Oliner K.S., Rong A., and Gansert J., Randomized phase III study of panitumumab with fluorouracil, leucovorin, and irinotecan (FOLFIRI) compared with FOLFIRI alone as second-line treatment in patients with metastatic colorectal cancer, J. Clin. Oncol. 28 (2010), pp. 4706–4713. [DOI] [PubMed] [Google Scholar]
  • 37.Persson I. and Khamis H., Bias of the Cox model hazard ratio, J. Mod. Appl. Stat. Methods. 4 (2005), pp. 90–99. [Google Scholar]
  • 38.Petrelli F., Borgonovo K., Cabiddu M., Ghilardi M., and Barni S., Cetuximab and panitumumab in kras wild-type colorectal cancer: A meta-analysis, Int. J. Colorectal. Dis. 26 (2011), pp. 823–833. [DOI] [PubMed] [Google Scholar]
  • 39.Pilz L.R., Manegold C., and Schmid-Bindert G., Statistical considerations and endpoints for clinical lung cancer studies: Can progression free survival (PFS) substitute overall survival (OS) as a valid endpoint in clinical trials for advanced nonsmall- cell lung cancer?, Transl. Lung. Cancer. Res. 1 (2012), pp. 26–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Prasad V. and Bilal U., The role of censoring on progression free survival: Oncologist discretion advised, Eur. J. Cancer. 51 (2015), pp. 2269–2271. [DOI] [PubMed] [Google Scholar]
  • 41.Qi Y., Allen Ziegler K.L., Hillman S.L., Redman M.W., Schild S.E., Gandara D.R., Adjei A.A., and Mandrekar S.J., Impact of disease progression date determination on progression-free survival estimates in advanced lung cancer, Cancer 118 (2012), pp. 5358–5365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Royston P., Tools to simulate realistic censored survival-time distributions, Stata. J. 12 (2012), pp. 639–654. [Google Scholar]
  • 43.Royston P. and Parmar M.K., Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects, Stat. Med. 21 (2002), pp. 2175–2197. [DOI] [PubMed] [Google Scholar]
  • 44.Schwartz L.H., Litière S., de Vries E., Ford R., Gwyther S., Mandrekar S., Shankar L., Bogaerts J., Chen A., Dancey J., Hayes W., Hodi F.S., Hoekstra O.S., Huang E.P., Lin N., Liu Y., Therasse P., Wolchok J.D., and Seymour L., RECIST 1.1 – update and clarification: From the RECIST committee, Eur. J. Cancer. 62 (2016), pp. 132–137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Sherman S.I., Clary D.O., Elisei R., Schlumberger M.J., Cohen E.E., Schöffski P., Wirth L.J., Mangeshkar M., Aftab D.T., and Brose M.S., Correlative analyses of ret and ras mutations in a phase 3 trial of cabozantinib in patients with progressive, metastatic medullary thyroid cancer, Cancer 122 (2016), pp. 3856–3864. [DOI] [PubMed] [Google Scholar]
  • 46.Sidhu R., Rong A., and Dahlberg S., Evaluation of progression-free survival as a surrogate endpoint for survival in chemotherapy and targeted agent metastatic colorectal cancer trials, Clin. Cancer. Res. 19 (2013), pp. 969–976. [DOI] [PubMed] [Google Scholar]
  • 47.Sridhara R., Mandrekar S.J., and Dodd L.E., Missing data and measurement variability in assessing progression-free survival endpoint in randomized clinical trials, Clin. Cancer. Res. 19 (2013), pp. 2613–2620. [DOI] [PubMed] [Google Scholar]
  • 48.Sun X., Li X., Chen C., and Song Y., A review of statistical issues with progression-free survival as an interval-censored time-to-event endpoint, J. Biopharm. Stat. 23 (2013), pp. 986–1003. [DOI] [PubMed] [Google Scholar]
  • 49.Tang P.A., Bentzen S.M., Chen E.X., and Siu L.L., Surrogate end points for median overall survival in metastatic colorectal cancer: Literature-based analysis from 39 randomized controlled trials of first-line chemotherapy, J. Clin. Oncol. 25 (2007), pp. 4562–4568. [DOI] [PubMed] [Google Scholar]
  • 50.U.S.FDA , Table of surrogate endpoints that were the basis of drug approval or licensure [EB/OL]. https://www.fda.gov/drugs/development-resources/table-surrogate-endpoints-were-basis-drug-approval-or-licensure, Content current as of:03/31/2021.
  • 51.Van Cutsem E., Köhne C.H., Hitre E., Zaluski J., Chang Chien C.R., Makhson A., D'Haens G., Pintér T., Lim R., Bodoky G., Roh J.K., Folprecht G., Ruff P., Stroh C., Tejpar S., Schlichting M., Nippgen J., and Rougier P., Cetuximab and chemotherapy as initial treatment for metastatic colorectal cancer, N. Engl. J. Med. 360 (2009), pp. 1408–1417. [DOI] [PubMed] [Google Scholar]
  • 52.Wan F., Simulating survival data with predefined censoring rates for proportional hazards models, Stat. Med. 36 (2017), pp. 838–854. [DOI] [PubMed] [Google Scholar]
  • 53.Zhou J., Vallejo J., Kluetz P., Pazdur R., Kim T., Keegan P., Farrell A., Beaver J.A., and Sridhara R., Overview of oncology and hematology drug approvals at US Food and Drug Administration between 2008 and 2016, J. Natl. Cancer. Inst. 111 (2019), pp. 449–458. [DOI] [PubMed] [Google Scholar]

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES