Abstract
In phase II oncology trials, the use of new cytostatic drugs raises some questions regarding the endpoint. Time-to-event endpoints such as Progression-Free Survival have been recommended and led to new designs. In 2003, Case and Morgan proposed a design based on the comparison of the cumulative hazards at a clinically relevant timepoint. In 2013, Kwak proposed a design based on the one-sample log-rank test. If all the patients are followed from their entry time to the analysis date, the Kwak and Jung’s design leads to a smaller sample size as compared to the Case-Morgan’s design. However, the Case and Morgan’s design requires less information since it only needs to follow every patient during a fixed interval of time. We propose a trade-off between these two approaches that corresponds to an adaptation of Kwak and Jung’s design when the follow-up is expected to be restricted. Our proposal is based on the one-sample log-rank test as the Kwak and Jung’s design but it uses the same follow-up information as the Case-Morgan’s design. Simulation study shows that our proposal allows reducing the sample size as compared to the Case-Morgan’s design (median difference of 23% [15%-33%]). Type I and type II error rates are close to their nominal rates planned in the protocol. A real phase II clinical trial in cervical cancer illustrated the interest of this new design. Thus, our proposal can be recommended as an alternative to the Kwak’s design when patients’ follow-up is restricted.
1. Introduction
Up to now, Phase II oncology trial designs are mainly relying upon response-based endpoints. However, since the inception of cytostatic therapies, the Progression Free Survival is nowadays strongly recommended for evaluating these new drugs [1]. This problematic has led to the development of new phase II designs that represent new alternatives to the classical phase II response-based design. Several designs have been recently proposed by Sun et al. [2], Case and Morgan [3], Huang et al. [4], Wu [5], Kwak and Jung [6] and Whitehead [7].
In this paper, we focus our research on the two designs developed by Case and Morgan [3](referred to herein as CM), and Kwak and Jung [6], (referred to herein as KJ). Both proposals are single-arm two-stage phase II design and rely on comparing the cumulative hazard function of an experimental treatment to an historical one. Both procedures assume that there is no lost to follow-up patient. The CM design compares the cumulative hazard rates [8] at a clinically meaningful timepoint and requires each patient to be followed until this timepoint unless an event occurs before. The KJ design which relies on the one-sample log-rank test [9], [10], compares the cumulative hazard rate over a defined period. Calculation of the stopping rules for the KJ design is based on an expected number of events which places strong constraints on obtaining an adequate follow-up over a large window of time. The failure to achieve such requirement leads to deviation from the original design.
However, in real practice, follow-up information is easier to obtain during the early period of the trial, due to the tight clinical monitoring, as compared to the late period. When the accrual duration is long, the update of the patient status at the date of analysis may be challenging whereas obtaining the status of each included patients within a restricted window of time is more realistic.
To take into account this practical problem while keeping the core features of the KJ design, we propose a modified KJ design that considers a predefined window of time for which a complete follow-up is expected. This new proposal is called the restricted-Kwak and Jung's design (referred to herein as r-KJ).
In Section 2, we recall the main features of the CM and KJ designs and present our proposed r-KJ design. In Section 3, a simulation study compares the properties of the r-KJ to the CM design. In Section 4, a phase II trial planned at the Institut Curie illustrates the use of r-KJ design. Finally, we discuss our results and give practical advices for using the r-KJ design.
2. Methods
2.1. Notations and background
For each patient , , let denote the accrual time and the failure time.
We denote the calendar time of analysis. Let denote the administrative censoring time (i.e. the time elapsed between the inclusion date and the date of analysis) and the censoring times that reflects the amount of follow-up information required by the design. We denote , and the survival functions of , and , respectively. The functions , and denote the corresponding cumulative density functions. Due to censoring mechanisms, we observe the pair of random variables , where and denotes the event indicator taking 1 if an event is observed and 0 otherwise. We assume that the censoring times are independent and non-informative.
The instantaneous hazard function of T is noted and defined as: and the cumulative hazard function is such as: . Let and be the event and at-risk processes, respectively. Here is the cumulative hazard function of the experimental therapy, is a pre-specified (null) reference cumulative hazard function and the desired target cumulative hazard function. Here is the cumulative hazard function of the historical data that can either be the Nelson-Aalen estimate or a model based estimate from previous data (previous study or meta-analyses). In the following, and are the survival functions of the failure times under the null and the alternatives hypotheses.
Let and be the date of analysis at first and second stage respectively. We denote the accrual duration for including the total sample size for the two stages.
A two stage design will be defined by the quadruplet where and are the total number of patients included at stage 1 and 2 respectively and and are futility boundaries for the first and the final analyses. If, we assume that the accrual rate is known, we have the following quadruplet .
In the following, we briefly recall the main features of CM ad KJ designs.
2.2. Case and Morgan's design
In 2003, Case and Morgan [3] have proposed a two-stage design based on the comparison of the cumulative hazards at a clinically meaningful time point (denoted ). This design relies upon the test proposed by Lin et al. [8] where the null and alternative hypotheses are:
For the CM design, each patient is followed for (months). According to our notations, here for all . The authors have provided an optimal allocation of accrual and follow-up to maintain type I and II error rates with or without suspending inclusion during the interim analysis.
2.3. Kwak and Jung's design
In 2013, Kwak and Jung [6] have proposed a one-sample log-rank test for phase II clinical trials allowing to compare the whole survival curves where the null and alternative hypotheses are:
It is worth noting that for the KJ design the null and the alternative hypothesis were expressed in terms of instantaneous hazard, but can obviously be re-expressed in term of cumulative hazard function. Indeed, hazard rate ordering implies stochastic ordering of the cumulative hazard function but not vice-versa. . Kwak and Jung's design compares the whole observed survival curve to an historical survival curve. Thus, it focuses on differences over survival curves rather than on survival probabilities computed at a unique timepoint as in the CM design. Kwak and Jung provided an optimal two-stage design which minimizes the expected sample size if the new drug has low efficacy and meets the requirements for the planned type I and type II error rates.
In the KJ procedure, from the null and alternatives hypotheses, the expected number of event is calculated which leads to a defined period for which we assume a complete follow-up of the patients. According to our notations and for the sake of simplicity, this assumption corresponds to After design specification data will be censored at the study duration, . If this follow-up is not guaranteed, the performance of the design can be altered.
As explained in the introduction, this practical problem prompted us to develop the restricted KJ design, presented just below, that considers a predefined window of time for which a complete follow-up is expected.
2.4. The proposed method: the restricted Kwak and Jung's design
Following the work of Kwak and Jung, we propose to re-design their strategy in considering a restricted window of monitoring time whose upper boundary corresponds to the clinically meaningful timepoint denoted . Thus, the value of is fixed to . Consequently, the survival function of is .
Single stage restricted Kwak and Jung's design:
Let be the counting process of the one-sample logrank test at the calendar time of analysis ,
Under , for n large, is approximately normal with mean 0 and its variance can be consistently estimated by . We reject with one-sided type I error rate if . Here denotes the -100th percentile of the standard normal distribution. We calculate the required sample size for a specified power under a specific alternative hypothesis . Under uniformly converge to [6], [11].
Thus, converge to
From expression of introduced by Jung [12], under , we state that is approximately Gaussian with mean ,
Kwak and Jung assume that and are close and considered that under , the variance of can be approached by:
where and . is the survival function corresponding to the “averaged” hazard function . Indeed, to calculate a sample size, Kwak and Jung [6] have considered an “averaged” alternative hypothesis for computing variances.
Thus, the power function is defined by .
By solving this equation, we obtain
2.4.1. Two-stage restricted Kwak and Jung's design
Let and be the two counting processes of the one-sample logrank test for the two stages at the respective calendar time of analysis and such as:
For large and , the joint distribution of under is approximately a bivariate normal with mean vector zero and approximated variance covariance matrix such as [13]:
The correlation coefficient between and is such as .
Algorithm to build the r-KJ design, type I error rate function and power function are similar to the KJ design [6]. Limits of and and the distribution of and under the null and the alternative hypotheses are adapted as presented below.
Under uniformly converges to .
Then, the limits of and and the distributions of and under the null and the alternative hypotheses can be derived.
Under , and and respectively converge to
and
So, is given by .
Under , we have and with
and .
Under , and respectively converge to
and .
Moreover, the variances of and are respectively approximated by
and
Finally, is given by .
2.4.2. Practical rules and sample size calculation
Here, we give the practical rules for designing a r-KJ design. From the specified values where denoted the inclusion rate, we can establish a restricted Kwak and Jung's design with quadruplet . Using the inclusion rate, calendar times and accrual duration could then be derived by and and the restricted Kwak and Jung's design takes place as follows:
2.4.2.1. At stage 1
At , each patient is followed until an event occurred before . A patient who is free of event before is followed until or censored at depending whose time is the smallest. At is the minimum between , and . The test statistic is calculated and compared to the stopping boundary . If , then trial is stopped for futility. Otherwise, inclusions continue until patients are included.
2.4.2.2. At stage 2
At , each patient is followed until an event occurred before or being censored at . Then the test statistic is calculated and compared to the stopping boundary . If , we conclude to the inefficacy of treatment. If we conclude that the treatment is promising.
Stage 2 conclusion can also be reached by calculating the p-value in order to take into account the observed correlation between and . Computation of the p-value is the same as the Kwak and Jung's design:
where is bivariate Gaussian vector with means 0, variance of 1 and correlation coefficient equals to and is the observed value of
2.4.3. Sample size calculation and futility boundaries determination
An iterative algorithm has been implemented to determined .
Step 1: From , determine the sample size for single stage r-KJ design.
Step 2: Initiate the algorithm with , and . Various value of will be tested until reaching convergence of the value.
Step 3: Find satisfying:
Step 4: Calculate the power of the design using the following expression:
with and .
Step 5: If power is less than , then is left and a new triplet is tested by repeating step 3 to 5. Otherwise, is selected.
For each selected candidate, the probability of early termination (PET) and the number of included patient under is calculated. The r-KJ design is those which minimize .
2.4.4. Sample size under uniform accrual and exponential survival assumptions
For sample size calculation, we assume that the failure time distribution is exponential with hazard rate under and under . Hazard ratio is given by: . Accrual is supposed to be uniform between 0 and and censoring is fixed at .
Under these assumptions, we have derived the expressions of and in Appendix A.
The above procedure has been implemented in R and the function to implement the r-KJ design is available in Supplementary Material.
2.5. Simulation protocol
Here, we compared our restricted Kwak and Jung design to the CM design since both designs use the same follow-up information. The comparison with the original KJ design would be irrelevant since for a fixed sample size it would require, for each patient except the last included patient, a longer period of follow-up.
To assess the performance of our procedure, we conducted a simulation study where the scenarios explored the impact of the following parameters: .
The historical distribution of survival times was an exponential distribution with rate , that was computed for obtaining survival rates at of , the hazard ratio of , the type I error rate of , the type II error rate of , an accrual rate of and a timepoint of fixed at 1.
For each of the 144 configurations, sample size and stopping boundaries at each stage were determined based on the Case and Morgan design and the restricted Kwak and Jung's design using the parameters
For each patient of each simulated trials, accrual time and failure time were generated independently. As required by these design, censoring time for each patient was fixed to . Failure times were generated from an exponential distribution and accrual times from an uniform distribution. For each configuration, 10,000 trials were simulated.
The r-KJ designs were compared to CM designs by several criteria: the sample size required by each design, the probability of early termination (denoted ), the stopping probability for efficacy. This last criterion under the null and the alternative hypotheses corresponded respectively to the type I and type II error rates of the design. We also considered the final number of subjects required under the null or the alternative hypotheses (it is denoted and it corresponds to if the trial stops early and to if the trial goes to the second stage).
3. Results
The CM design required to include more patients than the r-KJ design (see Table 1): the median difference between the sample sizes of the two designs was of 23% (15%–33%) on the 144 studied configurations.
Table 1.
| ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1- | Restricted Kwak and Jung |
Case and Morgan |
||||||||||||||
0.5 | 0.05 | 0.9 | 15 | 51 | 33 | −0.06 | 53 | −1.63 | 42.33 | 0.52 | 35 | 0.29 | 71 | 1.64 | 48.72 | 0.61 |
0.5 | 0.05 | 0.95 | 15 | 62 | 42 | −0.07 | 65 | −1.63 | 52.56 | 0.53 | 44 | 0.28 | 88 | 1.64 | 60.92 | 0.61 |
0.5 | 0.05 | 0.9 | 30 | 51 | 35 | 0.08 | 53 | −1.62 | 44.59 | 0.47 | 43 | 0.04 | 67 | 1.64 | 54.41 | 0.52 |
0.5 | 0.05 | 0.95 | 30 | 62 | 44 | 0.06 | 65 | −1.63 | 54.76 | 0.48 | 51 | 0.11 | 85 | 1.64 | 66.44 | 0.54 |
0.5 | 0.05 | 0.9 | 50 | 51 | 38 | 0.42 | 52 | −1.63 | 46.97 | 0.34 | 55 | −0.90 | 62 | 1.64 | 60.63 | 0.18 |
0.5 | 0.05 | 0.95 | 50 | 62 | 48 | 0.28 | 64 | −1.63 | 57.49 | 0.39 | 62 | −0.28 | 81 | 1.64 | 73.37 | 0.39 |
0.35 | 0.05 | 0.9 | 15 | 36 | 24 | 0.00 | 37 | −1.63 | 30.44 | 0.50 | 28 | 0.18 | 50 | 1.64 | 37.07 | 0.57 |
0.35 | 0.05 | 0.95 | 15 | 44 | 30 | −0.03 | 46 | −1.63 | 37.56 | 0.51 | 34 | 0.22 | 63 | 1.64 | 45.89 | 0.59 |
0.35 | 0.05 | 0.9 | 30 | 36 | 27 | 0.21 | 37 | −1.63 | 32.54 | 0.42 | 36 | −0.25 | 47 | 1.64 | 42.51 | 0.40 |
0.35 | 0.05 | 0.95 | 30 | 44 | 33 | 0.21 | 45 | −1.63 | 39.70 | 0.41 | 42 | −0.09 | 60 | 1.64 | 51.39 | 0.46 |
0.35 | 0.05 | 0.9 | 50 | 36 | 26 | 0.61 | 37 | −1.63 | 33.98 | 0.27 | a | a | a | a | a | a |
0.35 | 0.05 | 0.95 | 50 | 44 | 34 | 0.54 | 45 | −1.63 | 41.64 | 0.30 | 53 | −1.07 | 57 | 1.64 | 56.36 | 0.14 |
: type I error rate and 1-: power. : Probability of early termination and .
No Case and Morgan‘s design could be found because with this accrual rate each selected design allow to include every patients before the interim analysis.
The mean number of patients included under was higher with the CM design than with the restricted r-KJ design. The median difference was of 17%, but it ranged from 2% to 34% depending of the configuration.
Probability of early stopping was similar for the two designs. In our 144 configurations, there was no difference of probability of early termination in medians. That is to say, there was an equal proportion of discrepancy in favor of one of the two designs.
Type I and type II error rate were respected in most configurations. Nevertheless, simulation study showed that Case and Morgan's design was conservative and over-powered. Although deviations of type I and type II error rates were small, they were statistically significant. (see Table 2). Simulations show that the r-KJ design was also conservative and over-powered although deviations from nominal type I and type II error rate are smaller than performance obtained with CM design. These results are similar with the other configurations (data not shown) although the over-power of the r-KJ design decreased when sample size increased.
Table 2.
1- | Restricted Kwak and Jung |
Case and Morgan |
|||||||
---|---|---|---|---|---|---|---|---|---|
Single stage design |
Two-stage design |
||||||||
0.5 | 0.05 | 0.9 | 15 | 0.047 [0.043; 0.051] |
0.917 [0.911; 0.922] |
0.039 [0.035; 0.043] |
0.917 [0.911; 0.922] |
0.043 [0.039; 0.047] |
0.958 [0.954; 0.962] |
0.5 | 0.05 | 0.95 | 15 | 0.042 [0.038; 0.046] |
0.953 [0.948; 0.957] |
0.040 [0.036; 0.044] |
0.959 [0.955; 0.963] |
0.051 [0.047; 0.056] |
0.985 [0.983; 0.987] |
0.5 | 0.05 | 0.9 | 30 | 0.041 [0.037; 0.045] |
0.912 [0.906; 0.918] |
0.043 [0.039; 0.047] |
0.919 [0.914; 0.924] |
0.043 [0.039; 0.047] |
0.954 [0.950; 0.958] |
0.5 | 0.05 | 0.95 | 30 | 0.043 [0.039; 0.047] |
0.954 [0.950; 0.958] |
0.041 [0.037; 0.045] |
0.964 [0.960; 0.967] |
0.042 [0.038; 0.046] |
0.983 [0.980; 0.985] |
0.5 | 0.05 | 0.9 | 50 | 0.040 [0.037; 0.044] | 0.915 [0.909; 0.920] | 0.040 [0.036; 0.044] | 0.921 [0.915; 0.926] | 0.049 [0.045; 0.053] | 0.959 [0.955; 0.963] |
0.5 | 0.05 | 0.95 | 50 | 0.042 [0.038; 0.046] |
0.953 [0.948; 0.957] |
0.044 [0.040; 0.048] |
0.963 [0.959; 0.967] |
0.035 [0.031; 0.038] |
0.980 [0.977; 0.983] |
0.35 | 0.05 | 0.9 | 15 | 0.043 [0.039; 0.047] |
0.904 [0.898; 0.909] |
0.041 [0.037; 0.045] |
0.900 [0.894; 0.906] |
0.038 [0.034; 0.041] |
0.946 [0.941; 0.950] |
0.35 | 0.05 | 0.95 | 15 | 0.041 [0.037; 0.045] |
0.947 [0.942; 0.951] |
0.040 [0.036; 0.044] |
0.949 [0.944; 0.95] |
0.043 [0.039; 0.047] |
0.977 [0.974; 0.980] |
0.35 | 0.05 | 0.9 | 30 | 0.045 [0.041; 0.050] |
0.898 [0.891; 0.903] |
0.043 [0.039; 0.047] |
0.899 [0.892; 0.90] |
0.061 [0.056; 0.066] |
0.960 [0.956; 0.964] |
0.35 | 0.05 | 0.95 | 30 | 0.043 [0.039; 0.047] |
0.948 [0.943; 0.952] |
0.040 [0.036; 0.044] |
0.950 [0.945; 0.954] |
0.038 [0.034; 0.042] |
0.976 [0.973; 0.979] |
0.35 | 0.05 | 0.9 | 50 | 0.038 [0.034; 0.042] |
0.903 [0.897; 0.908] |
0.038 [0.034; 0.042] |
0.901 [0.894; 0.906] |
a | a |
0.35 | 0.05 | 0.95 | 50 | 0.044 [0.040; 0.048] |
0.946 [0.941; 0.950] |
0.036 [0.032; 0.040] |
0.947 [0.942; 0.951] |
0.061 [0.056; 0.066] |
0.985 [0.982; 0.987] |
No CM design could be found because with this accrual rate each selected design allow to include every patients before the interim analysis.
The r-KJ design allowed including less patients as compared to the CM design. In our 144 configurations, the number of included patients to conclude to the null hypothesis was 18% (4%–34%) less than with the CM design (see Table 3). Under the alternative hypothesis, the median relative difference was 22% (15–33). Whatever the investigated configuration, the r-KJ design allowed to include less patients as compared to the CM design. The minimum was of 4% (see Table 3).
Table 3.
| |||||
---|---|---|---|---|---|
1- | |||||
0.5 | 0.05 | 0.9 | 15 | 0.155 | 0.249 |
0.5 | 0.05 | 0.95 | 15 | 0.158 | 0.260 |
0.5 | 0.05 | 0.9 | 30 | 0.202 | 0.209 |
0.5 | 0.05 | 0.95 | 30 | 0.185 | 0.234 |
0.5 | 0.05 | 0.9 | 50 | 0.222 | 0.164 |
0.5 | 0.05 | 0.95 | 50 | 0.222 | 0.210 |
0.35 | 0.05 | 0.9 | 15 | 0.214 | 0.257 |
0.35 | 0.05 | 0.95 | 15 | 0.205 | 0.268 |
0.35 | 0.05 | 0.9 | 30 | 0.242 | 0.214 |
0.35 | 0.05 | 0.95 | 30 | 0.241 | 0.250 |
0.35 | 0.05 | 0.9 | 50 | a | a |
0.35 | 0.05 | 0.95 | 50 | 0.263 | 0.212 |
No CM design could be found because with this accrual rate each selected design allow to include every patients before the interim analysis.
3.1. Real clinical example
The trial presented in this work was an open phase II randomized non comparative trial, conducted at the Institut Curie, which included patients with IB2 to III stage of cervix cancer. This trial aimed to evaluate the efficacy of the association of a targeted therapy with a standard radio-chemotherapy (Cisplatine plus pelvic irradiation). The primary endpoint was the disease-free survival (DFS) at 24 months. The trial was planned in 2007. The first patient was included in 2008 and results were published in 2015 [14]. The two treatment arms were planned with the same hypotheses. The original design was a one-stage Fleming design and DFS rates were analyzed as a binary endpoint. They planned the trial with an historical DFS rate at 24 months of 50% and they expected to have a DFS rate at 24 months of 75% with the new treatment with an expected inclusion rate of 2 patients by month.
Here, we suppose that we would like to design a r-KJ trial with the same assumptions and a type I error rate of 7% and a power of 94.5%. Under an exponential distribution, these values correspond to an hazard ratio of . For a two-stage r-KJ design, we need to include 28 patients at the first stage and 38 patients at the second stage. Analyses have to be performed at 14 and 43 (19 + 24) months after the first inclusion for the first and the second stage respectively. One-sample log-rank test statistics have to be compared to the stopping boundary and which were fixed to 0.77 and −1.46 respectively. Each patient has to be followed 24 months. Here, we assume no loss to follow-up and the follow-up information is restricted to the first 24 months for every patient.
It is worth noting that the CM design could not be planned here because the clinical timepoint is too late compared to the duration accrual. The CM design requires that the interim analysis being performed when at least one patient has reached the minimum follow-up (here 24 months). At 24 months, every patient would be included; no CM designs could be found satisfying all constraints of the design. An optimal Kwak and Jung design could have been planned but this design would have required that every patient being followed 24 months after the last inclusion which seemed unrealistic. In this practical case, a restricted Kwak and Jung's design is a good option since it uses realistic follow-up information.
3.1.1. Trial results
In total, 78 patients were included in the trial. 40 patients were randomized in the experimental arm and 38 patients were randomized in the standard arm. Among the 40 patients randomized in the experimental arm, two patients withdrew their consent few days after inclusion. Thus, 38 patients by arm are analyzed (see Table 4).
Table 4.
Restricted Kwak and Jung |
|||||
---|---|---|---|---|---|
N | Test statistics | Stopping boundary | Decision | ||
Stage 1 | Experimental arm | 21 | 1.51 | 0.77 | →Stop inclusion |
Standard arm | 20 | −1.13 | 0.77 | →Proceed to stage 2 | |
Stage 2 | Experimental Arm | ||||
Standard arm | 38 | −2.75 | −1.46 | →Reject |
According to the boundaries computed above, the first analysis would take place at 14 months after the first inclusion: no patient has reached 24 months of follow-up. At the first stage, the test statistics of experimental and standard arm would be equals to 1.51 and −1.13 respectively. In experimental arm, test statistic would be higher than the stopping boundary : the null hypothesis would not be rejected and inclusion would be stopped. To the contrary, in the standard arm the null hypothesis would be rejected leading to move forward to the second stage. At second stage, in the standard arm, test statistic would be −2.75 and would be less than . The null hypothesis would have been rejected.
It is worth noting that the stopping rules and sample sizes were computed under historical rates that could be seriously questioned (see Fig. 1). Fig. 1 shows that disease-free survival of patients randomized in the experimental arm (green curve) is compatible to the null hypothesis (represented by the red curve) at least until 18 months of follow-up. The disease-free survival of patient randomized in the standard arm (blue curve) is higher than the null hypothesis (as the statistical rules suggest). The restricted Kwak and Jung's design allows coherence between statistical rules and graphical representation which is very important when communicating the results.
4. Discussion
In recent years, new sequential designs for phase II trial with time-to-event endpoint have been proposed such as the Case-Morgan and Kwak-Jung designs. The KJ design focuses on the survival distributions whereas the CM considers only survival probabilities at a particular timepoint. However, the KJ design assumes that the investigator is able to provide complete follow-up for all enrolled patients which may be challenging in many cases.
In this article, we present the restricted KJ design which is a new two-stage design for phase II trials with time-to-event endpoint restricted to a predefined window of time over which we expect an adequate follow-up. We provide the general formulas for computing the stopping boundaries and sample sizes for pre-specified null and hypotheses and error rates.
From the simulations, we show that for a wide variety of configuration the r-KJ design is better than the CM design as it requires fewer patients for the same follow-up information. Both designs use the same window of time but the first one compares the entire survival curves whereas the second is limited to a punctual comparison. With equivalent follow-up information, the r-KJ design shows better operating characteristics than the Case and Morgan's design.
Simulations results show that the r-KJ design is conservative but power gains are nevertheless preserved. The conservativeness and under-power of the one-sample log-rank test have already been documented by Wu [5]. Using the estimate of the variance proposed by Kwak and Jung [6] leads to an under-estimation which impacts the performances of the test. To improve the type I error rate control, we may use the exact estimate of the variance under the null hypothesis as proposed by Wu [5]. This modification can be implemented but requires further works for being fully evaluated. The r-KJ design, just like the KJ design, is a non-parametric design but sample size calculations are provided under exponential survival which can be a limitation of their uses. However, a current work is to develop sample size formulation under Weibull survival to help to the diffusion of these designs.
The use of the r-KJ designs in a real phase II cancer clinical trial highlights its practical interest in situations where the clinical timepoint is late and accrual rate is relatively high. Requiring a complete follow-up in such trials can be unrealistic due to financial or management constraints. Thus, a KJ design is expected to deviate from the planned constraints due to the incomplete follow-up. In our phase II cancer clinical trial with a long term endpoint, complete follow-up is extremely difficult to obtain and loss of power could be feared. Thus, a restricted Kwak and Jung's design represents a perfect alternative to the classical Kwak and Jung's design. It is worth noting that our phase II clinical trial also emphasizes the need of randomized phase II trial when the efficacy of the historical treatment is unknown. Randomized phase II trial could be used to check the null hypothesis of historical survival [15], [16]. All previously discussed designs (CM, KJ, r-KJ) are well-suited to parallel randomized non-comparative randomized trial. It is also worth noting that since both designs require the specification of a survival curve and an accrual rate, any misspecification will lead to alter performances. Moreover, the KJ designs require providing the whole historical curve that could be particularly challenging. Influence of misspecifications of this is a topic for further works.
In summary, we recommend the use of our restricted Kwak and Jung's design when the investigator is expecting to have adequate follow-up over a specified window of time.
Formatting of funding source
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Acknowledgements
We would like to thank Dr Suzy Scholl to give us access to the data of the trial. We also want to acknowledge the reviewers for their excellent expertise and advices.
Footnotes
Supplementary data related to this article can be found at https://doi.org/10.1016/j.conctc.2017.09.010.
Appendix A.
For sample size calculation, we assume that the failure time distribution is exponential with hazard rate under and under . Hazard ratio is given by: . Accrual is supposed to be uniform between 0 and and censoring is fixed at . So, survival functions , and are the following:
Under these assumptions, we have:
Appendix B. Supplementary data
The following is the supplementary data related to this article:
References
- 1.Ivanova A., Paul B., Marchenko O., Song G., Patel N., Moschos S.J. Nine-year change in statistical design, profile, and success rates of Phase II oncology trials. J. Biopharm. Stat. 2016;26(1):141–149. doi: 10.1080/10543406.2015.1092030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Sun X., Peng P., Tu D. Phase II cancer clinical trials with a one-sample log-rank test and its corrections based on the Edgeworth expansion. Contemp. Clin. Trials. 2011;32(1):108–113. doi: 10.1016/j.cct.2010.09.009. [DOI] [PubMed] [Google Scholar]
- 3.Case L.D., Morgan T.M. Design of Phase II cancer trials evaluating survival probabilities. BMC Med. Res. Methodol. 2003;3:6. doi: 10.1186/1471-2288-3-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Huang B., Talukder E., Thomas N. Optimal two-stage phase II designs with long-term endpoints. Stat. Biopharm. Res. 2010;2(1):51–61. [Google Scholar]
- 5.Wu J. Sample size calculation for the one-sample log-rank test. Pharm. Stat. 2015;14(1):26–33. doi: 10.1002/pst.1654. [DOI] [PubMed] [Google Scholar]
- 6.Kwak M., Jung S.H. Phase II clinical trials with time-to-event endpoints: optimal two-stage designs with one-sample log-rank test. Stat. Med. 2014 May 30;33(12):2004–2016. doi: 10.1002/sim.6073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Whitehead J. One-stage and two-stage designs for phase II clinical trials with survival endpoints. Stat. Med. 2014 Sep 28;33(22):3830–3843. doi: 10.1002/sim.6196. [DOI] [PubMed] [Google Scholar]
- 8.Lin D.Y., Shen L., Ying Z., Breslow N.E. Group sequential designs for monitoring survival probabilities. Biometrics. 1996;52(3):1033–1041. [PubMed] [Google Scholar]
- 9.Finkelstein D.M., Muzikansky A., Schoenfeld D.A. Comparing survival of a sample to that of a standard population. J. Natl. Cancer Inst. 2003;95(19):1434–1439. doi: 10.1093/jnci/djg052. [DOI] [PubMed] [Google Scholar]
- 10.Woolson R.F. Rank-Tests and a one-sample logrank test for comparing observed survival data to a standard population. Biometrics. 1981;37(4):687–696. [Google Scholar]
- 11.Harrington D.P., Fleming T.R. John Wiley and Sons; 2011. Counting Processes and Survival Analysis. [Google Scholar]
- 12.Jung S.H. In: Randomized Phase II Cancer Clinical Trials. Series C.B., editor. Chapmann et Hall; 2013. [Google Scholar]
- 13.Tsiatis A. Repeated Significance Testing for a general class of statistics used in censored survival analysis. J. Am. Stat. Assoc. 1982;77(380):855–861. [Google Scholar]
- 14.de la Rochefordiere A., Kamal M., Floquet A., Thomas L., Petrow P., Petit T., Pop M., Fabbro M., Kerr C., Joly F., Sevin E., Maillard S., Curé H., Weber B., Brunaud C., Minsat M., Gonzague L., Berton-Rigaud D., Aumont M., Gladieff L., Peignaux K., Bernard V., Leroy Q., Bieche I., Margogne A., Nadan A., Fourchotte V., Diallo A., Asselain B., Plancher C., Armanet S., Beuzeboc P., Scholl S.M. PIK3CA pathway mutations predictive of poor response following standard radiochemotherapy ± cetuximab in cervical cancer patients. Clin. Cancer Res. Off. J. Am. Assoc. Cancer Res. 2015;21(11):2530–2537. doi: 10.1158/1078-0432.CCR-14-2368. [DOI] [PubMed] [Google Scholar]
- 15.Buyse M. Randomized designs for early trials of new cancer treatments—an overview. Drug Inf. J. 2000;34:387–396. [Google Scholar]
- 16.Rubinstein L., Crowley J., Ivy P., Leblanc M., Sargent D. Randomized phase II designs. Clin. Cancer Res. Off. J. Am. Assoc. Cancer Res. 2009;15(6):1883–1890. doi: 10.1158/1078-0432.CCR-08-2031. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.