Abstract
Sample size calculation based on normal approximations is often associated with the loss of statistical power for a single-arm trial with time-to-event endpoint. Recently, Wu (2015) derived the exact variance for the one-sample log-rank test under the alternative, and showed that a single-arm one-stage study based on exact variance often has power above the nominal level while the type I error rate is controlled. We extend this approach to a single-arm two-stage design by using exact variances of the one-sample log-rank test for the first stage and the two stages combined. The empirical power of the proposed two-stage optimal designs is often not guaranteed under a two-stage design setting, which could be due to the asymptotic bi-variate normal distribution used to estimate the joint distribution of the test statistics. We adjust the nominal power level in design search to guarantee the simulated power of the identified optimal design being above the nominal level. The sample size and the study time savings of the proposed two-stage designs are substantial as compared to the one-stage design.
Keywords: Cancer trials, Exact variance, One-sample log-rank test, Optimal designs, Two-stage designs
1. Introduction
In an early phase clinical trial (e.g., a phase II trial) to assess the activity of a new treatment, a two-stage or multi-stage design is often used to allow a trial to be stopped earlier due to futility to protect patients when the new treatment is indeed ineffective. For a trial with binary endpoint, Simon’s two-stage optimal designs are widely used [1, 2, 3]. In other trials with survival endpoints (such as cytotoxic therapies to prevent the growth of tumors), we compare the progression-free survival (PFS) of a new treatment with historical data in a single-arm trial or the standard care in a randomized clinical trial [4, 5].
For a single-arm study with time-to-event endpoint, the one-sample log-rank test is commonly used to assess the activity of a new treatment by comparing its survival distribution with historical data [6, 7]. Finkelstein et al. [8] developed a sample size formula by using the one-sample log-rank test based on the number of events. Kwak and Jung [9] proposed a new sample size formula by using the average of the cumulative hazard function (CHF) under the null and the CHF under the alternative as the estimate of CHF in power calculation. These one-stage designs based on asymptotic variance generally do not preserve the nominal power level. For this reason, Wu [10] derived the exact variance estimate for the one-sample log-rank test in sample size calculation, and showed that the power of the one-stage design based on exact variance is always guaranteed.
A two-stage design is often preferable in early phase clinical trials to investigate the effectiveness of a new treatment and protect patients [11]. In the case that the new treatment is not as effective as expected, a two-stage design is able to stop the trial earlier in the first stage to avoid further treating more patients in an inferior treatment. In addition, a two-stage design can save sample size and study time substantially as compared to the traditionally used one-stage design. Kwak and Jung [9] developed two-stage optimal designs for a study with time-to-event endpoints which follows the exponential survival distribution. They computed the power of the study in a conservative manner by using an underestimated CHF. As a result of that, the empirical power of their computed two-stage designs should be higher. However, it is still under the nominal level. As pointed out by Whitehead [12], the power of a study based on normal approximation is often below the nominal level. We propose extending the one-stage design based on exact variance to a two-stage design setting. Two types of two-stage optimal designs are developed: a two-stage optimal design with the smallest expected sample size under H0, and a two-stage minimax design with the smallest maximum possible sample size (n).
The rest of this article is organized as follows. In Section 2, we first introduce the one-sample log-rank test, then propose single-arm two-stage optimal designs with a detailed algorithm to search for optimal designs. We conduct extensive numerical studies to compare the performance of the proposed two-stage designs with the existing one-stage design by Wu [10], and the two-stage designs by Kwak and Jung [9] in Section 3. At the end of this section, an example from a real clinical trial is used to illustrate the application of the developed two-stage optimal designs. In the last section, we provide some comments and remarks for the proposed two-stage designs.
2. Optimal two-stage designs
In a trial to evaluate the activity of a new treatment when the outcome is the time-to-event endpoint, the one-sample log-rank test is commonly used for data analysis and sample size determination. Suppose Ti and Ci are the survival time and the censoring time for the i-th patient in a trial, i = 1, 2, ⋯, n, where n is the total sample size, Ti and Ci and are assumed to be independent from each other. Then, the observed time is Xi = min (Ti, Ci), and the failure indicator is defined as ζi = I(Ti ≤ Ci) for the i-th patient, where I(.) is the indicator function with I(Ti ≤ Ci) = 1 when Ti ≤ Ci and 0 otherwise.
Let S(t) and F(t) be the survival function of the survival time T and the censoring time C. In this article, we assume the survival distribution of T follows the Weibull distribution with the shape parameter k and the scale parameter λ, with the survival function and cumulative hazard function presented as
| (1) |
where S(t) = exp−Λ(t). It is well known that the exponential distribution is a special case of the Weibull distribution with k = 1.
Let tc be the clinically meaningful follow-up time to compare the PFS between a new treatment and the estimated historical survival rate S0(tc). Then the hypotheses of a study can be specifically expressed as
| (2) |
which is equivalent to test the cumulative hazard function as
| (3) |
The survival function in Equation (1) can be alternatively written as
| (4) |
where m is the median survival time. In a randomized phase II trial of Onartuzumab in combination with Erlotinib to treat patients with advanced non-small-cell lung cancer [13], the median survival times for the control arm (placebo plus Erlotinib) and the new treatment arm (Onartuzumab plus Erlotinib) were estimated as 3.3 months and 5.5 months, respectively.
Suppose n patients are enrolled in a study with the accrual rate of θ patients per unit (e.g, per year). The parameter of interest is the PFS rate at the clinically meaningful follow-up time tc. A patient who survives at time tc, is still on the study to be followed. Then, the total accrual time is ta = n / θ and the total study time is ts = ta + tc. Suppose Ni(t) = ζiI(Xi ≤t) and Yi(t) = I(Xi ≥t) are the event process and the at-risk process, respectively. The one-sample log-rank test is widely used to test the aforementioned hypotheses in Equation (3), with the following test statistic:
where are are the observed number of events and the expected number of events, respectively. The one-sample log-rank test can be alternatively written as
where and , and is the variance estimate of W.
In a two-stage design, a trial is allowed to be stopped for futility at the first stage (t1). At the end of the first stage, the number of patients available for data analysis is n1 = t1θ. By using the available data at time t1, we calculate the one-sample log-rank test statistic for this stage. The observed time for the i-th patient at the end of the first stage is: Xi1 = min(Ti, Ci, t1 − Ei), where Ei is the calendar time of the i-th patient enrolled in the study, 0 ≤ Ei ≤ t1. When Ti ≤ Ci and Ti ≤ t1 − Ei, an event occurs before the data analysis time t1. When Ti > Ci or Ti > t1 − Ei, the i-th patient is censored at time t1 or time Ci + Ei for data analysis.
When L1 is larger than the critical value of c1, a trial proceeds to the second stage with additional n2 patients enrolled, with a total number of patients n = n1 + n2. The time for the final data analysis is ts = ta + tc. The observed time for the i-th patient is calculated as Xi = min (Ti, Ci, ts − Ei). It follows that the one-sample log-rank test is then calculated as by using data from all patients. When sample sizes (n1 and n) are large enough, the joint distribution function of (L1, L) asymptotically follows a bivariate normal distribution [9]. The detailed calculation for the actual type I error (TIE) rate and the power for a two-stage design with sample sizes and critical values (n1, c1, n, c) can be found in Appendix. The following search algorithm is used to identify the optimal design given the design parameters (α, β, tc, Λ0(tc), Λ1(tc), θ).
Step 1: Sample size for the one-stage design is calculated using the approach by Wu [10], nsingle.
Step 2: We search the optimal design with the maximum possible sample size, n, with a range from 0.5nsingle to 1.5nsingle.
Step 3: For each given n from Step 2, the total accrual time is ta = n/θ. We assume that all the patients are uniformly accrued between 0 and ta. The possible stopping times for the first stage are t1 = 1/ θ, 2/ θ, ⋯, (n − 1) / θ, which is equivalent to having n1 = 1, 2, ⋯, n − 1 in the first stage.
Step 4: For given (n1, n) or (t1, n), the first stage critical value c1 has the possible values from −1.65 to 1.65 by an increment of 0.005.
Step 5: For each combination of (n1, c1, n), we compute the largest c such that the TIE from Equation (5) in Appendix is less than or equal to α.
Step 6: Power is then computed by using Equation (6) in Appendix after the critical value of c is determined from Step 5. If the computed power is above the nominal level (1 − β), this set of sample sizes and critical values (n1, c1, n, c) is saved as a candidate for the optimal design.
Step 5 and Step 6 are used iteratively to compute power of all the possible designs (n1, c1, n, c) and identify all the designs who have the power above the nominal level. When n is large, we would suggest reducing the number of possible first stopping in Step 3 to every half month, instead of a total of n − 1 possible first stage stopping times to reduce the computational boundary.
Among the designs that meet the power requirement, the one with the smallest expected sample size under the null hypothesis (ESS0) is the optimal two-stage design. Following Simon’s design, we also introduce the minimax design that has the smallest ESS0 among the ones having the smallest n. In the following section, we compare the performance of the proposed two-stage designs for a study with time-to-event endpoint with other existing one-stage and two-stage designs.
3. Simulation study
We compare the proposed two-stage optimal designs based on exact variance, with the existing one-stage and two-stage optimal designs with regards to expected sample size and maximum possible sample size.
Kwak and Jung [9] (referred to be as the KJ design) compute the variances of and under the alternative by using the average of Λ0(t) and Λ1(t) to estimate Λ(t) under the alternative, which could lead to an reduced estimate of Λ(t) under Ha, and an over-estimated sample size. This approach may be appropriate for use in practice when Λ0(t) and Λ1(t) are close to each other. Under a one-stage design setting, Wu [10] showed that the power of a study designed by the KJ approach is generally below the nominal level, and the design by using exact variance derived by Wu [10] has the power that is always above the nominal level.
We compare the proposed two-stage minimax design and the two-stage minimax KJ design in Table 1, for studies with design parameters: the accrual rate of θ = 30 patients per year, the clinically meaningful follow-up time tc = 1. Only the exponential distribution is considered in the KJ design, with k = 1 in the Weibull distribution. The CHF Λ(tc) under the null hypothesis and that under the alternative are log(2) and log(2)/δ, respectively, where δ is the HR with the values of 1.4, 1.5, 1.6, and 1.7. The new two-stage design is similar to the KJ design with regards to the maximum possible sample size n and the simulated power. The simulated power is close to be 0.02 below the nominal level. For this reason, we increase the nominal level by 0.02, and search for the optimal two-stage design again with the design parameters (α, β −0.02, tc, Λ0(tc), Λ1(tc), θ). For these new designs at the nominal power level of 92%, their simulated power values are all above the nominal level. Although the ESS0 and n for the designs with 0.02 higher in power (with β = 0.08) are larger than these from the original designs (with β = 0.10), we suggest to utilize a study design having the guaranteed power to detect the clinically meaningful difference between the new treatment to historical data in a single-arm trial. For this reason, we increase the power nominal level by 0.02 for the proposed two-stage optimal designs in the following comparisons.
Table 1:
Comparison between the proposed two-stage minimax designs with the existing two-stage minimax design by Kwak and Jung (2014), when α = 0.05, β = 0.1, the accrual rate θ = 30 patients per year, the follow-up time tc = 1. The cumulative hazard function is assumed to be log(2) under the null hypothesis, and log(2)/δ under the alternative hypothesis, where δ is the hazard ratio (HR).
| HR | KJ design (β = 0.10) | New design (β = 0.10) | New design (β = 0.08) | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| δ | n | ESS0 | power | n | ESS0 | power | n | ESS0 | power | TIE |
| α = 5% | ||||||||||
| 1.4 | 98 | 82.8 | 0.876 | 99 | 91.1 | 0.885 | 107 | 95.6 | 0.908 | 0.043 |
| 1.5 | 74 | 65.6 | 0.879 | 74 | 68.9 | 0.881 | 80 | 72.0 | 0.907 | 0.043 |
| 1.6 | 59 | 54.9 | 0.889 | 59 | 55.7 | 0.881 | 64 | 57.7 | 0.905 | 0.042 |
| 1.7 | 51 | 45.1 | 0.895 | 49 | 47.8 | 0.878 | 53 | 48.7 | 0.903 | 0.041 |
| α = 10% | ||||||||||
| 1.4 | 78 | 72.3 | 0.882 | 81 | 75.6 | 0.890 | 88 | 81.6 | 0.913 | 0.091 |
| 1.5 | 59 | 55.7 | 0.881 | 61 | 56.4 | 0.888 | 66 | 61.0 | 0.910 | 0.090 |
| 1.6 | 48 | 44.0 | 0.887 | 48 | 46.2 | 0.886 | 52 | 49.8 | 0.909 | 0.089 |
| 1.7 | 40 | 38.5 | 0.886 | 40 | 38.1 | 0.884 | 43 | 42.3 | 0.908 | 0.089 |
We further compare the performance of the proposed two-stage designs with the one-stage design by Wu [10]. The proposed two-stage designs can be considered as an extension of Wu’s one-stage design based on exact variance estimate of W under the alternative. We use the survival function of S(t) = exp −log(2)(t / m)k in the comparison, where m is the median survival time, and k is the shape parameter of the Weibull distribution. When the shape parameter k is assumed to be the same under the null and the alternative, the HR between the null and the alternative is calculated as: δ = (m1 / m0)k, which is a proportional hazard model.
Table 2 shows the sample size comparison between the proposed two-stage designs and the one-stage design with design parameters: (α, β, tc) = (0.05, 0.1, 1), δ = 1.2 and 1.8, under the Weibull distribution with the shape parameter k = 0.1, 0.25, 0.5, 1, and 2. It should be noted that the power nominal level is 92% for the two-stage designs, while it is still 90% in the one-stage design. As compared to the one-stage design, the proposed two-stage minimax designs save sample sizes with an average of 14.4% (ranges from 2% to 31%) for the cases studied in Table 2, and the average sample size saving is increased to 21.7% for the proposed two-stage optimal designs. The empirical TIE and power of these two-stage designs are presented in Table 3. The size of all studies is well controlled at the nominal level, and the simulated power is often above 90% except the cases with a small shape parameter, e.g., k = 0.1. For a Weibull distribution with a very small shape parameter (e.g., 0.1), the survival time for the majority of patients is very short. For the cases with the power being less than the nominal level, we would suggest searching for two-stage optimal designs with an increased power level adjustment, such as, 4%. In general, the proposed two-stage designs have a satisfactory performance with regards to TIE rate and power.
Table 2:
Proposed two-stage optimal and minimax designs with the design parameters, the shape parameter in the Weibull distribution k from 0.1 to 2, the accrual rate θ = nsingle / 3 with tc = 3 for the one-stage design by Wu (2015), HR δ = 0.05 and 1.8, α = 0.05 and β = 0.10. The optimal two-stage designs presented in the table are the ones searched at the nominal power of 92%. ESS0: expected sample size under the null hypothesis; PET: probability of early termination.
| Two-stage minimax design | Two-stage optimal design | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| k | n1 | n | c1 | c | ESS0 | PET | n1 | n | c1 | c | ESS0 | PET | nsingle |
| HR: δ =1.2 | |||||||||||||
| 0.1 | 282 | 562 | −0.045 | −1.645 | 417.0 | 0.52 | 230 | 599 | −0.295 | −1.645 | 371.7 | 0.62 | 534 |
| 0.25 | 307 | 515 | 0.145 | −1.645 | 423.0 | 0.44 | 239 | 572 | −0.255 | −1.644 | 372.0 | 0.60 | 491 |
| 0.5 | 306 | 450 | 0.450 | −1.645 | 403.0 | 0.33 | 240 | 504 | −0.165 | −1.641 | 354.7 | 0.57 | 432 |
| 1 | 282 | 365 | 0.865 | −1.645 | 348.9 | 0.19 | 228 | 403 | −0.025 | −1.638 | 313.8 | 0.51 | 356 |
| 2 | 251 | 308 | 1.000 | −1.645 | 299.0 | 0.16 | 204 | 332 | 0.165 | −1.639 | 276.4 | 0.43 | 306 |
| HR: δ =1.8 | |||||||||||||
| 0.1 | 29 | 62 | −0.135 | −1.645 | 43.7 | 0.55 | 24 | 67 | −0.290 | −1.645 | 40.6 | 0.61 | 63 |
| 0.25 | 33 | 56 | 0.240 | −1.645 | 46.7 | 0.41 | 26 | 62 | −0.245 | −1.644 | 40.5 | 0.60 | 57 |
| 0.5 | 28 | 49 | 0.220 | −1.644 | 40.3 | 0.41 | 25 | 54 | −0.080 | −1.641 | 38.6 | 0.53 | 50 |
| 1 | 26 | 38 | 0.785 | −1.645 | 35.4 | 0.22 | 23 | 41 | 0.235 | −1.640 | 33.7 | 0.41 | 41 |
| 2 | 21 | 31 | 1.460 | −1.645 | 30.3 | 0.07 | 21 | 33 | 0.600 | −1.642 | 29.7 | 0.27 | 36 |
Table 3:
The empirical TIE and power of the two-stage optimal and minimax designs in Table (2) at the significance level of 0.05 and the nominal power level of 92%, based on 10,000 simulations. TIE: actual type I error rate.
| Two-stage minimax design | Two-stage optimal design | |||||
|---|---|---|---|---|---|---|
| k | TIE | Power | n | TIE | Power | n |
| HR: δ =1.2 | ||||||
| 0.1 | 0.046 | 0.908 | 562 | 0.041 | 0.892 | 599 |
| 0.25 | 0.047 | 0.913 | 515 | 0.041 | 0.901 | 572 |
| 0.5 | 0.047 | 0.915 | 450 | 0.042 | 0.906 | 504 |
| 1 | 0.047 | 0.913 | 365 | 0.044 | 0.912 | 403 |
| 2 | 0.045 | 0.913 | 308 | 0.044 | 0.912 | 332 |
| HR: δ =1.8 | ||||||
| 0.1 | 0.040 | 0.894 | 62 | 0.037 | 0.884 | 67 |
| 0.25 | 0.042 | 0.903 | 56 | 0.037 | 0.893 | 62 |
| 0.5 | 0.039 | 0.899 | 49 | 0.038 | 0.900 | 54 |
| 1 | 0.040 | 0.901 | 38 | 0.038 | 0.902 | 41 |
| 2 | 0.038 | 0.901 | 31 | 0.038 | 0.908 | 33 |
We also provide sample size comparison between the proposed two-stage designs with the one-stage design based on exact variance [10] in Figure 1 with HR: δ = 1.3, and 1.6. The two-stage optimal designs have the smallest ESS0, followed by the two-stage minimax designs, and the one-stage design. The maximum possible sample size of the two-stage minimax designs is close to that of the one-stage design, and the two-stage optimal designs often need more patients than the other two designs.
Figure 1:

Comparison the expected sample size ESS0 and the maximum possible sample size n between the proposed two-stage designs and the one-stage design by Wu (2015) when HR δ = 1.3, and 1.6.
3.1. An example
We consider an example from a clinical trial to compare a new treatment to the standard care of the drug D-penicillamine (DPCA) for patients with primary biliary cirrhosis of the liver [14]. The historical data with the DPCA treatment can be fit well by the Weibull distribution with the shape parameter k = 1.22, and the median survival time m0 = 9 years. Suppose the survival distribution of the new treatment has the same shape parameter k = 1.22 as the standard care, and the HR is δ = 1.75. The total accrual time is ta = 5 years, and the clinically meaningful follow-up time is tc = 3 years. The one-stage design developed by Wu [10] requires the sample size of nsingle = 88 to attain 80% power at the significance level of 0.05.
To compute the proposed two-stage designs, we use the same accrual rate θ = nsingle / ta = 88 / 5 as in the one-stage design. The total study time of the one-stage design is 5+3=8 years. We provide the detailed two-stage optimal designs in Table 4 with no power adjustment, 0.02 more, and 0.04 more power with the nominal power level of 80%, 82%, and 84%. When the power nominal level is either 80% or 82%, the simulated power is slightly below the nominal level. The study designs found by using the power level at 84% have the simulated power above 80%, and the type I error rate is well controlled for all the cases. For this example, we would recommend the two-stage designs computed at the power nominal level of 84% for use in practice to make sure that a study has sufficient enough patients to detect the clinically meaningful activity of the new treatment. Let the expected total study length under the null hypothesis be
where PET is the probability of early termination that is the probability that a trial is being stopped at the end of the first stage. The ETSL0 of the optimal design and the minimax design is 1 year and 1.5 years shorter than that of the one-stage design (ETSL0 = 5 + 3 = 8). The sample size savings are within the range of the findings from the aforementioned numerical studies.
Table 4:
The two-stage optimal and minimax designs under the nominal power levels of 80%, 82%, and 84% for a trial to investigate the activity of a new treatment for patients with primary biliary cirrhosis of the liver, as compared to the one-stage design with nsingle = 88, and the expected total study length ETSL0 = 5 + 3 = 8 years, where ta = 5 is the accrual time and tc = 3 is the follow-up time. ESS0 : expected sample size under the null hypothesis; ETSL0 : expected total study length under the null hypothesis; PET : probability of early termination.
| Nominal Power | n1 | c1 | n | c | DA1 | ESS0 | ETSL0 | TIE | Power |
|---|---|---|---|---|---|---|---|---|---|
| Two-stage minimax design | |||||||||
| 80% | 47 | 0.800 | 85 | −1.632 | 2.667 | 76.9 | 6.7 | 0.042 | 0.787 |
| 82% | 54 | 0.710 | 88 | −1.633 | 3.068 | 79.9 | 6.8 | 0.042 | 0.797 |
| 84% | 61 | 0.710 | 91 | −1.636 | 3.466 | 83.8 | 7.0 | 0.042 | 0.817 |
| Two-stage optimal design | |||||||||
| 80% | 44 | 0.310 | 92 | −1.597 | 2.500 | 73.8 | 6.1 | 0.044 | 0.795 |
| 82% | 45 | 0.345 | 95 | −1.601 | 2.557 | 76.7 | 6.3 | 0.043 | 0.796 |
| 84% | 50 | 0.350 | 97 | −1.608 | 2.841 | 79.9 | 6.5 | 0.043 | 0.820 |
Given the design parameters, the null and alternative survival rates at tc = 3 years are estimated as 83.41% and 90.15%, respectively. As suggested by the reviewers, we include Simon’s two-stage minimax and optimal designs when the outcomes are considered as binary. In this case, the binary outcome is defined as the 3-year survival. The expected sample sizes under the null hypothesis are 100.5 for the optimal design and 156.2 for the minimax design, which are larger than those from the proposed two-stage designs with survival endpoints as seen in Table 4. The proposed two-stage designs are preferable when the outcome is survival.
4. Conclusion
Wu [10] derived the exact variance estimate of W under the alternative to calculate sample size for a single-arm one-stage design. The empirical power of these designs is always above the nominal level while that of other designs based on asymptotic variance estimates is often underestimated. For this reason, we extend the one-stage design to two-stage designs based on exact variance [10]. From our numerical studies, it can be seen that the empirical power of the proposed two-stage designs is not guaranteed. Although the variances of W1 and W under the alternative can be estimated by using the exact method, the correlation between W1 and W is still based on the limiting distribution of the joint function of W1 and W. The exact correlation coefficient may be estimated by using non-parametric approaches, which could be then used to improve the performance of the proposed two-stage designs [15, 16].
The computed type I error rate could be conservative for the proposed two-stage minimax and optimal designs with survival outcome. In this article, we search for both designs at the same time until power of both designs is above the nominal level [17, 18]. When one type of designs is chosen in the study design, the type I error rate could be closer to the nominal level. Alternative, one could add an additional constraint for the computed type I error (e.g., at least 4.5% when α=5%). We provide statistical software program in R for the design search. Researchers can modify the code to add a constraint on type I error rate.
For the existing two-stage KJ designs, the power of these optimal designs is calculated at the average of Λ0 and Λ1. This approach increases the required sample size, however, the empirical power of these two-stage designs is still underestimated. We would suggest computing the power of the study at the cumulative hazard function of Λ1. In practice, we recommend adjusting the nominal power level to guarantee the power of the two-stage designs.
Acknowledgment
We would like to thank Editor, Associate Editor, and two referees for their valuable comments and suggestions that helped to improve this manuscript. We would also like to thank Dr. Jianrong Wu for sharing his R codes with us.
Appendix
Suppose L1 and L are the one-sample log-rank test at first stage and the two stages combined. The two-dimensional variable, (L1, L), follows a bivariate normal distribution asymptotically. Then, the conditional distribution of L1 given L = z follows a normal distribution. Let ϕ and Φ be the probability density function and the cumulative distribution function of the standard normal distribution. The TIE of a two-stage design is computed as:
| (5) |
where ρ0 is the correlation coefficient between W1 and W.
We assume that the censoring distribution at the end of study, G(t), is a uniform distribution U(tc, ta + tc), and the censoring distribution at the end of the first stage, G1(t), is a uniform distribution, U(0, t1) [9]. Under the null hypothesis, the variances of W1 and W are estimated as
It follows that the correlation between W1 and W under H0 is ρ0 = σ01 / σ02.
Under the alternative hypothesis, the mean values of W1 and W are
where ω = p1 − p0, , , and ω1 = p1f − p0f, , . Recently, Wu [10] derived the exact variance of W under Ha as
where and . The exact variance of W1, , can be derived by a similar approach. Then, the correlation between W1 and W under Ha is ρ0 = σ11 / σ12, and the power of a two-stage design is computed as
| (6) |
where and
References
- [1].Shan G, Wilding GE, Hutson AD, Gerstenberger S. Optimal adaptive two-stage designs for early phase II clinical trials. Statistics in Medicine April 2016; 35(8):1257–1266, doi: 10.1002/sim.6794 . URL 10.1002/sim.6794 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].Shan G, Hutson AD, Wilding GE. Two-stage k-sample designs for the ordered alternative problem. Pharmaceut. Statist. 2012; 11(4):287–294, doi: 10.1002/pst.1499 . URL 10.1002/pst.1499 . [DOI] [PubMed] [Google Scholar]
- [3].Shan G, Zhang H, Jiang T. Adaptive two-stage optimal designs for Phase II clinical studies that allow early futility stopping. Sequential Analysis 2019; 38(In press). [Google Scholar]
- [4].Shan G, Zhang H. Two-stage optimal designs with survival endpoint when the follow-up time is restricted. BMC Medical Research Methodology April 2019; 19(1), doi: 10.1186/s12874-019-0696-x . [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Shan G. Exact Statistical Inference for Categorical Data. 1 edn., Academic Press: San Diego, CA, 2015. URL http://www.worldcat.org/isbn/0081006810 . [Google Scholar]
- [6].Fleming TR. One-Sample Multiple Testing Procedure for Phase {II} Clinical Trials. Biometrics mar 1982; 38(1):143–151, doi: 10.2307/2530297 . URL http://dx.doi.org/10.2307/2530297http://view.ncbi.nlm.nih.gov/pubmed/7082756http://view.ncbi.nlm.nih.gov/pubmed/7082756 . [DOI] [PubMed] [Google Scholar]
- [7].Sun X, Peng P, Tu D. Phase II cancer clinical trials with a one-sample log-rank test and its corrections based on the Edgeworth expansion. Contemporary clinical trials jan 2011; 32(1):108–113. URL http://view.ncbi.nlm.nih.gov/pubmed/20888929 . [DOI] [PubMed] [Google Scholar]
- [8].Finkelstein DM, Muzikansky A, Schoenfeld DA. Comparing survival of a sample to that of a standard population. Journal of the National Cancer Institute oct 2003; 95(19):1434–1439, doi: 10.1093/jnci/djg052 . URL 10.1093/jnci/djg052 . [DOI] [PubMed] [Google Scholar]
- [9].Kwak M, Jung SHH. Phase II clinical trials with time-to-event endpoints: optimal two-stage designs with one-sample log-rank test. Statistics in medicine May 2014; 33(12):2004–2016. URL http://view.ncbi.nlm.nih.gov/pubmed/24338995 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Wu J. Sample size calculation for the one-sample log-rank test. Pharmaceutical statistics 2015; 14(1):26–33. URL http://view.ncbi.nlm.nih.gov/pubmed/25339496 . [DOI] [PubMed] [Google Scholar]
- [11].Case DD, Morgan TM. Design of Phase II cancer trials evaluating survival probabilities. BMC medical research methodology apr 2003; 3(6). URL http://view.ncbi.nlm.nih.gov/pubmed/12697051 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Whitehead J. One-stage and two-stage designs for phase II clinical trials with survival endpoints. Statistics in medicine September 2014; 33(22):3830–3843. URL http://view.ncbi.nlm.nih.gov/pubmed/24817473 . [DOI] [PubMed] [Google Scholar]
- [13].Spigel DR, Ervin TJ, Ramlau RA, Daniel DB, Goldschmidt JH, Blumenschein GR, Krzakowski MJ, Robinet G, Godbert B, Barlesi F, et al. Randomized phase II trial of Onartuzumab in combination with erlotinib in patients with advanced non-small-cell lung cancer. Journal of clinical oncology nov 2013; 31(32):4105–4114. URL http://view.ncbi.nlm.nih.gov/pubmed/24101053 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Fleming TR, Harrington DP. Counting Processes and Survival Analysis. John Wiley & Sons: Hoboken, New Jersey, 2005. [Google Scholar]
- [15].Zhang H, Shan G. Letter to Editor: A novel confidence interval for a single proportion in the presence of clustered binary outcome data. Statistical methods in medical research January 2019; :096228021984 005 doi: 10.1177/0962280219840056 . URL http://www.ncbi.nlm.nih.gov/pubmed/30945615 . [DOI] [PubMed] [Google Scholar]
- [16].Shan G, Dodge-Francis C, Wilding GE. Exact Unconditional Tests for Dichotomous Data When Comparing Multiple Treatments With a Single Control. Therapeutic Innovation & Regulatory Science apr 2019; :216847901881 469 doi: 10.1177/2168479018814697 . URL http://journals.sagepub.com/doi/10.1177/2168479018814697 . [DOI] [PubMed] [Google Scholar]
- [17].Shan G, Ma C, Hutson AD, Wilding GE. Randomized Two-Stage Phase II Clinical Trial Designs Based on Barnard’s Exact Test. Journal of Biopharmaceutical Statistics aug 2013; 23(5):1081–1090, doi: 10.1080/10543406.2013.813525 . URL 10.1080/10543406.2013.813525 . [DOI] [PubMed] [Google Scholar]
- [18].Shan G, Zhang H, Jiang T. Minimax and admissible adaptive two-stage designs in phase II clinical trials. BMC Medical Research Methodology August 2016; 16(1):90+, doi: 10.1186/s12874-016-0194-3 . URL 10.1186/s12874-016-0194-3 . [DOI] [PMC free article] [PubMed] [Google Scholar]
