Abstract
In addition to point estimate for the probability of response in a two-stage design (e.g., Simon’s two-stage design for a Phase II clinical trial with binary endpoints), confidence limits should be Cute the confidence interval does not guarantee coverage probability in a two-stage setting. The existing exact approach to calculate one-sided limits is based on the overall number of responses to order the sample space. This approach could be conservative because many sample points have the same limits. We propose a new exact one-sided interval based on p-value for the sample space ordering. Exact intervals are computed by using binomial distributions directly, instead of a normal approximation. Both exact intervals preserve the nominal confidence level. The proposed exact interval based on the p-value generally performs better than the other exact interval with regard to expected length and simple average length of confidence intervals. Therefore, the new interval calculation based on p-value is recommended for use in practice.
Keywords: Clinical Trials, Confidence interval, Coverage Probability, Simon’s two-stage design
1. Introduction
For a phase II clinical trial with the binary endpoint as the primary outcome (e.g., response vs non-response to a drug), Simon’s optimal or minimax two-stage designs [1] have been widely used to assess the activity of a new treatment. With this design, a study can be stopped earlier in the first stage for futility when a very few number of responses are observed in that stage. It should be noted that Simon’s two-stage designs only allow early stopping due to futility in the first stage. Later, Mander and Thompson [2] proposed a modified Simon’s design that allows a study to be stopped in the first stage for either futility or efficacy. An additional critical response value was introduced for the first stage termination due to efficacy, and they developed a statistical software package to compute these designs [2]. Recently, other modified Simon’s designs, such as adaptive two-stage designs [3–7], have been developed to improve the flexibility and efficiency of a trial by allowing the second stage sample size to be changed as the observed first stage responses.
When a new experimental treatment under investigation in a study has a very few number of responses, the treatment is not considered promising. To test the activity of a new treatment, the hypotheses are presented as
against the alternative
but it is powered at π1, where π0 and π1 are the unacceptable and acceptable probabilities of response, respectively. Upon completion of a study, the probability of response is often reported. In addition to point estimation, a confidence interval should also be calculated and presented. For a one-sided hypothesis, a one-sided confidence interval is statistically desirable to make sure that the hypothesis testing and the confidence interval construction are consistent with each other [8]. For a study to be tested at the significance level of α, a 1 − α one-sided interval, (L,1] should be computed for statistical inference, where L is the 1 − α lower limit.
Traditionally, a one-sided interval can be computed by inverting the rejection probability function at every possible π value. Recently, Zeng et al. [9] conducted a simulation study and demonstrated that this confidence interval does not guarantee the coverage probability. Thus, this confidence interval is not exact [10,11]. In addition, the simulated confidence intervals were shown to be conservative at the sample points’ values.
To respect the coverage property, exact one-sided confidence intervals by using the Clopper-Pearson approach [12] were developed in a two-stage or multi-stage design setting by Jennison and Turnbull [13], Duffy and Santner [14], and Jung and Kim [15]. In a multi-stage design, a study can be stopped earlier before the final stage when the futility threshold or the efficacy threshold is met, and a reasonable sample ordering is needed to define the tail area in order to compute p-value and the associated confidence interval [13,14,16]. The sample space ordering should consider the nature of the multistage design. One commonly used sample space ordering in a two-stage design is based on the total number of responses, and this ordering is the same as that based on the unbiased estimate proposed by Girshick, Mosteller and Savage [17]. Later, Jung and Kim [15] derived the uniformly minimum variance unbiased estimator (UMVUE) for the probability of response in a two-stage design setting. The sample space ordering based on the UMVUE is the same as the ordering by the total number of responses [18]. This sample space ordering may be associated with a significant number of ties. To have a finer ordering, we propose usingv the p-value to order the sample space and develop a new exact one-sided interval. Both exact intervals guarantee the nominal coverage.
In Section 2, we first introduce the construction of exact one-sided lower limits, and then conduct extensive numerical studies to compare the performance of the two new exact intervals with regards to expected length and simple average length. Exact intervals are computed by using the exact discrete probability distribution and respect the nominal coverage probability. The results indicate that the exact interval based on p-value has a shorter expected length for the majority of the samples and is always associated with a shorter average length, as compared to the exact interval based on the total number of responses. The construction of the exact upper limit can be readily obtained by using the approach proposed in this section. In Section 3, we first use an example to compare the performance of the two exact intervals and the traditional interval by the inverting approach, then we compare the two exact intervals, looking at expected length and simple average length of intervals. Finally, we provide some comments in Section 4.
2. Methods
In Simon’s two-stage design, for given type I and II error rates, π0 and π1 as the response rates in the hypothesis testing, four design parameters (r1,rt,n1,n2) need to be determined, where r1 and rt are the critical values for the number of responses in the first stage and both stages combined, and n1 and n2 are the sample sizes for the first stage and the second stage, respectively. A study will be terminated early if the first stage number of responses, X1, is less than or equal to r1. In cases where X1 > r1, the trial goes to the second stage and additional n2 participants will be enrolled in the study. The second stage number of responses, X2, is added to X1 to compute the total number of responses, Xt = X1 + X2. A final decision is made by comparing Xt to rt. The null hypothesis is rejected at the end of a study if X1 > r1 and Xt > rt. Let (x1, x2) be the observed numbers of responses. The rejection region is presented as
and the p-value is then calculated as
(1) |
With a one-sided hypothesis, a one-sided confidence interval for the probability of response in the form (L(x1,x2),1] is suitable to make proper statistical inference of a two-stage design with x1 and x2 responders observed out of n1 and n2 participants from the first stage and the second stage. The lower limit is traditionally computed by inverting the rejection probability function (referred to as the I approach). The one-side interval (LI(x1,x2),1] for π based on the I approach is given as
(2) |
where ΩI(x1,x2) = Ω(x1,x2).
In Simon’s two-stage design with design parameters (r1,rt,n1,n2), the associated sample space is presented as
The study is terminated early in the first stage when X1 ≤ r1, thus, the first r1 + 1 points in the sample space are the possible number of responses in the first stage from 0 to r1. We use set A = {0,1,··· ,r1} to represent this set. When X1 > r1, the study proceeds to the second stage with an additional n2 participants. For each X1 from r1 + 1 to n1, the value of X2 ranges from 0 to n2. This set is referred to as set G = {(X1,X2) : X1 > r1,0 ≤ X2 ≤ n2}. The size of the sample space Ω is (r1 +1)+(n1 −r1)(n2 +1). For each sample point (X1,X2) in set G, the total number of responses is (X1 + X2), and all these sample points in set G can be ordered by the total number of responses. For sample points in set A where the study is stopped in the first stage, they can be ordered by their responses. The ordering of the lower limits in the proposed approach based on the number of responses is defined as
This approach is referred to as the R approach. The lower limit based on the R approach is computed as
(3) |
where ΩR(x1,x2) = {(X1,X2) : X1 + X2 ≥ x1 + x2}.
For any sample point in set G, and any sample point in set A, since, it follows that, and. Therefore, the lower limits for sample points in set G are always larger than those for sample points in set A. The lower limit for sample points in set A can also be calculated from Equation (3), resulting in the exact lower Clopper-Pearson limit [12] for one binomial proportion.
This stochastic ordering based on the total number of responses was proposed by Jennison and Turnbull [13], Duffy and Santner [14], Jung and Kim [15], and Jovic and Whitehead [19]. This ordering is the same as that based on the naive point estimate for the probability of response (x1 + x2)/(n1 + n2) which was shown to be a biased estimate [15]. Recently, Jung [18] reviewed the statistical issues in analyzing data from multi-stage designs with an emphasis on two-stage designs. He pointed out that the stochastic ordering based on the UMUVE [15] is identical to that based on the overall number of responses in the R approach.
When two sample points have the same total number of responses, their lower limits using the R approach are the same. To create a finer sample space ordering, we propose using p-value to order the sample space, and thus refer to it as the PV approach. For each sample point (x1,x2), its rejection probability function is calculated as P(x1,x2|π0) as in Equation (1). A sample point with a small p-value, is more likely to result in rejection of the null hypothesis. Thus, it should have a large lower confidence limit. Therefore, the sample space can be sorted by p-value from the smallest to the largest as a new sample space ordering. The exact one-sided lower interval based on the PV approach is computed as
(4) |
where ΩPV(x1,x2) = {(X1,X2) : P(X1,X2|π0) ≤ P(x1,x2|π0)}.
For any sample point in set G and any sample point in set A, it is easy to show that
where the inequation is satisfied because from the sample point in set G is always larger than in set A. We then have. According to the definition of ΩPV (x1,x2), we have, and for any π. Therefore, the lower limit for sample points in set A is always smaller than that in set G. For sample points in set A, their exact lower limit is again the Clopper-Pearson exact lower limit [12] for one binomial proportion.
Coverage probability is defined as
(5) |
A confidence interval is desirable to respect the nominal confidence level: is true for any π ∈ [0,1].
Theorem 2.1. The two proposed intervals based on the R approach and the PV approach preserve the nominal confidence level.
Proof. From the definition of the lower limits based on the two proposed approaches in Equation (3) and Equation (4), coverage probabilities for the two proposed intervals are guaranteed. The coverage probability is exactly 1−α at realized π values, and these two intervals are conservative at other π values.
We present the following theorem to show the relationship between the two exact limits and the limit based on the I approach.
Theorem 2.2. The two proposed exact lower limits are always less than or equal to that based on the I approach.
Proof of this theorem is given in Appendix A. This theorem shows that the exact lower limits, LR(x1,x2) and LPV (x1,x2), are always less than or equal to LI(x1,x2). We further compare the performance of the exact intervals based on the R approach and the PV approach in the next section.
3. Results
We first use a two-stage design to illustrate the application of three approaches to construct lower limits, then compare the proposed limits with regards to expected length (EL) and simple average length (AL) of the intervals from a range of two-stage designs.
3.1. An example
Suppose a study is designed by using Simon’s two-stage minimax design with π0 = 0.1,π1 = 0.2, to attain 80% power at the significance level of 0.05. The minimax twostage design parameters are (r1,rt,n1,n2) = (4,12,45,33), and the trial is stopped for futility at the first stage if 4 or fewer responses are observed out of n1 = 45 participants in that stage. As an example, if the observed data is (x1,x2) = (8,5), the 95% one-sided lower interval based on the I approach, the R approach, and the PV approach are (0.110,1], (0.102,1], and (0.103,1], and they all reject the null hypothesis as π0 = 0.1 is not included in these intervals. If the observed data is (x1,x2) = (8,4) with one less response observed from the second stage, the 95% one-sided lower intervals are (0.103,1], (0.092,1], and (0.096,1] for the I approach, the R approach, and the PV approach, respectively. The two exact approaches fail to reject the null hypothesis, while the I approach results in the rejection the null hypothesis since the lower limit is greater than π0. Moreover, the total number of responses in this case is 12, which falls in the non-rejection region, therefore, the traditional interval based on the I approach is anti-conservative for this case.
In addition to the comparison between three confidence intervals for two particular sample points, we also compare the coverage probability for all possible π values from 0 to 1. The coverage probability for each interval can be computed by using Equation (5). Figure 1 displays the coverage probability for the three intervals for the minimax two-stage design (r1,rt,n1,n2) = (4,12,45,33). It can be seen that the two proposed approaches, the R approach and the PV approach, guarantee the coverage probability, while the traditionally used I approach does not. In fact, with the I approach, the coverage probability for the majority of sample points is lower than the nominal coverage. The minimum coverage for the I approach can be as low as 91.4% which is much less than the nominal level of 95%. It should be noted that all three approaches have the same lower limits for sample points in set A, and their lower limits are smaller than those in set G. For that reason, the coverage probability plots near the left corner are the same for the three approaches.
Figure 1.
Coverage probability comparison among the three intervals for the minimax Simon two-stage design (r1,rt,n1,n2) = (4,12,45,33) given π0 = 0.1, π1 = 0.2, α = 0.05, and β = 0.2.
3.2. Numerical study
The I approach is not going to be included in the numerical comparison due to the unsatisfactory coverage property, and the two exact approaches (the R approach and the PV approach) will be compared with regards to EL and AL of the intervals. As a general rule, an interval with the shortest length among all exact intervals is preferable. The EL is often used to compare confidence interval performance [20,21], and it is calculated as
where b(.) represents the probability density function of a binomial distribution. The length for each sample point is weighted by its probability density function at π, and adding these weighted lengths together is the EL at π. It is obvious that the EL is a function of π, where π ∈ [0,1]. The exact intervals are evaluated over 2000 uniformly distributed π values between 0 and 1. For each π, we first compute ELR(π) and ELPV (π) based on the R approach and the PV approach, respectively. Subsequently, their ratio is calculated as Ratio(π) = ELPV (π)/ELR(π). For the 2000 ratio values, we compute the proportion of {π : Ratio(π) < 1} as: I(Ratio(π) < 1)/2000, where I(K) is an indicator function with I(K) = 1 if K is true and 0 otherwise. This is the proportion where the PV approach has a shorter expected length than the R approach. The higher the proportion is, the better the performance of the PV approach.
We compare the R approach and the PV approach under various design parameters for the optimal design in Table 1 and the minimax design in Table 2. The designs are calculated to insure 80% or 90% power, to detect π1 − π0 = 0.2 when π0 = 0.05, 0.1, 0.2, 0.3, ··· , and 0.7 at α = 0.05. For the 16 total cases in Table 1, the proportion of {π : Ratio(π) < 1} is at least 77%, and in 10 out of the 16 total cases, the proportion is over 99%. We present some typical plots for the ratio when π0 = 0.4, 0.5, 0.6, and 0.7 given α = 0.05 and β = 0.2 in Figure 2. Their associated proportions are 99%, 92%, 83%, and 77%. In the case of π0 = 0.4, the PV approach always has a shorter expected length than the R approach at all π values from 0 to 1. The performance of the R approach increases as π0 goes up. The R approach has better performance than the PV approach in 23% of the π values in the case of π0 = 0.7. These plots have a similar pattern: the R approach has better performance than the PV approach at the values that are close to π0. For the 16 minimax designs in Table 2, the minimum proportion is 92% which occurs in only one case. For all the remaining 15 cases, the proportion is over 99%, demonstrating better performance of the PV approach.
Table 1.
Average length compariSon between the R approach and the PV approach for Simon’s optimal two-stage designs at α = 0.05 when π1 — π0 = 0.2.
π0 | π1 | Power | ALR | ALPV |
---|---|---|---|---|
0.05 | 0.25 | 0.8 | 0.670 | 0.641 |
0.9 | 0.628 | 0.604 | ||
0.1 | 0.3 | 0.8 | 0.615 | 0.582 |
0.9 | 0.596 | 0.564 | ||
0.2 | 0.4 | 0.8 | 0.571 | 0.527 |
0.9 | 0.560 | 0.516 | ||
0.3 | 0.5 | 0.8 | 0.542 | 0.486 |
0.9 | 0.522 | 0.465 | ||
0.4 | 0.6 | 0.8 | 0.510 | 0.443 |
0.9 | 0.489 | 0.420 | ||
0.5 | 0.7 | 0.8 | 0.476 | 0.406 |
0.9 | 0.450 | 0.375 | ||
0.6 | 0.8 | 0.8 | 0.438 | 0.375 |
0.9 | 0.416 | 0.342 | ||
0.7 | 0.9 | 0.8 | 0.453 | 0.408 |
0.9 | 0.366 | 0.302 |
Table 2.
Average length comparison between the R approach and the PV approach for Simon’s minimax two-stage designs at α = 0.05 when π1 — π0 = 0.2.
π0 | π1 | Power | ALR | ALPV |
---|---|---|---|---|
0.05 | 0.25 | 0.8 | 0.672 | 0.646 |
0.9 | 0.642 | 0.617 | ||
0.1 | 0.3 | 0.8 | 0.625 | 0.595 |
0.9 | 0.598 | 0.570 | ||
0.2 | 0.4 | 0.8 | 0.569 | 0.527 |
0.9 | 0.556 | 0.515 | ||
0.3 | 0.5 | 0.8 | 0.539 | 0.485 |
0.9 | 0.534 | 0.479 | ||
0.4 | 0.6 | 0.8 | 0.401 | 0.369 |
0.9 | 0.485 | 0.421 | ||
0.5 | 0.7 | 0.8 | 0.451 | 0.384 |
0.9 | 0.452 | 0.377 | ||
0.6 | 0.8 | 0.8 | 0.446 | 0.375 |
0.9 | 0.424 | 0.346 | ||
0.7 | 0.9 | 0.8 | 0.250 | 0.210 |
0.9 | 0.363 | 0.295 |
Figure 2.
Expected length ratio between the PV approach and the R approach, ELPV /ELR for the optimal designs to attain 80% power at α = 0.05 when π0 =0.4, 0.5, 0.6, and 0.7, and π1=π0+0.2. The PV approach has a shorter expected length than the R approach when the ratio is less than 1.
The other measurement to compare confidence interval methods is the simple average length which assigns an equal weight to all sample points in its calculation. The AL is defined as
where N is the number of sample points (X1,X2) in set G. As aforementioned, the confidence interval is the same for both approaches for any sample point in set A in Simon’s design. Therefore, these sample points are excluded in our calculations of interval lengths. The optimal design in Table 1 and the minimax design in Table 2 are also used in the AL comparison between the two exact approaches. As can be seen from Table 1 for the optimal designs, the exact interval based on the PV approach always has a shorter average length than that based on the R approach, and the length saving for the PV approach, (ALR − ALPV )/ALR, ranges from 4% to 22%. Similar results are observed for the comparison between the two exact approaches for designs obtained under the minimax criteria in Table 2. The PV approach performs better than the R approach with regards to simple average length.
4. Discussion
We propose a new exact one-sided confidence limit for the probability of response based on p-value ordering of the sample space in a two-stage design. We compare the proposed PV approach with the existing I approach that computes the confidence interval by inverting the p-value function, and the R approach that uses the total number of responses for the stochastic ordering. The I approach does not guarantee the coverage probability as seen in Figure 1. Therefore, it is not recommended for use. Under the simple AL criteria, the PV approach performs better than the R approach since the PV approach always has a shorter average length. Under the expected length criteria, while the R approach performs slightly better than the PV approach when the probability of response is close to π0, the PV approach generally performs better in other cases.
Chen and Shan [22] proposed an optimal three-stage design that allows a study to be stopped early for futility or efficacy during the first two stages. However, the ordering of the sample space becomes more complicated in a three-stage design setting. For example, it is not easy to order two sample points where one is stopped in the first stage; the other is stopped in the second; but they have a similar probability of response. One possible solution to sort sample points is to adopt an inductive order as proposed by Wang [23]. It should be noted that this approach could be very computationally intensive in a multi-stage design setting.
When an adaptive two-stage method is used for a clinical trial design [3,5,6,24,25], the second stage sample size depends on the number of responses observed from the first stage. At the end of the study, the number of responses (X1,X2) and sample sizes (n1,n2) are recorded, and the associated exact confidence limits for the probability of response for this study can be computed by using the proposed approach. We consider this as future work to develop exact one-sided intervals for adaptive two-stage designs.
The two-stage design discussed in the article only allows early stopping due to futility in the first stage, and the hypotheses are one-sided. For these reasons, the lower limit can be used for statistical inference. In the case that a two-stage design allows an early stopping due to either futility or efficacy (e.g., sequential designs [14,16,26]), one may need to compute both the lower limit and the upper limit. The upper limit can be computed by using a similar approach presented in this article for the lower limit. It should be noted that the two-sided interval could be conservative when both the lower and the upper intervals guarantee their nominal confidence levels.
Acknowledgment
The author is very grateful to the Editor, the Associate Editor and two reviewers for their insightful comments that improve the manuscript.
Funding
Shan’s research is partially supported by grants from the National Institute of General Medical Sciences from the National Institutes of Health: P20GM109025.
Appendix A: Proof of Theorem 2.2
Proof. For any sample point (x1,x2) in set G, it is easy to show that ΩR(x1,x2) equals to {(X1,X2) : X1+X2 ≥ x1+x2} ⊃ {(X1,X2) : X1 ≥ x1,X1+X2 ≥ x1+x2} = ΩI(x1,x2).
Thus, we have LR(x1,x2) < LI(x1,x2). For any sample point in set A, the R approach and the I approach yield the same lower limits. Thus, LR(x1,x2) ≤ LI(x1,x2) is true for any sample point in the sample space.
Now we are going to prove the relationship between LPV (x1,x2) and LI(x1,x2). For any sample point in the rejection region based on the I approach ΩI(x1,x2), we have
It follows that
Then, by the definition of the p-value from Equation (1), we have
and
Therefore, the rejection region based on the I approach is a subset of that based on the PV approach,
Similar to the first part of this proof, it is easy to show that LPV (x1,x2) ≤ LI(x1,x2) is true for any sample point in the sample space. □
References
- [1].Simon R Optimal two-stage designs for phase II clinical trials. Controlled clinical trials. 1989. Mar;10(1):1–10. Available from: http://view.ncbi.nlm.nih.gov/pubmed/2702835. [DOI] [PubMed] [Google Scholar]
- [2].Mander AP, Thompson SG. Two-stage designs optimal under the alternative hypothesis for phase II cancer clinical trials. Contemporary clinical trials. 2010. November;31(6):572–578. Available from: http://view.ncbi.nlm.nih.gov/pubmed/20678585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Banerjee A, Tsiatis AA. Adaptive two-stage designs in phase II clinical trials. Statistics in medicine. 2006. Oct;25(19):3382–3395. Available from: http://view.ncbi.nlm.nih.gov/pubmed/16479547. [DOI] [PubMed] [Google Scholar]
- [4].Shan G, Ma C, Hutson AD, et al. Randomized Two-Stage Phase II Clinical Trial Designs Based on Barnard’s Exact Test. Journal of Biopharmaceutical Statistics. 2013. Aug; 23(5):1081–1090. Available from: 10.1080/10543406.2013.813525. [DOI] [PubMed] [Google Scholar]
- [5].Shan G, Wilding GE, Hutson AD, et al. Optimal adaptive two-stage designs for early phase II clinical trials. Statistics in Medicine. 2016. Apr;35(8):1257–1266. Available from: 10.1002/sim.6794. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Shan G, Wilding GE, Hutson AD. Computationally Intensive Two-Stage Designs for Clinical Trials. 2017;:1–7 Available from: 10.1002/9781118445112.stat07986. [DOI] [Google Scholar]
- [7].Shan G, Chen JJ. Optimal inference for Simon’s two-stage design with over or under enrollment at the second stage. Communications in Statistics - Simulation and Computation. 2017. Mar;:1–11 Available from: 10.1080/03610918.2017.1307398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Koyama T, Chen H. Proper inference from Simon’s two-stage designs. Statistics in medicine. 2008. Jul;27(16):3145–3154. Available from: http://view.ncbi.nlm.nih.gov/pubmed/17960777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Zeng D, Gao F, Hu K, et al. Hypothesis testing for two-stage designs with over or under enrollment. Statist Med. 2015. Jul;34(16):2417–2426. Available from: 10.1002/sim.6490. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Wang W, Shan G. Exact confidence intervals for the relative risk and the odds ratio. Biometrics. 2015. December;71(4):985–995. Available from: 10.1111/biom.12360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Shan G, Wang W. ExactCIdiff: An R Package for Computing Exact Confidence Intervals for the Difference of Two Proportions. The R Journal. 2013;5(2):62–71. [Google Scholar]
- [12].Clopper CJ, Pearson ES. The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika. 1934. December;26(4):404–413. Available from: 10.1093/biomet/26.4.404. [DOI] [Google Scholar]
- [13].Jennison C, Turnbull BW. Confidence Intervals for a Binomial Parameter Following a Multistage Test With Application to MIL STD 105D and Medical Trials. Technometrics. 1983. Feb;25(1):49–58. Available from: 10.1080/00401706.1983.10487819. [DOI] [Google Scholar]
- [14].Duffy DE, Santner TJ. Confidence Intervals for a Binomial Parameter Based on Multistage Tests. Biometrics. 1987;43(1):81–93. [Google Scholar]
- [15].Jung SHH, Kim KMM. On the estimation of the binomial probability in multistage clinical trials0. Statistics in medicine. 2004. Mar;23(6):881–896. Available from: 10.1002/sim.1653. [DOI] [PubMed] [Google Scholar]
- [16].Jennison C, Turnbull BW. Group Sequential Methods (Chapman & Hall/CRC Interdisciplinary Statistics). 1st ed. Chapman and Hall/CRC; 1999. Available from: http://www.worldcat.org/isbn/0849303168. [Google Scholar]
- [17].Girshick MA, Mosteller F, Savage LJ. Unbiased Estimates for Certain Binomial Sampling Problems with Applications. The Annals of Mathematical Statistics. 1946. March;17(1):13– 23. Available from: 10.1214/aoms/1177731018. [DOI] [Google Scholar]
- [18].Jung SHH. Statistical issues for design and analysis of single-arm multi-stage phase II cancer clinical trials. Contemporary clinical trials. 2015. May;42:9–17. Available from: http://view.ncbi.nlm.nih.gov/pubmed/25749311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Jovic G, Whitehead J. An exact method for analysis following a two-stage phase II cancer clinical trial. Statistics in medicine. 2010. December;29(30):3118–3125. Available from: http://view.ncbi.nlm.nih.gov/pubmed/21170906. [DOI] [PubMed] [Google Scholar]
- [20].Newcombe RG Confidence Intervals for Proportions and Related Measures of Effect Size (Chapman & Hall/CRC Biostatistics Series). CRC Press; 2012. Available from: http://www.worldcat.org/isbn/1439812780. [Google Scholar]
- [21].Fagerland M, Lydersen S, Laake P. Statistical Analysis of Contingency Tables. Boca Raton, FL: Chapman and Hall/CRC; 2017. [Google Scholar]
- [22].Chen TT. Optimal three-stage designs for phase II cancer clinical trials. Statistics in medicine. 1997. December;16(23):2701–2711. Available from: http://view.ncbi.nlm.nih.gov/pubmed/9421870. [DOI] [PubMed] [Google Scholar]
- [23].Wang W On construction of the smallest one-sided confidence interval for the difference of two proportions. The Annals of Statistics. 2010. April;38(2):1227–1243. Available from: 10.1214/09-aos744. [DOI] [Google Scholar]
- [24].Shan G, Zhang H, Jiang T. Minimax and admissible adaptive two-stage designs in phase II clinical trials. BMC Medical Research Methodology. 2016. Aug;16(1):90+ Available from: 10.1186/s12874-016-0194-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Shan G, Ma C. Unconditional tests for comparing two ordered multinomials. Statistical methods in medical research. 2016. Feb;25(1):241–254. Available from: 10.1177/0962280212450957. [DOI] [PubMed] [Google Scholar]
- [26].Fleming TR. One-sample multiple testing procedure for phase II clinical trials. Biometrics. 1982;38(1):143–151. Available from: 10.2307/2530297. [DOI] [PubMed] [Google Scholar]