Summary
Conventionally, evaluation of a new drug, A, is done in three phases. Phase I is based on toxicity to determine a “maximum tolerable dose” (MTD) of A, phase II is conducted to decide whether A at the MTD is promising in terms of response probability, and if so a large randomized phase III trial is conducted to compare A to a control treatment, C, usually based on survival time or progression free survival time. It is widely recognized that this paradigm has many flaws. A recent approach combines the first two phases by conducting a phase I-II trial, which chooses an optimal dose based on both efficacy and toxicity, and evaluation of A at the selected optimal phase I-II dose then is done in a phase III trial. This paper proposes a new design paradigm, motivated by the possibility that the optimal phase I-II dose may not maximize mean survival time with A. We propose a hybridized design, which we call phase I-II/III, that combines phase I-II and phase III by allowing the chosen optimal phase I-II dose of A to be re-optimized based on survival time data from phase I-II patients and the first portion of phase III. The phase I-II/III design uses adaptive randomization in phase I-II, and relies on a mixture model for the survival time distribution as a function of efficacy, toxicity, and dose. A simulation study is presented to evaluate the phase I-II/III design and compare it to the usual approach that does not re-optimize the dose of A in phase III.
Keywords: Bayesian Design, Clinical Trial, Dose Finding, Phase I-II Clinical Trial, Phase III Clinical Trial
1. Introduction
After a new treatment agent, A, is identified in pre-clinical studies, conventional clinical drug development and evaluation is carried out in three phases (Cancer.org, 2018). In phase I, the aim is to identify a dose, called the “maximum tolerable dose” (MTD), having acceptable toxicity probability. Phase I trials typically are small, with a wide variety of designs, including the 3+3 algorithm (Storer, 1989), continual reassessment method (O’Quigley et al., 1990), and escalation with overdose control (Babb et al., 1998). Efficacy of A at the MTD then is evaluated in phase II using the estimated probability πE of a short-term event (“response”), such as 50% solid tumor shrinkage or complete remission of leukemia. Most phase II designs compare πE(A) with A at the MTD to an assumed πE(C) of a conventional therapy, C. Phase II trials often are small, and may include an early stopping rule if πE(A) is poor compared to πE(C). If A is found to be promising in phase II, this may motivate a randomized phase III trial of A versus C based on a long term outcome, such as survival time.
Many phase II designs have been published. Simon et al. (1985) proposed a randomized selection design for two or more experimental treatments. For single-arm phase II trials, two-stage designs were proposed by Simon (1989) based on response, and by Bryant and Day (1995) based on response and toxicity. Bayesian sequential designs were proposed by Thall and Simon (1994) for a binary response, and by Thall et al. (1995) for monitoring multiple outcomes. Lee and Liu (2005) used predictive probabilities for futility rules, and Yin et al., (2012) used adaptive randomization to favor empirically better treatment arms.
It now is recognized widely that the conventional phase I → phase II → phase III paradigm has many flaws, and has led to many negative phase III trials. Two studies (Bio, 2016; Arrowsmith, 2011) showed that only about 50% of phase III trials yield an improvement over standard therapy. Seruga, et al. (2015) discussed causes of failure in phase III, including insufficient evidence of anti-disease activity in early phase trials, disagreements about how phase II trials should be designed, and reliance on phase II efficacy events or other surrogates not associated with longer survival. Yuan, Nguyen, and Thall (2016, Chapter 1) discuss problems with the conventional phase I → phase II paradigm, mainly due to limited sample sizes and ignoring efficacy when determining an MTD in phase I.
Many alternatives have been proposed that create hybrid designs by combining conventional phases, most commonly phase I-II or phase II-III. Thall (2008) reviewed phase II-III designs and discussed problems with the conventional phase II → phase III paradigm. “Select-and-test” phase II-III designs, where two or more experimental agents are chosen in phase II and randomized against C in phase III while maintaining desired overall type I and type II error rates, are given by Thall et al. (1988), Schaid et al. (1990), Stallard and Todd (2003), and many others. A phase II-III design proposed by Inoue, Thall and Berry (2004) uses both an early efficacy (response) indicator, YE, and survival time, YS. Denote πE = Pr(YE = 1), the probability density function (pdf) of YS by fS(t), and the conditional pdf of [YS | YE] by fS(t | YE = y) for y = 0, 1. Their approach relies on a mixture model of the general form
(1) |
Denote the indicator of early toxicity by YT and πT = Pr(YT = 1). Because phase I designs use YT but ignore YE when choosing a MTD, they are likely to choose a dose having reasonable πT but ineffectively low πE. For example, consider a dose-finding scenario with five doses, true toxicity probabilities (.05, .10, .20, .30, .35), and true efficacy probabilities (.05, .10, .20, .30, .60). If the CRM is used with target toxicity probability .30, this most likely will select dose 4 as optimal. By ignoring YE, however, dose 5 is chosen less frequently, despite the fact that it has only a .05 higher πT than dose 4 but doubles πE from .30 to .60. Phase I-II designs are motivated, in part, by the desire to overcome this sort of problem. Examples include the two-stage design of Hoering et al. (2011), studying combination therapies (Huang et al., 2006), using the odds ratio between πE and πT (Yin et al, 2006), and basing decisions on elicited numerical utilities of the possible elementary events determined by efficacy and toxicity (Thall and Nguyen, 2012). Thall and Cook (2004) proposed, and Thall et al. (2014) refined, the so-called “Eff-Tox” phase I-II design based on maximizing an estimate of an efficacy-toxicity trade-off function, ϕ(πE, πT). The function ϕ(πE, πT) increases in πE, decreases in πT, and quantifies the desirability of each probability pair (πE, πT).
This article presents a new Bayesian hybrid design that combines a phase I-II design followed by a modified phase III design, based on both early and late outcomes. We will call this a phase I-II/III trial design. For simplicity, we will use survival time, YS, as the long term outcome, although progression-free survival (PFS) time will work in precisely the same way. Our approach relies on a mixture model for the distribution of YS that generalizes the model (1) by including both efficacy and toxicity indicators, (YE, YT), to characterize early outcome. After phase I-II and an initial stage of phase III have been completed, the phase I-II/III design may re-optimize the dose of the experimental agent A based on mean survival time, μS. This approach hybridizes the phase I-II → phase III paradigm, in which dose-finding for A is done using (YE, YT) in phase I-II, rather than using only YT for dose-finding as in the more conventional phase I → phase II → phase III paradigm.
Our proposed phase I-II/III design has K ⩾ 3 stages. In stage 1, a phase I-II trial is conducted based on the short term binary indicators (YE, YT), including adaptive randomization (AR) among doses of A based on the dose desirability criterion ϕ. The use of AR reduces the risk of getting stuck at a suboptimal dose in phase I-II. It addresses the “exploration versus exploitation” or “stickiness” problem, which is well known in sequential analysis (Sutton and Barto, 1998; Azriel, et al., 2010). AR improves the reliability of our proposed phase I-II/III design because it obtains more data on doses that may be sub-optimal in terms of the phase I-II criterion ϕ based on (YE, YT) but optimal in terms of μS.
Denote A given at x by A(x). At the end of phase I-II (stage 1), an optimal dose of A based on ϕ is determined. In stage 2, phase III begins with patients randomized fairly to C and Phase I-II patients are followed to observe their times of death or follow up. After a pre-specified number of deaths, , have been observed from patients receiving C or in stage 2, all (YE,YT) and survival data of patients treated with A in stages 1 and 2 are used to determine an optimal dose such that maximizes μS, for use in the rest of phase III. The re-optimized dose may or may not be the same as . Stages 3, …, K are a randomized group sequential trial with up to K − 2 tests comparing the mean survival times of versus C. To provide a concrete illustration, for stage 1 we use the Eff-Tox phase I-II design of Thall et al. (2014), extended to include AR.
The rest of the paper is organized as follows. In Section 2, the data structure, models, and decision criteria are presented. Section 3 presents details of trial conduct. Section 4 describes possible decisions, outcomes, and potential consequences of re-optimizing dose versus the conventional approach of using in phase III. Section 5 presents results of simulation study to compare the phase I-II/III design to the phase I-II → phase III paradigm. Section 6 concludes with a discussion. A computer program to implement the phase I-II/III design is available on CRAN in the package Phase123.
2. Data Structure, Models, and Decision Criteria
Given raw doses d = (d1, ⋯, dj) of the experimental agent A, denote the standardized doses by for j = 1, …J. Let denote the observed time to death or administrative censoring and . Denote the parameters for the distribution of [YE, YT|x] by θET, and the parameters for the distribution of [YS|YE, YT, x] by θS.
The Eff-Tox design is reviewed in Web Appendix Section A. Briefly, for each m = E,T, and xj, it is assumed that πm(xj, θET) = P(Ym = 1| xj, θET) = logit−1 {ηm(xj, θET)}, with ηT (xj, θET) = τT,1 + τT,2xj and , with τT,2 > 0, so that πT(xj, θET) increases with xj, but πE(xj, θET) may be non-monotone. An association parameter ψ determines the joint distribution of (YE, YT) from their marginals using a copula, so θET = (τT,1, τT,2, τE,1, τE,2, τE,3, ψ). These parameters are assumed to be independent with priors ψ ~ N(0, 1), τE,2 ~ N(0, .20), and for m = E,T and r = 1, 2. Numerical values of for m = E, T, r = 1, 2 are determined from elicited means of πm(xj, θET), for j = 1, ⋯, J, m = E, T, and a desired prior effective sample size. Adaptive dose-finding decisions are based on a trade-off function ϕ(πE, πT) for π ∈ [0, 1]2.
Denote ϕj = ϕ[E{πE(xj, θET), πT(xj, θET)}|data] for each dose xj at any point during phase I-II based on the current data. The estimated optimal dose in an Eff-Tox trial is . To extend this design to include AR, rather than choosing for each cohort during phase I-II, we adaptively randomize the next cohort to dose xj with probability
where Q is the current set of posterior mean desirabilities, ϕj, of doses that are acceptably safe and efficacious. This shrinks the selection probability of less desirable doses toward 0 while allowing selection of doses that are suboptimal in terms of ϕ. After the (YE,YT) outcomes of all NET patients in phase I-II have been evaluated, based on the final phase I-II data is moved forward to stage 2, which is the first portion of phase III.
In the phase I-II/III design, we define two different types of truly optimal doses of A. Let denote an assumed true value of θm for m = ET or S. The truly optimal dose that maximizes is . The truly optimal dose that maximizes the mean survival time is . Let k =1, …., K index the stages of the phase I-II/III trial. Thus, k = 1 indexes the phase I-II trial, k = 2 indexes the first portion of phase III at the end of which the dose of A may be re-optimized based on μS, and k = 3, …, K index the subsequent group sequential stages in phase III for comparing to C. Thus, there are up to K − 2 group sequential comparisons in phase III. Let and denote the data for patients at the end of stage k, from the phase I-II and phase III portions of the trial, respectively. Therefore, consists only of the (x, YE, YT) data from phase I-II patients, while also includes these patients’ survival time data , up to the time at which the decision of whether to switch the dose based on mean survival time is made. does not exist because phase III has not begun in stage 1. The re-optimized dose is chosen based on , which includes all (x, YE, YT) and data at the end of stage 2.
Since the phase I-II → phase III paradigm uses throughout phase III, the primary motivation for our design is the possibility that , and that re-optimizing the dose of A may produce larger μS by comparing C to rather than in the group sequential trial. To evaluate the effects of re-optimizing the dose of A based on μS during the first part of phase III, we require models for [YE, YT|x] and [YS|YE, YT, x], in order to formulate a mixture model for [YS|x]. This will include the effects of x on the indicators (YE, YT), and the effects of (YE, YT) and x on the hazard function of YS. Let π(yE, yT|x, θET) denote the probability distribution of (YE, YT) at dose x, where (yE, yT) ∈ {0, 1}. Let fS|E,T(yS | yE, yT, x, θS) denote the conditional pdf of YS given the early binary outcomes and dose x of A. The mixture pdf of YS for patients treated with A(x) is
(2) |
The conditional mean survival time given (yE, yT) of a patient treated with A(x) is
(3) |
At the end of stage 2 of phase I-II/III, we choose based on all observed data, where maximizes the posterior mean of the parametric mean survival time
(4) |
at A(x). Conventionally (YE, YT) are used as surrogates for YS in choosing a dose in phase I-II, but (YE, YT) are ignored when modeling survival in phase III.
We assume that the distribution of [YS | YE, YT, x] has the Cox type hazard function
(5) |
We assume that β1, β2, βE, and βT are independent with identical non-informative N(0, 100) priors. For robustness, we assume that the baseline hazard is piecewise exponential with h0(t) = exp(λl) for t ∈ (tl, tl+1] under the partition t0 = 0 < t1 < …. < tL+1 = max(Yso). We allow the dimension L of the baseline hazard to vary, with prior L ~ Poi(ζS) and assume that the locations of the split points t vary according to the even order statistics with a uniform distribution of size 2L, as in Lee et al. (2015) and Chapple et al. (2017). This prevents obtaining intervals in h0(t) having few events for estimating λl. We suggest values of ζS ∈ {3, 4, 5, 6, 7}, since most hazard shapes can be approximated very accurately with 1 to 5 pieces. The resulting posterior distribution is not sensitive to the choice of ζS in this range for sample sizes greater than 50. We assume a normal prior with mean 0 and variance 25 for λ1, denoted λ1 ~ N(0, 25), and borrow strength when L > 1 for adjacent intervals via the prior , with the prior of σλ proportional to 1/σλ. The variance of λ1 ensures posterior hazard values seen in practice, while maintaining prior non-informativeness.
Denote , the posterior mean survival time for A(x) given the data from phases I-II and III at the end of stage 2. We compute this quantity under the mixture model (2) by estimating the posterior mean survival time
for each pair (yE, yT) ∈ {0, 1}, under the formula (3), and computing the posterior mean
of each bivariate probability under the Eff-Tox model given in Web Appendix A.
Since there will be limited survival time follow up information after events, the design only evaluates the means until the maximum observed patient follow up time. The trial is continued after patient events using the dose of A having the highest posterior mean . After making this decision, the design does not use data from patients who were treated at doses . After obtaining values of from East 6 statistical software (2016), is chosen such that the design can switch doses with high accuracy, but can still yield high power for phase III trials, given the truly optimal dose has been selected. Suitable values of can be determined using the function SimPhase123 in the package Phase123. This approach will result in a larger sample size of patients in the C arm being compared to if We use Markov chain Monte Carlo to obtain posterior distributions for θET and θS, using 2000 iterations and 1000 discarded as burnim This gives good convergence of the parameters, shown by the posterior of L settling on one or two values as well as traceplots for the parameters λ|L, s|L, and the coefficients in the linear terms of the Eff-Tox and survival hazard models. A detailed account of computational algorithms used to simulate posterior samples is given in Web Appendix B.
3. Trial Conduct
In this section, we give specific rules for conducting a phase I-II/III clinical trial. Each of the computer functions described below is contained in the R package Phase123, available on CRAN, including documentation of inputs and examples. Additional information on the trial parameters is given in Web Appendix C and a tutorial on several of the functions is given in Web Appendix D. When designing a phase I-II/III trial, the statistician should consult with the physician to establish design parameters, such as ϕ, maximum sample sizes NET and NS, and the number of comparative tests K − 2 following dose re-optimization. The group sequential boundaries for stopping the trial due to futility or superiority may be obtained using East 6 statistical software (2016), specifying a null value of μC, desired improvement Δ, type I error, power under the alternative, maximum sample size NS, and information proportions for determining for k = 3, …, K. If no futility decision is desired at look k then . The information proportions used to determine should be large enough (> 30%) to avoid making unreliable decisions based on a small amount of patient data if a dose is re-optimized for A.
The phase I-II/III design parameters must be calibrated to obtain good operating characteristics (OCs) under a reasonable array of possible scenarios. A smaller value of (≤ 20% of the total information proportion) may be obtained by simulating the phase I-II/III trial under sets of different (a) Eff-Tox scenarios quantifying effects of x on (YE,YT), (b) effects of (x,YE,YT) on survival, and (c) survival distributions. The stage 2 sample size should be set by examining the design’s OCs for several different values, to find (1) large enough to give a high probability of selecting the optimal dose, but (2) small enough so, given that the design switches to a true optimal dose in stage 2, it has good generalized power figures. This can be done using the function SimPhase123. Specific rules for conducting a phase I-II/III trial are as follows:
Enroll the first cohort of patients in the phase I-II portion at the lowest dose. For each subsequent cohort until NF patients have been treated, use the function AssignEffTox to obtain the next dose to give.
Once NF patients have been enrolled in phase I-II, use the function RandomEffTox to adaptively randomize the next cohort of patients among acceptable doses, which allows doses that are empirically suboptimal in terms of ϕ(πE, πT) to be chosen.
After NET patients have been enrolled in phase I-II and their efficacy and toxicity outcomes have been evaluated, use the function AssignEffTox to obtain the dose to continue to phase III.
Start phase III, randomizing patients equally between C and .
After deaths have been observed, use the function Reoptimize to determine the dose to continue with for the remainder of the trial.
Remove any patients treated with from consideration if the dose was switched and begin randomizing patients between C and .
- For each stage k = 3, …, K, after deaths occur, do two-sided tests for superiority or futility using the logrank test in R. Denoting the Z-score corresponding to the logrank statistic by Z, for futility bound and superiority bound , stop the trial if
Stop accrual after NS patients have been enrolled in the phase III portion, including patients treated with a dose that is no longer considered optimal.
4. Possible Trial Outcomes
Before presenting our simulation results, we discuss possible design decisions and comment on each under different true states of nature. Because the phase I-II/III design may change the phase I-II selected dose of A in phase III before comparing A to C, the sequence of decisions that it makes may be correct and optimal, correct but suboptimal, wrong, or disastrously wrong, depends on and , their estimates, and whether or . Since more than one dose of A may provide the desired improvement in μS of at least Δ over C, we denote the set of all such doses by . We define the generalized power (GP) to be the probability of (1) selecting a dose xj ∈ Xopt in stage 2 and (2) declaring A(xj) superior to C in one of stages 3, ⋯, K. The GP is the sum over xj ∈ Xopt of the probability of selecting xj and declaring A(xj) superior to C. If Xopt contains more than one dose, then the GP is larger than the probability of the best possible decision, which is to select the optimal dose that maximizes μS with A and declare superior to C. We denote the probability of making this best decision by γ1 and the GP by γ2. Thus, γ1 ≤ γ2, with γ1 = γ2 if Xopt contains exactly one dose, which in this case must be .
To help sort this out, Table 1 provides explanatory comments on scenarios in stages k = 1 (phase I-II) and k = 2 (the first portion of phase III) regarding the true relationship between the optimal doses and and their posterior estimates and . If , then the phase I-II/III and phase I-II → phase III designs make equivalent decisions. However, if , then switching provides a potential advantage. In this case survival data from patients who were treated with during phase III are no longer relevant. Depending on the accrual rate, maximum sample size NS, and number of patient events needed to re-optimize dose, this may result in 30 to 100 patients being treated at doses no longer considered a part of the trial as phase III proceeds.
Table 1.
|
|
Comments | ||
---|---|---|---|---|
|
|
The optimal dose in terms of μS was selected in phase I-II, so it is not desirable to switch doses at stage 2. In this scenario, the phase I-II/III design cannot provide an improvement over phase I-II → III. | ||
|
|
The dose selected in phase I-II is optimal in terms of ϕ but is not optimal in terms of μS. This illustrates the advantage of the phase I-II/III design over phase I-II → III design. | ||
|
|
The dose selected in phase I-II is suboptimal based on ϕ, but the optimal doses in terms of ϕ and μS are identical. This scenario illustrates the advantage of the phase I-II/III design over phase I-II → III design. | ||
|
|
The dose selected in phase I-II is suboptimal based on ϕ, but the optimal doses in terms of ϕ and μS are not identical. This scenario illustrates the advantage of the phase I-II/III design over phase I-II → III design. |
After choosing , the phase I-II/III design makes group sequential decisions comparing to C, so the decisions in phase III depend on the selected . But it may not be the case that . That is, the design may not choose the truly optimal dose in terms of mean survival time in stage 2. Table 2 lists possible decisions of a phase I-II/III design in stages k = 2, …, K and how each decision may be viewed in terms of . Table 2 is ordered with the best outcomes listed first and the worst listed last, with outcomes 1, 2, and 3 being good and outcomes 4 and 5 being bad. In outcome 1, the design declares the dose that increases μS the most to be superior to C. In outcome 2, a dose of A is selected that provides a clinically meaningful improvement ⩾ Δ in μS compared to C, but the best dose of A is not chosen, so the decision is correct but the dose has not been truly optimized. Outcome 3 represents a correct decision, but it does not improve μS since it declares C superior to or equivalent to . Outcome 4 gives a false positive result, including cases where the design wrongly chooses an inferior dose for which , which is worse than a conventional type I error. Outcome 5 represents the worst possible case, since not only does the design wrongly conclude that the chosen dose gives superior to C, but it might have obtained a successful trial result if it had correctly selected in stage 2.
Table 2.
O | Decision | Truth | Comments | ||
---|---|---|---|---|---|
1 |
|
|
This is the generalized power event at the optimal dose . The design correctly selects as optimal and declares superior to C. | ||
2 |
|
|
This is a generalized power event in a case where the design correctly concludes is superior to C but is suboptimal, so it could have improved survival more had if it chosen the truly optimal dose . | ||
3 |
|
|
This is a correct conclusion, but the phase I-II/III design will require an increased sample size compared to the phase I-II → III design due to correctly switching to . | ||
4 |
|
|
This is a false positive conclusion. While the design may pick the best dose of A, it incorrectly concludes that A at that dose is superior to C. | ||
5 |
|
|
This is a disastrous false negative conclusion. The design chooses a suboptimal dose based on μS and incorrectly concludes is inferior to C; instead of correctly selecting and declaring superior to C. |
These same decisions and interpretations are made in the conventional phase I-II → phase III paradigm, with the difference that is replaced with . Compared to this conventional design, allowing the optimal dose to be switched in the phase I-II/III design makes selecting more likely, which increases the probabilities of outcomes 1 and 2 and decreases the probabilities of the disastrous outcome 5. Under outcome 3, the phase I-II/III design is likely to treat more patients because it is more likely to correctly pick the dose having the largest μS, thus making stopping the trial early for superiority of C or futility less likely. It will be more likely to switch to the dose having the longest mean survival time for outcome 4, however, which makes a false positive event more likely.
5. Simulation Study
To perform a simulation study comparing the phase I-II/III design to the phase I-II → phase III paradigm, we first specify three different Eff-Tox scenarios, consisting of true efficacy and toxicity dose-probability vectors. We will use these to specify different relationships between (YE, YT) and YS. We evaluate the design with J = 5 doses using raw dose values (d1, ⋯, d5) = (1, 2, 3, 3.5, 5). For this study, each patient’s (YE,YT) are evaluated in one month, and we assume for simplicity that no patients die before this month long window. For a dose xj chosen in phase I-II, we test the null hypothesis months versus with target months, a Δ = 12 month improvement.
To implement phase I-II using the Eff-Tox design, the three equivalent (πE, πT) pairs used to establish the desirability function ϕ were (.35, 0), (.70, .40) and (1, .75). The contour created by these three pairs is seen in Web Figure 1. The upper limit on πT was and the lower limit on πE was . The threshold on the posterior probability that πE > .30 and πT < .40 was set to be pE = pT = .10 for both acceptability rules. Patients were treated in cohorts of size 3, with up to NET = 60 patients enrolled in phase I-II (stage 1). We calibrated the phase I-II hyperparameters to have prior effective sample size .90 as suggested by Yuan, Nguyen and Thall (2016). We used prior mean toxicity probabilities of (.05, .10, .15, .20, .30) and mean efficacy probabilities (.20, .40, .60, .65, .70) for the five doses, to produce the hyperparameter means (−4.23, 3.1, .02, 3.45, 0, 0) and standard deviations (3.13, 3.12, 2.68, 2.69, .2, 1) for the prior of θET. The EffTox program is freely available on the MDAnderson biostatistics software page.
For the phase I-II portion of the simulated trials, patients were treated in cohorts of size three and assigned doses after the previous cohort was fully evaluated, assuming an accrual rate of five patients per month with adaptive randomization begun after NF = 15 patients. The three simulation scenarios’ assumed true πE(xj) and πT(xj) are given in Table 3, with their selection percentages, true ϕ values, and numbers of patients treated, based on 5,000 simulated trials using the EffTox program.
Table 3.
Scenario | Value | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|---|
1 | (πE, πT)TR | (.20, .10) | (.40,.15) | (.60, .25) | (.65, .35) | (.70, .50) |
ϕ{(πE, πT)TR} | −.37 | −.13 | .05 | −.01 | −.13 | |
% Selected | 3 | 26 | 29 | 27 | 13 | |
# Treated | 6.4 | 16.0 | 16.1 | 12.1 | 9.0 | |
2 | (πE, πT)TR | (.2, .05) | (.25, .08) | (.35, .10) | (.40, .15) | (.55, .20) |
ϕ{(πE, πT)TR} | −.30 | −.26 | −.14 | −.13 | .04 | |
% Selected | 5 | 11 | 18 | 16 | 49 | |
# Treated | 8.4 | 9.1 | 10.1 | 9.8 | 22.4 | |
3 | (πE, πT)TR | (.40, .10) | (.50, .15) | (.60, .35) | (.65, .60) | (.70, .70) |
ϕ{(πE, πT)TR} | −.06 | .03 | −.09 | −.35 | −.40 | |
% Selected | 26 | 51 | 20 | 2 | 0 | |
# Treated | 16.7 | 27.3 | 11.6 | 3.3 | 0.8 |
In the three Eff-Tox scenarios in Table 3, the respective optimal doses in terms of the tradeoff contour are doses = 3, 5, and 2. We only consider simulated phase I-II trials that advance to phase III, ignoring simulation replications where the trial stopped early. In scenario 1, doses 3 and 4 have nearly equivalent desirability, so we expect most patients in the phase I-II portion of the phase I-II/III trial to be treated at these two doses. In scenario 2, the highest dose 5 is considered optimal, most patients are treated at this dose, and it is selected in 49% of the simulations. In scenario 3, the dose 2 is optimal and doses 4 and 5 have unacceptably high toxicity probabilities, so we expect to treat fewer patients at these doses. The design treats the most patients at dose 2, which is selected with probability .51, but substantial numbers of patients are treated at doses 1 and 3. The use of AR assigns more patients to doses 1 and 3, which allows the phase I-II /III design to better assess the functional relationship between dose and mean survival time. For the control group, we assume that the effects of toxicity and efficacy on overall survival are the same as those for the experimental group, and set the probabilities of toxicity and efficacy to be (.15, .40), (.10, .30), and (.35, .20) for the three Eff-Tox scenarios, respectively. For each of these simulation scenarios, we assume two different forms for the linear terms of the log hazard of YS. For A(x), we assume ηs(x, YE,YT) = β0 + β1x + β2x2 − exp(βE)YE + exp(βT)YT. For the simulated data from the control group, we assume ηS(C,YE,YT) = βC − exp(βE)YE + exp(βT)YT and calibrate the additional parameter βC so that we obtain the desired null value of 24 months for mean survival time. We first consider an exponential distribution with pdf f(t|ρ) = (1/ρ) exp(−t/ρ) where ρ = exp{ηS(x, YE,YT)}, since the O’Brien Fleming group sequential bounds (O’Brien and Fleming, 1979) for the logrank test are based on this assumption. Later, we will consider several other distributions to evaluate the robustness of the methodology. Table 4 displays the six scenarios considered, which correspond to the different Eff-Tox scenarios listed in Table 3, as well as differing effects of dose, efficacy, and toxicity on survival time.
Table 4.
Scenario | Eff-Tox Scen | Hyp | (β1, β2, , , β0)TR | ( , , , , )TR |
---|---|---|---|---|
1 | 1 | Null | (.1, −.5, .5, .5, 2.9) | (8.3, 17.9, 24, 22.5, 9.8) |
Alt | (.25, −2, .5, .5, 3.4) | (1, 14.5, 36.2, 28.3, 1) | ||
2 | 2 | Null | (.1, −.1, 1, .5, 2.6) | (14.0, 17.8, 21.9, 23, 24) |
Alt | (.5, 0, 1, .5, 2.3) | (7.1, 10.3, 16.0, 19.5, 36) | ||
3 | 2 | Null | (.1, −.5, .3, 1, 3.1) | (9.5, 18.5, 24, 22.5, 10.4) |
Alt | (.1, −1, .3, 1, 3.6) | (6.9, 24.7, 38, 33.1, 6.3) | ||
4 | 3 | Null | (−.3, .3, .3, 1, 2.3) | (24,13.6, 8.9, 6.8,7.8) |
Alt | (−.1, .3, .3, 1, 3.0) | (38, 24.6, 18.4, 15.0, 21.1) | ||
5 | 3 | Null | (.1, −.5, .3, .1, 3.0) | (9.3, 18.7, 24, 22.7, 10.4) |
Alt | (.1, −1, .3, .1, 3.6) | (7.8, 28.8, 44, 38.6, 7.4) | ||
6 | 1 | Null | (.75, −.5, .3, .25, 2.8) | (3.2, 10.1, 20.4, 24, 20.4) |
Alt | (1, −.6, .3, .25, 3.3) | (3.0, 12.9, 31.8, 40, 36.6) |
These scenarios encompass several qualitatively and quantitatively different possible cases in connecting phase I-II to phase III. In scenario 1, the optimal dose in terms of μS is dose 3, which is selected with probability .29. In scenario 2, there is a large efficacy effect, leading to dose 5 being optimal in terms of μS, but this dose is only selected with probability .49 in phase I-II. Thus, we expect to see a large improvement in this scenario by using a phase I-II/III design. Similarly, in scenario 3, dose 3 is optimal in terms of μS, but is only selected with probability .18 in phase I-II. Scenario 4 represents a case with a large toxicity effect and small efficacy effect, making dose 1 optimal in terms of μS, but dose 1 is only chosen in 26% of the usual phase I-II trials. In scenario 5, dose 3 is the third best dose in terms of ϕ but is best in terms of μS, and it is only selected with probability .20. In this scenario, dose 4 also gives a significant improvement in μS compared to C, with months. In scenario 6, there is a large efficacy effect on overall survival, making dose 4 optimal in terms of both overall survival and ϕ, but dose 5 also has significantly improved survival compared to C. These two scenarios provide a basis for evaluating improvements in both γ1 and the GP, γ2. To control the possibility of incorrectly switching due to chance outcomes, we do not allow the design to continue with a dose that had less than 6 patients treated. These scenarios also have varying effects of toxicity and efficacy on hS, quantified by the coefficients βE and βT. This will evaluate the sensitivity of the method to these effects. Since the parameters (β1, β2) must be changed substantially to obtain similar μS values for different values of (βE, βT), we do not perform a sensitivity analysis to these parameters within each scenario.
We assume that 10 patients, on average, are accrued each month during phase III, and that the phase III trial will begin 1 month after the phase I-II trial concludes. This waiting time could be increased to obtain longer survival follow up and thus improve the design’s ability to re-optimize doses during stage k = 2. We enroll a maximum of NS = 500 patients in phase III, which has up to three interim looks after , and deaths, with superiority decisions possible at each. We calibrated the stopping boundaries with East 6 statistical software (2016) using O’Brien-Fleming bounds (O’Brien and Fleming, 1979) with power .80 and type I error probability .05. We included a rule to determine if the trial should be stopped for futility, i.e. neither C nor is superior, after deaths. The boundaries for declaring superiority of or C based on the standardized logrank statistics are , and the futility bound at the second look is . At the start of the phase III portion of the trial, we begin randomizing patients equally to and C. After deaths in the trial have occurred, we determine the dose that patients receiving A should receive for the remainder of the trial. This is the re-optimization step. Survival times for patients in phase I-II and phase III are generated after their toxicity and efficacy are scored, which does not allow the possibility that a patient may die before their short term indicators are seen.
For each scenario and design, 5,000 simulation replications were performed. The simulation results are summarized in Table 5. In each of scenarios 1–4, γ1 = γ2, since there is one dose for which A is superior to C. Mean improvement in patient survival time with each design is denoted by , computed by averaging the differences between the true mean survival time with the selected dose of A and μC, if A is declared superior to C. In the simulations, is computed as the mean over {Wb, b = 1, ⋯, 5000}, where
Table 5.
Scenario | Design | Alternative Hypothesis
|
Null Hypothesis
|
|||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
γ1 | γ2 |
|
|
α |
|
|
|||||
1 | Phase I-II → phase III | 4.04 | .29 | .29 | 4.05 | 431.1 | – | .02 | 4.05 | 461.5 | ||
Phase I-II/III | 10.15 | .83 | .83 | 4.73 | 479.2 | – | .03 | 4.32 | 492.0 | |||
2 | Phase I-II → phase III | 7.87 | .66 | .66 | 4.29 | 459.2 | – | .06 | 4.18 | 489.4 | ||
Phase I-II/III | 8.97 | .75 | .75 | 4.45 | 470.7 | – | .02 | 4.18 | 489.9 | |||
3 | Phase I-II → phase III | 1.83 | .06 | .06 | 3.10 | 355.0 | – | < .01 | 3.28 | 385.7 | ||
Phase I-II/III | 11.51 | .79 | .79 | 4.56 | 476.9 | – | .04 | 4.22 | 485.6 | |||
4 | Phase I-II → phase III | 3.52 | .25 | .25 | 4.28 | 475.8 | – | .05 | 3.48 | 407.8 | ||
Phase I-II/III | 5.86 | .42 | .42 | 4.30 | 472.0 | – | .05 | 3.81 | 442.0 | |||
5 | Phase I-II → phase III | 5.61 | .21 | .25 | 3.98 | 428.5 | – | .01 | 3.83 | 440.4 | ||
Phase I-II/III | 16.71 | .68 | .88 | 4.24 | 464.4 | – | .03 | 4.37 | 493.9 | |||
6 | Phase I-II → phase III | 9.46 | .34 | .52 | 4.16 | 447.9 | – | .02 | 4.16 | 466.5 | ||
Phase I-II/III | 12.67 | .59 | .75 | 4.53 | 472.7 | – | .04 | 4.39 | 494.0 |
Table 5 shows that, in general, the phase I-II/III design maintained type I error probability ≤ .05 under H0 and had a uniformly higher γ1 and GP, γ2, compared to the conventional phase I-II → phase III approach without dose re-optimization. The values of γ1, γ2, and are uniformly larger for the phase I-II/III design than for the conventional phase I-II → phase III paradigm. The differences are extremely large in scenario 3, with an improvement of .73 in γ1, and a 9.68 month improvement in . The smallest advantage of the phase I-II/III design is seen in scenario 2, with an improvement of .09 for γ1 and .90 for . In scenarios 5 and 6, where two doses of A give mean survival time larger than μC + Δ = 36 months, the phase I-II/III design provides respective improvements in γ2 of .61 and .23, and improvements in γ1 of .57 and .25. These scenarios illustrate the potential advantage of the phase I-II/III design compared to the conventional phase I-II → phase III approach.
The phase I-II/III design does have the drawback that it requires treating more patients and longer trial durations, on average, than the conventional paradigm. Part of this required increase is due to the design correctly switching to the best dose of A in terms of overall survival, which decreases the likelihood that a trial will stop early by declaring C to be superior or due to futility. This increase in required sample size and trial duration are the price paid for the much larger probability of a successful phase III trial in cases where dose switching increases mean survival time with A.
Since phase I-II trials may have sample sizes ranging from 24 to 90 in practice, we chose the Eff-Tox sample size NET = 60 in the simulations as a practical compromise that obtains a reasonable amount of information in stage 1. Web Tables 1 and 2, seen in Web Appendix E, summarize additional simulations with NET = 90. Values of γ1, γ2 and for the phase I-II/III design all increased substantially with NET for all six scenarios. This is because more information at different doses in phase I-II makes switching to the best dose in stage 2 more likely.
To assess robustness of the phase I-II/III design to different event time distributions, we evaluated its performance for two lognormal distributions, with variances .25 and 1, a Weibull distribution with increasing or decreasing hazard, with shape parameters 4 or .5, and a gamma distribution with scale parameter 2. The true coefficients of YE and YT in the hazard function’s linear term were kept constant for each distribution, and the remaining constant parameters β0, βC, β1, β2 were adjusted to obtain similar true means as in the exponential distribution simulation study. We exponentiated the linear term for the gamma and Weibull distribution rate parameters, but did not do this for the lognormal distribution. The means under the null and alternative hypotheses for each distribution are given in Web Table 3. Table 6 summarizes the robustness study, showing that under the alternative hypothesis, for each distribution, the phase I-II/III design has uniformly higher values of γ1, γ2, and , with substantially higher values for scenarios 1,3 and 5. For the Weibull distribution with decreasing hazards in each scenario, the decrease in γ1 and γ2 for both designs is due to the assumptions of the logrank test being grossly violated by a high early failure rate. Because so many patients have early failures, patients are not followed as long before the final group sequential test. For this distribution, however, the phase I-II/III design still improved the probability of selecting the optimal dose of A compared to the conventional paradigm by .49, .21, .15, .59, .46 and .14 in the six scenarios, respectively. Similar improvements are seen under the other distributions. This shows that the logrank test is not robust to the Weibull distribution with decreasing hazard. An extension of the phase I-II/III design might incorporate a robust group sequential test in place of the logrank test, to reduce the loss in power under a Weibull with decreasing hazard. The type I error constraints are nearly met for each distribution. Some slight inflation in α above .05 may be attributed to the proportional hazards assumption being violated. For each distribution, the phase I-II/III design treats more patients, on average, under both H0 and H1, and has slightly longer trial duration, but makes the correct decision much more often.
Table 6.
Scenario | Distribution | Phase I-II → phase III
|
Phase I-II/III Design
|
||||||
---|---|---|---|---|---|---|---|---|---|
|
γ1 | γ2 | α |
|
γ1 | γ2 | α | ||
1 | Lognormal, σ = .5 | 7.72 | .30 | .30 | .02 | 14.56 | .80 | .80 | .03 |
Lognormal, σ = 1 | 6.18 | .30 | .30 | .02 | 14.88 | .85 | .85 | .03 | |
Weibull Increasing | 6.37 | .30 | .30 | .02 | 12.00 | .77 | .77 | .03 | |
Weibull Decreasing | 1.96 | .13 | .13 | .02 | 4.14 | .32 | .32 | .02 | |
Gamma | 5.02 | .30 | .30 | .02 | 11.72 | .89 | .89 | .04 | |
2 | Lognormal, σ = .5 | 12.18 | .73 | .73 | .05 | 15.14 | .91 | .91 | .05 |
Lognormal, σ = 1 | 11.70 | .70 | .70 | .06 | 13.78 | .82 | .82 | .03 | |
Weibull Increasing | 9.08 | .73 | .73 | .06 | 11.20 | .90 | .90 | .06 | |
Weibull Decreasing | 3.98 | .32 | .32 | .05 | 4.75 | .38 | .38 | .03 | |
Gamma | 9.11 | .73 | .73 | .06 | 11.30 | .90 | .90 | .03 | |
3 | Lognormal, σ = .5 | 2.17 | .06 | .06 | < .01 | 13.91 | .85 | .85 | .04 |
Lognormal, σ = 1 | 2.02 | .06 | .06 | < .01 | 12.91 | .81 | .81 | .03 | |
Weibull Increasing | 2.14 | .06 | .06 | < .01 | 13.63 | .84 | .84 | .03 | |
Weibull Decreasing | .90 | .03 | .03 | < .01 | 4.94 | .32 | .32 | .03 | |
Gamma | 2.07 | .06 | .06 | < .01 | 13.23 | .86 | .86 | .03 | |
4 | Lognormal, σ = .5 | 4.50 | .32 | .32 | .05 | 7.23 | .52 | .52 | .05 |
Lognormal, σ = 1 | 3.86 | .28 | .28 | .05 | 5.61 | .40 | .40 | .06 | |
Weibull Increasing | 4.67 | .33 | .33 | .04 | 6.61 | .47 | .47 | .04 | |
Weibull Decreasing | 4.14 | .28 | .28 | .06 | 6.21 | .42 | .42 | .07 | |
Gamma | 4.62 | .31 | .31 | .04 | 6.93 | .47 | .47 | .04 | |
5 | Lognormal, σ = .5 | 6.63 | .21 | .25 | .01 | 18.19 | .76 | .96 | .03 |
Lognormal, σ = 1 | 5.74 | .21 | .25 | .01 | 16.78 | .69 | .87 | .03 | |
Weibull Increasing | 7.41 | .21 | .25 | .01 | 15.88 | .64 | .77 | .03 | |
Weibull Decreasing | 4.00 | .16 | .18 | .01 | 12.27 | .48 | .63 | .03 | |
Gamma | 6.73 | .21 | .25 | .01 | 18.73 | .80 | .92 | .03 | |
6 | Lognormal, σ = .5 | 11.92 | .34 | .52 | .04 | 15.56 | .58 | .83 | .06 |
Lognormal, σ = 1 | 11.50 | .34 | .52 | .02 | 15.66 | .60 | .80 | .04 | |
Weibull Increasing | 10.29 | .34 | .52 | .04 | 12.76 | .54 | .77 | .05 | |
Weibull Decreasing | 5.46 | .21 | .31 | .02 | 6.63 | .32 | .38 | .03 | |
Gamma | 10.90 | .34 | .52 | .02 | 14.80 | .59 | .83 | .04 |
6. Discussion
We have proposed a new drug development strategy, which we call a phase I-II/III design, that re-optimizes the dose of an experimental agent A chosen in phase I-II during phase III based on mean survival time. We use information from all patients treated with A, including their short term efficacy and toxicity indicators, dose assigned, and survival time information, in order to more accurately select the dose of A that provides the highest posterior mean survival time. The design is based on an assumed a mixture model for the survival time distribution that averages over the possible short term phase I-II outcomes. While we have used the Eff-Tox trade-off based phase I-II design for stage 1 of the phase I-II/III design, one could replace the Eff-Tox design with any phase I-II design based on (YE, YT) that uses some dose optimality criterion ϕ and includes AR. However, the necessary modifications of the design parameters and computer software to accommodate such a change would be non-trivial. Similarly, a complicated but straightforward extension of the methodology may address the problem of possible deaths before evaluation of (YE, YT).
The simulations shows that, under a range of alternative cases, the generalized power γ2, and probability γ1 of the best possible decision, both are greatly increased by the phase I-II/III design compared to the phase I-II → phase III paradigm. The phase I-II/III design also has a much lower probability of making the least desirable decision, where a suboptimal dose is chosen and a true treatment advance is missed. A drawback of the phase I-II/III design is that it requires more patients and a slightly longer trial duration, on average, compared to the phase I-II → phase III paradigm. This seems like a very reasonable price to pay for the much larger values of γ1, γ2, and , in cases where re-optimizing the dose of the experimental agent increases its associated mean survival time.
Supplementary Material
Acknowledgments
Peter Thall’s research was supported by NCI grants R01 CA 83932 and P30 CA 016672. Andrew Chapple’s research was partially supported by the NIH grant 5T32-CA096520-07.
Footnotes
Supplementary Materials
Web Appendices and tables, referenced in Sections 2, 3, and 5, are available at the Biometrics website on Wiley Online Library.
Contributor Information
Andrew G. Chapple, Email: agc6@rice.edu, Department of Statistics, Rice University, Houston, Texas, U.S.A.
Peter F. Thall, Email: rex@mdanderson.org, Department of Biostatistics, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, U.S.A.
References
- Arrowsmith J. Trial watch: Phase III and submission failures: 20072010. Nature Review Drug Discovery. 2011;10:87–87. doi: 10.1038/nrd3375. [DOI] [PubMed] [Google Scholar]
- Azriel D, Mandel M, Rinott Y. The Treatment Versus Experimentation Dilemma in Dose-Finding Studies. Journal of Statistical Planning and Inference. 2011;141:2759–2768. [Google Scholar]
- Babb J, Rogatko A, Zacks S. Cancer phase I clinical trials: Efficient dose escalation with overdose control. Statistics in Medicine. 1998;17:1103–1120. doi: 10.1002/(sici)1097-0258(19980530)17:10<1103::aid-sim793>3.0.co;2-9. [DOI] [PubMed] [Google Scholar]
- BIO, Biomedtracker, Amplion. Clinical Development Success Rates 2006-2015. https://www.bio.org/bio-industry-analysis-published-reports (Accessed December 20, 2017).
- Bryant J, Day R. Incorporating toxicity considerations into the design of two-stage phase II clinical trials. Biometrics. 1995;51:1372–1382. [PubMed] [Google Scholar]
- Cancer.org. What are the phases of clinical trials? 2018 https://www.cancer.org/treatment/treatments-and-side-effects/clinical-trials/what-you-need-to-know/phases-of-clinical-trials.html (Accessed 17 Jan. 2018)
- Chapple AG, Vannucci M, Thall PF, Lin SH. Bayesian variable selection for a semi-competing risks model with multiple components. Journal of Computational Statistics and Data Analysis. 2017;112:170–185. doi: 10.1016/j.csda.2017.03.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chu Y, Pan H, Yuan Y. Adaptive Dose Modification for Phase I Clinical Trials. Statistics in Medicine. 2016;35:3497–3508. doi: 10.1002/sim.6933. [DOI] [PubMed] [Google Scholar]
- East 6. Statistical software for the design, simulation and monitoring of clinical trials. Cytel Inc.; Cambridge, MA: 2016. [Google Scholar]
- Green P. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika. 1995;82(4):711–732. [Google Scholar]
- Hoering A, LeBlanc M, Crowley J. Seamless phase I-II trial design for assessing toxicity and efficacy for targeted agents. Clinical Cancer Research. 2011;17(4):640–646. doi: 10.1158/1078-0432.CCR-10-1262. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang X, Biswas X, Oki Y, Issa JP, Berry D. A parallel phase I/II clinical trial design for combination therapies. Biometrics. 2006;63(2):429–436. doi: 10.1111/j.1541-0420.2006.00685.x. [DOI] [PubMed] [Google Scholar]
- Inoue LYT, Thall PF, Berry DA. Seamlessly expanding a randomized phase II trial to phase III. Biometrics. 2002;58:823–831. doi: 10.1111/j.0006-341x.2002.00823.x. [DOI] [PubMed] [Google Scholar]
- Jin I-H, Liu S, Thall PF, Yuan Y. Using data augmentation to facilitate conduct of phase I/II clinical trials with delayed outcomes. Journal of the American Statistical Association. 2014;109:525–536. doi: 10.1080/01621459.2014.881740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee J, Liu D. A predictive probability design for phase II cancer clinical trials. Clinical Trials. 2008;5:93–106. doi: 10.1177/1740774508089279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee K, Haneuse S, Schrag D, Dominici F. Bayesian semiparametric analysis of semicompeting risks data: investigating hospital readmission after a pancreatic cancer diagnosis. Journal of the Royal Statistical Society: Series C (Applied Statistics) 2015;64:253–273. doi: 10.1111/rssc.12078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O’Brien PC, Fleming TR. A Multiple Testing Procedure for Clinical Trials. Biometrics. 1979;35:549–556. [PubMed] [Google Scholar]
- O’Quigley J, Pepe M, Fisher L. Continual reassessment method: A practical design for phase I clinical trials in cancer. Biometrics. 1990;46:33–48. [PubMed] [Google Scholar]
- Schaid DJ, Wieand S, Therneau T. Optimal two-stage screening designs for survival comparisons. Biometrika. 1990;3(1):507–513. [Google Scholar]
- Seruga B, Ocana A, Amir E, Tannock IF. Failures in Phase III: Causes and Consequences. Clinical Cancer Research. 2015;21:4551–4560. doi: 10.1158/1078-0432.CCR-15-0124. [DOI] [PubMed] [Google Scholar]
- Simon R. Optimal two-stage designs for phase II clinical trials. Controlled Clinical Trials. 1989;10:1–10. doi: 10.1016/0197-2456(89)90015-9. [DOI] [PubMed] [Google Scholar]
- Simon R, Wittes RE, Ellenberg SS. Randomized phase II clinical trials. Cancer Treatment Reports. 1985;69:137581. [PubMed] [Google Scholar]
- Stallard N, Todd S. Sequential designs for phase III clinical trials incorporating treatment selection. Statistics in Medicine. 2003;22:286–703. doi: 10.1002/sim.1362. [DOI] [PubMed] [Google Scholar]
- Storer BE. Design and analysis of phase I clinical trials. Biometrics. 1989;45:925–937. [PubMed] [Google Scholar]
- Sutton RS, Barto AG. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press; 1998. [Google Scholar]
- Thall PF. A review of phase 2-3 clinical trial designs. Lifetime Data Analysis. 2008;14:37–53. doi: 10.1007/s10985-007-9049-x. [DOI] [PubMed] [Google Scholar]
- Thall PF, Herrick RC, Nguyen HQ, Venier JJ, Norris JC. Effective sample size for computing prior hyperparameters in Bayesian phase I-II dose-finding. Clinical Trials. 2014;11:657–666. doi: 10.1177/1740774514547397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thall PF, Cook JD. Dose-finding based on efficacy-toxicity trade-offs. Biometrics. 2004;60:684–693. doi: 10.1111/j.0006-341X.2004.00218.x. [DOI] [PubMed] [Google Scholar]
- Thall PF, Nguyen HQ. Adaptive randomization to improve utility-based dose-finding with bivariate ordinal outcomes. Journal of Biopharmaceutical Statistics. 2012;22:785–801. doi: 10.1080/10543406.2012.676586. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thall PF, Simon R, Ellenberg SS. Two-stage selection and testing designs for comparative clinical trials. Biometrika. 1988;75:303–310. [Google Scholar]
- Thall PF, Simon R. Practical Bayesian guidelines for phase IIB clinical trials. Biometrics. 1994;50:337–349. [PubMed] [Google Scholar]
- Thall PF, Simon RM, Estey EH. Bayesian sequential monitoring designs for single arm clinical trials with multiple outcomes. Statistics in Medicine. 1995;14:357–379. doi: 10.1002/sim.4780140404. [DOI] [PubMed] [Google Scholar]
- Yin G, Chen N, Lee J. Phase II trial design with Bayesian adaptive randomization and predictive probability. Journal of the Royal Statistical Society: Series C (Applied Statistics) 2012;61:219–235. doi: 10.1111/j.1467-9876.2011.01006.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yin G, Li Y, Ji Y. Bayesian dose-finding in phase I/II clinical trials using toxicity and efficacy odds ratios. Biometrics. 2006;62:777–787. doi: 10.1111/j.1541-0420.2006.00534.x. [DOI] [PubMed] [Google Scholar]
- Yuan Y, Nguyen H, Thall PF. Bayesian Designs for Phase I-II Clinical Trials. Chapman and Hall/CRC Biostatistics Series 2016 [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.