Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Dec 1.
Published in final edited form as: Stat Methods Med Res. 2017 May 23;27(12):3628–3642. doi: 10.1177/0962280217709817

Extended Two-stage Adaptive Designs with Three Target Responses for Phase II Clinical Trials

Seongho Kim 1, Weng Kee Wong 2
PMCID: PMC5515697  NIHMSID: NIHMS875672  PMID: 28535716

Abstract

We develop a nature-inspired stochastic population-based algorithm and call it discrete particle swarm optimization (DPSO) to find extended two-stage adaptive optimal designs that allow 3 target response rates for the drug in a phase II trial. Our proposed designs include the celebrated Simon’s two-stage design and its extension that allows 2 target response rates to be specified for the drug. We show that DPSO not only frequently outperforms greedy algorithms, which are currently used to find such designs when there are only a few parameters; it is also capable of solving design problems posed here with more parameters that greedy algorithms cannot solve. In stage 1 of our proposed designs, futility is quickly assessed and if there are sufficient responders to move to stage 2, one tests one of the three target response rates of the drug, subject to various user-specified testing error rates. Our designs are therefore more flexible and interestingly, do not necessarily require larger expected sample size requirements than 2 stage adaptive designs. Using a real adaptive trial for melanoma patients, we show our proposed design requires one half fewer subjects than the implemented design in the study.

Keywords: Adaptive design, Greedy algorithm, Particle swarm optimization, Power, Sequential design, Simon’s two-stage design

Introduction

Phase II clinical trials concern early exploration of efficacy effects and use the most recent results from a small group of patients to make decision for the next group of patients until some prefixed termination rule is met. Group sizes can range from 20–120 depending on the nature and seriousness of the disease. Kramar et al. (1996) provides a review on multistage designs for phase II clinical trials and statistical issues in cancer research and Brown et al. (2011) gives an overview of the role of phase II trials in oncology. Increasingly, these designs employ an adaptive approach, where design decisions for the next stage are made based on cumulative responses. Chow and Chang (2011) reviews adaptive randomization designs in clinical trials.

Simon’s two-stage design was developed under a framework that if the proportion of responders in stage 1 is small, the trial is terminated; otherwise, the trial goes on to stage 2 where the cumulative response rate is now used to test whether the efficacy rate is at a higher pre-specified level. In essence, two user-selected efficacy rates are posited p0 and p1 with p0 < p1, and p0 is the uninspiring response rate. In stage 1, we test the null hypothesis H0 : p = p0 and if we fail to reject the null hypothesis, we terminate the trial; otherwise, we conclude that the drug is sufficiently promising to advance to stage 2, where more responses from more patients will be used to test p = p1. Given pre-specified Type I and II error rates for the tests at the two stages, the design questions are the number of patients required in stage 1, number of responders required in stage 1, how many additional patients are required for stage 2 and the cumulative number of responders required at the end of stage 2. The design problem is to determine the optimal combination of these 4 numbers so that the expected number of patients treated with a drug of low activity under the null hypothesis is minimized. Such an optimal design may not be unique and other design criteria are possible.

Lin and Shih (2004) provided practical examples and showed that while it was relatively easy to specify the uninteresting rate p0, the same was not true for p1. To tackle the issue of uncertainty in targeting the hypothesis in stage 2, they proposed designs that allowed two specifications for p1 and called one a skeptical choice and the other the optimistic choice. Their design problem now has 7 parameters to optimize in contrast to 4 in Simon’s two-stage design, subject to user-specified Type I and II error constraints. The additional 3 parameters beyond the 4 parameters required for Simon’s two-stage design are for testing the additional targeted efficacy rate. Both Simon and Lin and Shih employed a greedy search to find the optimal designs and the latter stated that they “did not extend the selection to more than two prefixed possible response rates mainly due to the complexity in the numerical solutions, and also because it is usually adequate for practitioners to contemplate between two (high/low) choices of the response rate p1.”

Our work was motivated by a real problem from clients interested to conduct a single-arm two-stage phase II trial to the effect of head and neck cancer (HNC) on the incidence of obstructive sleep apnea (OSA). Our clients’ main goal was to ascertain accurately the incidence rate of OSA on HNC patients in a timely manner. Because of the potential huge beneficial impact on treating HNC patients, the clients wanted a relatively simple design that could terminate the study at stage 1 if the initial response (OSA incidence) rate was poor; otherwise, stage 2 of the design used the number of responders, who experienced OSA from stage 1 to recruit more patients to more accurately assess the incidence rate of OSA. Clearly, Simon’s two-stage design proposed in 1989 and the adaptive two-stage design proposed by Lin and Shih (2004) seemed useful.

Our clients were particularly interested to extend Lin and Shih (2004)’s approach to include an additional targeted alternative hypothesis for testing in the second stage. The main reason was the great uncertainty in the OSA incidence rate on HNC patients and the need to move a potentially promising impact further in the treatment of HNC patients expeditiously with a very good estimate of the true OSA incidence rate of HNC patients. Thus depending on the number of responders, who experience OSA in stage 1, more flexibility in specifying the incidence rate of OSA in stage 2 is required. In particular, the clients did not want to run another costly follow up trial to more accurately evaluate the incidence rate of OSA in HNC patients. Another motivation for having such a design was also recently made aware to the second author by personal communication from a senior researcher at the National Institutes of Health. She mentioned that in oncology, there is a great tendency for researchers to be too optimistic on the efficacy rate of a new drug. Our proposed designs should help address such an issue by allowing the clinician the flexibility to perform a test at stage 2 for one of the two lower efficacy rates should the drug efficacy rate be over-specified in the first place. In what is to follow, we discuss and construct extended two-stage adaptive designs for our OSA study and two other applications. The first one is Lin and Shih’s (2004) vinorelbine, bleomycin, and gemcitabine (VBG) study for patients with recurrent or refractory Hodgkin’s disease and the second one is a Phase II BREAK-2 study for melanoma patients.

We were able to formulate the optimization problem quickly as an extension of that from Lin and Shih (2004)’s paper, but anticipated the optimization burden for the new constrained optimization problem now with 10 parameters to optimize for the 3 user-selected targeted alternative hypotheses in stage 2 and subject to various user-specified Type I and II error rate constraints. Lin and Shih (2004) acknowledged computational difficulties in their work at that time and remarked that further extension of the problem would seem to be a too challenging or an impossible computational task. In what is to follow, we present an algorithm to solve this complex constrained optimization problem and show that our extended two-stage adaptive design has advantages over both Simon’s two-stage design and Lin and Shih’s designs in a number of ways.

There are many variations of strategies proposed for phase II designs since Simon’s two-stage design was proposed. We mention some here and refer to earlier cited papers for a comprehensive overview. Modifications and extensions of the likes of Simon’s two-stage design include having two binary outcomes (Kunz and Kieser (2011)) or finding designs that minimize the expected sample size or minimize the expected maximum sample size, not under the null hypothesis but under the alternative hypothesis. For example, Mander and Thompson (2010) and Mander et al. (2012) investigated novel designs which are optimal under the alternative hypothesis, that the tumor response rate is higher than the null hypothesis value, and also designs which allow early stopping for efficacy. Wason et al. (2011) considered reducing sample size for phase II trials with a continuous outcome, and Kwak and Jung (2014) proposed a two-stage adaptive optimal design for single arm trials with right-censored survival time that minimizes the expected sample size subject to Type I and II error rate specifications. Phase II designs with three stages were motivated and proposed in Ensign et al. (1994), Chen (1997), and Chen and Shan (2007). Schlesselman and Reis (2006) noted limits and benefits of phase II trials and Zhou (2004) gave some guidance on the choice of a design for an early phase trial.

Software for computing various types of phase II adaptive designs for such trials is available in commercial software statistical packages, like SAS, STATA and JMP. They are also codes on webpages that generate various adaptive optimal designs for early phase trials after the error rates are specified. One such site is at https://stattools.crab.org/Calculators/twoStage.htm. Because none of the current software can find our proposed designs, we have developed codes to run on a web browser using the R package shiny (http://shiny.rstudio.com) to generate our proposed designs. We call our R package ss2stagePSO and it is freely available at http://cansur.wayne.edu/. We also provide an option to compare competitive designs and user-specified designs relative to the optimal designs by reporting their relative efficiencies.

In the next section, we first describe extended two-stage adaptive designs for phase II clinical trials and technical background before discussing nature-inspired stochastic population-based algorithms for finding our proposed designs. There are many of them and, as an example, we focus on particle swarm optimization (PSO), which seem to be widely used. Motivated from our recent successes with this flexible algorithm, we modified it for our application at hand, because PSO was originally designed to solve optimization problems over a continuous domain, not optimization over a domain comprising of discrete positive integer numbers, and call it Discrete PSO (DPSO). In particular, we show how DPSO solves our extended two-stage adaptive design problems and related problems effectively that a greedy algorithm cannot. We then apply our algorithm to real applications and show our proposed designs require smaller sample sizes than those that were implemented in the trial. The paper is concluded with a summary.

Extended two-stage adaptive designs

Suppose p0 is the maximum uninteresting response rate and there are three choices for the target response rates: p1, p2, p3, where 0 < p0 < p1 < p2 < p3 < 1. Data from stages 1 and 2 will be used to test one of the null hypotheses in stage 2 depending on the number of responders in stage 1. Our adaptive design assumes that a total of n1 patients are assigned at the first stage and the total number of patients and tests required in the entire trial will depend on the number of responses in the first and subsequent stages. The null hypothesis at the first stage is H0: pp0. According to the number of responses in the first stage, the corresponding alternative hypothesis will be one of three hypotheses: H11: p > p1, H12: p > p2 and H13: p > p3. Using similar notation in Lin and Shih (2004), our extended two-stage adaptive design has a total of 10 parameters given by θ = (s1, r1, q1, n1, s, l, r, m, q, n) and it operates as follows:

  • Step I: Begin by recruiting n1 patients in the first stage of our design and observe the number of the responses, x, from of n1 patients.

  • Step II:
    1. If xs1, stop the trial with failure to reject H0 (i.e., pp0)
    2. If s1 < xr1, power the study at (1-β1) for p = p1 and enter l2 = ln1 additional patients into the study. Reject the hypothesis that H11: p > p1 if the total number of responses ≤ s out of l patients.
    3. If r1 < xq1, power the study at (1-β2) for p = p2 and enter m2 = mn1 additional patients into the study. Reject the hypothesis that H12: p > p2 if the total number of responses ≤ r out of m patients.
    4. If x > q1, power the study at (1-β3) for p = p3 and enter n2 = nn1 additional patients into the study. Reject the hypothesis that H13: p > p3 if the total number of responses ≤ q out of n patients.

By construction, the values of l, m, n are the total number of patients required for the entire trial corresponding to the alternative hypotheses, H11: p > p1, H12: p > p2, and H13: p > p3, respectively. The extended two-stage adaptive design has a total of 10 parameters denoted by θ = (s1, r1, q1, n1, s, l, r, m, q, n) that we wish to optimize given error rate constraints and the stipulated three response rates. If the true response probability is p, the probability of failing to reject H0 is given by

G(θ|p)=B(s1,n1,p)+x=s1+1min(r1,s)b(x,n1p)B(sx,l2,p)+x=r1+1min(q1,r)b(x,n1p)B(rx,m2,p)+x=q1+1min(q,n1)b(x,n1p)B(qx,n2,p), (1)

where b and B are the probability density function and cumulative density function of a binomial distribution, respectively, with l = n1 + l2, m = n1 + m2 and n = n1 + n2. It follows that when the true probability response rate is p, the expected sample size is

E(N|p,θ)=n1+(B(r1,n1,p)B(s1,n1,p))l2+(B(q1,n1,p)B(r1,n1,p))m2+(1B(q1,n1,p))n2. (2)

Our design problem is to find a good choice θ^Θ, the set containing all values of θ that satisfies four natural error constraints

G(θ|p0)1α,G(θ|p1)β1,G(θ|p2)β2andG(θ|p3)β3. (3)

The goodness of θ^ may be determined by one of the following four optimality criteria:

  • C1: θ^=arg minθΘE(N|p0,θ);

  • C2: θ^=arg minθΘE(N|p0,θ)andθ^=arg minθΘ{max(l,m,n)};

  • C3: θ^=arg minθΘ{maxi=0,1,2,3E(N|pi,θ};

  • C4: θ^=arg minθΘ{maxi=0,1,2,3E(N|pi,θ}andθ^=arg minθΘ{max(l,m,n)}.

The optimality criteria C1–C4 are extensions of Lin and Shih (2004)’s criteria and, if there is only one target response, criteria C1 and C2 are exactly the same as Simon’s two optimality criteria. Our proposed extended 2-stage adaptive design is schematically described in Figure 1.

Figure 1.

Figure 1

Flowchart of the extended two-stage adaptive design for Phase II clinical trials (third column) with s1 < r1 < q1. Simon’s two-stage design is obtained by setting r1 = n1 (first column) and Lin and Shih’s design is obtained by setting q1 = n1 (second column).

The above extension to three prefixed possible target responses may appear straightforward to implement, but in reality it is computationally impractical to find a solution by the use of greedy algorithms due to its complexity. Indeed, we formulate the optimization problem and show that greedy algorithms fail to find a solution for our 10-dimensional optimization problem after running it for 30 days. To this end, we briefly review a nature-inspired stochastic population-based algorithm, particle swarm optimization technique, and show that it not only provides solutions in a relatively short time but also facilitates performance comparison among different algorithms in the following sections.

Discrete Particle Swarm Optimization (DPSO)

Particle swarm optimization (PSO), a population-based global optimization method, was introduced by Kennedy and Eberhart (1995). It is an evolutionary algorithm and stochastically evolves a group of particles that mimic observational behavior from nature. PSO is motivated from observing how a flock of birds move in the sky and so is a member of the class of nature-inspired stochastic population-based algorithms. This class of algorithms has been gaining lots of recognition especially in the last decade or so for its ability to solve or nearly solve hard-to-optimize high dimensional problems in the real world. Whitacre (2011a,b) documented the meteoric rise in use of such algorithms in applied fields and increasingly in academia as well. Their main appeals are that they are simple to implement, assumption-free and tend to quickly converge to the optimum or get quickly to the vicinity of the optimum solution. Additionally, they are general-purpose optimization algorithms and so are adaptable to solve different types of optimization problems after the user inputs some tuning parameters to initiate the algorithm. Genetic algorithm, simulated annealing and PSO are examples of such algorithms.

Some recent applications of PSO to solve optimal design problems in statistics are Qiu et al. (2014), Chen et al. (2014), Phoa et al. (2016), and Wong et al. (2015). They tackled design problems in biomedical problems that ranged from finding D-optimal designs for several nonlinear models and mixture models with multiple constraints to finding minimax and standardized maximin type of optimal designs. The latter designs have non-differentiable optimality criteria that require a couple of nested levels of optimization and are notoriously difficult to find. Phoa et al. (2016) applied swarm intelligence to find an optimal supersaturated design in a high dimension problem that involves judicious and repeated exchanges of columns in the design matrix to minimize correlations among the columns via the E(s2) criterion. PSO has also successfully used for estimation in statistical problems. For example, Kim and Li (2011, 2014) applied PSO to estimate parameters in nonlinear mixed-effects pharmacokinetics models, and Kim et al. (2012) employed PSO to estimate efficacy of different lung cancer screening methods. In each of the above problems, we observed, as many others had, the flexibility of PSO and how it can be readily modified to solve an optimization problem at hand. In what is to follow, we first demonstrate yet again that PSO can be adapted to efficiently solve our adaptive design problem with a very different setup than those just mentioned above. Second, we show PSO can solve high dimensional problems with many parameters in our adaptive design problems that current greedy algorithms cannot and third, PSO can substantially outperform greedy algorithms in adaptive design problems with a small number of parameters.

Operationally, the user initiates PSO by first specifying the maximum of iterations allowed, say K and the flock size consisting of, say N, randomly generated particles for the search. Each particle represents a candidate solution to the problem and at any one time, each has a fitness value (i.e. the design criterion value). As it searches, it keeps track of its best value (personal or local best value) and communicates with the rest of the particles to determine the global best value, defined as the best personal best value among the flock up to that iteration. The particles updates these two values continuously as it moves across the search space in the direction guided by its most recent local best and the global best with a velocity determined stochastically by a combination of values of the tuning parameters, its personal best and the global best and the velocity with which it has arrived at the current position. Specifically, suppose the nth particle’s position vector is xn=(xkn)k=1,,K and its updating velocity vector is vn=(vkn)k=1,,K, n = 1, ⋯, N. If its local best and global best are xlbestn and xgbest respectively, its velocity vk+1n and the position xk+1n at the (k + 1)th iteration are determined by

vk+1n=wkvkn+c1r1(xlbestnxkn)+c2r2(xgbestxkn);xk+1n=xkn+vk+1n. (4)

Here wk is the inertia weight (0 ≤ wk ≤ 1) at the kth iteration, c1 and c2 are two positive constants called cognitive and social coefficient and, r1 and r2 two random variates in the range [0, 1]. The lower limit values of the constants c1 and c2 determine how far particles are allowed to wander beyond the target region before pulled back, and their upper limit values determine how sudden the particles should be moved back to the target region. Following convention, we set c1 and c2 to their default values equal to 2 in our simulation studies. At the kth iteration, the inertia weight wk is wk=wmaxkK(wmaxwmin) where wmin and wmax are user-defined constants satisfying 0 ≤ wK = wminwkwmax = w0 ≤ 1.

During the search, the inertia weight adaptively controls the impact of the previous history of velocities on the current velocity and also influences the trade-off between global (wide-ranging) and local (nearby) exploration abilities of the particles as they move across the search space. A larger inertia weight facilitates global exploration (searching new areas) and a smaller inertia weight facilitates local exploration. When the inertia weight is suitably chosen, it can provide a balance between global and local exploration abilities and on average, require fewer iterations to find the global optimum (Shi and Eberhart (1998)). To exploit these properties of the inertia weight, we use a dynamic inertia weight to enable PSO escape from premature convergence when it is stagnated (Tao and Cai (2009)).

PSO was originally designed to solve optimization problems over a continuous domain, not optimization over a domain comprising of discrete positive integer numbers. Since PSO is a flexible algorithm, we modified it for our problem and call it Discrete PSO (DPSO). The task is to choose the right combination of several positive integers that would minimize the total sample size required under various hypothesis and at the same time meet the user-specified Type I and II error rates. The proposed DPSO has two different features from PSO: (i) the inertia weight wk is the nearest integer of PSO’s inertia weight wk at the kth iteration, i.e. wk=wk+12=wmaxkK(wmaxwmin)+12, where ⌊x⌋ = arg maxm{mZ|mx} and Z is a set of integers, and (ii) the two random sequences are generated from a discrete uniform distribution.

Extended two-stage adaptive design with 3 target responses

We consider two optimization approaches to find θ^, the optimal value for θ = (s1, r1, q1, n1, s, l, r, m, q, n) for our adaptive design problem. One is the greedy search and the other is DPSO proposed in this study. The greedy search adopts the strategies used in Simon’s or Lin and Shih (2004)’s algorithms for the extended 2-stage adaptive design. Specifically, we first calculate the required sample sizes ni, i = l, m, and n, for a single-stage design, where nl, nm, and nn are the required sample sizes corresponding to the target response rates p1, p2, and p3, respectively, and constrain l, m, and n to be in the range of ⌊0.85 · ni⌋ ≤ i ≤ ⌊1.5 · ni + 1⌋, where i = l, m, and n. For each value of l, m, and n, and each of n1 in [1,min(l, m, n) − 1], s1 in [0, n1], r1 in [s1 + 1, n1], and q1 in [r1 + 1, n1], we then find the set of feasible solutions Θ

that satisfies the error constraints (3) for q in [q1 + 1, n], r in [r1 + 1, m], and s in [s1 + 1, l] and determine the optimum according to one of optimality criteria C1–C4.

To implement DPSO, we specify the lower and upper boundaries for each component of the parameter vector θ = (s1, r1, q1, n1, s, l, r, m, q, n). Similar to the greedy search, we use the sample sizes nl, nm, and nn, required for a single-stage design corresponding to the target response rates p1, p2, and p3, respectively and set

Li=0.85ni,i=l,m,n;Ui=1.5ni+1,i=l,m,n;U=min{Ui;i=l,m,n}. (5)

The lower and upper boundaries for each component of θ are then obtained as follows:

s1[0,Um2+1];r1[0,Up1+1];q1[0,Up2+1];n1[1,U];s[0,Ulp1+1];l[Ll,Ul];r[0,Ump2+1];m[Lm,Um];q[0,Unp3+1];n[Ln,Un],

and used in DPSO to find the optimal solution under each of the C1–C4 optimality criteria among the feasible solutions in Θ that satisfy the error constraints (3).

The standard PSO is usually not sensitive to the initial values of the tuning parameters and can find an optimal solution using the default values. However, in our DPSO, a lot of iterations is required for convergence, especially when there are two or three target responses and the initial values were not among the feasible solutions in Θ. For this reason, we consider two approaches (G-DPSO and D-DPSO) to find an appropriate initial set of values among the feasible solutions in Θ to boost the speed of convergence of DPSO.

In the first approach, the initial value is found by a greedy search over a smaller range of values for ni and not over the entire range, where i = l, m, and n. Specifically, the required sample sizes ni, i = l, m, and n, are in the range of ni − 1 ≤ ini + 1, where i = l, m, and n. Within this smaller domain, we searched for an appropriate set of initial values using the same strategy as the greedy search did for the rest of the parameters. By shrinking the range of the parameters ni, i = l, m, and n, we potentially reduced the computational burden. We call DPSO with this first approach G-DPSO. In particular, we observed the consistency of the computation time with a short amount of time, when G-DPSO was applied to the case with one target response (see Table 1 and Supplementary Information Table S1). However, similar to the greedy search, the computation time of G-DPSO increases as the number of target response s increases, especially for two or three target responses (for example, see Tables 2 and 3 and Supplementary Information Tables S2 and S3). Therefore, we devise the second approach that does not depend on the number of target responses for the cases with two or three target responses.

Table 1.

Various adaptive 2-stage optimal designs with one target response when α = 0.05 and β = 0.20.

p0 p1 Optimal criteria Method s1/n1 s/n 1 − α β E(N|p0) E(N|p1)
0.05 0.20 C1 GS 0/10 3/29 0.953 0.199 17.624 26.960
G-DPSO 0/10 3/29 0.953 0.199 17.624 26.960
C2 GS 0/11 3/28 0.956 0.199 18.330 26.540
G-DPSO 0/11 3/28 0.956 0.199 18.330 26.540
C3 GS 0/11 3/28 0.956 0.199 18.330 26.540
G-DPSO 0/11 3/28 0.956 0.199 18.330 26.540
C4 GS 0/11 3/28 0.956 0.199 18.330 26.540
G-DPSO 0/11 3/28 0.956 0.199 18.330 26.540

0.20 0.35 C1 GS 5/22 19/72 0.951 0.200 35.368 63.855
G-DPSO 5/22 19/72 0.951 0.200 35.368 63.855
C2 GS 3/21 15/53 0.950 0.200 41.148 51.941
G-DPSO 3/21 15/53 0.950 0.200 41.148 51.941
C3 GS 6/31 15/53 0.950 0.198 40.436 51.983
G-DPSO 6/31 15/53 0.950 0.198 40.436 51.983
C4 GS 3/21 15/53 0.950 0.200 41.148 51.941
G-DPSO 3/21 15/53 0.950 0.200 41.148 51.941

0.55 0.70 C1 GS 15/26 48/76 0.952 0.195 42.021 69.735
G-DPSO 15/26 48/76 0.952 0.195 42.021 69.735
C2 GS 20/35 43/67 0.953 0.200 45.802 64.662
G-DPSOa 20/35 43/67 0.953 0.200 45.802 64.662
C3 GS 20/35 43/67 0.953 0.200 45.802 64.662
G-DPSO 20/35 43/67 0.953 0.200 45.802 64.662
C4 GS 20/35 43/67 0.953 0.200 45.802 64.662
G-DPSOb 20/35 43/67 0.953 0.200 45.802 64.662

Table 2.

Various adaptive 2-stage optimal designs with two target responses when α = 0.05, β1 = 0.20 and β2 = 0.10.

p0 p1 p2 Optimal criteria Method s1/r1/n1 s/m r/n 1 − α β1 β2 E(N|p0) E(N|p1) E(N|p2)
0.05 0.20 0.25 C1 GS 0/1/9 3/30 4/41 0.957 0.199 0.094 17.548 33.383 36.119
G-DPSO 0/1/10 3/28 3/31 0.953 0.199 0.088 17.481 27.940 29.254
D-DPSO 0/1/8 4/41 3/36 0.952 0.200 0.107 18.821 32.980 34.532
C2 GS 0/2/16 3/29 3/22 0.955 0.199 0.082 22.978 24.097 23.249
G-DPSO 0/2/14 3/29 3/21 0.954 0.200 0.084 21.444 23.925 22.982
D-DPSO 0/2/14 3/29 3/21 0.954 0.200 0.084 21.444 23.925 22.982
C3 GS 0/4/9 4/41 5/22 0.960 0.174 0.084 20.831 36.333 37.668
G-DPSO 0/5/11 3/28 6/25 0.956 0.199 0.083 18.330 26.505 27.179
D-DPSO 0/5/11 3/28 6/25 0.956 0.199 0.083 18.330 26.505 27.179
C4 GS 0/2/16 3/29 3/22 0.955 0.199 0.082 22.978 24.097 23.249
G-DPSO 0/2/15 3/28 3/23 0.957 0.197 0.078 21.796 24.533 24.007
D-DPSO 0/2/15 3/28 3/23 0.957 0.197 0.078 21.796 24.533 24.007

0.20 0.35 0.40 C1 G-DPSO 4/9/20 17/62 10/36 0.952 0.199 0.069 35.487 53.869 53.499
D-DPSO 5/10/22 19/72 11/36 0.951 0.199 0.078 35.311 60.004 60.179
C2 G-DPSO 8/12/37 16/57 13/41 0.950 0.196 0.059 42.913 46.949 44.252
D-DPSO 5/10/29 16/56 11/36 0.951 0.196 0.055 43.094 46.415 42.638
C3 G-DPSO 6/11/27 16/58 13/43 0.950 0.200 0.063 35.833 51.406 50.886
D-DPSO 3/10/21 15/53 11/36 0.950 0.200 0.058 41.131 50.629 49.683
C4 G-DPSO 8/12/38 16/56 13/43 0.952 0.200 0.057 43.830 47.337 45.224
D-DPSO 3/10/21 15/53 11/36 0.950 0.200 0.058 41.131 50.629 49.683

0.55 0.70 0.75 C1 G-DPSO 15/20/26 48/76 27/39 0.951 0.200 0.053 41.807 63.720 61.521
D-DPSO 15/20/26 48/76 27/39 0.951 0.200 0.053 41.807 63.720 61.521
C2 G-DPSO 24/28/41 47/73 30/45 0.951 0.194 0.039 48.872 55.463 50.268
D-DPSO 11/17/24 45/70 30/45 0.951 0.199 0.043 57.956 59.756 54.718
C3 G-DPSO 13/20/25 43/67 25/42 0.951 0.194 0.038 47.730 62.880 61.206
D-DPSO 14/20/26 43/67 23/39 0.950 0.199 0.041 45.163 59.975 56.926
C4 G-DPSO 25/30/44 44/68 32/49 0.950 0.197 0.037 51.862 56.540 52.457
D-DPSO 10/16/21 43/67 28/41 0.950 0.200 0.044 51.907 60.628 57.152

Table 3.

Various adaptive 2-stage optimal designs with three target responses when α = 0.05, β1 = 0.20, β2 = 0.10 and β3 = 0.05.

p0 p1 p2 p3 Optimal criteria Method s1/r1/q1/n1 s/l r/m q/n 1 − α β1 β2 β3 E(N|p0) E(N|p1) E(N|p2) E(N|p3)
0.05 0.20 0.25 0.30 C1 G-DPSO 0/1/4/10 3/28 3/31 5/28 0.953 0.200 0.088 0.037 17.481 27.841 29.020 29.593
D-DPSO 0/1/2/9 4/36 3/34 3/31 0.959 0.199 0.093 0.044 18.816 30.463 31.375 31.691
C2 G-DPSO 0/1/2/13 3/31 3/28 3/20 0.951 0.200 0.088 0.036 21.158 23.725 22.613 21.636
D-DPSO 0/1/2/14 4/36 3/33 3/19 0.952 0.198 0.094 0.043 24.391 24.899 22.847 21.245
C3 G-DPSO 0/2/5/11 3/28 3/28 6/21 0.956 0.200 0.084 0.033 18.330 26.458 27.042 27.116
D-DPSO 0/3/4/11 3/29 5/29 5/22 0.953 0.198 0.085 0.034 18.761 27.101 27.437 27.172
C4 G-DPSO 0/1/2/15 3/28 3/27 3/24 0.958 0.198 0.077 0.026 21.698 24.904 24.615 24.354
D-DPSO 0/1/2/14 4/32 3/30 3/22 0.960 0.198 0.079 0.028 22.675 25.189 24.130 23.260

0.20 0.35 0.40 0.45 C1 G-DPSO 5/9/10/24 17/61 10/36 11/32 0.952 0.200 0.064 0.017 36.401 48.570 45.349 40.823
D-DPSO 5/11/13/22 19/72 12/36 14/32 0.951 0.200 0.078 0.028 35.355 62.126 63.957 61.554
C2 G-DPSO 5/9/10/29 17/60 13/44 11/34 0.951 0.197 0.061 0.015 44.649 45.199 40.615 37.117
D-DPSO 3/6/8/19 19/67 11/38 9/31 0.950 0.194 0.060 0.016 43.149 47.817 43.504 39.094
C3 G-DPSO 4/9/13/22 16/57 10/39 16/36 0.951 0.196 0.059 0.014 37.889 50.724 49.244 46.353
D-DPSO 3/8/9/17 17/62 12/36 11/28 0.951 0.197 0.070 0.024 37.230 54.484 54.003 50.929
C4 G-DPSO 5/9/10/26 16/57 13/44 11/34 0.950 0.197 0.059 0.013 38.717 46.656 43.284 39.542
D-DPSO 4/8/11/24 18/64 11/36 12/28 0.952 0.193 0.068 0.026 44.584 48.279 42.940 37.622

0.55 0.70 0.75 0.80 C1 G-DPSO 15/20/21/26 48/76 29/43 23/31 0.951 0.199 0.052 0.010 41.812 63.491 60.658 51.947
D-DPSO 10/15/16/19 46/72 23/39 18/29 0.951 0.198 0.047 0.008 44.911 62.696 60.681 54.260
C2 G-DPSO 17/22/23/32 47/73 32/48 24/35 0.950 0.195 0.041 0.005 51.994 54.810 46.623 39.493
D-DPSO 8/13/14/17 46/72 27/39 21/31 0.951 0.195 0.048 0.009 52.800 62.503 58.360 51.268
C3 G-DPSO 15/20/21/27 46/72 27/41 24/35 0.951 0.192 0.041 0.005 44.743 59.648 54.638 46.497
D-DPSO 12/17/18/22 47/74 26/39 24/32 0.950 0.197 0.054 0.014 44.314 63.001 60.007 52.356
C4 G-DPSO 17/22/23/32 47/73 32/48 24/35 0.950 0.195 0.041 0.005 51.994 54.810 46.623 39.493
D-DPSO 7/12/14/16 45/70 26/39 19/31 0.951 0.200 0.044 0.006 55.302 60.783 56.534 50.252

The second approach abbreviated as D-DPSO is used when the number of target responses is two or more. D-DPSO finds the initial set of values using an optimal set of values decided by the case with the one less number of target response. In other words, when the number of target responses is k, the initial value is decided using the sample sizes determined by the case when the number of target responses is k − 1, where k = 2, 3. By doing so, we could make the second approach not depend on the number of target responses in terms of computation time. The specific details for implementing D-DPSO to search for any one of the four types of optimal design when there are two and three target responses are as follows:

The number of target responses is two

We first obtain two sets of sample sizes, E1={s11,n11,s1,n1} and E2={s12,n12,s2,n2} from the target responses p1 and p2, respectively, based on the G-DPSO method when there is one target response. Then, using the two sets of sample sizes E1 and E2, the initial value is set to θ2=(s1,r1,m1,s,m,r,n)=(rank3{s11,n11,s12,n12},s1,n1,s2,n2) where rankkS is the top k elements of the set S arranged in ascending order. The first three values (s1,r1,m1) correspond to the null response p0 (i.e., the first stage), so the sample sizes (s11,n11,s12,n12), which correspond to the first stage of E1 and E2, are used. Since (s,m) and (r,n) correspond to the target values p1 and p2 (i.e., the second stage), respectively, the values (s1,n1) and (s2,n2), which correspond to the second stage of E1 and E2, are used.

The number of target responses is three

We first obtain two sets of sample sizes E1={s11,r11,m11,s1,m1,r1,n1} and E2={s12,r12,s2,n2} using the D-DPSO with the target responses (p1, p2) and the G-DPSO with the target response p3, respectively. Then, the initial value is set to θ3=(s1,r1,q1,n1,s,l,r,m,q,n)=(rank4{s11,r11,m11,s12,n12},s1,m1,r1,n1,s2,n2). The first four values (s1,r1,q1,n1) correspond to the null response p0 (i.e., the first stage), so the sample sizes (s11,r11,m11,s12,n12), which correspond to the first stage of E1 and E2, are used. Since (s,l,r,m) and (q,n) correspond to the target values (p1, p2) and p3 (i.e., the second stage), respectively, the values (s1, m1, r1, n1) and (s2, n2), which correspond to the second stage of E1 and E2, are used.

Simulation studies

We use simulation studies to evaluate the performance of extended two-stage adaptive designs. We also generate Simon’s two-stage design and Lin and Shih (2004)’s adaptive two-stage design using DPSO and compare results when the number of target response is one and two, respectively. We refer to Simon’s two-stage design as an adaptive two-stage design with one target response, Lin and Shih (2004)’s adaptive two-stage design as an adaptive two-stage design with two target responses, and our extended two-stage adaptive design as an adaptive two-stage design with three target responses.

Our simulation covers one set of Type I (α) and Type II (β) error rates with (α, β1, β2, β3) = (0.05, 0.20, 0.10, 0.05) and three sets of target response rates, which are (p0, p1, p2, p3) ∈ {(0.05, 0.20, 0.25, 0.30), (0.20, 0.35, 0.40, 0.45), (0.55, 0.70, 0.75, 0.80)}. Note that we empirically selected these combinations for null and target response rates to reflect the general performances of each method. The optimal design under each of the four optimality criteria C1–C4 was found using two optimization methods: a greedy search (GS) and discrete PSO (DPSO). GS looks for all possible and feasible cases to find the optimal solution, while DPSO stochastically searches for the optimal solution using Equations (1) and (2). The number of particles used to optimize each design parameter was 10 and the population size was set to 10000 in our DPSO. The total number of iterations allowed was 50. The two methods described in the previous section were applied to obtain initial values for G-DPSO and D-DPSO.

Tables 13 display optimal designs found from GS, G-DPSO and D-DPSO methods when there are one, two, and three target responses, respectively. Details of the computation times in minutes to evaluate computational burden for the different algorithms are available in the Supplementary Information Tables S1–S3. Note that the computation times given in Supplementary Information Tables S1 to S3 might not accurately reflect the true computational complexity of each method and we display them to give some insights on how efficient each method is. All simulations were rendered up to 30 days using a high performance Grid enabled computing system at Wayne State University. The Grid currently has the combined processing power of 4,420 cores: 1,200 Intel cores, 3,220 AMD cores, with over 12TB of RAM and over half a petabyte of disk space.

We make two remarks before we present our results. First, to ensure a fair comparison, we implemented all the GS, G-DPSO and D-DPSO algorithms using the statistical software R. Some of our G-DPSO results are a bit different from published results found by the GS method. This is not surprising because of the stochastic nature of the G-DPSO algorithm and the results from G- or D-DPSO depend on the algorithm convergence criterion and the tuning parameters, which for simplicity, we had set a priori to be same for all cases. Consequently, optimal designs found by the G- and D-DPSO algorithm may vary each time we ran. For instance, when p0 = 0.55 and p1 = 0.70, our simulation settings for PSO produced different results for criteria C2 and C4 than those reported in Simon’s paper found by a greedy search. Some such discrepancies are indicated in Table 1 by superscripts 1 and 2 but they can be resolved using a different set of tuning parameters for the algorithm. For example, in these instances, when the number of particles was increased from 10 to 20 and the population size was increased from 10000 to 70000, our DPSO algorithm converged to the same optimal design with s1/n1 = 20/35 obtained in Simon’s paper in a small fraction of the time required by GS. Second, our codes repeatedly call on IMS functions for computing Binomial probabilities every single time it is needed, which is unlike the codes in Simon’s paper where they were predefined upfront. Consequently, our CPU times may appear several factors longer than those reported in earlier work, but are in fact substantially faster than the computing time required by GS.

Adaptive two-stage design with one target response

When there is one target response rate, the adaptive two-stage design is the same as Simon’s two-stage design. Table 1 compares the results between GS and G-DPSO when p1p0 = 0.15. As stated before, we applied D-DPSO only when the number of target response is either two or three. In all cases, both optimization methods give the exactly same optimal design with the same expected sample sizes. Table S1 in the Supplementary Information shows G-DPSO can be up to 100 times faster than GS except when p0 = 0.05. One reason that G-DPSO is slower than GS is that the size of the solution domain when p0 = 0.05 is small and the computational time of GS can be highly dependent on the size of the solution domain. As such, the computational time required by GS can be highly variable. In contrast, the time required by G-DPSO to find the optimum does not depend on the size of the solution set once the initial values are obtained. Consequently, we observe a consistent computational time range required by DPSO for all the cases. The table also shows that the sample size requirements for optimality criteria C2-C4 are smaller than those required by criterion C1.

Adaptive two-stage design with two target responses

We now apply GS and DPSO (G-DPSO and D-DPSO) to generate the Lin and Shih (2004)’s adaptive two-stage designs and Table 2 displays them. Each optimization was allowed to run up to 30 days. GS reports the optimal designs only when p0 = 0.05 because the size of the solution area increases exponentially with the number of target responses when p0 is close to 0.5. As a result, GS could not reach out to the solutions for other cases. Interestingly, as shown in Supplementary Information Table S2, the computation time of GS when p0 = 0.05 is much longer (at least 68 times longer) than those of both G-DPSO and D-DPSO, even though the determined sample sizes are comparable with or slightly smaller than those of G-DPSO and D-DPSO. As expected, the required sample sizes of G-DPSO and D-DPSO are similar even though the computation time for G-DPSO and D-DPSO are different; the former depends on the size of the solution set but the latter does not, hence their computing times are different. For example, when (p0, p1, p2) = (0.55, 0.70, 0.75), G-DPSO takes 23 times longer than D-DPSO to determine the optimal design, but when the size of the solution set is small as when (p0, p1, p2) = (0.05, 0.20, 0.25), the computation time required by G-DPSO is shorter than that required by D-DPSO, which consistently ranges from 5.07 to 6.10 minutes and independent of the solution size.

Adaptive two-stage design with three target responses

Table 3 displays the extended 2-stage adaptive designs with three target response rates. Within 30 days run, GS always failed to determine the optimal design, but both G-DPSO and D-DPSO were able to find the optimal designs with similar results. Interestingly, when (p0, p1, p2, p3) = (0.55, 0.70, 0.75, 0.80), the computation time of G-DPSO is almost 30 days, but that of D-DPSO is about 6.5 minutes, demonstrating that D-DPSO is much more time efficient that G-DPSO even in complicated situations and computation burden, see Table S3 in the Supplementary Information.

At the request of an anonymous reviewer, we also performed another simulation with α = 0.01. Table S4 to S6 in the Supplementary Information display the results. Due to the expensive computation time, we included the results only from D-DPSO when there are three target responses. The overall performances of each method with α = 0.01 are close to the case when α = 0.05.

Figure 2 compares expected sample size requirements from the 3 types of 2-stage designs discussed here for criteria C1–C4. We considered 3 testing scenarios with various hypothesized efficacy rates for the drug in the designs and investigated which method estimates the smallest sample size among the 3 types of 2-stage designs. Interestingly, we observe that among the 3 types of adaptive designs, there is no clear winner that consistently requires the smallest expected sample sizes. We further compared expected sample size requirements under various alternative hypotheses shown in Figures S1 to S3 in the Supplementary Information. We observe that the adaptive designs with two or three targets are comparable similar to the null hypothesis, but the 2-stage design with one target requires the largest sample sizes. Our proposed extended 2-stage adaptive designs are very flexible and yet surprisingly do not necessarily require larger expected sample sizes than the other two types of adaptive designs and sometimes smaller expected sample sizes, especially for criterion 3. Even when our proposed designs have larger expected sample size requirements, the additional numbers may not be large and seem manageable. Clearly, compounding the comparison problem is that the expected sample sizes also crucially depend on the various hypothesized rates in a complicated manner. The summary here is that our proposed designs offer practitioners a good alternative, especially when it is difficult to ascertain the drug efficacy rate at the onset.

Figure 2.

Figure 2

Expected sample sizes under the null hypothesis for the 4 criteria C1–C4 with 1, 2 or 3 target alternatives estimated by D–DPSO for 3 scenarios (from left to right): (i) p0 = 0.05, p1 = 0.20, p2 = 0.25, p3 = 0.30, (ii) p0 = 0.20, p1 = 0.35, p2 = 0.40, p3 = 0.45, and (iii) p0 = 0.55, p1 = 0.70, p2 = 0.75, p3 = 0.80. Error rates were set at α = 0.05, β1 = 0.20, β2 = 0.10 and β3 = 0.05

Applications

Obstructive sleep apnea (OSA) study

Our clients did not allow us to discuss the entire nature of their study and so we minimally describe the study design. This study aims to assess the effect of HNC on the incidence of OSA compared to healthy patients. The literature suggests the maximum incidence rate of snoring and sleep apnea on healthy patients is 16.5%, resulting in the null hypothesis of 16.5% (i.e., p0 = 0.165). There was neither historical nor preliminary data available, except that the incidence rate of OSA will be higher in HNC patients. Our clients provided an empirical range of the target response rates, which is from 24.38% to 39.00%. If Simon’s two-stage design were used for sample size calculation with 80% power and 5% significance level, the required sample sizes range from 30 to 197, using the optimality and not the minimax criterion. This shows the impact of the great uncertainty of the OSA incidence rate on HNC patients has on the sample size requirements. So there is high risk of under or over powering study if one mis-specifies the response rate. Due to wide range of the target response rates, Lin and Shih’s approach will not be able to cover the great uncertainty. For our application, one may use the extreme limits in the range for HNC patients as two target rates in the alternative hypotheses with the third that targets the midpoint between the extreme limits in the presumed range. Table 4 shows the D-DPSO generated sample size requirements for this scenario with three target responses (p1 = 24.38, p2 = 31.69%, and p3 = 39.00%) and various error rates specifications (β1 = 0.20, β2 = 0.15, and β3 = 0.10).

Table 4.

Sample sizes from D-DPSO for the 4 optimality criteria for OSA, Lin and Shih’s VBG, and the Phase II BREAK-2 studies.

Optimality criteria s1/r1/q1/n1 s/l r/m q/n E(N|p0) E(N|p1) E(N|p2) E(N|p3)
OSA

C1 2/8/9/21 39/188 13/55 10/39 136.92 167.28 157.91 124.73
C2 and C4 1/11/12/25 34/161 20/72 13/36 152.07 158.99 154.07 135.42
C3 1/8/9/17 37/177 12/58 10/35 144.40 166.73 167.70 153.87

Lin and Shih’s VBG

C1 20/29/35/49 83/183 33/75 39/77 101.45 157.83 153.53 129.21
C2 20/28/29/54 81/175 42/87 31/65 125.76 134.87 107.31 81.83
C3 20/31/33/51 77/168 40/76 37/68 107.64 154.27 150.26 129.11
C4 19/28/31/54 79/170 39/83 34/66 134.72 136.06 109.57 84.89

BREAK-2

C1 4/10/11/19 26/80 14/44 12/34 51.522 72.216 65.961 58.879
C2 and C4 4/9/12/22 27/83 16/43 13/31 62.079 65.691 50.196 43.063
C3 2/7/9/13 25/77 15/44 14/39 55.526 70.032 66.463 62.186

Lin and Shih’s VBG study

This study is adapted from Lin and Shih’s work and is to investigate the efficacy of the combination therapy of vinorelbine, bleomycin, and gemcitabine (VBG) on patients with recurrent or refractory Hodgkin’s disease (HD). The null hypothesis was set at 40% based on the single agents’ response rate, but the target response rates ranges from 0.5 to 0.6. So, Lin and Shih applied their design to cover this uncertainty with two target response rates, p1 = 55% and p2 = 60%, under β1 = 0.20 and β2 = 0.15 at a 5% significance level. Using our proposed design, three target responses p1 = 50%, p2 = 55%, and p3 = 60% are considered with β1 = 20%, β2 = 15%, and β3 = 5% and the optimal designs estimated are available in Table 4. The D-DPSO’s required sample size in stage 1 (49 to 54) is a little larger than those of Lin and Shih’s (39), because the smallest target rate in D-DPSO (p1 = 0.50) is smaller than that in Lin and Shih’s (p1 = 0.55). However, the required sample sizes of both approaches in stage 2 are comparable to each other.

Phase II BREAK-2 study

We now use a real trial from the literature and discuss advantages of our proposed extended two-stage adaptive designs. In particular, we show our proposed optimal design can reduce the sample size used in the real trial by one-half if it were implemented.

A multicenter, international, single-arm, phase II study (BREAK-2) was carried out to assess the overall response rate of dabrafenib from patients with BRAFV600E mutation-positive metastatic melanoma (Ascierto et al. (2013)). The null hypothesis was set at p0 = 0.25 and the alternative hypothesis was set at p1 = 0.40. The trial wanted to recruit at least 85 patients and the plan was to declare the treatment a success if at least 29 patients responded. The Green-Dahlberg 2-stage design (Green and Dahlberg (1992)) was employed with an interim analysis planned after the first 30 patients with BRAFV600E mutation-positive metastatic melanoma were enrolled. The expectation was that at least 7 patients were needed to respond at the interim analysis to continue the study when Type I and II error rates were set at 0.05 and 0.10, respectively. The study BREAK-2 is ongoing, but not recruiting participants as of December 14, 2015 (http://www.clinicaltrials.gov, NCT01153763). Green and Dahlberg’s two-stage design (also known as Southwest Oncology Group [SWOG] two-stage design) is similar to Simon’s two-stage design but uses a simple and uniform (i.e., fixed) significance to estimate the sample sizes for each stage with 0.02 level in stage 1 and 0.055 level in stage 2, resulting in less sensitive (or more robust) to the discrepancy between the actual sample size and the planned one. For example, as stated in Green and Dahlberg (Green and Dahlberg (1992)), multicenter studies are not easy to control over the number of patients accrued. In addition, it often occurs for multicenter studies that some patients who entered the study are ineligible after the accrual is suspended. To address these potential issues, BREAK-2 used Green and Dahlberg’s approach and, in fact, ended up a total of 76 patients which differ from the planned sample size of 85. Under the design parameters, using Green and Dahlberg’s approach and notation, the sample sizes for each stage are n1 = 45 and n2 = 40 (i.e., n = 85). If Simon’s two-stage design is used, the sample sizes are n1 = 57 and n2 = 13 (i.e., n = 83) with the minimax constraint and n1 = 37 and n2 = 10 (i.e., n = 99) with the optimal constraint.

The efficacy results show that 76 patients with BRAFV600E mutation-positive metastatic melanoma were enrolled and 45 patients (59%) had a confirmed response. Although its parent phase I study (Falchook et al. (2012); http://www.clinicaltrials.gov, NCT00880321) showed the same type of patients had a response rate of 50%, this phase II study chose the response rate of 40% as an alternative hypothesis by lowering the response rate of phase I study. However, based on the phase I study, it would be of benefit if the higher response rate was explored in addition to 40% because the final response rate of the phase II study was 59%. Examining the benefit of applying the adaptive 2-stage design, we determined the optimal design of this phase II study with p0 = 0.25, p1 = 0.40, p2 = 0.50, and p3 = 0.55 and with α = 0.05, β1 = 0.15, β2 = 0.10, and β3 = 0.05 using D-DPSO.

Table 4 shows the two-stage adaptive optimal designs found from D-DPSO. The same DPSO setting of parameters was used as the simulation studies did. Comparing to the original BREAK-2 study (i.e., when p1 = 0.4), all the sample sizes of the adaptive design (s/l in Table 4) are comparable but smaller than that of BREAK-2 (29/85), while the sample size of the stage 1 of BREAK-2 (7/30) is larger than those of the adaptive design (s1/n1 in Table 4), although the Type II error of the adaptive design is larger than that of BREAK-2 (i.e., 0.15 vs. 0.10). Given the response rate of 59% (i.e., the final results of BREAK-2), the probabilities of not terminating the experiment at the end of the first stage are more than 99% for all three sample sizes in the table. The probabilities of choosing the sample sizes corresponding to either p2 = 0.5 or p3 = 0.55 in the second stage are 79%, 97% and 75%, respectively, for C1, C2 and C4, and C3, given the response rate of 59%, implying that only a half of the planned number of patients would be required for the BREAK-2 study if the adaptive design were used. Furthermore, the probabilities of choosing the sample sizes corresponding to p3 = 0.55 in the second stage are 63%, 74% and 33%, respectively, for C1, C2 and C4, and C3, given the response rate of 59%.

Conclusions

We proposed a novel and effective nature-inspired stochastic population-based algorithm called discrete particle swarm optimization (DPSO) to find extended two-stage adaptive designs. This design terminates the trial if there are too few responders in the first stage and tests 1 of 3 preselected hypothesized response rates of a drug at the second stage, where the choice depends on the number of responders in the first stage. The design controls user-selected Type I error rate in the first stage and both Type I and II error rates at the second stage. Our results show that algorithms based on a greedy search invariably failed to find extended two-stage adaptive designs and an improved version of DPSO, called D-DPSO finds the optimum. When the problem is simplified to one or two target response rates, we also showed that D-DPSO outperformed their peers by a wide margin. Additionally, we employed simulation studies and showed all D-DPSO generated designs were able to meet pre-specified Type I and II error requirements and demonstrated how such designs can reap benefits if they were implemented in a real melanoma adaptive trial.

Supplementary Material

Supplementary Information

Acknowledgments

The research reported in this paper was partially supported by two grants: NSF DMS-1312603 for SK and NIH R01GM107639 for WKW. The Biostatistics Core in the Karmanos Cancer Institute at Wayne State University is supported in part by the NIH Cancer Center Support Grant P30 CA022453. We are grateful to Drs. Ammar Sukari and Misako Nagasaka for allowing us to use their OSA study as a real example. The contents in this paper are solely the responsibility of the authors and do not necessarily represent the official views of NIH and NSF.

References

  1. Ascierto PA, Minor D, Ribas A, et al. Phase II trial (BREAK-2) of the BRAF inhibitor Dabrafenib (GSK2118436) in patients with metastatic melanoma. Journal of Clinical Oncology. 2013;31:3205–3211. doi: 10.1200/JCO.2013.49.8691. [DOI] [PubMed] [Google Scholar]
  2. Brown SR, Gregory WM, Twelves CJ, Buyse M, Collinson F, Parmar M, Seymour MT, Brown JM. Designing phase II trials in cancer: a systematic review and guidance. British Journal of Cancer. 2011;105:194–199. doi: 10.1038/bjc.2011.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Chen K, Shan M. Optimal and minimax three-stage designs for phase II oncology clinical trials. Contemp Clin Trials. 2007;28:32–41. doi: 10.1016/j.cct.2007.04.008. [DOI] [PubMed] [Google Scholar]
  4. Chen RB, Chang SP, Wang W, Tung HC, Wong WK. Minimax optimal designs via particle swarm optimization methods. Statistics and Computing. 2014;24:1063–1080. [Google Scholar]
  5. Chen RB, Wang W, Chang SP, Wong WK. Optimal designs for mixture models using particle swarm optimization methods. PlosOne. 2015;10(6):e0124720. doi: 10.1371/journal.pone.0124720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Chen TT. Optimal three-stage designs for phase II cancer clinical trials. Stat Med. 1997;16:1–11. doi: 10.1002/(sici)1097-0258(19971215)16:23<2701::aid-sim704>3.0.co;2-1. [DOI] [PubMed] [Google Scholar]
  7. Chow SC, Chang M. Adaptive design methods in clinical trials. 2nd. New York, NY: Chapman & Hall/CRC Press; 2011. [Google Scholar]
  8. Ensign LG, Gehan EA, Kamen DS, Thall P. An optimal three-stage design for phase II clinical trials. Stat Med. 1994;13:1727–1736. doi: 10.1002/sim.4780131704. [DOI] [PubMed] [Google Scholar]
  9. Falchook GS, Long GV, Kurzrock R, Kim KB, Arkenau TH, Brown MP, Hamid O, Infante JR, Millward M, Pavilck AC, O’Day SJ, Blackman SC, Curtis CM, Lebowitz P, Ma B, Ouellet D, Defford RF. Dabrafenib in patients with melanoma, untreated brain metastases, and other solid tumors: A phase 1 dose-escalation trial. Lancet. 2012;379:18931901. doi: 10.1016/S0140-6736(12)60398-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Green SJ, Dahlberg S. Planned versus attained design in phase II clinical trials. Stat Med. 1992;11:853–862. doi: 10.1002/sim.4780110703. [DOI] [PubMed] [Google Scholar]
  11. Kennedy J, Eberhart R. Particle Swarm Optimization. Proceedings of IEEE International Conference on Neural Networks. 1995;IV:19421948. [Google Scholar]
  12. Kim S, Li L. A novel global search algorithm for nonlinear mixed-effects models using particle swarm Optimization. J Pharmacokinetics and Pharmacodynamics. 2011;38:471–495. doi: 10.1007/s10928-011-9204-6. [DOI] [PubMed] [Google Scholar]
  13. Kim S, Li L. Statistical identifiability and convergence evaluation for nonlinear pharmacokinetic models with particle swarm optimization. Comput Methods Programs Biomed. 2014;113:413–432. doi: 10.1016/j.cmpb.2013.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Kim S, Erwin D, Wu D. Efficacy of dual lung cancer screening by chest x-ray and sputum cytology using Johns Hopkins Lung Project Data. Biometrics and Biostatistics. 2012;3:1000139. [Google Scholar]
  15. Kramar A, Potvin D, Hill C. Multistage designs for phase II clinical trials: statistical issues in cancer research. British J of Cancer. 1996;74:1317–1320. doi: 10.1038/bjc.1996.537. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Kunz C, Kieser M. Optimal 2-stage designs for single-arm phase II oncology trials with two binary endpoints. Methods of Information in Medicine. 2011;50:372–377. doi: 10.3414/ME10-01-0037. [DOI] [PubMed] [Google Scholar]
  17. Kwak M, Jung SH. Phase II clinical trials with time-to-event endpoints: optimal two-stage designs with one-sample log-rank test. Statist Med. 2014;33:2004–2016. doi: 10.1002/sim.6073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Lin Y, Shih WJ. Adaptive 2-stage designs for single-arm phase IIA cancer clinical trials. Biometrics. 2004;60:482–490. doi: 10.1111/j.0006-341X.2004.00193.x. [DOI] [PubMed] [Google Scholar]
  19. Mander A, Thompson S. 2-Stage designs optimal under the alternative hypothesis for phase II cancer clinical trials. Contemporary Clinical Trials. 2010;31:572–578. doi: 10.1016/j.cct.2010.07.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Mander AP, Watson JMS, Sweetig MJ, Thompson SG. Admissible 2-stage designs for phase II cancer clinical trials that incorporate the expected sample size under the alternative hypothesis. Pharmaceutical Statistics. 2012;11:91–96. doi: 10.1002/pst.501. [DOI] [PubMed] [Google Scholar]
  21. Phoa KHF, Chen RB, Wang WC, Wong WK. Optimizing two-level supersaturated designs using swarm intelligence Techniques. Technometrics. 2016;58:43–49. doi: 10.1080/00401706.2014.981346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Qiu JH, Chen RB, Wang WC, Wong WK. Using animal instincts to design efficient biomedical studies. Swarm and Evolutionary Computation. 2014;18:1–10. doi: 10.1016/j.swevo.2014.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Schlesselman JJ, Reis IM. Phase II clinical trials in oncology: strengths and limitations of 2-stage designs. Cancer Invest. 2006;24:404–412. doi: 10.1080/07357900600705516. [DOI] [PubMed] [Google Scholar]
  24. Shi YH, Eberhart RC. A modified particle swarm optimizer; IEEE International Conference on Evolutionary Computation; May 4–9.Alaska: Anchorage; 1998. [Google Scholar]
  25. Simon R. Optimal 2-stage designs for phase 2 clinical trials. Controlled Clinical Trials. 1989;10:1–10. doi: 10.1016/0197-2456(89)90015-9. [DOI] [PubMed] [Google Scholar]
  26. Tao Z, Cai JD. A new chaotic PSO with dynamic inertia weight for economic dispatch problem. 2009 International Conference on Sustainable Power Generation and Supply, Nanjing. 2009:1–6. doi: 10.1109/SUPERGEN.2009.5347916. [DOI] [Google Scholar]
  27. Wason JMS, Mander AP, Eisen TG. Reducing sample sizes in 2-stage phase II cancer trials by using continuous tumor shrinkage end-points. European Journal of Cancer. 2011;47:983–989. doi: 10.1016/j.ejca.2010.12.007. [DOI] [PubMed] [Google Scholar]
  28. Whitacre JM. Recent trends indicate rapid growth of nature-inspired optimization in academia and industry. Computing. 2011a;93:121–133. [Google Scholar]
  29. Whitacre JM. Survival of the flexible: explaining the recent popularity of nature-inspired optimization within a rapidly evolving world. Computing. 2011b;93:135–146. [Google Scholar]
  30. Wong WK, Chen RB, Huang CC, Wang W. A modified particle swarm optimization technique for finding optimal designs for mixture models. PLoS ONE. 2015;10:e0124720. doi: 10.1371/journal.pone.0124720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Yang M, Biedermann S, Tang E. On optimal designs for nonlinear models: a general and efficient algorithm. Journal of the American Statistical Association. 2013;504:1411–1420. [Google Scholar]
  32. Zhou Y. Choice of designs and doses for early phase trials. Fundam Clin Pharmacol. 2004;18:373–8. doi: 10.1111/j.1472-8206.2004.00226.x. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information

RESOURCES