Optimizing subgroup selection in two‐stage adaptive enrichment and umbrella designs

Nicolás M Ballarini; Thomas Burnett; Thomas Jaki; Christoper Jennison; Franz König; Martin Posch

doi:10.1002/sim.8949

. 2021 Mar 29;40(12):2939–2956. doi: 10.1002/sim.8949

Optimizing subgroup selection in two‐stage adaptive enrichment and umbrella designs

Nicolás M Ballarini ¹, Thomas Burnett ², Thomas Jaki ^2,³, Christoper Jennison ⁴, Franz König ¹, Martin Posch ^1,^✉

PMCID: PMC8251960 PMID: 33783020

Abstract

We design two‐stage confirmatory clinical trials that use adaptation to find the subgroup of patients who will benefit from a new treatment, testing for a treatment effect in each of two disjoint subgroups. Our proposal allows aspects of the trial, such as recruitment probabilities of each group, to be altered at an interim analysis. We use the conditional error rate approach to implement these adaptations with protection of overall error rates. Applying a Bayesian decision‐theoretic framework, we optimize design parameters by maximizing a utility function that takes the population prevalence of the subgroups into account. We show results for traditional trials with familywise error rate control (using a closed testing procedure) as well as for umbrella trials in which only the per‐comparison type 1 error rate is controlled. We present numerical examples to illustrate the optimization process and the effectiveness of the proposed designs.

Keywords: Bayesian optimization, conditional error function, subgroup analysis, utility function

1. INTRODUCTION

It is increasingly common to integrate subgroup identification and confirmation into a clinical development program. Biomarker‐guided clinical trial designs have been proposed to close the gap between the exploration and confirmation of subgroup treatment effects. Numerous statistical considerations (eg, multiplicity issues, consistency of treatment effects, trial design) need to be taken into account to ensure a proper interpretation of study findings, as outlined in recent reviews. ¹ , ² , ³ , ⁴

Several study designs are available for the investigation of subgroups in clinical trials. These include all‐comers designs where biomarker status or subgroup are not considered for enrolment but only in the trial analysis, and stratified designs where the trial prevalences for each subgroup, that is the proportion of patients recruited from each subgroup, are chosen initially and maintained throughout the trial. ⁵ , ⁶ Adaptive enrichment designs have been proposed to increase the efficiency of these trials. ⁷ , ⁸ , ⁹ , ¹⁰ , ¹¹ These designs allow subgroups to be dropped for futility at interim analyses with the rest of the trial being conducted with subjects from the remaining groups only. The U.S. Food and Drug Administration guidance on adaptive designs highlights the use of adaptive enrichment designs as a means to increase the chance to detect a true drug effect over that of a fixed sample design. ¹²

Master protocols provide an infrastructure for efficient study of newly developed compounds or biomarker‐defined subgroups. ¹³ , ¹⁴ Such studies simultaneously evaluate more than one investigational drug or more than one disease type within the same overall trial structure. ¹⁵ , ¹⁶ , ¹⁷ An umbrella trial is a particular type of master protocol in which enrolment is restricted to a single disease but the patients are screened and assigned to molecularly defined subtrials. Each subtrial may have different objectives, endpoints or design characteristics. An example of an umbrella trial is the ALCHEMIST trial, in which patients with nonsmall cell lung cancer are screened for EGFR mutation or ALK rearrangement and assigned accordingly to subtrials with different treatments. ¹⁸

In this paper, we study confirmatory trials that allow the investigation of the treatment effect in prespecified nonoverlapping subgroups. In particular, we focus on adaptive clinical trials that allow the modification of design elements without compromising the integrity of the trial. ¹⁹ We propose a class of adaptive enrichment designs that use a Bayesian decision framework to optimize the design parameters, such as the trial prevalences of the subgroups, the weights for multiple hypotheses testing, and adaptation rules. A similar framework has been used in References 20, 21, 22, 23, 24, 25, 26, 27 for adaptive enrichment trials.

We consider two types of problem. In the first case, we study designs that preserve the familywise error rate (FWER) of the trial using a closed testing procedure to test the null hypotheses of no treatment effect in the two subgroups. This is what is typically required in adaptive enrichment trials where a single treatment is evaluated against a control. In the second case, we show results for umbrella trial designs without multiplicity adjustment. Here, we consider studies made up of separate simultaneous trials, for which it has been argued that no control of multiplicity is needed. ²⁸ Our work, therefore, provides an overarching framework for both adaptive enrichment designs and umbrella trials.

The manuscript is organized as follows: In Section 2, we introduce the designs and distinguish between single‐stage designs (Section 2.2) and two‐stage designs (Section 2.3), and in Section 2.4 we discuss how to adapt our proposed designs to umbrella trials. In Sections 3 and 4 we present numerical examples. We describe how our methods may be extended to designs with more than two stages in Section 5 and we end with conclusions and a discussion in Section 6.

2. BAYES OPTIMAL DESIGNS

2.1. The class of trial designs

Consider a confirmatory parallel‐group clinical trial comparing a new treatment and a control with respect to a pre‐defined primary endpoint. We assume the patient population may be divided into disjoint, biomarker‐defined subgroups. Given a maximum achievable sample size, n, we aim to optimize the trial design by maximising a specific utility function.

Suppose two biomarker‐defined subgroups have been identified before commencing the trial. Let $0 < λ < 1$ be the prevalence of the first subgroup in the underlying patient population and $1 - λ$ the prevalence of the second subgroup. Let $θ_{1}$ and $θ_{2}$ be the treatment effects, denoting the difference in the mean outcome between treatment and control, in the first and second subgroups, respectively. We consider trials to investigate the null hypotheses H₀₁: $θ_{1} \leq 0$ and H₀₂: $θ_{2} \leq 0$ with corresponding alternative hypotheses H₁₁: $θ_{1} > 0$ and H₁₂: $θ_{2} > 0$ . In Sections 2.2 and 2.3 we consider confirmatory trials in which strong control of the FWER is imposed. ²⁹ In our discussion of umbrella trials in Section 2.4, we assume multiplicity control is not required.

We consider optimization within a class of designs $𝒜$ that have a single interim analysis at which adaptation can take place. The total sample size is fixed at n with s⁽¹⁾n patients in the first stage and s⁽²⁾n patients in the second stage, where s⁽¹⁾ > 0, s⁽²⁾ ≥ 0 and s⁽¹⁾ + s⁽²⁾ = 1. In the first stage, $r_{1}^{(1)} s^{(1)} n$ patients are recruited from subgroup 1 and $r_{2}^{(1)} s^{(1)} n$ from subgroup 2, where $r_{1}^{(1)} \geq 0$ , $r_{2}^{(1)} \geq 0$ and $r_{1}^{(1)} + r_{2}^{(1)} = 1$ . In the second stage, $r_{1}^{(2)} s^{(2)} n$ patients are recruited from subgroup 1 and $r_{2}^{(2)} s^{(2)} n$ from subgroup 2, where $r_{1}^{(2)} \geq 0$ , $r_{2}^{(2)} \geq 0$ and $r_{1}^{(2)} + r_{2}^{(2)} = 1$ , and the values of $r_{1}^{(2)}$ and $r_{2}^{(2)}$ may depend on the first stage data. Within each stage and subgroup, we assume equal allocation to the two treatment arms (this assumption is not strictly necessary and could be relaxed). Figure 1 gives a schematic representation of the trial design.

SIM-8949-FIG-0001-c — Schematic representation of the three types of trial design. In the single‐stage trial, the sampling prevalences of the subgroups are fixed throughout the trial. In standard adaptive enrichment trials, patients are recruited with predefined subgroup prevalences until the interim analysis, at which point a decision is taken to continue with the same prevalences or to sample from a single subgroup. In the Bayes optimal adaptive trial designs that we consider, the sampling prevalences may be changed at the interim analysis [Colour figure can be viewed at wileyonlinelibrary.com]

The definition of a particular design in $𝒜$ is completed by specifying the multiple testing procedure to be used and the method for combining data across stages when adaptation occurs. We use a closed testing procedure to control FWER, applying a weighted Bonferroni procedure to test the intersection hypothesis. In this procedure, weights are initially set as $ω_{1}^{(1)}$ and $ω_{2}^{(1)}$ but these may be modified in the second stage if adaptation occurs. The error rate for each hypothesis test is controlled by preserving the conditional type I error rate when an adaptation is made. Thus, while we use a Bayesian approach to optimize the design, the trial is analyzed using frequentist procedures that control error rates at the desired level, adhering to conventional regulatory standards.

We follow a Bayesian decision theoretic approach to optimize over trial designs in the class $𝒜$ . In assessing each design, we assume a prior distribution for the treatment effects in each subgroup and a utility function ³⁰ that quantifies the value of the trial's outcome. We shall optimize designs with respect to the timing of the interim analysis, the proportion of patients recruited from the two subgroups at each stage of the trial, the weights in the weighted Bonferroni test, and the rule for updating these weights given the interim data.

We summarize the data observed during the trial by the symbol $\hat{θ}$ , noting that this summary should contain information about the numbers of observations from each subgroup and weights to be used in the weighted Bonferroni test at each stage, as well as estimates of $θ_{1}$ and $θ_{2}$ obtained from observations before and after the interim analysis. We define our utility function to be

𝒰 (\hat{θ}) = λ 1 (Reject H_{01}) + (1 - λ) 1 (Reject H_{02}),

(1)

where $1 (.)$ is the indicator function. By definition, the data summary $\hat{θ}$ contains the information needed to determine if each of the hypotheses H₀₁ and H₀₂ is rejected.

The utility (1) involves the size of the underlying subgroups as well as the rejection of the corresponding hypotheses. Thus, rejection of the null hypothesis for a larger subgroup is given greater weight. If the population prevalence of the two subgroups is not known, a prior on $λ$ may be added. We note that terms in the function (1) are positive when a null hypothesis is rejected but the associated treatment effect is very small or even negative: this issue could be addressed by multiplying each term by an indicator variable which takes the value 1 if the relevant parameter, $θ_{1}$ or $θ_{2}$ , is larger than zero or above a clinically relevant threshold (eg, Stallard et al ³¹ where a similar approach is used for treatment selection).

Since the trial design is optimized with respect to the stated utility, it is important to choose a utility function that reflects accurately the relative importance of possible trial outcomes. Furthermore, the definition of utility can be adapted to reflect the interest of different stakeholders, for example, Ondra et al ²¹ and Graf, Posch and König ²⁴ propose utility functions that represent the view of a sponsor or take a public health perspective.

Let $π (θ)$ denote the prior distribution for $θ = (θ_{1}, θ_{2})$ . Then, the Bayes expected utility for a trial design $a \in 𝒜$ is

W_{π (θ)} (a) = 𝔼_{π (θ)} [𝔼_{θ} \{𝒰 (\hat{θ})\}],

where we have taken the expectation over the sampling distribution of the trial data given the true treatment effects $θ$ , with an outer integral over the prior distribution $π (θ)$ .

When choosing the prior $π (θ)$ , it is important to remember that $W_{π (θ)} (a)$ represents the expected utility, averaged over $θ \sim π (θ)$ . If an “uninformative” prior is chosen, this will place weight on extreme scenarios, such as large negative treatment effects, which have little credibility. Thus, when considering the Bayes optimal design, it is important to use subjective, informative priors. In some cases, pilot studies or historic observational data may be available to construct the prior distribution.

In this paper, we assume the prior distribution $π (θ)$ to be bivariate normal,

(\begin{matrix} θ_{1} \\ θ_{2} \end{matrix}) \sim N ((\begin{matrix} μ_{1} \\ μ_{2} \end{matrix}), (\begin{matrix} ψ_{1}^{2} & ρ ψ_{1} ψ_{2} \\ ρ ψ_{1} ψ_{2} & ψ_{2}^{2} \end{matrix})) .

(2)

Here, the correlation coefficient $ρ$ reflects the belief about the existence of common factors that contribute to the treatment effects in the two subgroups.

2.2. Bayes optimal single‐stage design

2.2.1. Patient recruitment and estimation

Suppose we wish to conduct a single‐stage trial, which is the special case where s⁽²⁾ = 0, usually referred to as a stratified design. For simplicity of notation in this section, we write r_jand $ω_{j}$ rather than $r_{j}^{(1)}$ and $ω_{j}^{(1)}$ for j = 1 and 2. We assume patients can be recruited at these rates regardless of the true proportions $λ$ and $1 - λ$ in the underlying patient population. In addition, we assume that patients are randomised between the new treatment and the control with a 1 : 1 allocation ratio in each subgroup.

During the trial we observe a normally distributed endpoint for each patient and we assume a constant variance for all observations. For patient i from subgroup j on the new treatment we have $X_{j i} \sim N (μ_{T j}, σ^{2})$ , i = 1, … , r_jn/2, and for patient i from subgroup j on the control treatment we have $Y_{j i} \sim N (μ_{C j}, σ^{2})$ , i = 1, … , r_jn/2. The estimate of the treatment effect $θ_{j} = μ_{T j} - μ_{C j}$ in subgroup j, is

{\hat{θ}}_{j} = {\overline{X}}_{j} - {\overline{Y}}_{j} = \frac{1}{r_{j} n / 2} \sum_{i = 1}^{r_{j} n / 2} X_{j i} - \frac{1}{r_{j} n / 2} \sum_{i = 1}^{r_{j} n / 2} Y_{j i}, j = 1, 2 .

(3)

2.2.2. Hypothesis testing in the single‐stage design

Consider the case s⁽²⁾ = 0 and 0 < r₁ < 1. Then

{\hat{θ}}_{j} | θ_{j} \sim N (θ_{j}, \frac{4 σ^{2}}{r_{j} n}), j = 1, 2,

and the corresponding Z‐values

Z_{j} = \frac{{\hat{θ}}_{j} \sqrt{r_{j} n}}{2 σ}, j = 1, 2,

follow standard normal distributions under the null hypotheses H₀₁ and H₀₂.

We use a closed testing procedure to ensure strong control of the FWER at $α$ level. ³² To construct this, we require level $α$ tests of H₀₁: $θ_{1} \leq 0$ , H₀₂: $θ_{2} \leq 0$ and H₀₁ ∩ H₀₂: $\{θ_{1} \leq 0\} \cap \{θ_{2} \leq 0\}$ . We reject H₀₁ globally if the level $α$ tests reject H₀₁ and H₀₁ ∩ H₀₂. Similarly, we reject H₀₂ globally if the level $α$ tests reject H₀₂ and H₀₁ ∩ H₀₂.

For the individual tests we reject H₀₁ if $Z_{1} \geq Φ^{- 1} (1 - α)$ and H₀₂ if $Z_{2} \geq Φ^{- 1} (1 - α)$ . To test the intersection hypothesis, we use a weighted Bonferroni test: given predefined weights $ω_{1}$ and $ω_{2}$ , where $ω_{1} + ω_{2} = 1$ , we reject H₀₁ ∩ H₀₂ if $Z_{1} \geq Φ^{- 1} (1 - ω_{1} α)$ or $Z_{2} \geq Φ^{- 1} (1 - ω_{2} α)$ . The resulting closed testing procedure is equivalent to the weighted Bonferroni‐Holm test and will be generalised to adaptive tests in Section 2.3.

We note that the choice of a closed testing procedure is not restrictive in this setting since any procedure that gives strong control of the FWER may be written as a closed testing procedure. ²² , ²³ Furthermore in the special cases r₁ = 1 and r₂ = 1, where the trial recruits from only one of the subgroups, just one subgroup is tested and only the test of the individual hypothesis is required. These cases are accommodated in our general class of designs by setting $ω_{1} = 1$ when r₁ = 1 and $ω_{2} = 1$ when r₂ = 1.

2.2.3. Bayesian optimization

In the single‐stage trial we wish to optimize the trial prevalences of each subgroup, r₁ and r₂, and the weights in the Bonferroni‐Holm procedure, $ω_{1}$ and $ω_{2}$ . Given the constraints r₁ + r₂ = 1 and $ω_{1} + ω_{2} = 1$ , we denote the set of parameters to optimize by $a = (r_{1}, ω_{1})$ .

Let $f (\hat{θ} | θ, a)$ denote the conditional distribution of $({\hat{θ}}_{1}, {\hat{θ}}_{2})$ given $θ$ for design parameters a. The Bayes expected utility is given by

𝔼_{π (θ)} [𝔼_{θ} [𝒰 (\hat{θ})]] = \int_{θ} \int_{\hat{θ}} 𝒰 (\hat{θ}) f (\hat{θ} | θ, a) π (θ) d \hat{θ} d θ .

The Bayes optimal design is given by the pair $a = (r_{1}, ω_{1})$ that maximises the Bayes expected utility of the trial, that is

\underset{a}{argmax} \int_{θ} \int_{\hat{θ}} 𝒰 (\hat{θ}) f (\hat{θ} | θ, a) π (θ) d \hat{θ} d θ .

Given our simple choices for the prior distribution and the utility function this integral may be computed directly (see Section S1.2 of Appendix S1). We find the Bayes optimal single‐stage trial by a numerical search over possible values of a.

2.3. Bayes optimal two‐stage adaptive design

2.3.1. Adding a second stage

Consider now a two‐stage design in which data from the first stage inform adaptations in the second stage. The estimate of $θ_{j}$ for subgroup j based on data collected in stage k is

{\hat{θ}}_{j}^{(k)} = {\overline{X}}_{j}^{(k)} - {\overline{Y}}_{j}^{(k)}, j = 1, 2, k = 1, 2,

(4)

where ${\overline{X}}_{j}^{(k)}$ and ${\overline{Y}}_{j}^{(k)}$ are the mean responses in subgroup j in stage k for the treatment arm and control arm, respectively. Given the value of $θ = (θ_{1}, θ_{2})$ , the first stage estimates are independent with distributions

{\hat{θ}}_{j}^{(1)} | θ_{j} \sim N (θ_{j}, \frac{4 σ^{2}}{r_{j}^{(1)} s^{(1)} n}), j = 1, 2 .

The trial prevalences, $r_{1}^{(2)}$ and $r_{2}^{(2)}$ , of the two subgroups in the second stage are dependent on ${\hat{θ}}_{1}^{(1)}$ and ${\hat{θ}}_{2}^{(1)}$ but, conditional on $r_{1}^{(2)}$ and $r_{2}^{(2)}$ , the second‐stage estimates are independent and conditionally independent of ${\hat{θ}}_{1}^{(1)}$ and ${\hat{θ}}_{2}^{(1)}$ with

{\hat{θ}}_{j}^{(2)} | r_{j}^{(2)}, θ_{j} \sim N (θ_{j}, \frac{4 σ^{2}}{r_{j}^{(2)} s^{(2)} n}), j = 1, 2 .

2.3.2. Hypothesis testing in the two‐stage adaptive design

There is a variety of approaches to test multiple hypotheses in a two‐stage adaptive design. ³³ , ³⁴ , ³⁵ , ³⁶ We shall use a closed testing procedure to ensure strong control of the FWER at level $α$ , as we did for the single‐stage design in Section 2.2.2. In constructing level $α$ tests of the null hypotheses H₀₁, H₀₂ and H₀₁ ∩ H₀₂ we employ the conditional error rate approach. ³⁷ , ³⁸ Based on a reference design and its predefined tests, we calculate the conditional error rate for each hypothesis and define adaptive tests which preserve this conditional error rate, thereby controlling the overall type I error rate.

Consider a reference design in which the trial prevalences of subgroups 1 and 2 and the weights in the weighted Bonferroni test of H₀₁ ∩ H₀₂ remain the same across stages, so $r_{j}^{(2)} = r_{j}^{(1)}$ and $ω_{j}^{(2)} = ω_{j}^{(1)}$ for j = 1 and 2. In the reference design, tests are performed by pooling the stage‐wise data within each subgroup and treatment arm, and using the conventional test statistics, as for the single‐stage test. For j = 1 and 2, the pooled estimate of $θ_{j}$ across the two stages of the trial is

{\hat{θ}}_{j}^{(p)} = s^{(1)} {\hat{θ}}_{j}^{(1)} + s^{(2)} {\hat{θ}}_{j}^{(2)},

with corresponding Z‐value

Z_{j}^{(p)} = \frac{{\hat{θ}}_{j}^{(p)}}{\sqrt{4 σ^{2} / (r_{j}^{(1)} n)}},

and the null hypothesis H_0j is rejected at level $α$ if $Z_{j}^{(p)} > Φ^{- 1} (1 - α)$ . Let

Z_{j}^{(1)} = \frac{{\hat{θ}}_{j}^{(1)}}{\sqrt{4 σ^{2} / (r_{j}^{(1)} s^{(1)} n)}}, j = 1, 2,

then the conditional distribution of $Z_{j}^{(p)}$ given the interim data is

Z_{j}^{(p)} | Z_{j}^{(1)}, θ_{j} \sim N (\sqrt{s^{(1)}} Z_{j}^{(1)} + s^{(2)} θ_{j} \frac{\sqrt{r_{j}^{(1)} n}}{2 σ}, s^{(2)}),

and the conditional error rates for the tests of H_0j are

A_{j} = ℙ (Z_{j}^{(p)} > Φ^{- 1} (1 - α) | Z_{j}^{(1)}, θ_{j} = 0), j = 1, 2 .

(5)

Similarly, the conditional error rate for the test of H₀₁ ∩ H₀₂ is

A_{12} = ℙ \{Z_{1}^{(p)} > Φ^{- 1} (1 - ω_{1}^{(1)} α) or Z_{2}^{(p)} > Φ^{- 1} (1 - ω_{2}^{(1)} α) | Z_{1}^{(1)}, Z_{2}^{(1)}, θ_{1} = θ_{2} = 0\} .

(6)

See Section S1.1 of Appendix S1 for further details on the derivations of the conditional distributions.

In the adaptive design, if no adaptations are made at the interim analysis we apply the tests as defined for the reference design. Suppose now that adaptations are made and the trial prevalences in stage 2 are set to be $r_{1}^{(2)}$ and $r_{2}^{(2)}$ with weights $ω_{1}^{(2)}$ and $ω_{2}^{(2)}$ for the weighted Bonferroni test. In this case, we calculate the conditional error rates A₁, A₂ and A₁₂ prior to adaptation from Equations (5) and (6). We then define tests of H₀₁, H₀₂ and H₀₁ ∩ H₀₂ based on stage 2 data alone that have these conditional error rates as their type 1 error probabilities. Given the updated $r_{1}^{(2)}$ and $r_{2}^{(2)}$ ,

Z_{j}^{(2)} | r_{j}^{(2)}, θ_{j} \sim N (θ_{j} \frac{\sqrt{r_{j}^{(2)} s^{(2)} n}}{2 σ}, 1), j = 1, 2 .

Thus, in our level $α$ tests, we reject H₀₁ if $Z_{1}^{(2)} > Φ^{- 1} (1 - A_{1})$ , we reject H₀₂ if $Z_{2}^{(2)} > Φ^{- 1} (1 - A_{2})$ and, applying a weighted Bonferroni test with weights $ω_{1}^{(2)}$ and $ω_{2}^{(2)}$ , we reject H₀₁ ∩ H₀₂ if $Z_{1}^{(2)} > Φ^{- 1} (1 - ω_{1}^{(2)} A_{12}) or Z_{2}^{(2)} > Φ^{- 1} (1 - ω_{2}^{(2)} A_{12})$ . Finally, following the closed testing procedure, we reject H₀₁ globally if the level $α$ tests reject H₀₁ and H₀₁ ∩ H₀₂ and we reject H₀₂ globally if the level $α$ tests reject H₀₂ and H₀₁ ∩ H₀₂.

2.3.3. Two‐stage optimization

We denote the set of initial design parameters by $a_{1} = (s^{(1)}, r_{1}^{(1)}, ω_{1}^{(1)})$ and the second‐stage parameters by $a_{2} = (r_{1}^{(2)}, ω_{1}^{(2)})$ . Let ${\hat{θ}}^{(1)} = ({\hat{θ}}_{1}^{(1)}, {\hat{θ}}_{2}^{(1)})$ and ${\hat{θ}}^{(2)} = ({\hat{θ}}_{1}^{(2)}, {\hat{θ}}_{2}^{(2)})$ be the vectors of estimated treatment effects in each subgroup, based on the first and second‐stage data, respectively, as defined in Equation (4). Denote the conditional distributions of the estimated effects in each stage of the trial by $f_{1} ({\hat{θ}}^{(1)} | θ, a_{1})$ and $f_{2} ({\hat{θ}}^{(2)} | θ, a_{2})$ and the posterior distribution of $θ$ given the stage 1 observations by $π (θ | {\hat{θ}}^{(1)}, a_{1})$ . Then, the Bayes expected utility can be written as

𝔼_{π (θ)} [𝔼_{θ} [𝒰 (\hat{θ})]] = \int_{θ} \int_{{\hat{θ}}^{(1)}} \int_{{\hat{θ}}^{(2)}} 𝒰 (\hat{θ}) f_{2} ({\hat{θ}}^{(2)} | θ, a_{2}) f_{1} ({\hat{θ}}^{(1)} | θ, a_{1}) π (θ) d {\hat{θ}}^{(2)} d {\hat{θ}}^{(1)} d θ .

(7)

We find the optimal combination of design parameters a₁ before stage 1 and a₂ before stage 2 using the backward induction principle. First we construct the Bayes optimal a₂ for all possible ${\hat{θ}}^{(1)}$ and a₁. Then we construct the Bayes optimal a₁ given that the optimal a₂ will be used in the second stage of the trial.

2.3.3.1. Optimizing the decision at the interim analysis

Denoting the marginal distribution of ${\hat{θ}}^{(1)}$ by $f_{1} ({\hat{θ}}^{(1)}, a_{1})$ , we have

π (θ) f_{1} ({\hat{θ}}^{(1)} | θ, a_{1}) = f_{1} ({\hat{θ}}^{(1)}, a_{1}) π (θ | {\hat{θ}}^{(1)}, a_{1}),

and the right‐hand side of Equation (7) can be written as

\int_{{\hat{θ}}^{(1)}} f_{1} ({\hat{θ}}^{(1)}, a_{1}) \int_{θ} \int_{{\hat{θ}}^{(2)}} 𝒰 (\hat{θ}) f_{2} ({\hat{θ}}^{(2)} | θ, a_{2}) π (θ | {\hat{θ}}^{(1)}, a_{1}) d {\hat{θ}}^{(2)} d θ d {\hat{θ}}^{(1)} .

Thus, given a₁ and ${\hat{θ}}^{(1)}$ , the Bayes optimal decision for the second stage is the choice of a₂ that maximises

W_{2} (a_{2}, a_{1}, {\hat{θ}}^{(1)}) = \int_{θ} \int_{{\hat{θ}}^{(2)}} 𝒰 (\hat{θ}) f_{2} ({\hat{θ}}^{(2)} | θ, a_{1}) π (θ | {\hat{θ}}^{(1)}, a_{1}) d {\hat{θ}}^{(2)} d θ .

For known values of ${\hat{θ}}^{(1)}$ and a₁, we can find the conditional error rates A₁, A₂, and A₁₂ used in hypothesis testing in stage 2, hence we may evaluate $𝒰 (\hat{θ})$ for given a₁, ${\hat{θ}}^{(1)}$ , a₂, and ${\hat{θ}}^{(2)}$ . Our choices for the prior distribution and utility function mean that it is quite straightforward to compute $W_{2} (a_{2}, a_{1}, {\hat{θ}}^{(1)})$ for given a₁, a₂ and ${\hat{θ}}^{(1)}$ . Thus, we are able to perform a numerical search seeking

\underset{a_{2}}{argmax} W_{2} (a_{2}, a_{1}, {\hat{θ}}^{(1)}),

to find the Bayes optimal a₂.

2.3.3.2. Overall trial optimization

Having found the Bayes optimal parameters a₂ for the second stage of the trial as a function of $(a_{1}, {\hat{θ}}^{(1)})$ , we determine a₁, the Bayes optimal choice for the initial parameters, as

\underset{a_{1}}{argmax} \int_{θ} \int_{{\hat{θ}}^{(1)}} W_{2} (a_{2}, a_{1}, {\hat{θ}}^{(1)}) f ({\hat{θ}}^{(1)} | θ, a_{1}) π (θ) d {\hat{θ}}^{(1)} d θ .

We conduct a search over possible values of a₁ to maximize the above integral and find the optimal choice of a₁. Computing the integral for a given value of a₁ by numerical integration is not straightforward. Instead, we have used Monte Carlo simulation to carry out this calculation for each value of a₁.

2.4. Bayes optimal umbrella trials

We now consider the case of umbrella trials, where it has been argued that no multiplicity adjustment is required as the hypotheses to be tested concern different experimental treatments targeted to different molecular markers or subgroups. ²⁸ Since each treatment is assessed separately, an umbrella trial can be viewed a set of independent trials even though they are run under a single protocol.

We consider umbrella trials with two subgroups, as in the previous sections. However, without multiplicity adjustment, the hypothesis testing procedure reduces to testing the elementary hypotheses H₀₁ and H₀₂ each at level $α$ . In applying the conditional error rate approach, only the computation of conditional error rates A₁ and A₂ from Equation (5) is required. Then, with $Z_{1}^{(2)}$ and $Z_{2}^{(2)}$ denoting the test statistics based on second‐stage data only, H₀₁ is rejected if $Z_{1}^{(2)} > Φ^{- 1} (1 - A_{1})$ and H₀₂ is rejected if $Z_{2}^{(2)} > Φ^{- 1} (1 - A_{2})$ . No test of the intersection hypothesis is performed.

Design parameters are optimized with respect to the utility function in Equation (1). To frame the optimization problem in the same way as in the previous sections, the interim decision in a two‐stage umbrella trial will optimize only the second‐stage subgroup trial prevalences, so $a_{2} = (r_{1}^{(2)})$ , while in the first stage we optimize the subgroup trial prevalences and the timing of the interim analysis, so $a_{1} = (s^{(1)}, r_{1}^{(1)})$ . In the case of a single‐stage umbrella trial, only the subgroup prevalences are optimized, so a = (r₁). We have used a normal prior distribution, as defined in Equation (2), in optimizing the design parameters of single‐stage and two‐stage trials. In the case of two‐stage designs, the interim analysis uses the test statistics from the first stage and the prior distribution to perform adaptations and the final tests are performed using the conditional error rate approach.

3. NUMERICAL EXAMPLES AND COMPARISONS

In this section, we give numerical examples of optimized single‐stage and two‐stage designs in a range of scenarios. We show results for cases with and without multiplicity correction, referring to these as enrichment and umbrella trials, respectively. Additionally, we illustrate the optimization of the decision rule at the interim analysis. In Table 1, we provide an overview of the scenarios considered and the parameters that are optimized.

TABLE 1.

The scenarios considered in the numerical examples. The term “opt” indicates that parameters were optimized, while “N/A” means the parameters are not applicable. The parameters $θ_{1}$ and $θ_{2}$ are either specified by a prior distribution in which $ψ_{1} = ψ_{2} = ψ$ or specific values of $θ_{1}$ and $θ_{2}$ are given

λ

μ_{1}

μ_{2}

ψ

ρ

s⁽¹⁾

r_{1}^{(1)}

ω_{1}^{(1)}

θ_{1}

θ_{2}

Figure 2

Single‐stage

0.3

0 to 0.3

0, 0.2

0.02 to 0.44

0.5

N/A

opt

prior

Figure S2

Single‐stage

0.3

0 to 0.3

0, 0.2

0.2

−1 to 1

N/A

opt

prior

Figure 3

Interim decision

0.3

0.1

0.2

−0.8, 0.5

0.25, 0.5

0.3

prior

Figure 4

Two‐stage

0.3

0, 0.3

0, 0.2

0.2

0.5

0.1 to0.9

opt

prior

Figure 5

Two‐stage

0.3

0 to 0.3

0, 0.2

0.02 to 0.4

0.5

opt

prior

Figure S10

Two‐stage

0.3

0 to 0.3

0, 0.2

0.2

−0.8 to 0.8

opt

prior

Figures 6 and S11

Power

0.3

0.1, 0.2

0.2

0.5

opt

0 to 0.3

0, 0.2

Open in a new tab

3.1. Optimal single‐stage designs

In studying the impact of the prior distribution on optimized trial design parameters $a = (r_{1}, ω_{1})$ for single‐stage designs, we consider studies where the response variance is $σ^{2} = 1$ and the total sample size is fixed at n = 700. We assume a multivariate normal prior distribution for $θ$ as defined in Equation (2) with parameters $μ_{1}$ , $μ_{2}$ , $ψ_{1} = ψ_{2} = ψ$ and $ρ$ , and we compute optimal designs for a variety of such priors. The FWER in enrichment designs and the per‐comparison error rate in umbrella designs is fixed at $α = 0.05$ .

In Figure 2 we display the effect of the prior SD on the optimal design parameters when the population prevalence of subgroup 1 is $λ = 0.3$ . We considered prior SDs $ψ$ of 0.02, 0.0632, 0.1, 0.1414, 0.2, 0.3162, and 0.44, corresponding to information from studies with 10 000, 1000, 400, 200, 100, 40 and 20 subjects in each subgroup.

SIM-8949-FIG-0002-c — Optimized design parameters for single‐stage designs and the expected utility, averaged over the prior. Parameters are $a = (r_{1}, ω_{1})$ for enrichment trials and a = (r₁) for umbrella trials. Results are classified by $μ_{1}$ and $μ_{2}$ , the prior means of $θ_{1}$ and $θ_{2}$ , and the prior SD $ψ = ψ_{1} = ψ_{2}$ . The prior correlation between $θ_{1}$ and $θ_{2}$ is fixed at $ρ = 0.5$ and the population prevalence of subgroup 1 is assumed to be $λ = 0.3$ [Colour figure can be viewed at wileyonlinelibrary.com]

The mean and variance of the prior distribution have a large impact on the optimal design parameters r₁ and $ω_{1}$ . The optimal values of r₁ and $ω_{1}$ and the expected utility of the resulting designs are very similar for enrichment and umbrella designs. If $μ_{1} > 0$ and $μ_{2} = 0$ , optimal values of r₁ and $ω_{1}$ are larger than 0.3, the population prevalence of subgroup 1, so the design over‐samples this subgroup. If $μ_{1} = 0$ and $μ_{2} > 0$ , the optimal design under‐samples subgroup 1. When both $μ_{1}$ and $μ_{2}$ are greater than zero, the optimal design has r₁ < 0.5 and $ω_{1} < 0.5$ , reflecting the fact that it is advantageous to sample more subjects from subgroup 2 and allocate more type 1 error probability to the test of H₀₂ since $λ = 0.3$ implies that P(Reject H₀₂) has a greater weight than P(Reject H₀₁) in the utility function.

In extreme cases where $μ_{1} = 0$ , $μ_{2} \geq 0$ and the prior variance is small, the optimal design has r₁ = 0, so only subgroup 2 is sampled. When $μ_{1} > 0$ , $μ_{2} = 0$ and the prior variance is small, the optimal design has r₁ = 1 and only subgroup 1 is sampled.

In Figure S2, we show the effect of the prior correlation $ρ$ on the design parameters when the prior SD is $ψ = 0.2$ . We observe that the correlation has an impact on the optimal weight $ω_{1}$ for testing the intersection hypothesis, in particular, when the treatment effects $θ_{1}$ and $θ_{2}$ have a high positive correlation, it is better to place most weight on one hypothesis rather than split the weight between the two hypotheses.

In Figures S3 and S4 we present further results for different values of $λ$ , varying $ρ$ in Figure S3 and $ψ$ in Figure S4. Since the utility to be maximized depends on the population prevalences, the optimal design parameters vary considerably with $λ$ . We see from Figure S3 that $ρ$ has only a small impact on the optimal value of r₁ when adjusting for multiplicity and no impact at all in umbrella designs where no multiplicity adjustment is made. Figure S4 shows that the dependence of optimal design parameters on $ψ$ is similar to that seen in Figure 2: when the prior variance is large the optimal choices for r₁ and $ω_{1}$ are close to $λ$ , while for smaller variances the optimal designs depend on the prior means $μ_{1}$ and $μ_{2}$ as well as $λ$ .

3.2. Optimal two‐stage designs

Figure 3 illustrates optimal adaptation rules for two‐stage designs. In these examples n = 700, $σ^{2} = 1$ , the population prevalence of subgroup 1 is $λ = 0.3$ , and the prior distribution for $θ$ has parameters $μ_{1} = 0.1$ , $μ_{2} = 0$ , $ψ_{1} = ψ_{2} = 0.2$ and $ρ = 0.5$ or −0.8. The first‐stage design parameters have not been optimized and are set as $r_{1}^{(1)} = ω_{1}^{(1)} = 0.3$ with s⁽¹⁾ equal to 0.25 or 0.5. The FWER in enrichment designs and the per‐comparison error rate in umbrella designs is fixed at $α = 0.05$ .

SIM-8949-FIG-0003-c — Examples of optimal adaptation rules when $λ = 0.3$ , the prior distribution for $θ$ has parameters $μ_{1} = 0.1$ , $μ_{2} = 0$ , $ψ_{1} = ψ_{2} = 0.2$ and $ρ = 0.5$ or −0.8, and first stage design parameters are set as $r_{1}^{(1)} = ω_{1}^{(1)} = 0.3$ and s⁽¹⁾ = 0.25 or 0.5. Optimized values of $r_{1}^{(2)}$ and $ω_{1}^{(2)}$ are shown for each combination of first stage Z‐values $Z_{1}^{(1)}$ and $Z_{2}^{(1)}$ . Also shown are the conditional expected utility when the trial proceeds using the optimized values of $r_{1}^{(2)}$ and $ω_{1}^{(2)}$ and the increase in conditional expected utility compared to continuing with no adaptation. In each plot, the red circle indicates the 95% highest density region for the distribution of $(Z_{1}^{(1)}, Z_{2}^{(1)})$ when the true treatment effects are $θ_{1} = 0.3$ and $θ_{2} = 0$ and the green ellipse indicates the 95% highest density region for the prior predictive distribution of $(Z_{1}^{(1)}, Z_{2}^{(1)})$ . The white regions contain values of $(Z_{1}^{(1)}, Z_{2}^{(1)})$ for which the maximum conditional expected utility is below 0.01. In these cases the numerical optimization becomes unstable and optimal values for $r_{1}^{(2)}$ and $ω_{1}^{(2)}$ are not displayed [Colour figure can be viewed at wileyonlinelibrary.com]

The adaptation rules specify the second‐stage design parameters $a_{2} = (r_{1}^{(2)}, ω_{1}^{(2)})$ that optimize the expected utility, as defined in Equation (1), given the first stage statistics ${Z_{1}}^{(1)}$ and ${Z_{2}}^{(1)}$ . The optimal $r_{1}^{(2)}$ and $ω_{1}^{(2)}$ are calculated using the Hooke‐Jeeves derivative‐free minimization algorithm through the hjkb function in the dfoptim package ³⁹ in R. ⁴⁰ We also calculated the conditional expected utility if the trial continued with no adaptation, so $r_{1}^{(2)} = r_{1}^{(1)}$ and $ω_{1}^{(2)} = ω_{1}^{(1)}$ , and the plots in the bottom row of Figure 3 show the gain in the conditional expected utility due to the optimized adaptation. In Section S3 of Appendix S1, we present optimal interim rules for further values of $λ$ .

In Figure 4, we illustrate the procedure for optimizing first‐stage design parameters, $a = (s^{(1)}, r_{1}^{(1)}, ω_{1}^{(1)})$ for an enrichment design or $a = (s^{(1)}, r_{1}^{(1)})$ for an umbrella design. For each combination of prior parameters and first‐stage design parameters a, we generated 1000 samples of first‐stage data under treatment effects drawn from the prior distribution. For each first‐stage dataset, we found the optimal second‐stage design parameters and noted the conditional expected utility using these optimal parameters. We took the average of the 1000 values of the optimized conditional expected utility as our simulation‐based estimate of the expected utility for this choice of a. The optimal first‐stage design parameters for a given prior distribution are those values of s⁽¹⁾, $r_{1}^{(1)}$ , and in the case of an enrichment design $ω_{1}^{(1)}$ , that yield the highest expected utility.

SIM-8949-FIG-0004-c — Optimization of first‐stage design parameters. The population prevalence of subgroup 1 is $λ = 0.3$ and the prior distribution for $θ$ has parameters $μ_{1} = 0$ or 0.3, $μ_{2} = 0$ or 0.2, $ψ_{1} = ψ_{2} = 0.2$ and $ρ = 0.5$ . Each column shows results for a different value of $s_{1}^{(1)}$ . The plots show the expected utility as a function of $r_{1}^{(1)}$ , with coloured solid lines for different values of $ω_{1}^{(1)}$ in an enrichment trial and black dashed lines for an umbrella trial with no multiplicity adjustment. In each panel, the colored dot indicates the combination of $r_{1}^{(1)}$ and $ω_{1}^{(1)}$ that yields the maximum expected utility for an enrichment design and the black dot shows the optimum value of $r_{1}^{(1)}$ for an umbrella design [Colour figure can be viewed at wileyonlinelibrary.com]

Our results show the impact of the prior distribution on the optimized trial design parameters. The flat lines when s⁽¹⁾ = 0.1 indicate that the expected utility is hardly affected by the choice of $r_{1}^{(1)}$ and $ω_{1}^{(1)}$ when the interim analysis is performed early in the trial. When the interim analysis is performed later, the choice of first‐stage design parameters is more important. It should be noted that for each pair of prior means $(μ_{1}, μ_{2})$ , expected utility close to the overall optimum can be achieved using a wide range of first‐stage design parameters as long as the second‐stage design is optimized, given the first‐stage data.

In Figures 5 and S10 we present optimized values of the first‐stage design parameters, s⁽¹⁾, $r_{1}^{(1)}$ , and $ω_{1}^{(1)}$ , given that optimal values of the second‐stage design parameters will be used following the interim analysis. The results are similar to those observed for optimal single‐stage designs. The prior variance has a large impact on the first‐stage optimal design: for smaller variances, interim analyses closer to the beginning of the trial yield a larger expected utility, while with larger variances, interim analyses after around 40% to 60% of the patients have been recruited are preferable. When the prior means are both 0 the optimal design parameters $r_{1}^{(1)}$ and $ω_{1}^{(1)}$ are close to the subgroup 1 prevalence $λ$ . However, if the prior suggests a benefit is more likely in subgroup 1, the optimal design over‐samples this subgroup, increasing its trial prevalence and testing weight. Figure S10 shows that, for enrichment designs, the prior correlation $ρ$ has a large impact on the choice of $ω_{1}^{(1)}$ but little effect on the optimal trial prevalences.

SIM-8949-FIG-0005-c — Optimized design parameters for two‐stage designs and the expected utility, averaged over the prior. Parameters are $a = (s^{(1)}, r_{1}^{(1)}, ω_{1}^{(1)})$ for enrichment trials and $a = (s^{(1)}, r_{1}^{(1)})$ for umbrella trials. Results are classified by $μ_{1}$ and $μ_{2}$ , the prior means for $θ_{1}$ and $θ_{2}$ , and by the prior SD $ψ = ψ_{1} = ψ_{2}$ . The prior correlation between $θ_{1}$ and $θ_{2}$ is fixed at $ρ = 0.5$ and the population prevalence of subgroup 1 is assumed to be $λ = 0.3$ [Colour figure can be viewed at wileyonlinelibrary.com]

As for single‐stage designs, the optimal values of $r_{1}^{(1)}$ are similar for enrichment and umbrella designs. A notable difference is that while the prior correlation $ρ$ has no effect at all on the optimal values of r₁ in a single‐stage umbrella design, the optimal value of $r_{1}^{(1)}$ in a two‐stage umbrella design does show a small dependence on $ρ$ . In the case of a single‐stage umbrella design, the marginal distributions of ${\hat{θ}}_{1}$ and ${\hat{θ}}_{2}$ do not depend on $ρ$ and thus, with no multiplicity adjustment in testing H₀₁ and H₀₂, the expected value of the utility defined in Equation (1) does not depend on $ρ$ . However, in a two‐stage umbrella trial, the optimal choice of $r_{1}^{(2)}$ and the resulting conditional expected utility depends on both ${\hat{θ}}_{1}^{(1)}$ and ${\hat{θ}}_{2}^{(1)}$ and it is the joint distribution of $({\hat{θ}}_{1}^{(1)}, {\hat{θ}}_{1}^{(2)})$ , which depends on $ρ$ , that determines the optimal value of $r_{1}^{(1)}$ .

It should be noted that the procedures we have described impose a high computational burden. While it is relatively straightforward to optimize the decision at the interim analysis, the overall optimization of the trial is performed using simulations over a grid of values for the first‐stage design parameters. More rapid computation of the optimal values may be achieved by using approximations to the utility when extreme first‐stage values are observed, for example, if both $Z_{1}^{(1)}$ and $Z_{2}^{(1)}$ are large and negative, the expected utility is practically zero for all choices of $r_{1}^{(2)}$ and $ω_{1}^{(2)}$ . In practice, one may wish to add the option of stopping the trial for futility if extreme negative results are observed at the interim analysis. The methods we have presented can be extended to find efficient designs that incorporate this option by working with a utility of the form

λ 1 (Reject H_{01}) + (1 - λ) 1 (Reject H_{02}) + k s^{(2)} n 1 (Stop at the interim analysis),

assigning a positive value k to each observation saved by early stopping.

3.3. Performance of the Bayes optimal design under specific alternative hypotheses

In this section we consider adaptive designs optimized for a particular prior distribution for $θ = (θ_{1}, θ_{2})$ but we evaluate their performance under specific values of $θ$ . We consider trials with a total sample size n = 700, response variance $σ^{2} = 1$ , and population prevalence of subgroup 1 equal to $λ = 0.3$ . As a benchmark for comparison, we consider a nonoptimized, single‐stage design with $r_{1} = λ$ and $ω_{1} = 0.5$ . We derive and assess the performance of single‐stage designs for which design parameters r₁ and $ω_{1}$ are optimized as described in Section 2.2, and we derive and assess two‐stage designs for which first‐stage design parameters and the adaptation rule are optimized as described in Section 2.3. In optimizing designs, we assume the normal prior distribution for $θ$ presented in Equation (2) with $μ_{1} = 0.1$ or 0.2, $μ_{2} = 0$ , $ψ_{1} = ψ_{2} = 0.2$ and $ρ = 0.5$ . These priors reflects the belief that a treatment benefit is more likely in subgroup 1. The prior SD of 0.2 corresponds to information from a trial with 100 subjects in each subgroup.

We evaluate the operating characteristics of the designs for values of $θ_{1}$ ranging from 0 to 0.3 and $θ_{2} = 0$ or 0.2. This creates scenarios with a treatment effect in only one subgroup when $θ_{2} = 0$ or with a treatment effect in both subgroups when $θ_{2} = 0.2$ and $θ_{1} > 0$ . Figure 6 presents simulation results for enrichment trials and Figure S11 presents results for umbrella trials. The plots show the probabilities of rejecting H₀₁ and H₀₂ and the average utility at the end of the trial for a variety of combinations of $μ_{1}$ , $μ_{2}$ , $θ_{1}$ , and $θ_{2}$ . For the scenarios considered, we see that optimizing the trial for the assumed priors leads to a substantial increase in the power to reject H₀₁ as compared to the nonoptimized, single‐stage design. However, the optimized designs have lower power to reject H₀₂ when $θ_{2} = 0.2$ . The optimized designs have a higher average utility than the nonoptimized design when $θ_{2} = 0$ . If $θ_{2} = 0.2$ , the two‐stage design optimized for the prior with $μ_{1} = 0.1$ has similar average utility to the the nonoptimized design but average utility of the optimized one‐stage design is a little lower; both one‐stage and two‐stage designs optimized for the prior with $μ_{1} = 0.2$ have lower average utility than the the nonoptimized design. These results are in line with previous studies ⁴¹ , ⁴² which showed adaptive enrichment designs provide the greatest advantage when a treatment effect is present in only one subgroup.

SIM-8949-FIG-0006-c — Operating characteristics of enrichment trials. The prior distribution for subgroup treatment effects $(θ_{1}, θ_{2})$ is normal with means $μ_{1} = 0.1$ or 0.2 and $μ_{2} = 0$ , SDs $ψ_{1} = ψ_{2} = 0.2$ and correlation $ρ = 0.5$ . The total sample size is 700 and the population prevalence of subgroup 1 is $λ = 0.3$ . Results are given for $θ_{1}$ ranging from 0 to 0.3 and $θ_{2} = 0$ or 0.2. The black dashed lines in the two top rows are placed at 0.05 as reference to the significance level, while the dashed lines in the third row indicates the expected utility of the trial given the initial design parameters [Colour figure can be viewed at wileyonlinelibrary.com]

4. WORKED EXAMPLE: IMPLEMENTING AN OPTIMIZED ADAPTIVE ENRICHMENT TRIAL

Suppose we wish to compare an experimental treatment to a control in a phase III clinical trial. We intend to use adaptive sample allocation as there is reason to believe the new treatment may only benefit a subgroup of patients. This trial will have a normally distributed endpoint with variance $σ^{2} = 1$ and, using information from a pilot study with 40 subjects from each subgroup, we construct a prior distribution $π (θ)$ for the treatment effects

(\begin{matrix} θ_{1} \\ θ_{2} \end{matrix}) \sim N ((\begin{matrix} 0.1 \\ 0 \end{matrix}), (\begin{matrix} 0.1 & 0.05 \\ 0.05 & 0.1 \end{matrix})) .

The total sample size for the trial is planned to be n = 700 subjects. The population prevalence of subgroup 1 is $λ = 0.3$ and a FWER $α = 0.05$ is to be used for the study.

Under the above assumptions, the results in Figure 5 for $ψ = \sqrt{0.1} = 0.3162$ show the optimal first‐stage parameters to be s⁽¹⁾ = 0.5, $r_{1}^{(1)} = 0.4$ and $ω_{1}^{(1)} = 0.4$ . Thus, we recruit 350 patients in the first stage of the trial with 40% of these from subgroup 1.

Now suppose we observe interim estimates ${\hat{θ}}_{1}^{(1)} = 0.442$ and ${\hat{θ}}_{2}^{(1)} = 0.033$ . These give Z‐values $Z_{1}^{(1)} = 2.616$ and $Z_{2}^{(1)} = 0.238$ and the conditional error rates, as defined in Equations (5) and (6), are A₁ = 0.6140, A₂ = 0.0184, and A₁₂ = 0.3912. At this point, we optimize the second‐stage design parameters $r_{1}^{(2)}$ and $ω_{1}^{(2)}$ . Figure 7 plots the conditional expected utility as a function of $r_{1}^{(2)}$ and $ω_{1}^{(2)}$ on a color‐coded scale. The maximum conditional expected utility, obtained using the Hooke‐Jeeves algorithm, is at $r_{1}^{(2)} = 0.314$ and $ω_{1}^{(2)} = 0.953$ . We therefore conduct the second stage of the trial using these parameter values.

SIM-8949-FIG-0007-c — Interim optimization. The color indicates the expected utility given interim data for each combination of second‐stage prevalence $r_{1}^{(2)}$ for subgroup 1 and testing weight $ω_{1}^{(2)}$ given the interim data [Colour figure can be viewed at wileyonlinelibrary.com]

Suppose, after recruiting the remaining subjects, the second‐stage estimates are ${\hat{θ}}_{1}^{(2)} = 0.272$ and ${\hat{θ}}_{2}^{(2)} = - 0.002$ . The corresponding Z‐values are $Z_{1}^{(2)} = 1.428$ and $Z_{2}^{(2)} = - 0.015$ , with P‐values $P_{1}^{(2)} = . 077$ and $P_{2}^{(2)} = . 506$ . Since $P_{1}^{(2)} < A_{1}$ and

P_{1}^{(2)} < . 3728 = 0.953 \times 0.3912 = ω_{1}^{(2)} \times A_{12},

we can globally reject H₀₁. However, since $P_{2}^{(2)} > A_{2}$ we cannot reject H₀₂.

5. EXTENDING THE DESIGNS

The methods we have described can be extended to trial designs with more than two stages or more than two subgroups. Suppose K disjoint subgroups S₁, … , S_K are specified and we wish to test the null hypotheses H_0k: $θ_{k} \leq 0$ against the alternatives H_1k: $θ_{k} > 0$ , where $θ_{k}$ denotes the treatment effect in subgroup k. In a trial with J stages and a total sample size n, we recruit s^(j)n patients in each stage, where s⁽¹⁾ + ⋯ + s^{(J )} = 1, and at stage j we recruit $r_{k}^{(j)} s^{(j)} n$ patients from subgroups k = 1, … , K, where $r_{1}^{(j)} + \dots + r_{K}^{(j)} = 1$ . The data provide estimates ${\hat{θ}}_{1}^{(j)}, \dots, {\hat{θ}}_{K}^{(j)}$ , at each stage j, from which we obtain Z‐values $Z_{1}^{(j)}, \dots, Z_{K}^{(j)}$ . In an enrichment design where control of the FWER is required, a suitable closed testing procedure is defined in terms of the $Z_{k}^{(j)}$ . Then, H_0k is rejected globally at level $α$ if all intersection hypotheses involving H_0k are rejected in local, level $α$ tests.

An adaptive design can be created by repeated application of the conditional error approach. An initial reference design is stated and when adaptation occurs, the modified testing procedure is defined so as to preserve the conditional error rate of each individual and intersection hypothesis test under the updated design for the remainder of the trial. This updated design becomes the new reference design under which conditional error rates will be calculated at any subsequent adaptation point.

We can consider optimizing the choice of the design parameters s^(j) and $r_{k}^{(j)}$ or weights in the tests of intersection hypotheses. The generalization of our earlier approach requires a prior distribution for the treatment effects $θ = (θ_{1}, \dots, θ_{K})$ and a utility function whose expectation is to be maximised. If $λ_{k}$ is the population prevalence of subgroup k, k = 1, … , K, a natural extension of Equation (1) is

𝒰 (\hat{θ}) = \sum_{k = 1}^{K} λ_{k} 1 (Reject H_{0 k}) .

In Section 2.3.3 we applied backwards induction to find the optimal design for a trial with two subgroups and two stages. Since the dimension of the state space grows with the number of subgroups and stages, such a direct application of backwards induction may not be feasible more generally. Other methods of optimization can be employed to find efficient, if not globally optimal, designs. For example, in a multistage design one may construct the adaptation rule at each interim analysis assuming the trial will continue without any further adaptation. We note that the optimization process is liable to be computationally intensive and it is important to commit resources to assess trial designs in a timely manner.

6. DISCUSSION

We have presented a Bayesian decision theoretic framework in which a clinical trial design can be optimized when two disjoint subgroups are under investigation. Our approach has both Bayesian and frequentist elements: the rules for hypothesis testing control the type I error rate and Bayesian decision tools are used to choose the design parameters within this scheme. This allows optimization of the sampling prevalence of each subgroup and weights in a weighted Bonferroni test of the intersection hypothesis, as well as optimal adaptation of these design parameters at the interim analysis. The optimal design maximizes the expected value of the specified utility function, averaged over the prior distribution assumed for the treatment effects in the two subgroups. After focusing on two‐stage trials with two subgroups in Sections 2 and 4, we outlined how our optimization framework may be extended to allow more subgroups or stages in the trial in Section 5.

Our results provide insights into how the mean and variance of the prior distribution affects the optimal timing of the interim analysis and the trial prevalences for each subgroup of patients. In practice, it is advisable to consider the sensitivity of the design's efficiency to modeling assumptions in order to create a trial design with robust efficiency.

In contrast to adaptive enrichment designs where recruitment is either from the full patient population or restricted to a single subgroup, we propose sampling from each subgroup at a specific rate which may differ from its population prevalence. We acknowledge that achieving the optimized prevalences in a trial may be challenging: additional screening will be required and over‐sampling a particular subgroup may delay a trial compared to an all‐comers design. ⁴³ , ⁴⁴ If logistical considerations imply that each subgroup is either dropped or sampled according to its population prevalence, our framework can still be used to optimize the other design parameters.

In Section 3.2 we discussed designs with the option of early stopping for futility and how the utility function might be modified to facilitate optimizing such designs. A similar approach could be followed to relax the requirement of a fixed total sample size and allow re‐assessment of future sample size at an interim analysis.

We have defined methods for normally distributed observations and a normal prior for treatment effects. While this has allowed us to demonstrate how to construct such designs, it is not a necessary restriction. With normally distributed responses, one could allow a separate response variance for each patient subgroup, placing prior distributions on these variances. In trials with other types of response distribution, including survival or categorical endpoints, standardized test statistics will still be approximately normally distributed if sample sizes are large enough, although nonnormal prior distributions may be appropriate. ⁴⁵

We assumed the null hypotheses of interest are that there is no treatment effect in each subgroup. Our decision theoretic framework can accommodate other formulations, such as testing for treatment effects in the full population and in one particular subgroup, ⁸ , ²⁰ , ²² , ²³ , ²⁴ , ⁴⁶ , ⁴⁷ , ⁴⁸ in which case the stage‐wise test statistics for different subgroups are correlated. Care is required to ensure that enrichment designs control FWER when test statistics are correlated but this is not an issue in umbrella trials with separate level $α$ tests for each null hypothesis. ³¹

Although we have focused on hypothesis testing instead, estimating treatment effects after an adaptive trial is also important. ⁴⁹ Simultaneous or marginal confidence regions for parameters, with or without multiplicity adjustment, can be constructed following a two‐stage design. ⁵⁰ , ⁵¹ Point estimates may be obtained by a weighted average of the treatment effects observed in the first and second stages ¹¹ , ⁵² but, due to the sample size adaptations and subgroup selection these estimators may be biased with the bias depending on the specific adaptation rules and the true parameter values. A thorough investigation of estimation for adaptive enrichment designs will be a topic of future research.

Software in the form of an R package is available at https://github.com/nicoballarini/OptimalTrial.

AUTHOR CONTRIBUTIONS

Dr Ballarini and Dr Burnett are the co‐primary authors and they contributed equally to this work.

Supporting information

Appendix S1. Technical appendices and additional simulation results.

Click here for additional data file.^{(18.3KB, pdf)}

ACKNOWLEDGEMENTS

Nicolás Ballarini is supported by the EU Horizon 2020 Research and Innovation Programme, Marie Sklodowska‐Curie grant No 633567. Thomas Jaki is supported by the National Institute for Health (NIHR‐SRF‐2015‐08‐001) and the Medical Research Council (MR/M005755/1). Franz König and Martin Posch are members of the EU Patient‐Centric Clinical Trial Platform (EU‐PEARL) which has received funding from the Innovative Medicines Initiative 2 Joint Undertaking, grant No 853966. This Joint Undertaking receives support from the EU Horizon 2020 Research and Innovation Programme, EFPIA, Children's Tumor Foundation, Global Alliance for TB Drug Development, and SpringWorks Therapeutics. The views expressed in this publication are those of the authors. The funders and associated partners are not responsible for any use that may be made of the information contained herein.

Ballarini NM, Burnett T, Jaki T, Jennison C, König F, Posch M. Optimizing subgroup selection in two‐stage adaptive enrichment and umbrella designs. Statistics in Medicine. 2021;40:2939–2956. 10.1002/sim.8949

Funding information H2020 Marie Skłodowska‐Curie Actions, 633567; Innovative Medicines Initiative, 853966; Medical Research Council, MR/M005755/1; National Institute for Health Research, NIHR‐SRF‐2015‐08‐001

Data Availability Statement

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

1. Dmitrienko A, Muysers C, Fritsch A, Lipkovich I. General guidance on exploratory and confirmatory subgroup analysis in late‐stage clinical trials. J Biopharm Stat. 2016;26(1):71‐98. [DOI] [PubMed] [Google Scholar]
2. Alosh M, Huque MF, Bretz F, D'Agostino RB Sr. Tutorial on statistical considerations on subgroup analysis in confirmatory clinical trials. Stat Med. 2017;36(8):1334‐1360. [DOI] [PubMed] [Google Scholar]
3. Ondra T, Dmitrienko A, Friede T, et al. Methods for identification and confirmation of targeted subgroups in clinical trials: a systematic review. J Biopharm Stat. 2016;26(1):99‐119. [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Antoniou M, Jorgensen AL, Kolamunnage‐Dona R. Biomarker‐guided adaptive trial designs in phase II and phase III: a methodological review. PLoS One. 2016;11(2):e0149803. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Mandrekar SJ, Sargent DJ. Clinical trial designs for predictive biomarker validation: theoretical considerations and practical challenges. J Clin Oncol. 2009;27(24):4027. [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Freidlin B, LM MS, Korn EL. Randomized clinical trials with biomarkers: design issues. J Natl Cancer Inst. 2010;102(3):152‐160. [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Simon N, Simon R. Adaptive enrichment designs for clinical trials. Biostatistics. 2013;14(4):613‐625. [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Brannath W, Zuber E, Branson M, et al. Confirmatory adaptive designs with Bayesian decision tools for a targeted therapy in oncology. Stat Med. 2009;28(10):1445‐1463. [DOI] [PubMed] [Google Scholar]
9. Friede T, Parsons N, Stallard N. A conditional error function approach for subgroup selection in adaptive clinical trials. Stat Med. 2012;31(30):4309‐4320. [DOI] [PubMed] [Google Scholar]
10. Sugitani T, Posch M, Bretz F, Koenig F. Flexible alpha allocation strategies for confirmatory adaptive enrichment clinical trials with a prespecified subgroup. Stat Med. 2018;37(24):3387‐3402. [DOI] [PubMed] [Google Scholar]
11. Chiu Y‐D, Koenig F, Posch M, Jaki T. Design and estimation in clinical trials with subpopulation selection. Stat Med. 2018;37(29):4335‐4352. [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Food and Drug Administration Adaptive designs for clinical trials of drugs and biologics. guidance for industry; 2018.
13. Berry DA. The Brave new world of clinical cancer research: adaptive biomarker‐driven trials integrating clinical practice with clinical research. Mol Oncol. 2015;9(5):951‐959. [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Meyer EL, Mesenbrink P, Dunger‐Baldauf C, et al. The evolution of master protocol clinical trial designs: a systematic literature review. Clin Ther. 2020;42(7):1330‐1360. [DOI] [PubMed] [Google Scholar]
15. Food and Drug Administration . Master protocols: efficient clinical trial design strategies to expedite development of oncology drugs and biologics. guidance for industry; 2018;.
16. Renfro LA, Sargent DJ. Statistical controversies in clinical research: basket trials, umbrella trials, and other master protocols: a review and examples. Ann Oncol. 2016;28(1):34‐43. [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Woodcock J, LaVange LM. Master protocols to study multiple therapies, multiple diseases or both. New Engl J Med. 2017;377(1):62‐70. [DOI] [PubMed] [Google Scholar]
18. Govindan R, Mandrekar SJ, Gerber DE, et al. ALCHEMIST trials: a golden opportunity to transform outcomes in early‐stage non‐small cell lung cancer. Clin Cancer Res. 2015;21(24):5439‐5444. [DOI] [PMC free article] [PubMed] [Google Scholar]
19. European Medicines Agency Reflection paper on methodological issues in confirmatory clinical trials planned with an adaptive design; 2007.
20. Ondra T, Jobjörnsson S, Beckman RA, et al. Optimized adaptive enrichment designs. Stat Methods Med Res. 2019;28(7). 10.1177/0962280217747312. [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Ondra T, Jobjörnsson S, Beckman RA, et al. Optimizing trial designs for targeted therapies. PLoS One. 2016;11(9):e0163726. [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Burnett T. Bayesian Decision Making in Adaptive Clinical Trials [PhD thesis]. University of BathUK; 2017.
23. Burnett T, Jennison C. Adaptive enrichment trials: what are the benefits? Stat Med. 2021;40(3):690‐711. [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Graf AC, Posch M, Koenig F. Adaptive designs for subpopulation analysis optimizing utility functions. Biom J. 2015;57(1):76‐89. [DOI] [PMC free article] [PubMed] [Google Scholar]
25. Beckman RA, Clark J, Chen C. Integrating predictive biomarkers and classifiers into oncology clinical development programmes. Nat Rev Drug Discov. 2011;10(10):735. [DOI] [PubMed] [Google Scholar]
26. Rosenblum M, Fang X, Liu H. Optimal, two stage, adaptive enrichment designs for randomized trials using sparse linear programming. Department of Biostatistics Working Papers. Working Paper 273, Johns Hopkins University; 2017.
27. Krisam J, Kieser M. Optimal decision rules for biomarker‐based subgroup selection for a targeted therapy in oncology. Int J Mol Sci. 2015;16(5):10354‐10375. [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Stallard N, Todd S, Parashar D, Kimani PK, Renfro LA. On the need to adjust for multiplicity in confirmatory clinical trials with master protocols. Ann Oncol. 2019;30(4):506. [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Dmitrienko A, D'Agostino RB Sr, Huque MF. Key multiplicity issues in clinical drug development. Stat Med. 2013;32(7):1079‐1111. [DOI] [PubMed] [Google Scholar]
30. Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian Approaches to Clinical Trials and Health‐Care Evaluation. Hoboken, NJ: John Wiley & Sons; 2004. [Google Scholar]
31. Stallard N, Posch M, Friede T, Koenig F, Brannath W. Optimal choice of the number of treatments to be included in a clinical trial. Stat Med. 2009;28(9):1321‐1338. [DOI] [PubMed] [Google Scholar]
32. Marcus R, Peritz E, Gabriel KR. On closed testing procedures with special reference to ordered analysis of variance. Biometrika. 1976;63(3):655‐660. [Google Scholar]
33. Posch M, Koenig F, Branson M, Brannath W, Dunger‐Baldauf C, Bauer P. Testing and estimation in flexible group sequential designs with adaptive treatment selection. Stat Med. 2005;24(24):3697‐3714. [DOI] [PubMed] [Google Scholar]
34. Bauer P, Kieser M. Combining different phases in the development of medical treatments within a single trial. Stat Med. 1999;18(14):1833‐1848. [DOI] [PubMed] [Google Scholar]
35. Bretz F, Koenig F, Brannath W, Glimm E, Posch M. Adaptive designs for confirmatory clinical trials. Stat Med. 2009;28(8):1181‐1217. [DOI] [PubMed] [Google Scholar]
36. Bauer P, Bretz F, Dragalin V, König F, Wassmer G. Twenty‐five years of confirmatory adaptive designs: opportunities and pitfalls. Stat Med. 2016;35(3):325‐347. [DOI] [PMC free article] [PubMed] [Google Scholar]
37. Müller H‐H, Schäfer H. Adaptive group sequential designs for clinical trials: combining the advantages of adaptive and of classical group sequential approaches. Biometrics. 2001;57(3):886‐891. [DOI] [PubMed] [Google Scholar]
38. Müller H‐H, Schäfer H. A general statistical principle for changing a design any time during the course of a trial. Stat Med. 2004;23(16):2497‐2508. [DOI] [PubMed] [Google Scholar]
39. Varadhan R, Borchers HW. dfoptim: derivative‐free optimization. R package version 2018.2‐1; 2018.
40. R Core Team R: a language and environment for statistical computing; 2018.
41. Simon R, Maitournam A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clin Cancer Res. 2004;10(20):6759‐6763. [DOI] [PubMed] [Google Scholar]
42. Hoering A, LeBlanc M, Crowley JJ. Randomized phase III clinical trial designs for targeted agents. Clin Cancer Res. 2008;14(14):4358‐4367. [DOI] [PMC free article] [PubMed] [Google Scholar]
43. Klauschen F, Andreeff M, Keilholz U, Dietel M, Stenzinger A. The combinatorial complexity of cancer precision medicine. Oncoscience. 2014;1(7):504. [DOI] [PMC free article] [PubMed] [Google Scholar]
44. Eichler H‐G, Bloechl‐Daum B, Bauer P, et al. “Threshold‐crossing”: a useful way to establish the counterfactual in clinical trials? Clin Pharmacol Ther. 2016;100(6):699‐712. [DOI] [PMC free article] [PubMed] [Google Scholar]
45. Brückner M, Burger HU, Brannath W. Nonparametric adaptive enrichment designs using categorical surrogate data. Stat Med. 2018;37(29):4507‐4524. [DOI] [PubMed] [Google Scholar]
46. Wang SJ, O'Neill RT, Hung HJ. Approaches to evaluation of treatment effect in randomized clinical trials with genomic subset. Pharm Stat J Appl Stat Pharm Ind. 2007;6(3):227‐244. [DOI] [PubMed] [Google Scholar]
47. Alosh M, Huque MF. A flexible strategy for testing subgroups and overall population. Stat Med. 2009;28(1):3‐23. [DOI] [PubMed] [Google Scholar]
48. Spiessens B, Debois M. Adjusted significance levels for subgroup analyses in clinical trials. Contemp Clin Trials. 2010;31(6):647‐656. [DOI] [PubMed] [Google Scholar]
49. Stallard N, Todd S, Whitehead J. Estimation following selection of the largest of two normal means. J Stat Plann Infer. 2008;138(6):1629‐1638. [Google Scholar]
50. Mehta CR, Bauer P, Posch M, Brannath W. Repeated confidence intervals for adaptive group sequential trials. Stat Med. 2007;26(30):5422‐5433. [DOI] [PubMed] [Google Scholar]
51. Magirr D, Jaki T, Posch M, Klinglmueller F. Simultaneous confidence intervals that are compatible with closed testing in adaptive designs. Biometrika. 2013;100(4):985‐996. [DOI] [PMC free article] [PubMed] [Google Scholar]
52. Kimani PK, Todd S, Stallard N. Estimation after subpopulation selection in adaptive seamless trials. Stat Med. 2015;34(18):2581‐2601. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix S1. Technical appendices and additional simulation results.

Click here for additional data file.^{(18.3KB, pdf)}

Data Availability Statement

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

[sim8949-bib-0001] 1. Dmitrienko A, Muysers C, Fritsch A, Lipkovich I. General guidance on exploratory and confirmatory subgroup analysis in late‐stage clinical trials. J Biopharm Stat. 2016;26(1):71‐98. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0002] 2. Alosh M, Huque MF, Bretz F, D'Agostino RB Sr. Tutorial on statistical considerations on subgroup analysis in confirmatory clinical trials. Stat Med. 2017;36(8):1334‐1360. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0003] 3. Ondra T, Dmitrienko A, Friede T, et al. Methods for identification and confirmation of targeted subgroups in clinical trials: a systematic review. J Biopharm Stat. 2016;26(1):99‐119. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0004] 4. Antoniou M, Jorgensen AL, Kolamunnage‐Dona R. Biomarker‐guided adaptive trial designs in phase II and phase III: a methodological review. PLoS One. 2016;11(2):e0149803. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0005] 5. Mandrekar SJ, Sargent DJ. Clinical trial designs for predictive biomarker validation: theoretical considerations and practical challenges. J Clin Oncol. 2009;27(24):4027. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0006] 6. Freidlin B, LM MS, Korn EL. Randomized clinical trials with biomarkers: design issues. J Natl Cancer Inst. 2010;102(3):152‐160. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0007] 7. Simon N, Simon R. Adaptive enrichment designs for clinical trials. Biostatistics. 2013;14(4):613‐625. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0008] 8. Brannath W, Zuber E, Branson M, et al. Confirmatory adaptive designs with Bayesian decision tools for a targeted therapy in oncology. Stat Med. 2009;28(10):1445‐1463. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0009] 9. Friede T, Parsons N, Stallard N. A conditional error function approach for subgroup selection in adaptive clinical trials. Stat Med. 2012;31(30):4309‐4320. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0010] 10. Sugitani T, Posch M, Bretz F, Koenig F. Flexible alpha allocation strategies for confirmatory adaptive enrichment clinical trials with a prespecified subgroup. Stat Med. 2018;37(24):3387‐3402. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0011] 11. Chiu Y‐D, Koenig F, Posch M, Jaki T. Design and estimation in clinical trials with subpopulation selection. Stat Med. 2018;37(29):4335‐4352. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0012] 12. Food and Drug Administration Adaptive designs for clinical trials of drugs and biologics. guidance for industry; 2018.

[sim8949-bib-0013] 13. Berry DA. The Brave new world of clinical cancer research: adaptive biomarker‐driven trials integrating clinical practice with clinical research. Mol Oncol. 2015;9(5):951‐959. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0014] 14. Meyer EL, Mesenbrink P, Dunger‐Baldauf C, et al. The evolution of master protocol clinical trial designs: a systematic literature review. Clin Ther. 2020;42(7):1330‐1360. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0015] 15. Food and Drug Administration . Master protocols: efficient clinical trial design strategies to expedite development of oncology drugs and biologics. guidance for industry; 2018;.

[sim8949-bib-0016] 16. Renfro LA, Sargent DJ. Statistical controversies in clinical research: basket trials, umbrella trials, and other master protocols: a review and examples. Ann Oncol. 2016;28(1):34‐43. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0017] 17. Woodcock J, LaVange LM. Master protocols to study multiple therapies, multiple diseases or both. New Engl J Med. 2017;377(1):62‐70. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0018] 18. Govindan R, Mandrekar SJ, Gerber DE, et al. ALCHEMIST trials: a golden opportunity to transform outcomes in early‐stage non‐small cell lung cancer. Clin Cancer Res. 2015;21(24):5439‐5444. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0019] 19. European Medicines Agency Reflection paper on methodological issues in confirmatory clinical trials planned with an adaptive design; 2007.

[sim8949-bib-0020] 20. Ondra T, Jobjörnsson S, Beckman RA, et al. Optimized adaptive enrichment designs. Stat Methods Med Res. 2019;28(7). 10.1177/0962280217747312. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0021] 21. Ondra T, Jobjörnsson S, Beckman RA, et al. Optimizing trial designs for targeted therapies. PLoS One. 2016;11(9):e0163726. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0022] 22. Burnett T. Bayesian Decision Making in Adaptive Clinical Trials [PhD thesis]. University of BathUK; 2017.

[sim8949-bib-0023] 23. Burnett T, Jennison C. Adaptive enrichment trials: what are the benefits? Stat Med. 2021;40(3):690‐711. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0024] 24. Graf AC, Posch M, Koenig F. Adaptive designs for subpopulation analysis optimizing utility functions. Biom J. 2015;57(1):76‐89. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0025] 25. Beckman RA, Clark J, Chen C. Integrating predictive biomarkers and classifiers into oncology clinical development programmes. Nat Rev Drug Discov. 2011;10(10):735. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0026] 26. Rosenblum M, Fang X, Liu H. Optimal, two stage, adaptive enrichment designs for randomized trials using sparse linear programming. Department of Biostatistics Working Papers. Working Paper 273, Johns Hopkins University; 2017.

[sim8949-bib-0027] 27. Krisam J, Kieser M. Optimal decision rules for biomarker‐based subgroup selection for a targeted therapy in oncology. Int J Mol Sci. 2015;16(5):10354‐10375. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0028] 28. Stallard N, Todd S, Parashar D, Kimani PK, Renfro LA. On the need to adjust for multiplicity in confirmatory clinical trials with master protocols. Ann Oncol. 2019;30(4):506. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0029] 29. Dmitrienko A, D'Agostino RB Sr, Huque MF. Key multiplicity issues in clinical drug development. Stat Med. 2013;32(7):1079‐1111. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0030] 30. Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian Approaches to Clinical Trials and Health‐Care Evaluation. Hoboken, NJ: John Wiley & Sons; 2004. [Google Scholar]

[sim8949-bib-0031] 31. Stallard N, Posch M, Friede T, Koenig F, Brannath W. Optimal choice of the number of treatments to be included in a clinical trial. Stat Med. 2009;28(9):1321‐1338. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0032] 32. Marcus R, Peritz E, Gabriel KR. On closed testing procedures with special reference to ordered analysis of variance. Biometrika. 1976;63(3):655‐660. [Google Scholar]

[sim8949-bib-0033] 33. Posch M, Koenig F, Branson M, Brannath W, Dunger‐Baldauf C, Bauer P. Testing and estimation in flexible group sequential designs with adaptive treatment selection. Stat Med. 2005;24(24):3697‐3714. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0034] 34. Bauer P, Kieser M. Combining different phases in the development of medical treatments within a single trial. Stat Med. 1999;18(14):1833‐1848. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0035] 35. Bretz F, Koenig F, Brannath W, Glimm E, Posch M. Adaptive designs for confirmatory clinical trials. Stat Med. 2009;28(8):1181‐1217. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0036] 36. Bauer P, Bretz F, Dragalin V, König F, Wassmer G. Twenty‐five years of confirmatory adaptive designs: opportunities and pitfalls. Stat Med. 2016;35(3):325‐347. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0037] 37. Müller H‐H, Schäfer H. Adaptive group sequential designs for clinical trials: combining the advantages of adaptive and of classical group sequential approaches. Biometrics. 2001;57(3):886‐891. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0038] 38. Müller H‐H, Schäfer H. A general statistical principle for changing a design any time during the course of a trial. Stat Med. 2004;23(16):2497‐2508. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0039] 39. Varadhan R, Borchers HW. dfoptim: derivative‐free optimization. R package version 2018.2‐1; 2018.

[sim8949-bib-0040] 40. R Core Team R: a language and environment for statistical computing; 2018.

[sim8949-bib-0041] 41. Simon R, Maitournam A. Evaluating the efficiency of targeted designs for randomized clinical trials. Clin Cancer Res. 2004;10(20):6759‐6763. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0042] 42. Hoering A, LeBlanc M, Crowley JJ. Randomized phase III clinical trial designs for targeted agents. Clin Cancer Res. 2008;14(14):4358‐4367. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0043] 43. Klauschen F, Andreeff M, Keilholz U, Dietel M, Stenzinger A. The combinatorial complexity of cancer precision medicine. Oncoscience. 2014;1(7):504. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0044] 44. Eichler H‐G, Bloechl‐Daum B, Bauer P, et al. “Threshold‐crossing”: a useful way to establish the counterfactual in clinical trials? Clin Pharmacol Ther. 2016;100(6):699‐712. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0045] 45. Brückner M, Burger HU, Brannath W. Nonparametric adaptive enrichment designs using categorical surrogate data. Stat Med. 2018;37(29):4507‐4524. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0046] 46. Wang SJ, O'Neill RT, Hung HJ. Approaches to evaluation of treatment effect in randomized clinical trials with genomic subset. Pharm Stat J Appl Stat Pharm Ind. 2007;6(3):227‐244. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0047] 47. Alosh M, Huque MF. A flexible strategy for testing subgroups and overall population. Stat Med. 2009;28(1):3‐23. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0048] 48. Spiessens B, Debois M. Adjusted significance levels for subgroup analyses in clinical trials. Contemp Clin Trials. 2010;31(6):647‐656. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0049] 49. Stallard N, Todd S, Whitehead J. Estimation following selection of the largest of two normal means. J Stat Plann Infer. 2008;138(6):1629‐1638. [Google Scholar]

[sim8949-bib-0050] 50. Mehta CR, Bauer P, Posch M, Brannath W. Repeated confidence intervals for adaptive group sequential trials. Stat Med. 2007;26(30):5422‐5433. [DOI] [PubMed] [Google Scholar]

[sim8949-bib-0051] 51. Magirr D, Jaki T, Posch M, Klinglmueller F. Simultaneous confidence intervals that are compatible with closed testing in adaptive designs. Biometrika. 2013;100(4):985‐996. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim8949-bib-0052] 52. Kimani PK, Todd S, Stallard N. Estimation after subpopulation selection in adaptive seamless trials. Stat Med. 2015;34(18):2581‐2601. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Optimizing subgroup selection in two‐stage adaptive enrichment and umbrella designs

Nicolás M Ballarini

Thomas Burnett

Thomas Jaki

Christoper Jennison

Franz König

Martin Posch

Abstract

1. INTRODUCTION

2. BAYES OPTIMAL DESIGNS

2.1. The class of trial designs

FIGURE 1.

2.2. Bayes optimal single‐stage design

2.2.1. Patient recruitment and estimation

2.2.2. Hypothesis testing in the single‐stage design

2.2.3. Bayesian optimization

2.3. Bayes optimal two‐stage adaptive design

2.3.1. Adding a second stage

2.3.2. Hypothesis testing in the two‐stage adaptive design

2.3.3. Two‐stage optimization

2.3.3.1. Optimizing the decision at the interim analysis

2.3.3.2. Overall trial optimization

2.4. Bayes optimal umbrella trials

3. NUMERICAL EXAMPLES AND COMPARISONS

TABLE 1.

3.1. Optimal single‐stage designs

FIGURE 2.

3.2. Optimal two‐stage designs

FIGURE 3.

FIGURE 4.

FIGURE 5.

3.3. Performance of the Bayes optimal design under specific alternative hypotheses

FIGURE 6.

4. WORKED EXAMPLE: IMPLEMENTING AN OPTIMIZED ADAPTIVE ENRICHMENT TRIAL

FIGURE 7.

5. EXTENDING THE DESIGNS

6. DISCUSSION

AUTHOR CONTRIBUTIONS

Supporting information

ACKNOWLEDGEMENTS

Data Availability Statement

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases