Bayesian Design of Superiority Trials: Methods and Applications

Wenlin Yuan; Ming-Hui Chen; John Zhong

doi:10.1080/19466315.2022.2090429

. Author manuscript; available in PMC: 2023 Jul 18.

Published in final edited form as: Stat Biopharm Res. 2022 Jul 18;14(4):433–443. doi: 10.1080/19466315.2022.2090429

Bayesian Design of Superiority Trials: Methods and Applications

Wenlin Yuan ¹, Ming-Hui Chen ^2,^*, John Zhong ³

PMCID: PMC10035591 NIHMSID: NIHMS1840288 PMID: 36968644

Abstract

In this paper, we lay out the basic elements of Bayesian sample size determination (SSD) for the Bayesian design of a two-arm superiority clinical trial. We develop a flowchart of the Bayesian SSD that highlights the critical components of a Bayesian design and provides a practically useful roadmap for designing a Bayesian clinical trial in real world applications. We empirically examine the amount of borrowing, the choice of noninformative priors, and the impact of model misspecification on the Bayesian type I error and power. A formal and statistically rigorous formulation of conditional borrowing within the decision rule framework is developed. Moreover, by extending the partial borrowing power priors, a new borrowing-by-parts power prior for incorporating historical data is proposed. Computational algorithms are also developed to calculate the Bayesian type I error and power. Extensive simulation studies are carried out to explore the operating characteristics of the proposed Bayesian design of a superiority trial.

Keywords: Borrowing-by-Parts Power Prior, Conditional Borrowing, Sample Size Determination, Power Prior

1. Introduction

Complex Innovative Trial Design (CID) Pilot Meeting Program was initiated by the U.S. Food and Drug Administration (FDA) in 2018 to support the goal of facilitating and advancing the use of complex adaptive, Bayesian, and other novel clinical trial designs (U.S. Food and Drug Administration, 2018). The first study ever selected by the FDA for the CID Pilot Meeting Program is the DYSTANCE 51 clinical trial sponsored by Wave Life Sciences, which is a global Phase 2/3, multicenter, randomized, double-blind, placebo-controlled clinical trial that evaluates the efficacy and safety of suvodirsen in ambulatory boys who are between 5 and 12 years of age (inclusive) with a genetically confirmed diagnosis of Duchenne muscular dystrophy (DMD) amenable to exon 51 skipping (Lake et al., 2021). DMD is a rapidly progressive form of muscular dystrophy that occurs primarily in males and manifests prior to the age of six years, affecting approximately 1 in 3,600 to 9,300 male births worldwide (Mah et al., 2014). From the experience with the CID Pilot Meeting Program shared by Wave Life Sciences, the DYSTANCE 51 clinical trial incorporates the capability to augment the placebo arm with historical data using Bayesian methods (Lake et al., 2021). Natural history studies can be used to support the development of safe and effective drugs and biological products for rare diseases. There has been a joint effort from the DMD communities and worldwide regulatory agencies to access the appropriateness of natural history data to supplement clinical development programs. The suitability in borrowing historical data of DMD has also been supported by studies (Goemans et al., 2020).

The literature on Bayesian sample size determination (SSD) has been growing recently due to recent advances in Bayesian computation and Markov chain Monte Carlo sampling. Joseph et al. (1995); Lindley (1997); Rubin and Stern (1998); Katsis and Toman (1999); and Inoue et al. (2005) are the Bayesian SSD articles cited in the FDA guidance for the use of Bayesian statistics in medical device clinical trials (U.S. Food and Drug Administration, 2010). The early literature on Bayesian SSD includes Rahme and Joseph (1998); Simon (1999); Wang and Gelfand (2002); Spiegelhalter et al. (2004); De Santis (2004, 2007); M’Lan et al. (2006); M’lan et al. (2008); Lee and Liu (2008); and Reyes and Ghosh (2013). Campbell (2011) and Berry et al. (2010) provided a list of Bayesian papers up to 2011. Gamalo-Siebers et al. (2016) gave an excellent review of Bayesian methods for the design and analysis of non-inferiority trials. Chen et al. (2011) developed a new Bayesian design methodology with a focus on controlling the type I error and power for non-inferiority trials. Chen et al. (2014) extended the methodology of Chen et al. (2011) to the Bayesian design of superiority clinical trials for recurrent events data. Li et al. (2015) developed the Bayesian design of non-inferiority clinical trials with co-primary endpoints and multiple dose comparison, and Li et al. (2018) proposed a Bayesian design via Bayes Factor. Bayesian methods for incorporating historical data include the power prior (Ibrahim and Chen, 2000; Ibrahim et al., 2015), the hierarchical prior (Chen et al., 2011), the commensurate prior (Hobbs et al., 2011, 2012), the meta-analytic-predictive (MAP) prior (Neuenschwander et al., 2010), the robust MAP prior (Schmidli et al., 2014), the covariate-adjusted hierarchical model-based prior (Han et al., 2017), and their respective variations. Recent review papers of these methods include Schmidli et al. (2020); Hall et al. (2021); Ghadessi et al. (2020); and van Rosmalen et al. (2018).

Motivated by the DYSTANCE 51 clinical trial, we explore different aspects in borrowing historical data within the Bayesian framework. Using the DMD natural history aggregate data, we develop a formal formulation of conditional borrowing of Allocco et al. (2010) via the decision rule. By extending the partial borrowing power priors (Ibrahim et al., 2012; Chen et al., 2014; Ibrahim et al., 2015), we propose a new borrowing-by-parts power prior for incorporating historical data. In this paper, we also address several critical issues in Bayesian SSD, namely, the amount of borrowing from the historical data, the choice of noninformative priors, and the impact of model misspecification on the Bayesian type I error and power.

The remaining part of the manuscript is organized as follows. In Section 2, we present the critical elements and necessary steps including the computational algorithm of a Bayesian design of a superiority trial and develop a flowchart of Bayesian SSD. In Section 3, we analytically examine the properties of the Bayesian type I error and the power when the variances are known and empirically investigate the choice of noninformative priors and the impact of model misspecification on the Bayesian type I error and power. Section 4 presents a comprehensive treatment of leveraging historical data in Bayesian SSD. In this section, we first introduce the DMD natural history aggregate data, then present two commonly used priors, namely, the power prior and the robust mixture prior, to leverage the historical data, and further investigate the impact of the amount of borrowing on the Bayesian type I error and power. In this very same section, the formal and statistically rigorous formulations of conditional borrowing and the borrowing-by-parts power prior are developed, and extensive simulation studies are conducted to examine the empirical performance of the proposed methodology. We conclude the paper with a brief discussion in Section 5.

2. Bayesian Design of a Superiority Trial

We consider designing a randomized, double-blind, placebo-controlled clinical trial to evaluate the superiority of a drug candidate over placebo with a continuous primary endpoint. Given the large amount of historical data available, we also consider borrowing the historical data to augment the placebo control in the clinical trial design. Let $y_{t} = {(y_{t 1}, y_{t 2}, \dots, y_{t n_{t}})}^{'}$ and $y_{c} = {(y_{c 1}, y_{c 2}, \dots, y_{c n_{c}})}^{'}$ be the primary endpoint data of test drug and placebo control, with sample sizes n_t and n_c, respectively. We assume $y_{t}$ and $y_{c}$ are independent, $y_{t i} \overset{i . i . d .}{\sim} N (μ_{t}, σ_{t}^{2})$ , and $y_{c i} \overset{i . i . d .}{\sim} N (μ_{c}, σ_{c}^{2})$ . The parameter of interest is the expectation of difference in the effects of the test drug and the placebo control, namely, $δ = μ_{t} - μ_{c}$ . The hypotheses of the superiority trial are $H_{0} : δ \leq 0$ versus $H_{1} : δ > 0$ , equivalently, $H_{0} : μ_{t} \leq μ_{c}$ versus $H_{1} : μ_{t} > μ_{c}$ .

Let $θ = {(μ_{c}, μ_{t}, σ_{c}^{2}, σ_{t}^{2})}^{'}$ . We also consider two special cases: (i) $θ = {(μ_{c}, μ_{t})}^{'}$ when the variances are known and (ii) $θ = {(μ_{c}, μ_{t}, σ^{2})}^{'}$ when the variances are equal but unknown, where $σ_{t}^{2} = σ_{c}^{2} = σ^{2}$ . Let $y^{(n)} = {(y_{t}^{'}, y_{c}^{'})}^{'}$ and $n = n_{t} + n_{c}$ . The joint distribution of $y^{(n)}$ and θ is written as

f (y^{(n)} ∣ θ) π (θ),

where $f (y^{(n)} ∣ θ)$ is the conditional distribution of y⁽ⁿ⁾ given θ, which is the product of the normal densities corresponding to $N (μ_{t}, σ_{t}^{2})$ and $N (μ_{c}, σ_{c}^{2})$ , respectively, and π (θ) denotes a prior distribution. Denote the fitting prior as $π^{(f)} (θ)$ and the sampling prior as $π^{(s)} (θ)$ , where the fitting prior is used to fit the model and the sampling prior is used to generate the data (Wang and Gelfand, 2002). Then, the posterior is given by

π (θ ∣ y^{(n)}, π^{(f)}) \propto f (y^{(n)} ∣ θ) π^{(f)} (θ) .

(2.1)

Let the null parameter space $Θ_{0} = {θ : μ_{t} \leq μ_{c}, σ_{t}^{2} > 0, σ_{c}^{2} > 0}$ corresponding to the null hypothesis and the alternative parameter space $Θ_{1} = {θ : μ_{t} > μ_{c}, σ_{t}^{2} > 0, σ_{c}^{2} > 0}$ corresponding to the alternative hypothesis. The sampling priors $π_{0}^{(s)} (θ)$ and $π_{1}^{(s)} (θ)$ , which are used to generate the data, are defined on $Θ_{0}$ and $Θ_{1}$ such that $\int_{Θ_{ℓ}} π_{ℓ}^{(s)} (θ) d θ = 1$ for $ℓ = 0, 1$ .

Let $T (y^{(n)})$ be a test statistic. We define a decision rule based on $T (y^{(n)})$ to reject H₀ as

ψ (y^{(n)}) = {\begin{matrix} 1, & if & T (y^{(n)}) \geq T_{0}, \\ 0, & if & T (y^{(n)}) < T_{0}, \end{matrix}

(2.2)

where T₀ is a critical value, which only depends on the study design, not on the data $y^{(n)}$ . We define the Bayesian power function as

β_{s}^{(n)} = E_{s} [ψ (y^{(n)}) ∣ π^{(s)}],

(2.3)

where the expectation is taken with respect to the marginal distribution of y⁽ⁿ⁾ under the sampling prior $π^{(s)} (θ)$ . Based on the approach by (Chen et al., 2011), we take

T (y^{(n)}) = P (δ > 0 ∣ y^{(n)}, π^{(f)}),

(2.4)

where the probability is computed with respect to the posterior distribution given the data $y^{(n)}$ and the fitting prior $π^{(f)} (θ)$ . Let $T_{0} = γ$ be a Bayesian credible level, where 0 < γ < 1. Then the Bayesian power function in (2.3) reduces to

β_{s}^{(n)} = E_{π^{(s)}} [1 {P (δ > 0 ∣ y^{(n)}, π^{(f)}) \geq γ}],

(2.5)

where the indicator function $1 {A}$ takes a value of “1” if A is true and “0” otherwise.

The analytical evaluation of (2.3) or (2.5) is often not available. The following computational algorithm can be used for $β_{s}^{(n)}$ :

Step 0. Set n_t, n_c, γ, and N (the number of simulated datasets);

Step 1. Generate $θ \sim π^{(s)} (θ)$ ;

Step 2. Generate $y^{(n)} \sim f (y^{(n)} ∣ θ)$ ;

Step 3. Calculate $T (y^{(n)}) = P (δ > 0 ∣ y^{(n)}, π^{(f)})$ ;

Step 4. Check whether $T (y^{(n)}) \geq γ$ or not;

Step 5. Repeat Steps 1-4 N times;

Step 6. Compute the proportion of ${T (y^{(n)}) \geq γ}$ in these N runs, which gives an estimate of $β_{s}^{(n)}$ in (2.3) or (2.5).

The Bayesian type I error $β_{s 0}^{(n)}$ is $β_{s}^{(n)}$ with $π^{(s)} = π_{0}^{(s)} (θ)$ in (2.5) while the Bayesian power $β_{s 1}^{(n)}$ is $β_{s}^{(n)}$ with $π^{(s)} = π_{1}^{(s)} (θ)$ . The maximum Bayesian type I error is $\sup_{π_{0}^{(s)}} β_{s 0}^{(n)}$ . For given $α_{0} > 0$ and $α_{1} > 0$ , we compute $n_{α_{0}} = \min {n : β_{s 0}^{(n)} \leq α_{0}}$ and $n_{α_{1}} = \min {n : β_{s 1}^{(n)} \geq 1 - α_{1}}$ . The Bayesian sample size is then given by $n_{B} = \max {n_{α_{0}}, n_{α_{1}}}$ . Common choices of α₀ and α₁ are $α_{0} = 0.025$ and $α_{1} = 0.20$ . With the Bayesian sample size of n_B, the type I error rate is intended to be less than or equal to 0.025 and the power is intended to be at least 0.80.

We summarize the above Bayesian SSD process in Figure 1. Every Bayesian SSD starts with the trial specification, which primarily consists of the type of trial (superiority, or non-inferiority, or equivalence) based on the objective of a study, the number of treatment arms, and the sample size allocation in each arm. The next step of Bayesian SSD shown in Figure 1 includes the specification of a statistical model for the current data, the derivation of the corresponding likelihood, and the mathematical formulation of scientific hypotheses according to the chosen model. According to (2.1), the next step in the flowchart is to specify a fitting prior and then to derive the posterior. One of the key components in Bayesian SSD is to construct a test statistic $T (y^{(n)})$ in (2.2), which leads to the formulation of a decision rule. The key design quantity, i.e., the Bayesian power function $β_{s}^{(n)}$ in (2.3), can be evaluated either analytically or numerically via a Monte Carlo method, under a given sampling prior $π_{ℓ}^{(s)} (θ)$ , which eventually leads to the final determination of Bayesian sample size. One additional component in Figure 1 is the historical data when available, which can be incorporated in the fitting priors via the decision rule. The inclusion of the historical data typically leads to a reduction of the Bayesian sample size while at the same time, it may increase the type I error. The technical details and potential issues in using a noninformative fitting prior or an informative prior by leveraging historical data are discussed in details in the next two sections.

3. Bayesian SSD with Noninformative Priors

3.1. Theoretical Properties of the Bayesian Power Function with Known Variances

In this subsection, we assume the variances, i.e., $σ_{t}^{2}$ and $σ_{c}^{2}$ are known. Under this assumption, we have $θ = {(μ_{c}, μ_{t})}^{'}$ . The null parameter space is $Θ_{0} = {θ : μ_{t} \leq μ_{c}}$ and the alternative parameter space is $Θ_{1} = {θ : μ_{t} > μ_{c}}$ . We assume a noninformative uniform prior for the fitting prior, namely, $π^{(f)} (θ) \propto 1$ . Then, the joint distribution of $y^{(n)}$ is given by

f (y^{(n)} ∣ θ) = \prod_{i = 1}^{n_{t}} \frac{1}{\sqrt{2 π} σ_{t}} \exp {- \frac{{(y_{t i} - μ_{t})}^{2}}{2 σ_{t}^{2}}} \prod_{j = 1}^{n_{c}} \frac{1}{\sqrt{2 π} σ_{c}} \exp {- \frac{{(y_{c j} - μ_{c})}^{2}}{2 σ_{c}^{2}}} .

From (2.1), the posterior distribution takes the form

π (θ ∣ y^{(n)}, π^{(f)}) \propto f (y^{(n)} ∣ θ) π^{(f)} (θ) \propto \exp {- \frac{{(μ_{c} - {\bar{y}}_{c})}^{2}}{2 σ_{c}^{2} / n_{c}}} \exp {- \frac{{(μ_{t} - {\bar{y}}_{t})}^{2}}{2 σ_{t}^{2} / n_{t}}},

where ${\bar{y}}_{c} = \frac{1}{n_{c}} \sum_{i = 1}^{n_{c}} y_{c i}$ and ${\bar{y}}_{t} = \frac{1}{n_{t}} \sum_{j = 1}^{n_{t}} y_{t j}$ . The test statistic $T (y^{(n)})$ in (2.4) is expressed as

T (y^{(n)}) = P (δ > 0 ∣ y^{(n)}, π^{(f)}) = \int_{- \infty}^{\infty} \int_{- \infty}^{μ_{t}} π (θ ∣ y^{(n)}, π^{(f)}) d μ_{c} d μ_{t} = Φ (\frac{{\bar{y}}_{t} - {\bar{y}}_{c}}{\sqrt{σ_{c}^{2} / n_{c} + σ_{t}^{2} / n_{t}}}),

where Φ is the standard normal N(0, 1) cumulative distribution function (cdf). The decision rule in (2.2) is to reject H₀ if

Φ (\frac{{\bar{y}}_{t} - {\bar{y}}_{c}}{\sqrt{σ_{c}^{2} / n_{c} + σ_{t}^{2} / n_{t}}}) \geq γ .

(3.1)

Taking $Φ^{- 1}$ , which is the inverse function of Φ, from both sides of (3.1) gives

{\bar{y}}_{t} - {\bar{y}}_{c} \geq Z_{γ} \sqrt{σ_{c}^{2} / n_{c} + σ_{t}^{2} / n_{t}},

(3.2)

where $Z_{γ} = Φ^{- 1} (γ)$ . Using (3.2), the critical function in (2.2) reduces to

ψ (y^{(n)}) = {\begin{matrix} 1, & if {\bar{y}}_{t} - {\bar{y}}_{c} \geq Z_{γ} \sqrt{σ_{c}^{2} / n_{c} + σ_{t}^{2} / n_{t}}, \\ 0, & otherwise . \end{matrix}

(3.3)

Assume the sampling prior $π^{(s)} (θ) = π^{(s)} (μ_{c}) π^{(s)} (μ_{t})$ such that $\int_{- \infty}^{\infty} π^{(s)} (μ_{c}) d μ_{c} = 1$ and $\int_{- \infty}^{\infty} π^{(s)} (μ_{t}) d μ_{t} = 1$ . Since ${\bar{y}}_{t} - {\bar{y}}_{c} ∣ μ_{c}, μ_{t} \sim N (μ_{t} - μ_{c}, σ_{c}^{2} / n_{c} + σ_{t}^{2} / n_{t})$ , using (3.3), the Bayesian power function (2.5) then becomes

β_{s}^{(n)} = E_{π^{(s)}} [ψ (y^{(n)})] = E_{π^{(s)}} [Φ (\frac{μ_{t} - μ_{c}}{\sqrt{σ_{c}^{2} / n_{c} + σ_{t}^{2} / n_{t}}} - Z_{γ})] = \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} π^{(s)} (μ_{c}) π^{(s)} (μ_{t}) \times Φ (\frac{μ_{t} - μ_{c}}{\sqrt{σ_{c}^{2} / n_{c} + σ_{t}^{2} / n_{t}}} - Z_{γ}) d μ_{c} d μ_{t} .

(3.4)

Taking $π^{(s)} (θ) = π_{0}^{(s)} (θ)$ such that $\int_{- \infty}^{\infty} \int_{μ_{t}}^{\infty} π^{(s)} (μ_{c}) π^{(s)} (μ_{t}) d μ_{c} d μ_{t} = 1$ , using (3.4), the Bayesian type I error is given by

β_{s 0}^{(n)} = \int_{- \infty}^{\infty} \int_{μ_{t}}^{\infty} π^{(s)} (μ_{c}) π^{(s)} (μ_{t}) \times Φ (\frac{μ_{t} - μ_{c}}{\sqrt{σ_{c}^{2} / n_{c} + σ_{t}^{2} / n_{t}}} - Z_{γ}) d μ_{c} d μ_{t} .

It is easy to show that

β_{s 0}^{(n)} \leq Φ (- Z_{γ}) = 1 - Φ (Z_{γ}) = 1 - γ .

(3.5)

It is easy to see that the maximum type I error is attained at the boundary $δ = μ_{t} - μ_{c} = 0$ by specifying the sampling prior $π_{0}^{(s)} (θ)$ as a degenerate distribution at δ = 0, denoted by $Δ_{{δ = 0}}$ , i.e., $P (δ = 0) = 1$ . Furthermore, we take a point mass sampling prior

π_{1}^{(s)} = Δ_{{δ = δ_{1}}},

(3.6)

which is a degenerate distribution at δ = δ₁, where δ₁ > 0. Using (3.4) and (3.6), the Bayesian power is given by

β_{s 1}^{(n)} = Φ (\frac{δ_{1}}{\sqrt{σ_{c}^{2} / n_{c} + σ_{t}^{2} / n_{t}}} - Z_{γ}) .

(3.7)

We have the following results for the Bayesian type I error and power.

Result 1. The maximum Bayesian type I error is 1 − γ and is attained at the boundary of the parameter space corresponding to the null hypothesis.

Result 2. Using the point mass sampling prior in (3.6), for any choice of δ ∈ Θ₁ , the Bayesian power is equal to the frequentist power given by

Φ (\frac{δ_{1}}{\sqrt{σ_{c}^{2} / n_{c} + σ_{t}^{2} / n_{t}}} - Z_{1 - α_{0}}),

(3.8)

where 0 < α₀ < 1 is the type I error.

Suppose we specify the maximum type I error at α₀ = 0.025 , the power at 1 − α₁ = 0.8, and the randomization ratio with n_t : n_c = 2 : 1. Assume that $σ_{t}^{2} = σ_{c}^{2} = 25$ , and δ₁ = 3.5. Solving

β_{s 1}^{(n)} = Φ (\frac{δ_{1}}{\sqrt{σ_{c}^{2} / n_{c} + σ_{t}^{2} / n_{t}}} - Z_{γ}) > 1 - α_{1}

gives $n_{c} > 3 \times 5^{2} \times {(Z_{γ} + Z_{1 - α_{1}})}^{2} / (2 \times δ_{1}^{2}) = 24.03$ . Thus, the required sample sizes are n_t = 50 and n_c = 25.

3.2. Choice of Noninformative Priors and Model Misspecification

Under the unequal variances assumption, $θ = {(μ_{c}, μ_{t}, σ_{c}^{2}, σ_{t}^{2})}^{'}$ , while under the equal variances assumption, $θ = {(μ_{c}, μ_{t}, σ^{2})}^{'}$ with $σ^{2} = σ_{c}^{2} = σ_{t}^{2}$ . For each case, the fitting prior is specified as $π^{(f)} (θ) \propto {(σ_{c}^{2} σ_{t}^{2})}^{- m}$ and $π^{(f)} (θ) \propto {(σ^{2})}^{- m}$ , respectively, where m = 0 corresponds to a uniform prior, m = 1 corresponds to a reference prior, and m = 3/2 corresponds to a Jeffreys’s prior. The detailed derivation of the Bayesian power function in (2.5) under each of these priors is given in Appendix A. We set n_t = 50, n_c = 25, $μ_{c} = 0, γ = 0.975$ and N = 10⁶ in the computational algorithm in Section 2 for calculating $β_{s}^{(n)}$ in the following calculations.

The Bayesian type I errors and powers with three noninformative fitting priors are given in Tables 1 and 2 under the models assuming unequal and equal variances, respectively. Several interesting observations are seen from these two tables. First, we see from the right block labeled with “Assuming Equal Variances” of Table 1 that (i) when $σ_{c}^{2} > σ_{t}^{2}$ (or $σ_{c}^{2} < σ_{t}^{2}$ ), the type I errors are 0.0281, 0.0325, and 0.0275 (or 0.0195, 0.0164, and 0.0199) under the uniform prior; (ii) when $σ_{c}^{2} = σ_{t}^{2}$ , the type I errors are 0.0236, 0.0235 and 0.0236, respectively, under the uniform prior; and (iii) the similar results are obtained under the reference prior and the Jeffreys’s prior, respectively. Thus, when the fitted model is misspecified, the type I errors are greater than 0.025 when one arm (“control”) has a smaller sample size coupled with a larger variance than another arm (“test”). On the contrary, the type I errors are less than 0.025 when one arm (“test”) has a larger sample size coupled with a larger variance than another arm (“control”). Second, we see from the left block labeled with “Assuming Unequal Variances” of Table 1 that the type I errors are smaller than, slightly smaller than, and almost at 0.025 under the uniform prior, the reference prior, and the Jeffreys’s prior, respectively, when we fit the model with unequal variances while the true model has equal variances. For example, the type I errors are 0.0195, 0.0235, and 0.0257 under these three priors, respectively, when we fit the model with unequal variances while the true model has $σ_{c}^{2} = σ_{t}^{2} = 20.25$ .

Table 1.

Bayesian type I errors with noninformative fitting priors.

	Assuming Unequal Variances			Assuming Equal Variances
	$σ_{t}^{2}$			$σ_{t}^{2}$
	16	20.25	25	16	20.25	25
$σ_{c}^{2}$	Uniform Prior (m = 0)
16	0.0196	0.0196	0.0195	0.0236	0.0195	0.0164
20.25	0.0196	0.0195	0.0197	0.0281	0.0235	0.0199
25	0.0196	0.0196	0.0196	0.0325	0.0275	0.0236
$σ_{c}^{2}$	Reference Prior (m = 1)
16	0.0231	0.0233	0.0232	0.0247	0.0207	0.0176
20.25	0.0234	0.0235	0.0234	0.0299	0.0250	0.0213
25	0.0235	0.0237	0.0234	0.0340	0.0293	0.0249
$σ_{c}^{2}$	Jeffreys’s Prior (m = 3/2)
16	0.0252	0.0253	0.0248	0.0257	0.0216	0.0180
20.25	0.0257	0.0257	0.0252	0.0307	0.0259	0.0220
25	0.0257	0.0257	0.0254	0.0349	0.0302	0.0259

Open in a new tab

Note that the sampling prior $π_{0}^{(s)} (δ) = Δ_{{δ = 0}}$ is used in all calculations.

Table 2.

Bayesian powers with noninformative fitting priors.

	Assuming Unequal Variances			Assuming Equal Variances
	$σ_{t}^{2}$			$σ_{t}^{2}$
	16	20.25	25	16	20.25	25
$σ_{c}^{2}$	Uniform Prior (m = 0)
16	92.31%	90.17%	87.59%	93.79%	90.48%	86.22%
20.25	87.42%	85.07%	82.37%	90.98%	87.38%	83.01%
25	81.62%	79.30%	76.63%	87.80%	84.05%	79.70%
$σ_{c}^{2}$	Reference Prior (m = 1)
16	93.46%	91.51%	89.09%	94.13%	90.99%	86.92%
20.25	89.10%	86.90%	84.36%	91.40%	87.96%	83.71%
25	83.83%	81.58%	79.10%	88.25%	84.68%	80.50%
$σ_{c}^{2}$	Jeffreys’s Prior (m = 3/2)
16	93.90%	92.01%	89.71%	94.27%	91.19%	87.18%
20.25	89.75%	87.65%	85.24%	91.58%	88.25%	84.14%
25	84.73%	82.62%	80.17%	88.50%	85.02%	80.87%

Open in a new tab

Note that the sampling prior $π_{1}^{(s)} (δ) = Δ_{{δ = 3.5}}$ is used in all calculations.

With noninformative priors, a decreased type I error typically corresponds to a lower power, while an increased type I error is associated with a higher power. Tables 1 and 2 exactly show these patterns. For example, the powers are 87.80%, 88.25%, and 88.50% under the uniform prior, the reference prior, and the Jeffreys’s prior, respectively, when we fit the model with equal variances while the true model has $σ_{c}^{2} = 25$ and $σ_{t}^{2} = 16$ . For $σ_{c}^{2} = 25$ and $σ_{t}^{2} = 16$ , the powers are 81.62%, 83.83%, and 84.73%, respectively, under these three priors when we fit the model with unequal variances, which are lower than those when we fit the model with equal variances. Note that in this case, the model assuming equal variances lead to increased type I errors, as discussed earlier. Thus, when the true variances are unequal, the misspecified model assuming equal variances could lead to a substantial decrease or increase of the type I error and power, depending on whether $σ_{c}^{2} < σ_{t}^{2}$ or $σ_{c}^{2} > σ_{t}^{2}$ . Moreover, the different noninformative priors lead to different type I errors and powers. From Tables 1 and 2, we see that the type I errors and powers under the uniform prior and the reference prior are lower than those under the Jeffreys’s prior, even when the fitted model is correctly specified. Furthermore, the type I errors and powers under the Jeffreys’s prior are very close to those under the frequentist method. Therefore, the Jeffreys’s prior is a more desirable noninformative prior than the other two noninformative priors we consider. Subsequently, the Jeffreys’s prior will be used as a default noninformative initial prior in constructing the informative priors.

Using the frequentist method, we use simulation to calculate type I error and power under the models with equal and unequal variances. We first introduce the following notation: let $S_{c} = {\frac{1}{n_{c} - 1} Σ_{i = 1}^{n_{c}} {(y_{c i} - {\bar{y}}_{c})}^{2}}^{1 / 2}$ and $S_{t} = {\frac{1}{n_{t} - 1} Σ_{j = 1}^{n_{t}} {(y_{t j} - {\bar{y}}_{t})}^{2}}^{1 / 2}$ denote the sample standard deviations, respectively, for the control and test arms. Also let $se ({\bar{y}}_{t} - {\bar{y}}_{c}) = \sqrt{\frac{(n_{c} - 1) S_{c}^{2} + (n_{t} - 1) S_{t}^{2}}{n_{c} + n_{t} - 2} (1 / n_{c} + 1 / n_{t})}$ and $se ({\bar{y}}_{t} - {\bar{y}}_{c}) = \sqrt{S_{c}^{2} / n_{c} + S_{t}^{2} / n_{t}}$ denote the standard errors, respectively, under the models with equal and unequal variances, where ${\bar{y}}_{c}$ and ${\bar{y}}_{t}$ are defined in Section 3.1. Denote the critical value by $t_{1 - α_{0}}$ such that $F_{t_{df}} (t_{1 - α_{0}}) = 1 - α_{0}$ , where $0 < α_{0} < 1, F_{t_{df}}$ is the cdf of a central-t distribution with degrees of freedom df, which is taken as n_t + n_c − 2 or $\frac{{(S_{c}^{2} / n_{c} + S_{t}^{2} / n_{t})}^{2}}{S_{c}^{4} / [n_{c}^{2} (n_{c} - 1)] + S_{t}^{4} / [n_{t}^{2} (n_{t} - 1)]}$ for the model with equal or unequal variances.

Using the corresponding df and $se (\bar{y_{t}} - \bar{y_{c}})$ for the model assuming equal variances or the model assuming unequal variances, the simulation algorithm is given as follows:

Step 1. Generate random samples y_c from $N (μ_{c}, σ_{c}^{2})$ and y_t from $N (μ_{c}, σ_{t}^{2})$ for the type I error or $N (μ_{t}, σ_{t}^{2})$ for the power;

Step 2. Calculate ${\bar{y}}_{c}$ , S_c, ${\bar{y}}_{t}$ , S_t, and $se ({\bar{y}}_{t} - {\bar{y}}_{c})$ for the corresponding model;

Step 3. Calculate test statistic $T^{*} = \frac{{\bar{y}}_{t} - {\bar{y}}_{c}}{se ({\bar{y}}_{t} - {\bar{y}}_{c})}$ and its corresponding degrees of freedom df; Compare T* with $t_{1 - α_{0}}$ for the corresponding model;

Step 4. Repeat Steps 1-3 for 10⁷ times. The type I error or power is the proportion of $T^{*} \geq t_{1 - α_{0}}$ .

For the case of $α_{0} = 0.025$ , $n_{t} = 50$ , n_c = 25, μ_c = 0, $σ_{c}^{2} = 25$ , and δ = 3.5 , Figure 2 (a) shows the type I error under the model assuming equal variances. When the true $σ_{t}^{2}$ is different from $σ_{c}^{2}$ , the type I error is increased in $σ_{c}^{2} - σ_{t}^{2}$ when $σ_{t}^{2} < σ_{c}^{2} = 25$ and is decreased when $σ_{t}^{2} > σ_{c}^{2} = 25$ under the wrong model. The type I errors under the model assuming unequal variances stay at 0.025 since the model is correctly specified. Figure 2 (b) shows the power under the model assuming equal variances and unequal variances. Compared with the powers under the model with unequal variances (cyan line), the powers under the model with equal variances (red line) are increased when $σ_{t}^{2} < σ_{c}^{2} = 25$ and decreased when $σ_{t}^{2} > σ_{c}^{2} = 25$ . Thus, the misspecified model may increase or decrease the frequentist type I error and power, which is similar to the Bayesian framework.

Fig. 2 — Plots of frequentist type I errors and powers versus *σ_t*.

4. Bayesian SSD with Informative Priors

4.1. Leveraging Historical Data

Historical data makes its impact on the fitting priors by various ways, including but not limited to full borrowing (Ibrahim and Chen, 2000), partial borrowing (Ibrahim et al., 2012; Chen et al., 2014; Ibrahim et al., 2015), dynamic borrowing (Viele et al., 2014; Pan et al., 2017; Lim et al., 2018), conditional borrowing (Allocco et al., 2010), and propensity score matching (Wang et al., 2019). Suppose that the historical data from a natural history study consists of the sample size of 44, the sample mean of −0.18, and the sample standard deviation of 3.38 for the primary endpoint. We next discuss how to leverage the historical data in Bayesian SSD via informative priors.

We first examine the effect of the amount of borrowing of the historical data on the gain in power within the frequentist SSD framework. For a randomized controlled trial, suppose the starting sample size ratio of the test arm versus the concurrent control arm is 2 : 1. When the concurrent control is augmented using historical data by 50% or 100% of the concurrent control sample size, the augmented sample size ratios of the test arm versus the concurrent control arm are then 2 : 1.5 or 2 : 2, respectively. Set δ = 3.5 and $σ_{t}^{2} = 25$ . Table 3 show the frequentist powers for $σ_{c}^{2} = 11.42$ , 16, 20.25 and 25; n_t = 32, 40, 44, and 52; and $n_{t} : n_{c} = 2 : 1, n_{t} : n_{c} = 2 : 1.5$ , and n_t : n_c = 2 : 2. We notice that we assume (i) all random samples for the control arm in Table 3 are from the same normal distribution $N (μ_{c}, σ_{c}^{2})$ and (ii) the model with unequal variances is used in all power calculations. From Table 3, we observe that (i) when $σ_{c}^{2} = 11.42$ , the powers are 79.8%, 86.5%, and 89.7% for n_t : n_c = 2 : 1, n_t : n_c, = 2 : 1.5, and n_t : n_c = 2 : 2, respectively, and the gains in power are 6.7% and 3.2% when the sample sizes on the control arm increase from 16 to 24 and 24 to 32, respectively; and (ii) the results remain similar for the other three values of $σ_{c}^{2}$ . Thus, in all of these four cases, the most gain in power is achieved at the first 50% increase in sample size on the control arm, i.e., when n_t : n_c = 2 : 1 increases to n_t : n_c = 2 : 1.5.

Table 3.

Frequentist powers with different amounts of borrowing $(δ = 3.5, σ_{t}^{2} = 25)$ .

	n_t : n_c = 2 : 1		n_t : n_c = 2 : 1.5		n_t : n_c = 2 : 2
$σ_{c}^{2}$	N	Power	N	Power	N	Power
11.42	32 + 16 = 48	79.8%	32 + 24 = 56	86.5%	32 + 32 = 64	89.7%
16	40 + 20 = 60	81.9%	40 + 30 = 70	89.4%	40 + 40 = 80	92.7%
20.25	44 + 22 = 66	80.2%	44 + 33 = 77	88.8%	44 + 44 = 88	92.7%
25	52 + 26 = 78	82.1%	52 + 39 = 91	90.5%	52 + 52 = 104	94.2%

Open in a new tab

We next explore the amount of borrowing within the Bayesian framework. The power prior and the robust mixture prior are considered to leverage the historical data. Let $y_{0} = {(y_{01}, y_{02}, \dots, y_{0 n_{0}})}^{'}$ be the historical data with sample size of n₀ from the control arm, and assume $y_{0 i} \overset{i . i . d .}{\sim} N (μ_{c}, σ_{c}^{2})$ . Let ${\bar{y}}_{0}$ and $S_{0 c}^{2}$ denote the sample mean and the sample variance of y₀. Write the historical data as $D_{0} = (n_{0}, {\bar{y}}_{0}, S_{0 c}^{2})$ . Denote the likelihood of the historical data by $L (μ_{c}, σ_{c}^{2} ∣ D_{0})$ . Under the unequal variances assumption, $θ = {(μ_{c}, μ_{t}, σ_{c}^{2}, σ_{t}^{2})}^{'}$ . The power prior (Ibrahim and Chen, 2000) is given by

π^{(f)} (μ_{c}, σ_{c}^{2} ∣ D_{0}, a_{0}) \propto L {(μ_{c}, σ_{c}^{2} ∣ D_{0})}^{a_{0}} π_{0}^{(f)} (μ_{c}, σ_{c}^{2}),

(4.1)

where $π_{0}^{(f)} (μ_{c}, σ_{c}^{2})$ is an initial prior and 0 ≤ a₀ ≤ 1 is the discounting parameter, which determines the amount of borrowing. The robust mixture prior (Greenhouse and Waserman, 1995; Ye et al., 2020) given p₀ is defined by

π^{(f)} (μ_{c}, σ_{c}^{2} ∣ D_{0}, p_{0}) = p_{0} π^{(f)} (μ_{c}, σ_{c}^{2} ∣ D_{0}) + (1 - p_{0}) π_{non}^{(f)} (μ_{c}, σ_{c}^{2}),

(4.2)

where $π^{(f)} (μ_{c}, σ_{c}^{2} ∣ D_{0}) = π^{(f)} (μ_{c}, σ_{c}^{2} ∣ D_{0}, a_{0} = 1)$ with $π_{0}^{(f)} (μ_{c}, σ_{c}^{2}) \propto {(σ_{c}^{2})}^{- 3 / 2}$ which is the power prior in (4.1) with a₀ = 1 and the weight $0 \leq p_{0} \leq 1$ determines the amount of borrowing. In (4.2), the skeptical prior when p₀ = 0 is specified as $π_{non}^{(f)} (μ_{c}, σ_{c}^{2}) = π_{non}^{(f)} (μ_{c} ∣ σ_{c}^{2}) π_{non}^{(f)} (σ_{c}^{2})$ , where $π_{non}^{(f)} (μ_{c} ∣ σ_{c}^{2}) \propto \frac{1}{σ_{c}} \exp {- μ_{c}^{2} / (2 \times 1000 σ_{c}^{2})}$ and $π_{non}^{(f)} (σ_{c}^{2}) \propto \frac{1}{{(σ_{c}^{2})}^{1.001}} \times \exp (- 0.001 / σ_{c}^{2})$ , which correspond to the normal distribution $N (0, 1000 σ_{c}^{2})$ and the inverse gamma distribution IG (0.001, 0.001). We further assume that $(μ_{c}, σ_{c}^{2})$ and $(μ_{t}, σ_{t}^{2})$ are independent a priori. Thus, we have $π^{(f)} (μ_{c}, σ_{c}^{2}, μ_{t}, σ_{t}^{2}) = π^{(f)} (μ_{c}, σ_{c}^{2}) π^{(f)} (μ_{t}, σ_{t}^{2})$ . Since the historical data are available only for the control arm, $π^{(f)} (μ_{c}, σ_{c}^{2})$ is taken as the robust mixture prior given in (4.2) and a non-informative prior, $π^{(f)} (μ_{t}, σ_{t}^{2}) \propto {(σ_{t}^{2})}^{- 3 / 2}$ , is specified for $(μ_{t}, σ_{t}^{2})$ .

Figure 3 (a) and (b) show the type I errors and the powers for different μ_c’s using the power prior given in (4.1) with a₀ = 0 (red lines), a₀ = 0.18 (green lines) and a₀ = 0.36 (blue lines) when n_t = 32, n_c = 16, $n_{0} = 44, σ_{t}^{2} = 25, σ_{c}^{2} = 11.42, {\bar{y}}_{0} = - 0.18, S_{0 c}^{2} = 11.42$ , and δ = 3.5. Comparing the power gain from a₀ = 0 to a₀ = 8 / 44 = 0.18, and the power gain from a₀ = 8 / 44 = 0.18 to a₀ = 16 / 44 = 0.36, we see from Figure 3 (b) most gain in Bayesian power is achieved at the first 50% increase in sample size on the concurrent control arm, i.e., when a₀ = 0 increases to a₀ = 0.18. Specifically, the powers with a₀ = 0, 0.18, and 0.36 are 79.78%, 85.92%, and 88.73%, respectively, at μ_c = −0.51; 79.79%, 87.99%, and 91.51%, respectively, at μ_c = −0.18; and 79.76%, 91.59%, and 95.63%, respectively, at μ_c = 0.52. Thus, the power gains from a₀ = 0 to a₀ = 0.18 are 6.14%, 8.20%, and 11.83%, respectively, at μ_c = −0.51, −0.18, and 0.52 while the power gains from a₀ = 0.18 to a₀ = 0.36 are 2.81%, 3.52%, and 4.04%, respectively, at μ_c = −0.51, −0.18, and 0.52. Note that the type I errors with a₀ = 0, 0.18, and 0.36 are 0.025, 0.013, and 0.011, respectively, at μ_c = −0.51; 0.025, 0.018, and 0.017, respectively, at μ_c = −0.18; and 0.025, 0.030, and 0.038, respectively, at μ_c = 0.52. In the above discussion, we consider μ_c = −0.51, −0.18 and 0.52, since these values of μ_c lead to a negative bias, no bias, and a positive bias, respectively, compared with the historical mean. In Figure 3 (a), the maximum values of μ_c so that the type I error is less than 0.025 are 0.27 and 0.15, respectively, for a₀ = 0.18 and 0.36. These results indicate that (i) borrowing the historical data may lead to a gain in power and a reduction in type I error at the same time and an example for this scenario is that the power increases but the type I error decreases in a₀ when μ_c = −0.51; (ii) borrowing the historical data may lead to a gain in power and an increase in type I error and, for example, both the power and type I error increase in a₀ when μ_c = 0.52; (iii) the type I error can still be less than 0.025 (a prespecified significance level) even when the historical data and the data from the concurrent control arm are not similar, for example, the type I errors are less than 0.025 when μ_c < 0.27 when a₀ = 0.18; and (iv) when a₀ = 0 , the type I errors are around 0.025 and the powers are about 0.80, which may be due to the fact that a noninformative prior is specified as the initial prior.

Fig. 3 — Plots of Bayesian type I errors and powers with the power prior versus μ_c.

Figure 4 (a) and (b) show the plots of the type I error and power using the robust mixture prior in (4.2) for p₀ = 0 , 0.18, 0.36, 0.5, and 1. The type I errors for p₀ = 0 , 0.18, 0.36, 0.5, and 1 are 0.025, 0.022, 0.019, 0.017, and 0.009, respectively, at μ_c = −0.51; 0.025, 0.024, 0.022, 0.022, and 0.018, respectively, at μ_c = −0.18; and 0.025, 0.031, 0.038, 0.043, and 0.060, respectively, at μ_c = 0.52. The powers for p₀ = 0 , 0.18, 0.36, 0.5, and 1 are 79.82%, 81.92%, 84.13%, 85.83%, and 91.82%, respectively, at μ = −0.51; 79.79%, 82.56%, 85.29%, 87.39%, and 95.08%, respectively, at μ = −0.18; and 79.79%, 83.13%, 86.52%, 89.14% and 98.59%, respectively, at μ_c = 0.52. Compared to those using the power prior, (i) the robust mixture prior leads to less gain in power when p₀ = a₀ <= 0.36; (ii) the powers using the robust mixture prior with p₀ = 0.5 are lower than those using the power prior with a₀ = 0.36 ; (iii) the type I errors are closer to 0.025 using the robust mixture prior than the power prior when p₀ = a₀ <= 0.36 ; and (iv) the skeptical prior given in (4.2) is quite noninformative in the sense that the type I errors are around 0.025 and the powers remain around 80% for all μ_c.

Fig. 4 — Plots of Bayesian type I errors and powers with the robust mixture prior versus μ_c.

4.2. Conditional Borrowing

Under the assumption of compatibility between the natural history data and the concurrent control data, the conditions on the primary end point can be further imposed in borrowing the historical data. We take

T (y^{(n)}) = P (δ > 0 ∣ y^{(n)}, π^{(f_{1})}) 1 {y^{(n)} \in A} + P (δ > 0 ∣ y^{(n)}, π^{(f_{2})}) 1 {y^{(n)} \in A^{C}},

(4.3)

where $π^{(f_{1})}$ and $π^{(f_{2})}$ are two fitting priors, A denotes a subset of the sample space induced by y⁽ⁿ⁾ , and A^C is the complement of A. We let T₀ = γ , where 0 < γ < 1 is a Bayesian credible level. Thus the Bayesian power function is

β_{s}^{(n)} = E_{π^{(s)}} [1 {T (y^{(n)}) \geq γ}] = E_{π^{(s)}} [1 {P (δ > 0 ∣ y^{(n)}, π^{(f_{1})}) \geq γ} 1 {y^{(n)} \in A} + 1 {P (δ > 0 ∣ y^{(n)}, π^{(f_{2})}) \geq γ} 1 {y^{(n)} \in A^{C}}] .

Suppose we impose a conditional borrowing region of $\pm S_{0 c} / (2 \sqrt{n_{0}})$ for the mean and $\pm S_{0 c}^{2} / \sqrt{2 (n_{0} - 1)}$ for the standard deviation of the concurrent control. Thus, $A = {y^{(n)} : {\bar{y}}_{0} - S_{0 c} / (2 \sqrt{n_{0}}) < {\bar{y}}_{c} < \bar{y_{0}} + S_{0 c} / (2 \sqrt{n_{0})}, S_{0 c}^{2} - S_{0 c}^{2} / \sqrt{2 (n_{0} - 1)} < S_{c}^{2} < S_{0 c}^{2} + S_{0 c}^{2} / \sqrt{2 (n_{0} - 1)}}$ . Using the historical data $D_{0} = {n_{0} = 44, {\bar{y}}_{0} = - 0.18, S_{0 c}^{2} = 11.42}$ in Section 4.1, $A = {y^{(n)} : - 0.43 < {\bar{y}}_{c} < 0.07, 10.19 < S_{c}^{2} < 12.66}$ . In (4.3), we specify the power prior in (4.1) for $π^{(f_{1})}$ and the Jeffreys’s prior ${(σ_{t}^{2} σ_{c}^{2})}^{- 3 / 2}$ for $π^{(f_{2})}$ . Figure 5 plots the type I errors and powers versus μ_c for a₀ = 0, 0.18, and 0.36 with conditional borrowing when n_t = 32, n_c = 16, $n_{0} = 44$ , $σ_{t}^{2} = 25$ , $σ_{c}^{2} = 11.42$ , ${\bar{y}}_{0} = - 0.18$ , $S_{0 c}^{2} = 11.42$ , and δ = 3.5.

Fig. 5 — Plots of Bayesian type I errors and powers with conditional borrowing versus μ_c.

Note that the powers with a₀ = 0, 0.18, and 0.36 are 79.79%, 80.11%, and 80.25%, respectively, at μ_c = −0.51; 79.79%, 80.04%, and 80.13%, respectively, at μ_c = −0.18; and 79.79%, 79.84%, and 79.86%, respectively, at μ_c = 0.52. Thus, the power gains from a₀ = 0 to a₀ = 0.18 are 0.32%, 0.25%, and 0.05%, respectively, at μ_c = −0.51, −0.18, and 0.52 while the power gains from a₀ = 0.18 to a₀ = 0.36 are close to 0 at μ_c = −0.51, −0.18, and 0.52. Also note that the type I errors with a₀ = 0 , 0.18, and 0.36 are 0.025, 0.025, and 0.025, respectively, at μ_c = −0.51; 0.025, 0.025, and 0.025, respectively, at μ_c = −0.18; and 0.025, 0.025, and 0.026, respectively, at μ_c = 0.52. Compared with those obtained using the power prior and the robust mixture prior shown in Figures 3 and 4, (i) the type I errors shown in Figure 5 are better controlled with a small increase toward larger μ_c; and (ii) the powers shown in Figure 5 are lower than those under unconditional borrowing.

4.3. Borrowing-by-Parts Power Priors

Let $D = (n_{t}, {\bar{y}}_{t}, S_{t}^{2}, n_{c}, {\bar{y}}_{c}, S_{c}^{2})$ denote the current data, where ${\bar{y}}_{c}$ and ${\bar{y}}_{t}$ are the sample means and $S_{c}^{2}$ and $S_{t}^{2}$ are the sample variances, respectively, for the concurrent control and the test arm. In (4.1), the historical data is borrowed all together via the power prior. A new variation of the power prior is the borrowing-by-parts power prior via distinct discounting parameters a₀₁ and a₀₂, given by

π^{(f)} (μ_{c}, σ_{c}^{2} ∣ D_{0}, a_{0}) \propto {[{(σ_{c}^{2})}^{- 1 / 2} \exp {- \frac{n_{0} {({\bar{y}}_{0} - μ_{c})}^{2}}{2 σ_{c}^{2}}}]}^{a_{01}} \times {[{(S_{0 c}^{2})}^{(n_{0} - 3) / 2} {(σ_{c}^{2})}^{- (n_{0} - 1) / 2} \exp {- \frac{(n_{0} - 1) S_{0 c}^{2}}{2 σ_{c}^{2}}}]}^{a_{02}} π_{0}^{(f)} (μ_{c}, σ_{c}^{2}),

(4.4)

where $0 \leq a_{01}, a_{02} \leq 1$ . In (4.4), the distinct discounting parameters a₀₁ and a₀₂ are used in borrowing the mean and the variance, respectively. In the case when the mean of the concurrent control is consistent with that of the historical data, however the variance of the concurrent control is not consistent with that of the historical data, a₀₁ > 0 and a₀₂ = 0 can be specified for borrowing the mean part but not borrowing the variance part. Compared with full borrowing, where a₀₁ = a₀₂ > 0 , borrowing only the mean part where a₀₁ > 0 and a₀₂ = 0 allows for achieving a desirable power while controlling the type I error at the same time.

Table 4 reports the type I errors and powers using the borrowing-by-parts power prior with consistent and inconsistent variances of the concurrent control compared with that of the historical control for n_t = 32, n_c = 16, $n_{0} = 44, {\bar{y}}_{0} = - 0.18, S_{0 c}^{2} = 11.42, μ_{c} = 0$ , and Jeffreys’s prior. When $σ_{t}^{2} = σ_{c}^{2} = 25$ , we see from Table 4 that (i) the type I error decreases and the power increases in a₀₁ when a₀₂ = 0; (ii) both the type I error and the power increase in a₀₂ when a₀₁ = 0; and (iii) the type I errors are less than 0.025 but the power increases considerably when a₀₁ = a₀₂ gets larger. When $σ_{t}^{2} = 25$ and $σ_{c}^{2} = 11.42$ , we also see from Table 4 that (i) the type I errors are controlled at 0.025 and the powers under various values of a₀₁ and a₀₂ such that a₀₁ + a₀₂ > 0 (borrowing) are greater than those under a₀₁ = a₀₂ = 0 (no borrowing); (ii) the gain in the power is incremental when a₀₂ > 0 and a₀₁ = 0. These results indicate that borrowing the mean part of the historical data only or the whole historical data is more effective in increasing the power than borrowing the variance part of the historical data only. The results with μ_c = −1 and μ_c = 1 are given in Appendix C. Figure 6(a) plots the type I error against a₀₁ and a₀₂ and Figure 6(b) plots the type I error against $a_{01} = a_{02} = a_{0}$ for $σ_{t}^{2} = σ_{c}^{2} = 25$ . When $a_{01} = a_{02} = a_{0}$ , the type I error stays at or below 0.025 as a₀ increases. When a₀₂ = 0 , the type I error first decreases and then increases slightly as a₀₁ increases. When a₀₁ = 0 , the type I error increases as a₀₂ increases.

Table 4.

Bayesian type I errors and powers using the borrowing-by-parts power prior.

$μ_{c} = 0, σ_{t}^{2} = 25, σ_{c}^{2} = 25$
a ₀₁	a ₀₂	Type I Error	Power
			δ = 3.5	δ = 4	δ = 4.5
0.00	0.00	0.0255	60.44%	71.93%	81.50%
0.18	0.00	0.0162	74.71%	85.46%	92.60%
0.36	0.00	0.0143	83.01%	91.82%	96.62%
0.00	0.18	0.0329	65.78%	76.67%	85.27%
0.00	0.36	0.0376	68.42%	78.88%	86.99%
0.18	0.18	0.0213	78.97%	88.55%	94.50%
0.36	0.36	0.0212	87.43%	94.44%	97.93%
$μ_{c} = 0, σ_{t}^{2} = 25, σ_{c}^{2} = 11.42$
a ₀₁	a ₀₂	Type I Error	Power
			δ = 3.5	δ = 4	δ = 4.5
0.00	0.00	0.0247	79.79%	89.11%	94.84%
0.18	0.00	0.0202	88.68%	95.25%	98.33%
0.36	0.00	0.0205	92.40%	97.30%	99.21%
0.00	0.18	0.0243	80.24%	89.49%	95.09%
0.00	0.36	0.0242	80.45%	89.64%	95.18%
0.18	0.18	0.0202	89.03%	95.47%	98.45%
0.36	0.36	0.0206	92.72%	97.47%	99.29%

Open in a new tab

Fig. 6 — Bayesian type I errors with borrowing mean and/or variance as a function of $(a_{01}, a_{02})$ (a) and $a_{0} = a_{01} = a_{02}$ (b).

5. Discussion

In this paper, we develop a roadmap of Bayesian SSD as shown in Figure 1. We analytically explore the properties of the Bayesian type I error and power with noninformative priors when the variances are known. We also examine the impact of model misspecification and the choice of noninformative priors on the Bayesian type I error and power. For misspecified models, although it is a common practice that the variances for the test group and the control group are assumed to be equal under the normal distributions for a superiority trial, we empirically show that the type I error and power can be increased or decreased depending on the relationship of $σ_{c}^{2}$ and $σ_{t}^{2}$ . The consequences of model misspecification are consistent for both frequentist and Bayesian methods. Also the choice of the priors matters even with noninformative priors.

For a 2 : 1 randomized controlled trial, we show that the first half in the amount of borrowing leads to more power gain than the second half in the amount of borrowing for both frequentist and Bayesian methods. This would be worth of consideration for both economical and practical point of view. We further demonstrate the risks and benefits of conditional borrowing. The type I error can be protected by the conditional borrowing, however, the power is lowered at the same time. We note that the conditional borrowing approach can be extended to the case in which multiple historical data sets are available (see the detailed elaboration in Appendix D of the Supplementary Materials).

We develop borrowing-by-parts power priors for incorporating the historical data in Bayesian SSD. The likelihood function is partitioned into the part of the parameter of primary interest and the part of the nuisance parameter, which are the mean and the variance, respectively, in the normal distribution case. By using separate discounting parameters a₀₁ and a₀₂, the historical data can be borrowed by either the mean part, or the variance part, or both. Although the borrowing-by-parts power priors are developed under the normal models, these priors can also be constructed under more general normal regression models or even more complex joint longitudinal and survival models such as those considered in Zhang et al. (2014, 2017); and Sheikh et al. (2021). As shown in Figure 3 and Table 4, borrowing the historical data can lead to inflation of type I errors when the concurrent control mean is greater than the sample mean of the historical data in certain magnitude or the variance of the future outcomes in the concurrent control arm quite differs from the sample variance of the historical data. The conditional borrowing approach discussed in Section 4.2 is quite effective in preventing inflation of type I errors, however, this approach also leads to much smaller gain in the power. Embedding the robust mixture prior and the borrowing-by-parts power priors into the conditional borrowing framework may yield a more promising approach which has a better control of the type I error and at the same time results in more gain in the power.

Although we assume that the historical data are available from the control arm, the proposed methodology can be extended to a more general case, in which the historical data are available for both the investigated product and control arms as considered in Chen et al. (2014). In this case, the borrowing-by-parts power priors may be even more attractive, which allows us to leverage different parts of the historical data within and between the investigated product arm and the control arm. When the historical effect such as the mean of the historical control is different from the mean of the concurrent control, the proposed conditional borrowing approach automatically takes this into consideration by essentially leveraging less amount of the historical data. Recently, the empirical profile approach (Wu et al., 2020) and the scale transformed power prior (Nifong et al., 2021) may be potentially more effective in dealing with different effects for the history control comparing with the concurrent control. These approaches can be integrated into our proposed borrowing-by-parts power priors and the conditional borrowing framework, which is another useful extension for the future research.

Supplementary Material

Supp 1

NIHMS1840288-supplement-Supp_1.zip^{(110.2KB, zip)}

Contributor Information

Wenlin Yuan, Department of Statistics, University of Connecticut at Storrs, CT 06269.

Ming-Hui Chen, Department of Statistics, University of Connecticut at Storrs, CT 06269.

John Zhong, REGENXBIO Inc., 9804 Medical Center Drive, Rockville, MD 20850.

References

Allocco DJ, Cannon LA, Britt A, Heil JE, Nersesov A, Wehrenberg S, Dawkins KD, and Kereiakes DJ (2010). A prospective evaluation of the safety and efficacy of the taxus element paclitaxel-eluting coronary stent system for the treatment of de novo coronary artery lesions: design and statistical methods of the perseus clinical program. Trials 11 (1), 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
Berry SM, Carlin BP, Lee JJ, and Muller P (2010). Bayesian adaptive methods for clinical trials. CRC Press. [Google Scholar]
Campbell G (2011). Bayesian statistics in medical devices: innovation sparked by the fda. Journal of Biopharmaceutical Statistics 21 (5), 871–887. [DOI] [PubMed] [Google Scholar]
Chen M-H, Ibrahim JG, Lam P, Yu A, and Zhang Y (2011). Bayesian design of noninferiority trials for medical devices using historical data. Biometrics 67(3), 1163–1170. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen M-H, Ibrahim JG, Zeng D, Hu K, and Jia C (2014). Bayesian design of superiority clinical trials for recurrent events data with applications to bleeding and transfusion events in myelodyplastic syndrome. Biometrics 70 (4), 1003–1013. [DOI] [PMC free article] [PubMed] [Google Scholar]
De Santis F (2004). Statistical evidence and sample size determination for Bayesian hypothesis testing. Journal of Statistical Planning and Inference 124 (1), 121–144. [Google Scholar]
De Santis F (2007). Using historical data for Bayesian sample size determination. Journal of the Royal Statistical Society: Series A (Statistics in Society) 170 (1), 95–113. [Google Scholar]
Gamalo-Siebers M, Gao A, Lakshminarayanan M, Liu G, Natanegara F, Railkar R, Schmidli H, and Song G (2016). Bayesian methods for the design and analysis of noninferiority trials. Journal of Biopharmaceutical Statistics 26 (5), 823–841. [DOI] [PubMed] [Google Scholar]
Ghadessi M, Tang R, Zhou J, Liu R, Wang C, Toyoizumi K, Mei C, Zhang L, Deng C, and Beckman RA (2020). A roadmap to using historical controls in clinical trials–by drug information association adaptive design scientific working group (diaadswg). Orphanet Journal of Rare Diseases 15 (1), 1–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
Goemans N, Signorovitch J, Sajeev G, Yao Z, Gordish-Dressman H, McDonald CM, Vandenborne K, Miller D, Ward SJ, Mercuri E, et al. (2020). Suitability of external controls for drug evaluation in duchenne muscular dystrophy. Neurology 95 (10), e1381–e1391. [DOI] [PMC free article] [PubMed] [Google Scholar]
Greenhouse JB and Waserman L (1995). Robust Bayesian methods for monitoring clinical trials. Statistics in Medicine 14 (12), 1379–1391. [DOI] [PubMed] [Google Scholar]
Hall KT, Vase L, Tobias DK, Dashti HT, Vollert J, Kaptchuk TJ, and Cook NR (2021). Historical controls in randomized clinical trials: opportunities and challenges. Clinical Pharmacology & Therapeutics 109 (2), 343–351. [DOI] [PubMed] [Google Scholar]
Han B, Zhan J, John Zhong Z, Liu D, and Lindborg S (2017). Covariate-adjusted borrowing of historical control data in randomized clinical trials. Pharmaceutical Statistics 16 (4), 296–308. [DOI] [PubMed] [Google Scholar]
Hobbs BP, Carlin BP, Mandrekar SJ, and Sargent DJ (2011). Hierarchical commensurate and power prior models for adaptive incorporation of historical information in clinical trials. Biometrics 67 (3), 1047–1056. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hobbs BP, Sargent DJ, and Carlin BP (2012). Commensurate priors for incorporating historical information in clinical trials using general and generalized linear models. Bayesian Analysis 7 (3), 639. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ibrahim JG and Chen M-H (2000). Power prior distributions for regression models. Statistical Science 15 (1), 46–60. [Google Scholar]
Ibrahim JG, Chen M-H, Gwon Y, and Chen F (2015). The power prior: theory and applications. Statistics in Medicine 34 (28), 3724–3749. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ibrahim JG, Chen M-H, Xia HA, and Liu T (2012). Bayesian meta-experimental design: evaluating cardiovascular risk in new antidiabetic therapies to treat type 2 diabetes. Biometrics 68 (2), 578–586. [DOI] [PMC free article] [PubMed] [Google Scholar]
Inoue LY, Berry DA, and Parmigiani G (2005). Relationship between Bayesian and frequentist sample size determination. The American Statistician 59 (1), 79–87. [Google Scholar]
Joseph L, Wolfson DB, and Berger RD (1995). Sample size calculations for binomial proportions via highest posterior density intervals. Journal of the Royal Statistical Society: Series D (The Statistician) 44 (2), 143–154. [Google Scholar]
Katsis A and Toman B (1999). Bayesian sample size calculations for binomial experiments. Journal of Statistical Planning and Inference 81 (2), 349–362. [Google Scholar]
Lake SL, Quintana MA, Broglio K, Panagoulias J, Berry SM, and Panzara MA (2021). Bayesian adaptive design for clinical trials in duchenne muscular dystrophy. Statistics in Medicine 40 (19), 4167–4184. [DOI] [PubMed] [Google Scholar]
Lee JJ and Liu DD (2008). A predictive probability design for phase ii cancer clinical trials. Clinical Trials 5 (2), 93–106. [DOI] [PMC free article] [PubMed] [Google Scholar]
Li W, Chen M-H, Tan H, and Dey DK (2015). Bayesian design of noninferiority clinical trials with co-primary endpoints and multiple dose comparison. In Applied Statistics in Biomedicine and Clinical Trials Design, pp. 17–33. Springer. [Google Scholar]
Li W, Chen M-H, Wang X, and Dey DK (2018). Bayesian design of noninferiority clinical trials via the bayes factor. Statistics in Biosciences 10 (2), 439–459. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lim J, Walley R, Yuan J, Liu J, Dabral A, Best N, Grieve A, Hampson L, Wolfram J, Woodward P, et al. (2018). Minimizing patient burden through the use of historical subject-level data in innovative confirmatory clinical trials: review of methods and opportunities. Therapeutic Innovation & Regulatory Science 52 (5), 546–559. [DOI] [PubMed] [Google Scholar]
Lindley DV (1997). The choice of sample size. Journal of the Royal Statistical Society: Series D (The Statistician) 46 (2), 129–138. [Google Scholar]
Mah JK, Korngut L, Dykeman J, Day L, Pringsheim T, and Jette N (2014). A systematic review and meta-analysis on the epidemiology of duchenne and becker muscular dystrophy. Neuromuscular Disorders 24 (6), 482–491. [DOI] [PubMed] [Google Scholar]
M’Lan CE, Joseph L, and Wolfson DB (2006). Bayesian sample size determination for case-control studies. Journal of the American Statistical Association 101 (474), 760–772. [Google Scholar]
M’lan CE, Joseph L, Wolfson DB, et al. (2008). Bayesian sample size determination for binomial proportions. Bayesian Analysis 3 (2), 269–296. [Google Scholar]
Neuenschwander B, Capkun-Niggli G, Branson M, and Spiegelhalter DJ (2010). Summarizing historical information on controls in clinical trials. Clinical Trials 7(1), 5–18. [DOI] [PubMed] [Google Scholar]
Nifong B, Psioda MA, and Ibrahim JG (2021). The scale transformed power prior for use with historical data from a different outcome model. arXiv preprint arXiv:2105.05157. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pan H, Yuan Y, and Xia J (2017). A calibrated power prior approach to borrow information from historical data with application to biosimilar clinical trials. Journal of the Royal Statistical Society: Series C (Applied Statistics) 66 (5), 979–996. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rahme E and Joseph L (1998). Exact sample size determination for binomial experiments. Journal of Statistical Planning and Inference 66 (1), 83–93. [Google Scholar]
Reyes EM and Ghosh SK (2013). Bayesian average error-based approach to sample size calculations for hypothesis testing. Journal of Biopharmaceutical Statistics 23 (3), 569–588. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rubin DB and Stern HS (1998). Sample size determination using posterior predictive distributions. Sankhyā: The Indian Journal of Statistics, Series B 60 (1), 161–175. [Google Scholar]
Schmidli H, Gsteiger S, Roychoudhury S, O’Hagan A, Spiegelhalter D, and Neuenschwander B (2014). Robust meta-analytic-predictive priors in clinical trials with historical control information. Biometrics 70 (4), 1023–1032. [DOI] [PubMed] [Google Scholar]
Schmidli H, Häring DA, Thomas M, Cassidy A, Weber S, and Bretz F (2020). Beyond randomized clinical trials: use of external controls. Clinical Pharmacology & Therapeutics 107 (4), 806–816. [DOI] [PubMed] [Google Scholar]
Sheikh MT, Ibrahim JG, Gelfond JA, Sun W, and Chen M-H (2021). Joint modelling of longitudinal and survival data in the presence of competing risks with applications to prostate cancer data. Statistical Modelling 21 (1-2), 72–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
Simon R (1999). Bayesian design and analysis of active control clinical trials. Biometrics 55 (2), 484–487. [DOI] [PubMed] [Google Scholar]
Spiegelhalter DJ, Abrams KR, and Myles JP (2004). Bayesian approaches to clinical trials and health-care evaluation, Volume 13. John Wiley & Sons. [Google Scholar]
U.S. Food and Drug Administration (2010). Guidance for the Use of Bayesian Statistics in Medical Device Clinical Trials. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/guidance-use-bayesian-statistics-medical-device-clinical-trials. Accessed December 2020.
U.S. Food and Drug Administration (2018). Complex Innovative Trial Designs Pilot Program. https://www.fda.gov/drugs/development-resources/complex-innovative-trial-designs-pilot-program. Accessed November 2020.
van Rosmalen J, Dejardin D, van Norden Y, Löwenberg B, and Lesaffre E (2018). Including historical data in the analysis of clinical trials: Is it worth the effort? Statistical Methods in Medical Research 27 (10), 3167–3182. [DOI] [PMC free article] [PubMed] [Google Scholar]
Viele K, Berry S, Neuenschwander B, Amzal B, Chen F, Enas N, Hobbs B, Ibrahim JG, Kinnersley N, Lindborg S, et al. (2014). Use of historical control data for assessing treatment effects in clinical trials. Pharmaceutical Statistics 13 (1), 41–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wang C, Li H, Chen W-C, Lu N, Tiwari R, Xu Y, and Yue LQ (2019). Propensity score-integrated power prior approach for incorporating real-world evidence in single-arm clinical studies. Journal of Biopharmaceutical Statistics 29 (5), 731–748. [DOI] [PubMed] [Google Scholar]
Wang F and Gelfand AE (2002). A simulation-based approach to Bayesian sample size determination for performance under a given model and for separating models. Statistical Science 17 (2), 193–208. [Google Scholar]
Wu Y, Hui J, and Deng Q (2020). Empirical profile Bayesian estimation for extrapolation of historical adult data to pediatric drug development. Pharmaceutical Statistics 19 (6), 787–802. [DOI] [PubMed] [Google Scholar]
Ye J, Reaman G, De Claro RA, and Sridhara R (2020). A Bayesian approach in design and analysis of pediatric cancer clinical trials. Pharmaceutical Statistics 19 (6), 814–826. [DOI] [PubMed] [Google Scholar]
Zhang D, Chen M-H, Ibrahim JG, Boye ME, and Shen W (2017). Bayesian model assessment in joint modeling of longitudinal and survival data with applications to cancer clinical trials. Journal of Computational and Graphical Statistics 26 (1), 121–133. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhang D, Chen M-H, Ibrahim JG, Boye ME, Wang P, and Shen W (2014). Assessing model fit in joint models of longitudinal and survival data with applications to cancer clinical trials. Statistics in Medicine 33 (27), 4715–4733. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp 1

NIHMS1840288-supplement-Supp_1.zip^{(110.2KB, zip)}

[R1] Allocco DJ, Cannon LA, Britt A, Heil JE, Nersesov A, Wehrenberg S, Dawkins KD, and Kereiakes DJ (2010). A prospective evaluation of the safety and efficacy of the taxus element paclitaxel-eluting coronary stent system for the treatment of de novo coronary artery lesions: design and statistical methods of the perseus clinical program. Trials 11 (1), 1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Berry SM, Carlin BP, Lee JJ, and Muller P (2010). Bayesian adaptive methods for clinical trials. CRC Press. [Google Scholar]

[R3] Campbell G (2011). Bayesian statistics in medical devices: innovation sparked by the fda. Journal of Biopharmaceutical Statistics 21 (5), 871–887. [DOI] [PubMed] [Google Scholar]

[R4] Chen M-H, Ibrahim JG, Lam P, Yu A, and Zhang Y (2011). Bayesian design of noninferiority trials for medical devices using historical data. Biometrics 67(3), 1163–1170. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Chen M-H, Ibrahim JG, Zeng D, Hu K, and Jia C (2014). Bayesian design of superiority clinical trials for recurrent events data with applications to bleeding and transfusion events in myelodyplastic syndrome. Biometrics 70 (4), 1003–1013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] De Santis F (2004). Statistical evidence and sample size determination for Bayesian hypothesis testing. Journal of Statistical Planning and Inference 124 (1), 121–144. [Google Scholar]

[R7] De Santis F (2007). Using historical data for Bayesian sample size determination. Journal of the Royal Statistical Society: Series A (Statistics in Society) 170 (1), 95–113. [Google Scholar]

[R8] Gamalo-Siebers M, Gao A, Lakshminarayanan M, Liu G, Natanegara F, Railkar R, Schmidli H, and Song G (2016). Bayesian methods for the design and analysis of noninferiority trials. Journal of Biopharmaceutical Statistics 26 (5), 823–841. [DOI] [PubMed] [Google Scholar]

[R9] Ghadessi M, Tang R, Zhou J, Liu R, Wang C, Toyoizumi K, Mei C, Zhang L, Deng C, and Beckman RA (2020). A roadmap to using historical controls in clinical trials–by drug information association adaptive design scientific working group (diaadswg). Orphanet Journal of Rare Diseases 15 (1), 1–19. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] Goemans N, Signorovitch J, Sajeev G, Yao Z, Gordish-Dressman H, McDonald CM, Vandenborne K, Miller D, Ward SJ, Mercuri E, et al. (2020). Suitability of external controls for drug evaluation in duchenne muscular dystrophy. Neurology 95 (10), e1381–e1391. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] Greenhouse JB and Waserman L (1995). Robust Bayesian methods for monitoring clinical trials. Statistics in Medicine 14 (12), 1379–1391. [DOI] [PubMed] [Google Scholar]

[R12] Hall KT, Vase L, Tobias DK, Dashti HT, Vollert J, Kaptchuk TJ, and Cook NR (2021). Historical controls in randomized clinical trials: opportunities and challenges. Clinical Pharmacology & Therapeutics 109 (2), 343–351. [DOI] [PubMed] [Google Scholar]

[R13] Han B, Zhan J, John Zhong Z, Liu D, and Lindborg S (2017). Covariate-adjusted borrowing of historical control data in randomized clinical trials. Pharmaceutical Statistics 16 (4), 296–308. [DOI] [PubMed] [Google Scholar]

[R14] Hobbs BP, Carlin BP, Mandrekar SJ, and Sargent DJ (2011). Hierarchical commensurate and power prior models for adaptive incorporation of historical information in clinical trials. Biometrics 67 (3), 1047–1056. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] Hobbs BP, Sargent DJ, and Carlin BP (2012). Commensurate priors for incorporating historical information in clinical trials using general and generalized linear models. Bayesian Analysis 7 (3), 639. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Ibrahim JG and Chen M-H (2000). Power prior distributions for regression models. Statistical Science 15 (1), 46–60. [Google Scholar]

[R17] Ibrahim JG, Chen M-H, Gwon Y, and Chen F (2015). The power prior: theory and applications. Statistics in Medicine 34 (28), 3724–3749. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] Ibrahim JG, Chen M-H, Xia HA, and Liu T (2012). Bayesian meta-experimental design: evaluating cardiovascular risk in new antidiabetic therapies to treat type 2 diabetes. Biometrics 68 (2), 578–586. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Inoue LY, Berry DA, and Parmigiani G (2005). Relationship between Bayesian and frequentist sample size determination. The American Statistician 59 (1), 79–87. [Google Scholar]

[R20] Joseph L, Wolfson DB, and Berger RD (1995). Sample size calculations for binomial proportions via highest posterior density intervals. Journal of the Royal Statistical Society: Series D (The Statistician) 44 (2), 143–154. [Google Scholar]

[R21] Katsis A and Toman B (1999). Bayesian sample size calculations for binomial experiments. Journal of Statistical Planning and Inference 81 (2), 349–362. [Google Scholar]

[R22] Lake SL, Quintana MA, Broglio K, Panagoulias J, Berry SM, and Panzara MA (2021). Bayesian adaptive design for clinical trials in duchenne muscular dystrophy. Statistics in Medicine 40 (19), 4167–4184. [DOI] [PubMed] [Google Scholar]

[R23] Lee JJ and Liu DD (2008). A predictive probability design for phase ii cancer clinical trials. Clinical Trials 5 (2), 93–106. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] Li W, Chen M-H, Tan H, and Dey DK (2015). Bayesian design of noninferiority clinical trials with co-primary endpoints and multiple dose comparison. In Applied Statistics in Biomedicine and Clinical Trials Design, pp. 17–33. Springer. [Google Scholar]

[R25] Li W, Chen M-H, Wang X, and Dey DK (2018). Bayesian design of noninferiority clinical trials via the bayes factor. Statistics in Biosciences 10 (2), 439–459. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] Lim J, Walley R, Yuan J, Liu J, Dabral A, Best N, Grieve A, Hampson L, Wolfram J, Woodward P, et al. (2018). Minimizing patient burden through the use of historical subject-level data in innovative confirmatory clinical trials: review of methods and opportunities. Therapeutic Innovation & Regulatory Science 52 (5), 546–559. [DOI] [PubMed] [Google Scholar]

[R27] Lindley DV (1997). The choice of sample size. Journal of the Royal Statistical Society: Series D (The Statistician) 46 (2), 129–138. [Google Scholar]

[R28] Mah JK, Korngut L, Dykeman J, Day L, Pringsheim T, and Jette N (2014). A systematic review and meta-analysis on the epidemiology of duchenne and becker muscular dystrophy. Neuromuscular Disorders 24 (6), 482–491. [DOI] [PubMed] [Google Scholar]

[R29] M’Lan CE, Joseph L, and Wolfson DB (2006). Bayesian sample size determination for case-control studies. Journal of the American Statistical Association 101 (474), 760–772. [Google Scholar]

[R30] M’lan CE, Joseph L, Wolfson DB, et al. (2008). Bayesian sample size determination for binomial proportions. Bayesian Analysis 3 (2), 269–296. [Google Scholar]

[R31] Neuenschwander B, Capkun-Niggli G, Branson M, and Spiegelhalter DJ (2010). Summarizing historical information on controls in clinical trials. Clinical Trials 7(1), 5–18. [DOI] [PubMed] [Google Scholar]

[R32] Nifong B, Psioda MA, and Ibrahim JG (2021). The scale transformed power prior for use with historical data from a different outcome model. arXiv preprint arXiv:2105.05157. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] Pan H, Yuan Y, and Xia J (2017). A calibrated power prior approach to borrow information from historical data with application to biosimilar clinical trials. Journal of the Royal Statistical Society: Series C (Applied Statistics) 66 (5), 979–996. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] Rahme E and Joseph L (1998). Exact sample size determination for binomial experiments. Journal of Statistical Planning and Inference 66 (1), 83–93. [Google Scholar]

[R35] Reyes EM and Ghosh SK (2013). Bayesian average error-based approach to sample size calculations for hypothesis testing. Journal of Biopharmaceutical Statistics 23 (3), 569–588. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] Rubin DB and Stern HS (1998). Sample size determination using posterior predictive distributions. Sankhyā: The Indian Journal of Statistics, Series B 60 (1), 161–175. [Google Scholar]

[R37] Schmidli H, Gsteiger S, Roychoudhury S, O’Hagan A, Spiegelhalter D, and Neuenschwander B (2014). Robust meta-analytic-predictive priors in clinical trials with historical control information. Biometrics 70 (4), 1023–1032. [DOI] [PubMed] [Google Scholar]

[R38] Schmidli H, Häring DA, Thomas M, Cassidy A, Weber S, and Bretz F (2020). Beyond randomized clinical trials: use of external controls. Clinical Pharmacology & Therapeutics 107 (4), 806–816. [DOI] [PubMed] [Google Scholar]

[R39] Sheikh MT, Ibrahim JG, Gelfond JA, Sun W, and Chen M-H (2021). Joint modelling of longitudinal and survival data in the presence of competing risks with applications to prostate cancer data. Statistical Modelling 21 (1-2), 72–94. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] Simon R (1999). Bayesian design and analysis of active control clinical trials. Biometrics 55 (2), 484–487. [DOI] [PubMed] [Google Scholar]

[R41] Spiegelhalter DJ, Abrams KR, and Myles JP (2004). Bayesian approaches to clinical trials and health-care evaluation, Volume 13. John Wiley & Sons. [Google Scholar]

[R42] U.S. Food and Drug Administration (2010). Guidance for the Use of Bayesian Statistics in Medical Device Clinical Trials. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/guidance-use-bayesian-statistics-medical-device-clinical-trials. Accessed December 2020.

[R43] U.S. Food and Drug Administration (2018). Complex Innovative Trial Designs Pilot Program. https://www.fda.gov/drugs/development-resources/complex-innovative-trial-designs-pilot-program. Accessed November 2020.

[R44] van Rosmalen J, Dejardin D, van Norden Y, Löwenberg B, and Lesaffre E (2018). Including historical data in the analysis of clinical trials: Is it worth the effort? Statistical Methods in Medical Research 27 (10), 3167–3182. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] Viele K, Berry S, Neuenschwander B, Amzal B, Chen F, Enas N, Hobbs B, Ibrahim JG, Kinnersley N, Lindborg S, et al. (2014). Use of historical control data for assessing treatment effects in clinical trials. Pharmaceutical Statistics 13 (1), 41–54. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] Wang C, Li H, Chen W-C, Lu N, Tiwari R, Xu Y, and Yue LQ (2019). Propensity score-integrated power prior approach for incorporating real-world evidence in single-arm clinical studies. Journal of Biopharmaceutical Statistics 29 (5), 731–748. [DOI] [PubMed] [Google Scholar]

[R47] Wang F and Gelfand AE (2002). A simulation-based approach to Bayesian sample size determination for performance under a given model and for separating models. Statistical Science 17 (2), 193–208. [Google Scholar]

[R48] Wu Y, Hui J, and Deng Q (2020). Empirical profile Bayesian estimation for extrapolation of historical adult data to pediatric drug development. Pharmaceutical Statistics 19 (6), 787–802. [DOI] [PubMed] [Google Scholar]

[R49] Ye J, Reaman G, De Claro RA, and Sridhara R (2020). A Bayesian approach in design and analysis of pediatric cancer clinical trials. Pharmaceutical Statistics 19 (6), 814–826. [DOI] [PubMed] [Google Scholar]

[R50] Zhang D, Chen M-H, Ibrahim JG, Boye ME, and Shen W (2017). Bayesian model assessment in joint modeling of longitudinal and survival data with applications to cancer clinical trials. Journal of Computational and Graphical Statistics 26 (1), 121–133. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R51] Zhang D, Chen M-H, Ibrahim JG, Boye ME, Wang P, and Shen W (2014). Assessing model fit in joint models of longitudinal and survival data with applications to cancer clinical trials. Statistics in Medicine 33 (27), 4715–4733. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Bayesian Design of Superiority Trials: Methods and Applications

Wenlin Yuan

Ming-Hui Chen

John Zhong

Abstract

1. Introduction

2. Bayesian Design of a Superiority Trial

Fig. 1.

3. Bayesian SSD with Noninformative Priors

3.1. Theoretical Properties of the Bayesian Power Function with Known Variances