Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Feb 20.
Published in final edited form as: J Biopharm Stat. 2022 Feb 7;31(6):852–867. doi: 10.1080/10543406.2021.1998100

Calibrated Dynamic Borrowing Using Capping Priors

Sharon X Ling 1, Brian P Hobbs 2, Alexander M Kaizer 3, Joseph S Koopmeiners 1,*
PMCID: PMC9940118  NIHMSID: NIHMS1870429  PMID: 35129422

Abstract

Multisource exchangeability models (MEMs), a Bayesian approach for dynamically integrating information from multiple clinical trials, are a promising approach for gaining efficiency in randomized controlled trials. When the supplementary trials are considerably larger than the primary trial, care must be taken when integrating supplementary data to avoid overwhelming the primary trial. In this paper, we propose “capping priors,” which controls the extent of dynamic borrowing by placing an a priori cap on the effective supplemental sample size. We demonstrate the behavior of this technique via simulation, and apply our method to four randomized trials of very low nicotine content cigarettes.

Keywords: multisource exchangeability models, prior specification, capping priors, supplementary data, reduced nicotine content cigarettes

1. Introduction

Opportunities often arise to augment randomized controlled trials (RCTs) with data from other sources with the same control arm, treatment arm, or both. For example, interventions previously deemed effective may later be used as control arms of trials for new treatments(1,2). Alternatively, we may have data on the treatment arm from other trials in the comparative effectiveness setting, or on both arms in a different patient population. Integrating information into the analysis of a current trial allows researchers to decrease the study’s targeted enrollment (2), which in turn can reduce patient burden (1) and conserve time and financial resources. From a statistical standpoint, borrowing information can also increase precision and power while reducing type I error, assuming that the data sources are truly similar and exchangeable (2).

While synthesizing data from multiple sources has the potential to increase efficiency, it must be done in a statistically principled manner to avoid introducing bias or increasing the type I error rate (1-3). To address this issue, Pocock (4) proposed a method implementing shrinkage estimators to effectuate an appropriate amount of static borrowing, given an a priori estimate of the inter-trial heterogeneity. While Pocock’s method utilized an estimate of the inter-trial heterogeneity to calibrate borrowing from the supplemental data, the ultimate determination as to whether the primary and supplemental data could be combined was based on the trial settings (i.e. intervention, trial population, etc.). We don’t believe, however, that it is sufficient in practice to consider the trial setting, exclusively, as ambiguity may exist as to whether or not changes in the trial settings would impact the treatment effect or other trial results. Moreover, while participant eligibility may be similar by design, the ultimate enrollment distribution of important prognostic subpopulations may vary between trials. For example, a group of investigators may run multiple trials with the same intervention and control arms, but different trial populations that may or may not alter the treatment or expected outcomes. In these cases, we may alternatively consider dynamic borrowing, which evaluates the similarity between the primary and supplemental data, and borrows more information when the data are similar and less when there is evidence of heterogeneity (1,2).

Dynamic borrowing has been studied extensively in the statistical literature, primarily from a Bayesian perspective, and includes power priors (5), hierarchical models (6), and commensurate priors (7,8). These methods rely on a single parameter to control borrowing, which is estimated from the data (e.g. the power parameter in power priors, the variance parameter in the hierarchical model, and the commensurability parameter in commensurate priors). This results in a simple approach to dynamic borrowing but one that is limited if there are multiple supplemental sources with varying degrees of inter-trial heterogeneity. More recently, Kaizer et al. (3) proposed multisource exchangeability models (MEMs), a flexible Bayesian approach to dynamic borrowing that uses Bayesian model averaging (BMA) (9-11) to effectuate borrowing. MEMs consider all possible exchangeability relationships among a primary data source when combined with supplemental data sources. Information is smoothed over all possible exchangeability models using BMA. A major advantage of MEMs is that they allow the degree of dynamic borrowing to vary across supplemental sources, resulting in more flexible inference for identifying disjointed subpopulations comprised of meta-subtypes or singleton subtypes on the basis of accumulating evidence. This extent of flexibility is not allowed by standard ‘single-source’ hierarchical or commensurate prior modeling (wherein parameters for all sources are assumed necessarily exchangeable) (3).

As with many Bayesian procedures, a key challenge to the practical applications of MEMs is prior specification. An incorrectly specified prior can result in inappropriate borrowing and poor model performance. This phenomenon is particularly true in trials for which the sample size of the current trial is substantially smaller than the sample sizes of supplementary trials. Overborrowing can be disadvantageous for several reasons. From a clinical trial design standpoint, most trials aim to procure similar participant enrollment in the interventional versus control arms (12,13), which may not be achieved in this case. Furthermore, and perhaps more importantly, the larger supplementary cohorts’ information would likely overwhelm the signal from the current data, thus negating the benefits of an RCT in the current population. Ideally, we would like to use the supplemental data to improve precision, while still basing most of our conclusions on the primary data.

Consider the Center for the Evaluation of Nicotine in Cigarettes (CENIC) as a motivating example. CENIC is a multi-institution collaboration that aimed to explore the impact of reducing the nicotine content in cigarettes on tobacco use behavior (14-17). CENIC included four RCTs; the last of which randomized 58 smokers with serious mental illnesses— including schizophrenia, schizoaffective disorder, or bipolar disorder—to either normal nicotine content (NNC) cigarettes or very low nicotine content (VLNC) cigarettes (17). While we can directly estimate the treatment effect from these subjects’ data, data from over 900 additional participants on these same two treatment groups are available from the three other CENIC RCTs in different patient populations (14-16). Borrowing strength from the other three trials has the potential to improve precision, which motivates the use of dynamic borrowing approaches. A direct application of MEMs with default, non-informative priors results in substantial borrowing from the supplemental data to the extent that the supplemental data attenuates the signal from the primary data source. We would like an intuitive approach to prior specification that limits the amount of borrowing, while still improving efficiency.

We address these limitations through the introduction of “capping priors,” which allow us to control the degree of borrowing from supplemental data while still gaining efficiency. Throughout this manuscript, we characterize the extent of borrowing from supplemental data using the effective supplemental sample size (ESSS) (18), which extends the notion of effective sample size (19,20) to dynamic borrowing and summarizes the efficiency gained through borrowing by the increase in sample size that would be needed to achieve the observed improvement in precision. Capping priors allow the investigator to specify the maximum ESSS and derive the prior source inclusion probabilities that limit borrowing to the desired level. This results in an intuitive approach to prior specification, which allows investigators to control the extent of borrowing, while maintaining the efficiency gains that result from dynamic borrowing.

The remainder of this paper proceeds as follows. First, we provide some background and notation, including a brief introduction of MEMs and how to compute the ESSS, in Section 2. Then, we derive a method for specifying capping priors given a desired ESSS threshold in Section 3. Next, we conduct a simulation study to evaluate the ability of capping priors to restrict the ESSS at the desired threshold in Section 4. Finally, we demonstrate our method’s utility in a clinical trial setting by applying capping priors to our motivating data in Section 5 and conclude with a discussion in Section 6.

2. Background and Notation

Suppose we wish to determine the effect of an interventional treatment compared to a control on some outcome and that we have pertinent information from one primary and H supplementary sources. We assume that the response vector yp for the np observations of the primary source are normally distributed with mean μp and variance σp2; in this source, the true treatment effect is denoted by Δp. Similarly, for h = 1,2,…,H, we assume that the response vector yh for the nh observations of the hth supplementary source are normally distributed with mean μh and variance σh2; in source h, the true treatment effect is denoted by Δh. We assume that the data are independent, within and across all sources and, for the purposes of our theoretical development, assume that all variances are known. Thus, the vector of all response values, denoted by y = (yp,y1,y2,…,yH)T, has a multivariate normal distribution with mean μ = (μp12,…,μH)T and variance-covariance matrix Σ = diag{Σp12,…,ΣH} where Σp = σp2Inp and Σh = σh2Inh for h = 1,2,…,H. Furthermore, for each subject, we assume that we have information on the randomized treatment assignment xTrt and c covariate measures x1,x2,…,xc with no missing data. We denote the collection of all data as D.

2.1. Multisource Exchangeability Models

MEMs enable cross-trial data integration using BMA (9-11) to facilitate borrowing from multiple supplemental sources (3). Each of the H supplemental sources can be either exchangeable or nonexchangeable with the primary source. Thus, there are K = 2H unique combinations of the data, denoted by Ωk for k = 1,2,…,K, each describing a potential set of supplementary sources that are truly exchangeable with the primary source. MEMs effectuate borrowing by averaging over each of the K models, resulting in an estimate of the parameter of interest that utilizes different amounts of supplemental data from each source (3,11).

To illustrate our concept, we will first consider a simplified case for which we only have one supplementary source, so that H = 1, resulting in two possible ways to combine the data. The first MEM, Ω1, assumes that the supplementary data are not exchangeable with the primary data. Under this model, the baseline value, treatment effect, and covariate effects are allowed to vary across sources. Thus, the conditional mean of the response value can be estimated using a linear model that regresses the response y on a design matrix X1 containing a vector of ones (for the intercept) and the randomized treatment assignment xTrt, covariates x1,x2,…,xc, supplementary source indicator γ1, interaction between the supplementary source indicator and the randomized treatment assignment, and interactions between the supplementary source indicator and each of the covariates. That is, the conditional mean response model is

E(yX1,Ω1)=β0(1)+βTrt(1)xTrt+i=1cβi(1)xi+γ1(β0,1(1)+βTrt,1(1)xTrt+j=1cβj,1(1)xj) (1)

The second MEM, Ω2, assumes that the primary and supplementary sources are exchangeable. Our linear model of the conditional mean of the response value regresses the response y on a simplified design matrix X2 containing a vector of ones (for the intercept) and the randomized treatment assignment xTrt, covariates x1,x2,…,xc, supplementary source indicator γ1, and interactions between the supplementary source indicator and each of the covariates, but excludes the interaction between the supplementary source indicator and the treatment effect. This specification allows the covariate effects to vary by source, while keeping the treatment effect the same for the primary and supplemental data sources. In this case, the conditional mean response can be modeled as

E(yX2,Ω2)=β0(2)+βTrt(2)xTrt+i=1cβi(2)xi+γ1(β0,1(2)+j=1cβj,1(2)xj) (2)

More generally, we can express both models above as E(yXk,Ωk) = Xkβ(k), where the design matrices Xk for k = 1,2 are described above, and the vector of regression coefficients is β(k)=(β0(k),βTrt(k),β1(k),,βc(k),β0,1(k),βTrt,1(k),β1,1(k),,βc,1(k)). Note that, for the model in Equation 2, the term βTrt,1(2) will simply be 0.

This method can easily be extended to scenarios with multiple supplementary sources (i.e., when H > 1). In these scenarios, we can construct models and design matrices similarly by considering K = 2H design matrices with corresponding model parameters β0,h,βTrt,h,β1,h, … , βc,h for each additional supplementary source h.

Regardless of the number of sources, our primary aim is to estimate the element, βTrt, of the parameter vector, β, corresponding to the treatment effect in the primary source. The model specific posterior of β(k) given Xk, Ωk, and Σ, follows a multivariate normal distribution with mean (E(βTrt(k)Xk,Ωk,Σ)=(XkTΣ1Xk)1XkTΣ1y and variance (Var(βTrt(k)Xk,Ωk,Σ)=(XkTΣ1Xk)1.

MEMs use BMA to smooth over the model-specific posteriors for βTrt, resulting in a posterior distribution that is a weighted average of a posterior assuming no borrowing and a posterior assuming borrowing (3,9-11). BMA requires that we specify a prior distribution for each scenario, which is most easily done by assuming prior independence on the prior probability ph that each supplemental data source h (for h = 1,2,…,H) is exchangeable with the primary source. Then, when H = 1, the prior weights are π(Ω1) = 1 – p1 and π(Ω2) = p1. When H > 1, the prior weight of Ωk for k = 1,2,…,K is

π(Ωk)=h=1Hphsh,k(1ph)1sh,k, (3)

where sh,k is an indicator function that equals 1 if supplementary source h is exchangeable with the primary source under model Ωk, and 0 otherwise. We note that the indicator function, sh,k, is neither observed data or a parameter to be estimated, but rather notation we use to specify the sub-model, Ωk, That is, each sub-model is defined by a different combination of “sh,k”s representing all of the different ways that the data can be combined.

Given the prior weights, the posterior weight for the kth model is a function of the prior weights and the marginalized likelihood, p(DΩk), for each MEM, i.e.,

p(ΩkD)=π(Ωk)p(DΩk)j=1Kπ(Ωj)p(DΩj), (4)

as detailed in Raftery (11). Moreover, as described by Kotalik et al. (21) and Burnham and Anderson (22), when the model-specific marginals have no closed form solution, it can be useful to instead approximate the posterior weights by

ωk=π(Ωk)exp(0.5ηk)(j=1Kπ(Ωj)exp(0.5ηj), (5)

with ηk = BICk – min{BIC1,BIC2,…,BICK} where BICk is the Bayesian Information Criterion for the kth MEM. We opt to utilize this approximation in our analysis.

Given the posterior weights ωk for k = 1,2,…,K, the overall posterior for the primary source treatment effect, βTrt, is a weighted average of the MEM-specific posteriors, i.e.,

P(βTrtD)=k=1KωkP(BTrt(k)Xk,Ωk,Σ), (6)

and is consequently a mixture of normal distributions. That is,

BTrtDN(k=1KωkE(BTrt(k)Xk,Ωk,Σ),k=1KωkE((βTrt(k))2Xk,Ωk,Σ)[k=1KωkE(βTrt(k)Xk,Ωk,Σ)]2) (7)

where (E((βTrt(k))2Xk,Ωk,Σ)=(Var(βTrt(k)Xk,Ωk,Σ)+E(βTrt(k)Xk,Ωk,Σ)2.

This mixture distribution is sensitive to the MEM-specific prior weights π(Ωk) for k = 1,2,…,K, which dictate the degree of borrowing. Prior weights that favor little to no borrowing (e.g., π(Ω1) = 1(Ωk) = 0 ∀ k ≠ 1) will result in a mixture distribution that is close to the distribution of response variable in the primary source, alone. As the prior weights change to encourage more borrowing (e.g., π(ΩK) = 1(Ωk) = 0 ∀ k ≠ 1), the mixture distribution will become more similar to the response distributions in the supplementary sources and, potentially, more dissimilar from that of the primary source. Thus, in many situations, the investigator may wish to cap the total amount of borrowing from the H supplemental sources to be less than a pre-specified desired threshold. In order to better quantify this threshold and the true level of borrowing, we use a quantity called the effective supplemental sample size (ESSS) and refer to the desired threshold as ESSSt.

2.2. Computation of ESSS Given Priors

As described above, the extent of borrowing can be characterized by the effective supplemental sample size (ESSS) (18,23), which extends the concept of effective sample size (19,20) to account for dynamic borrowing, and refers to the additional number of observations that would be required to achieve the same precision as was observed through dynamic borrowing. When the posterior precision is approximately linear in the sample size (as in the case of linear models), ESSS is defined as

ESSS=np(PJointPReference1), (8)

where PReference is the posterior precision of the treatment effect under the reference model (which does not utilize any borrowing) and PJoint is the posterior precision of the treatment effect under the joint model (which does utilize borrowing).

In our case, the reference model corresponds to Ω1, for which the treatment effect is estimated independently for each source. Thus, the posterior precision for the reference model is

PReference=[[Var(βTrt(1)X1,Ω1,Σ)]1=X1TΣ1X1. (9)

The joint model assumes that the overall posterior, P(βTrtD), for the treatment effect βTrt of the primary source has a distribution equal to a weighted mixture of the K MEM specific posteriors for βTrt. As described in Equation 7, this posterior precision is

PJoint={k=1KωkE(((βTrt(k))2Xk,Ωk,Σ)[k=1KωkE((βTrt(k)Xk,Ωk,Σ)]2}1. (10)

Our goal is to find capping priors π(Ωk) for k = 1,2,…,K that restrict the induced ESSS in Equation 8 to be at most some a priori specified desired threshold ESSSt.

3. Capping Priors

In this section, we first consider the simplified scenario for which there is only one supplementary source and thus only one prior inclusion probability, p1, of interest. Note that π(Ω1) = 1 – p1 and π(Ω2) = p1, which implies that the prior weights for each model can be written in terms of p1. Given the direct connection between the prior model weights and the ESSS, we can derive the value of p1 such that ESSS is less than some pre-specified threshold, ESSSt. However, ESSS is also a function of the data, particularly the elements in yp, which are unobserved a priori. Hence, we must derive p1 such that the ESSS is less than ESSSt for all yp. We start by discussing our approach to deriving p1 assuming that all data are known and then discuss how our approach can be relaxed in the next section.

Assuming all data are known, we can conduct a grid search by iterating across a fine grid of possible values over the support of p1, considering values from smallest to largest. At each iteration, we compute the induced ESSS. Then, we flag the largest value for p1 such that the corresponding induced ESSS for that value and for all considered p1 values before it are not greater than ESSSt, and we choose it to be p1. Finally, we assign the capping priors as π(Ω1)=1p1 and π(Ω2)=p1.

3.1. Estimating Responses from Primary Source

In a Bayesian data analysis, the prior cannot be a function of the outcome data. Thus, instead of using the primary source responses, yp, we must use a proxy estimate, yp, and use y=(yp,y1)T as a proxy for y in our computations of the capping prior above. We wish to choose y* such that the true induced ESSS (based on y) will be at most that which we would obtain with y*. Because the ESSS is maximized when the primary data have the same conditional mean as the supplementary data, we wish to generate yp from the conditional mean of the supplemental data.

To find this conditional mean, we first find an estimate, β^(1), of the vector of regression coefficients, β(1) = (β0,1, βTrt,1,β1,1, … , βc,1)T. That is, we simply regress the responses y1 from the one supplementary source on a design matrix, X(1) containing information on the intercept, treatment assignment, and covariate values of the supplementary source. We then let yp=X(P)β^(1), where X(P) is a design matrix containing information on the intercept, treatment assignment, and covariate values in the primary source.

3.2. Multiple Supplementary Sources

If H > 1, we must find the prior inclusion probability for each supplementary source one by one, in sequence. Consider source h, where h = 1,2,…,H. To find ph, we first regress the vector of supplementary source responses yh on a design matrix X(h) containing information from the hth supplementary source on the intercept, treatment assignment, and covariate values to obtain a vector of estimated regression coefficients: β^(h)=(β^0,h,β^Trt,h,β^1,h,,β^c,h)T for h = 1,2,…,H. Next, we let yp,h=X(P)β^(1), and ya,h=X(a)β^(1) for a ∈ {1,2,…,H}\{h}, where X(P) is a design matrix containing information on the intercept, treatment assignment, and covariate values of the primary source and X(a) is a design matrix containing information on the intercept, treatment assignment, and covariate values of the ath supplementary source. We then use y,h=(yp,h,y1,h,,yh,,yH,h)T as a proxy for y while searching for ph. When there are multiple supplementary sources, we typically do not have a priori reason to believe that any source is more likely to be exchangeable with the primary source than the rest. Thus, we assume for simplicity that all prior inclusion probabilities are equal, i.e., that p1(h)=p2(h)==pH(h). Then, we can complete a grid search, as described above, setting all prior inclusion probabilities equal to p~h(h). Then we let php~h(h). After repeating the above process for all supplementary sources, we will have found all of the prior inclusion probabilities ph for h = 1,2,…,H. Note that p1 = p2 = … = pH will be true by nature of our algorithm, if the assumptions of normality and homogeneity within each source are met. Finally, we will then we compute the prior model weights using Equation 3.

3.3. Estimating Variance-Covariance Matrix

To compute the capping priors above, we assumed that the variance-covariance matrix Σ is fixed and known. In practice, however, this quantity is typically unknown and must be accounted for before we can solve for the capping prior. Ideally, we would specify a prior on Σ and then derive the capping priors after integrating out Σ. However, this approach is not analytically tractable. Thus, we propose the use of a plug-in estimator, Σ^=diag(σ^p2,σ^12,,σ^H2), for Σ, which we obtain by fitting separate models for each source and setting σ^a2 equal to the residual variance for model a. While this approach is not fully Bayesian, our simulation study illustrates that this simpler estimation technique is adequate for meeting our objectives.

4. Simulation Study

We completed a thorough simulation study to evaluate whether capping priors effectively control borrowing via MEMs. For our simulation study, we consider scenarios for which there are 1, 2, or 3 supplementary sources. For all scenarios, we consider one covariate (so that c = 1), with values originating from a standard normal distribution. We also fix the primary cohort’s sample size to be np = 50, the primary cohort’s variance to be σp2=50, and the total number of supplementary observations to be h=1Hnh=900. In all scenarios, we vary the true treatment effect in the primary source between 1 and 3 in increments of 0.5, and we consider six scenarios where we set ESSSt at equally spaced values between 0 to 900.

For the different scenarios, we consider when the supplementary sources have equal treatment effects (with Δh = 2 for h = 1,…,H) or unequal treatment effects (with Δh = 2h for h = 1,…,H.) We also consider when the supplementary sources have equal sample sizes (with nh = 900/H for h = 1,…,H) or unequal sample sizes (with nh=900hj=1Hj for h = 1,…,H). Finally, we also consider scenarios where the primary and supplementary sources all have equal variance (i.e. σp2=σ12==σH2=50)), as well as scenarios where the variance in the supplemental sources vary. The sets of parameters specific to each scenario are enumerated in Table 1.

Table 1:

Scenarios considered in simulation study. We vary the number H of supplementary sources, as well as whether the treatment effects Δ12,…,ΔH in the supplementary sources are equal, whether the sample sizes n1,n2,…,nH in the supplementary sources are equal, and whether the variances σ12, σ22,,σH2 in the supplementary sources are equal.

Scenario H Equal
Δh
Equal
nh
Equal
σh2
Δ1, Δ2, … , Δh n1, n2, … , nH σ12, σ22,,σH2
1 1 T T T 2 900 50
2 1 T T F 2 900 40
3 2 T T T 2, 2 450, 450 50, 50
4 2 T T F 2, 2 450, 450 40, 60
5 2 T F T 2, 2 300, 600 50, 50
6 2 T F F 2, 2 300, 600 40, 60
7 2 F T T 2, 4 450, 450 50, 50
8 2 F T F 2, 4 450, 450 40, 60
9 2 F F T 2, 4 300, 600 50, 50
10 2 F F F 2, 4 300, 600 40, 60
11 3 T T T 2, 2, 2 300, 300, 300 50, 50, 50
12 3 T T F 2, 2, 2 300, 300, 300 40, 50, 60
13 3 T F T 2, 2, 2 150, 300, 450 50, 50, 50
14 3 T F F 2, 2, 2 150, 300, 450 40, 50, 60
15 3 F T T 2, 4, 8 300, 300, 300 50, 50, 50
16 3 F T F 2, 4, 8 300, 300, 300 40, 50, 60
17 3 F F T 2, 4, 8 150, 300, 450 50, 50, 50
18 3 F F F 2, 4, 8 150, 300, 450 40, 50, 60

To compose the vector of true regression coefficients, β, for each scenario, we let the element of β corresponding to the primary source’s treatment effect be βTrt = Δp, the element corresponding to the hth source’s indicator be β0,h = Δh, and the element corresponding to the interaction between the hth source indicator and primary source treatment effect be BTrt,h = Δh – ΔP. All other elements of β are set at the arbitrarily chosen value of 1.

Data are generated by first simulating covariate values from a standard normal distribution. The outcome vector, y, is then simulated from a normal distribution with parameters μ = and Σ=ag(σp2,σ12,,σH2). A grid size 0.001 was used for determining the capping priors, and 1000 simulated data sets were considered for each scenario.

We assess the performance of capping priors by evaluating the extent of borrowing for each simulated data set, as measured by the ESSS. First, we check that our algorithm produces ESSS values of at most each of the desired ESSSt thresholds for all simulations. Next, we verify that a lower ESSSt corresponds to a lower mean and maximum ESSS, across all simulations. Finally, we verify that our algorithm borrows more information (i.e., has a higher induced ESSS) in situations for which the data are truly exchangeable—that is, when the true treatment effect, ΔP, of the primary source is relatively close to the true treatment effects, Δh for h = 1,2,3, of the supplementary sources.

4.1. Results

For each scenario in Table 1, we visualize our results with three plots, which are shown in Figures 1 - 4. The first of each set of plots shows ESSSt versus the maximum induced ESSS among all simulations for various possible values of Δp. From these plots, we note that the maximum induced ESSS across all simulations is always less than or equal to ESSSt, which indicates that the capping priors have successfully limited the amount of borrowing to be at most the desired threshold. We also note that a lower ESSSt value corresponds to a lower induced ESSS, as desired.

Figure 1:

Figure 1:

Simulation results for Scenarios 1 – 5. Plots of (1) target ESSS threshold ESSSt versus maximum induced ESSS among all simulations for various primary source treatment effects Δp, (2) target ESSS threshold ESSSt versus mean induced ESSS among all simulations for various primary source treatment effects Δp, (3) treatment effect Δp in primary source versus mean induced ESSS among all simulations for various target ESSS thresholds ESSSt, and (4) target ESSS threshold ESSSt versus mean prior inclusion probabilities πh) for h = 1,2,…,H.

Figure 4:

Figure 4:

Simulation results for Scenarios 16 – 18. Plots of (1) target ESSS threshold ESSSt versus maximum induced ESSS among all simulations for various primary source treatment effects Δp, (2) target ESSS threshold ESSSt versus mean induced ESSS among all simulations for various primary source treatment effects Δp, (3) treatment effect Δp in primary source versus mean induced ESSS among all simulations for various target ESSS thresholds ESSSt, and (4) target ESSS threshold ESSSt versus mean prior inclusion probabilities πh) for h = 1,2,…,H.

In the second of each set of plots, we have ESSSt versus the mean induced ESSS among all simulations for various possible values of P. We again note that lower ESSSt values correspond to lower induced ESSS.

The last of each set shows a plot of the treatment effect Δp in the primary source against the mean induced ESSS among all simulations for various possible threshold values, ESSSt. In plots for which the supplementary cohorts’ treatment effects are equal (at Δh = 2 for h = 1,…,H), the average induced ESSS is higher when Δp is closer to the supplementary sources’ shared treatment effect. This illustrates the desired performance of MEMs, which should borrow more information when the treatment effects of the primary and supplementary sources are similar and ignore the supplementary data when there is heterogeneity between the primary and supplemental data set.

These findings demonstrate that the use of capping priors for MEMs is effective at capping the amount of overall supplementary information borrowed in the analysis at the desired ESSSt threshold and that a lower ESSSt value corresponds to a lower induced ESSS. Moreover, our results suggest that our method encourages more borrowing when the treatment effects in the supplementary sources are truly similar to the primary source.

5. Application to CENIC Project

We illustrate the use of capping priors through an application to four RCTs from the Center for the Evaluation of Nicotine in Cigarettes (CENIC), which examined the impact of nicotine reduction on smoking behavior. The last trial, titled Very Low Nicotine Cigarettes in Smokers With Schizophrenia (PS), was a double-blind randomized trial that collected data from 2014-2017 on smokers with serious mental illnesses, such as schizophrenia, schizoaffective disorder, and bipolar disorder (17). Subjects were randomized to receive either normal nicotine content (NNC) cigarettes with 15.8 mg nicotine/g tobacco or very low nicotine content (VLNC) cigarettes with 0.4 mg nicotine/g tobacco. The primary outcome of interest was the total number of cigarettes smoked per day (CPD) at the end of a 6-week period. The primary analysis used linear regression to compare the primary outcome between treatment groups, adjusting for baseline CPD as a covariate. Fifty-eight participants were randomized, and data are available for 51 subjects at study completion.

In addition to PS, CENIC completed three previous trials of VLNC cigarettes that could be used as supplemental data. The first trial, called Project 1, Study 1 (P1S1) explored the effect of VLNC cigarettes compared to NNC cigarettes with respect to tobacco dependence, measured by the total CPD and other variables (14). The second trial, called Project 1, Study 2 (P1S2), examined the impact of VLNC cigarettes with and without transdermal nicotine replacement therapy (i.e nicotine patch) on cigarette smoking, measured through the same variables as in P1S1 (16). The third trial, called Project 2 (P2), evaluated differences in biomarkers of tobacco exposure and smoking patterns among three participant groups: those given NNC cigarettes, those given VLNC cigarettes, and those given NNC cigarettes at the beginning of the study and were, over time, given cigarettes corresponding to a gradual reduction in nicotine content until receiving VLNC cigarettes for the last month of the intervention (15). In total, these trials contained information on 2,329 subjects; 1,111 of which were in treatment arms that administered NNC or VLNC cigarettes and, thus, were similar to the arms in PS (14,15,24). After removing two subjects’ data for having unfeasible measurements of over 180 CPD, there were 978 complete case observations, with information on both baseline and total CPD at the end of 6 weeks (for P1S1 and P1S2) or 8 weeks (for P2, since data on CPD at 6 weeks is unavailable for P2). Table 2 shows a detailed breakdown of these sample sizes, and Figure 5 has boxplots for the distributions of baseline CPD and ending total CPD in each source.

Table 2:

Summary of CENIC data by trial. Information on each trial’s abbreviation, defining characteristics of treatment arms, number of subjects randomized to each arm, and number of subjects in each arm used in our analysis. Subjects in all arms were given cigarettes with constant nicotine content throughout the study duration, except for those in arms labeled “gradual” (for which the cigarettes’ nicotine content was gradually decreased during the study). “NRT” denotes nicotine-replacement therapy.

Abbreviation Arms Randomized Used in Analysis
PS 15.8 mg/g (NNC) 28 25
0.4 mg/g (VLNC) 30 26
P1S1 Usual cigarette brand 118 0
5.2 mg/g tobacco filler 122 0
2.4 mg/g tobacco filler 119 0
1.3 mg/g tobacco filler 119 0
0.4 mg/g tobacco filler with 13 mg tar 123 0
15.8 mg/g tobacco filler (NNC) 119 110
0.4 mg/g tobacco filler (VLNC) 119 109
P1S2 15.8 mg/g (NNC) and NRT 59 0
0.4 mg/g tobacco filler (VLNC) and NRT 60 0
15.8 mg/g tobacco filler (NNC) 61 54
0.4 mg/g tobacco filler (VLNC) 60 49
P2 15.8 to 0.4 mg/g, gradual 498 0
15.8 mg/g tobacco filler (NNC) 249 234
0.4 mg/g tobacco filler (VLNC) 503 422

Figure 5:

Figure 5:

Boxplots of data from CENIC project used in this analysis. There are separate boxplots for the primary source PS (labeled “p”) and each of the three supplementary sources P1S1, P1S2, and P2 (labeled “1”, “2”, and “3,” respectively).

As PS was much smaller than the other trials, we hope to gain precision by borrowing information from the other trials using MEMs. We note that borrowing all the information from the three substantially larger data sets could potentially obscure the most important information from the much smaller primary cohort. Thus, we set a desired ESSS threshold of ESSSt = 102, which is twice the number of subjects in the primary cohort. We choose a grid sizes of 0.001 for finding p1.

For comparison, we conduct two additional analyses. The first analysis uses flat prior inclusion probabilities of ph = 0.5 for h = 1,2,3, which lead to equal priors π(Ωk) = 0.125 for all MEMs k = 1,2,…,8. In this analysis, we are cautious of a large ESSS, potentially much larger than the primary source’s sample size. The second analysis is a standard analysis of the PS data that ignores the supplemental information. Using only data from the primary cohort, we fit a linear model regressing the total ending CPD on treatment assignment and baseline CPD. We expect to see that this method’s point estimate for βTrt (compared to that which we will find using capping priors and MEMs) will be similar but that the uncertainty surrounding this estimate will be larger.

5.1. Results

Based on models with only each of the secondary sources (P1S1, P1S2, and P2) separately, estimates of the supplementary-source-specific treatment effects are −6.120, −5.367, and −4.791, respectively. Using our algorithm, we compute the prior inclusion probabilities corresponding to the capping priors to be p1=p2=p3=0.20. Ultimately, we achieve an ESSS of 58, which is less than the pre-specified threshold of 102. We compute the overall average treatment effect estimate to be E(βTrtD) = −4.776 and its variance to be V ar(βTrtD) = 1.984. Assuming a normal distribution, this finding corresponds to an entirely negative 95% confidence interval of (−7.537, −2.016), suggesting that the use of VLNC (as opposed to NNC) cigarettes is significantly associated with a reduction in the total number of CPD.

In comparison, an analysis that uses uninformative priors leads to MEM-specific priors and posteriors that yield an ESSS of 797.0, which is over 15 times the sample size in the primary cohort of 51. The average treatment effect is estimated to be −5.063, with a variance 0.2560 and a confidence interval of (−6.054, −4.071). While this estimation is very precise, the results are essentially an analysis of the supplemental data, thus undermining the purpose of conducting a RCT in the primary cohort.

Furthermore, an analysis with using only data from the primary cohort yields an average treatment effect estimate of −4.014, variance of 4.434, and 95% confidence interval of (−12.70, 4.677). As expected, compared to the result using capping priors and MEMs, this point estimate is similar, but the variance is higher and the confidence interval is wider.

6. Discussion

In this paper, we described capping priors as a means for calibrating the amount of supplementary information that is dynamically borrowed using MEMs. Our simulation results suggest that our method allows MEMs to cap the amount of borrowed information at an a priori specified threshold. Furthermore, the use of capping priors preserves an important property of MEMs—that more supplementary information is borrowed when the source-specific treatment effects are more similar between the primary and supplementary sources and that less supplementary information is borrowed, otherwise.

An application to data from CENIC demonstrates the utility of our method. In Section 5, we used MEMs with capping priors to analyze the last in a series of four trials, while borrowing a proportion of the information from the first three trials. Ultimately, we conclude that, among smokers with serious mental illnesses, the use of very low nicotine content (instead of normal nicotine content) cigarettes is associated with a significant reduction in smoking, with an expected smoking reduction of 2.016 to 7.537 cigarettes (with mean 4.776) per day. These results align with the previous literature, which suggest that decreased nicotine content in cigarettes may encourage smokers (including those with mental health conditions, such as schizophrenia) to smoke less (25,26). Moreover, this analysis using capping priors yielded an ESSS of 58, which is smaller and much more reasonable than an analysis using uninformative priors, which leads to an ESSS of 797.0. Furthermore, compared to an analysis of the primary trial alone (i.e., with no data borrowing), our results using capping priors with MEMs have a similar point estimate and a 23.52% increase in precision.

In our applied example, we specified ESSSt to be twice the size of the primary trial and in our simulation study, we considered scenarios where ESSSt was 18 times the size of the primary trial. In all cases, capping priors were able to control the amount of borrowing, resulting in an ESSS less that the pre-specified threshold. ESSSt is an important parameter that must be pre-specified and that will likely have a substantial impact on trial operating characteristics (i.e. type-I error rate, power, etc.). In practice, we recommend an ESSSt of no more than 2 – 3 times the sample size of the primary trial, which could potentially result in a situation where 66% - 75% of statistical information could come from the supplemental data. Alternately, ESSSt could be viewed as a tuning parameter, with simulation used to evaluate the impact of different values on trial operating characteristics, as has been proposed in other applications of MEMs(27). Additional research is needed to understand the impact of varying ESSSt on trial operating characteristics. Throughout this manuscript, we made the simplifying assumption of equal prior inclusion probabilities for all supplemental sources. This is justified because, in most cases, there will be little a priori information that would result in a preference for borrowing from one supplemental data source over the others. We previously considered the possibility of allowing the prior inclusion probabilities to vary by source but this resulted in poor performance because there is no unique solution for controlling the total ESSS. This indicates that, while unequal prior inclusion probabilities could be accommodated using the approach outlined in this paper, the investigators would be required the pre-specify the relationship between the prior inclusion probabilities between sources (i.e. specifying the prior inclusion probability for one source to be twice the other sources).

Nevertheless, this study has several limitations. Currently, our method relies on generalized least squares regression and thus assumes that the data from each source are normally distributed. This assumption may be unreasonable in many applications, for which the response variable may be bimodal or have some other non-Gaussian shape. Nevertheless, assuming that the data are not normally distributed leads to a more complicated problem, where the posterior may no longer be available in closed form, meaning that the another method (such as the Expectation-Maximization algorithm and other forms of numerical integration) might be required. We will consider extending capping priors to accommodate other data types in future research. Finally, a limitation of MEMs is that the number of sub-models grows quickly with the number of supplemental trials. In practice, we anticipate that the number of supplemental trials will be small (5 or less) and that capping priors would work well in most settings. Recently, iterated MEMs were proposed as a solution for applying MEMs with a large number of supplemental data sources (28). In principle, capping priors could be extended to iterated MEMs but further research is needed to understand how this would be implemented, in practice.

Figure 2:

Figure 2:

Simulation results for Scenarios 6 – 10. Plots of (1) target ESSS threshold ESSSt versus maximum induced ESSS among all simulations for various primary source treatment effects Δp, (2) target ESSS threshold ESSSt versus mean induced ESSS among all simulations for various primary source treatment effects Δp, (3) treatment effect Δp in primary source versus mean induced ESSS among all simulations for various target ESSS thresholds ESSSt, and (4) target ESSS threshold ESSSt versus mean prior inclusion probabilities πh) for h = 1,2,…,H.

Figure 3:

Figure 3:

Simulation results for Scenarios 11 – 15. Plots of (1) target ESSS threshold ESSSt versus maximum induced ESSS among all simulations for various primary source treatment effects Δp, (2) target ESSS threshold ESSSt versus mean induced ESSS among all simulations for various primary source treatment effects Δp, (3) treatment effect Δp in primary source versus mean induced ESSS among all simulations for various target ESSS thresholds ESSSt, and (4) target ESSS threshold ESSSt versus mean prior inclusion probabilities πh) for h = 1,2,…,H.

Acknowledgements

This work was supported in part by the NSF Graduate Research Fellowship Program (to S. Ling) and by NIH grants R01-DA046320 and U54-DA031659 from the National Institute on Drug Abuse and FDA Center for Tobacco Products (CTP), and K01-HL151754 from the National Heart, Lung, and Blood Institute. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or Food and Drug Administration Center for Tobacco Products.

Footnotes

Conflict of Interest: None declared.

References

  • 1.Lim J, Walley R, Yuan J, Liu J, Dabral A, Best N, et al. Minimizing Patient Burden Through the Use of Historical Subject-Level Data in Innovative Confirmatory Clinical Trials: Review of Methods and Opportunities. Ther Innov Regul Sci. 2018. Sep 1;52(5):546–59. [DOI] [PubMed] [Google Scholar]
  • 2.Viele K, Berry S, Neuenschwander B, Amzal B, Chen F, Enas N, et al. Use of historical control data for assessing treatment effects in clinical trials. Pharm Stat. 2014. Feb;13(1):41–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Kaizer AM, Koopmeiners JS, Hobbs BP. Bayesian hierarchical modeling based on multisource exchangeability. Biostatistics. 2018;19(2):169–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Pocock SJ. The combination of randomized and historical controls in clinical trials. J Chronic Dis. 1976;29:175–188. [DOI] [PubMed] [Google Scholar]
  • 5.Chen M-H, Ibrahim JG. Power prior distributions for regression models. Stat Sci. 2000. Feb;15(1):46–60. [Google Scholar]
  • 6.Neuenschwander B, Capkun-Niggli G, Branson M, Spiegelhalter DJ. Summarizing historical information on controls in clinical trials. Clin Trials Lond Engl. 2010. Feb;7(1):5–18. [DOI] [PubMed] [Google Scholar]
  • 7.Hobbs BP, Carlin BP, Mandrekar SJ, Sargent DJ. Hierarchical Commensurate and Power Prior Models for Adaptive Incorporation of Historical Information in Clinical Trials. Biometrics. 2011;67(3):1047–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Hobbs BP, Sargent DJ, Carlin BP. Commensurate Priors for Incorporating Historical Information in Clinical Trials Using General and Generalized Linear Models. Bayesian Anal. 2012;7(3):639–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Volinsky CT, Raftery AE, Madigan D, Hoeting JA. Bayesian model averaging: a tutorial. Stat Sci. 1999. Nov;14(4):382–417. [Google Scholar]
  • 10.Raftery AE, Madigan D, Hoeting JA. Bayesian Model Averaging for Linear Regression Models. J Am Stat Assoc. 1997. Mar;92(437):179–91. [Google Scholar]
  • 11.Raftery AE. Bayesian Model Selection in Social Research. Sociol Methodol. 1995;25:111. [Google Scholar]
  • 12.Pocock SJ. Allocation of patients to treatment in clinical trials. Biometrics. 1979. Mar;35(1):183–97. [PubMed] [Google Scholar]
  • 13.Torgerson DJ, Torgerson CJ. Unequal Randomisation. In: Torgerson DJ, Torgerson CJ, editors. Designing Randomised Trials in Health, Education and the Social Sciences: An Introduction [Internet]. London: Palgrave Macmillan UK; 2008. [cited 2021 Aug 30]. p. 108–13. Available from: 10.1057/9780230583993_10 [DOI] [Google Scholar]
  • 14.Donny EC, Denlinger RL, Tidey JW, Koopmeiners JS, Benowitz NL, Vandrey RG, et al. Randomized Trial of Reduced-Nicotine Standards for Cigarettes. N Engl J Med. 2015. Oct;373(14):1340–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Hatsukami DK, Luo X, Jensen JA, Al’Absi M, Allen SS, Carmella SG, et al. Effect of immediate vs gradual reduction in nicotine content of cigarettes on biomarkers of smoke exposure a randomized clinical trial. JAMA. 2018;320(9):880–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Smith TT, Koopmeiners JS, Tessier KM, Davis EM, Conklin CA, Denlinger-Apte RL, et al. Randomized Trial of Low-Nicotine Cigarettes and Transdermal Nicotine. Am J Prev Med. 2019. Oct;57(4):515–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Tidey JW, Colby SM, Denlinger-Apte RL, Goodwin C, Cioe PA, Cassidy RN, et al. Effects of 6-Week Use of Very Low Nicotine Content Cigarettes in Smokers With Serious Mental Illness. Nicotine Tob Res Off J Soc Res Nicotine Tob. 2019. Dec 23;21(Suppl 1):S38–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Hobbs BP, Carlin BP, Sargent DJ. Adaptive adjustment of the randomization ratio using historical control data. Clin Trials. 2013;10:430–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Morita S, Thall PF, Müller P. Determining the effective sample size of a parametric prior. Biometrics. 2008;64:595–602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Morita S, Thall PF, Müller P. Prior effective sample size in conditionally independent hierarchical models. Bayesian Anal. 2012;7:591–614. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Kotalik A, Vock DM, Donny EC, Hatsukami DK, Koopmeiners JS. Dynamic borrowing in the presence of treatment effect heterogeneity. Biostat Oxf Engl. 2020. Jan 24; [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Burnham KP, Anderson DR. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach [Internet]. 2nd ed. New York: Springer-Verlag; 2002. [cited 2021 Aug 30]. Available from: https://www.springer.com/gp/book/9780387953649 [Google Scholar]
  • 23.Hobbs BP, Landin R. Bayesian basket trial design with exchangeability monitoring. Stat Med. 2018. Nov;37(25):3557–72. [DOI] [PubMed] [Google Scholar]
  • 24.Smith TT, Koopmeiners JS, Tessier KM, Davis EM, Conklin CA, Denlinger-Apte RL, et al. Randomized Trial of Low-Nicotine Cigarettes and Transdermal Nicotine. Am J Prev Med. 2019;57(4):515–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Higgins ST, Heil SH, Sigmon SC, Tidey JW, Gaalema DE, Hughes JR, et al. Addiction Potential of Cigarettes With Reduced Nicotine Content in Populations With Psychiatric Disorders and Other Vulnerabilities to Tobacco Addiction. JAMA Psychiatry. 2017. Oct 1;74(10):1056–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Tidey JW, Rohsenow DJ, Kaplan GB, Swift RM, Ahnallen CG. Separate and combined effects of very low nicotine cigarettes and nicotine replacement in smokers with schizophrenia and controls. Nicotine Tob Res Off J Soc Res Nicotine Tob. 2013. Jan;15(1):121–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Kaizer AM, Hobbs BP, Koopmeiners JS. A multi-source adaptive platform design for testing sequential combinatorial therapeutic strategies. Biometrics. 2018. Sep;74(3):1082–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Brown R, Fan Y, Das K, Wolfson J. Iterated multisource exchangeability models for individualized inference with an application to mobile sensor data. Biometrics. 2021. Jun;77(2):401–12. [DOI] [PubMed] [Google Scholar]

RESOURCES