Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Jan 1.
Published in final edited form as: J Biopharm Stat. 2024 Jun 7;35(4):550–564. doi: 10.1080/10543406.2024.2359149

Implementation of statistical features of a Bayesian two-armed responsive adaptive randomization trial with post hoc analysis of time trend drift

Elena Shergina a, Kimber P Richter b, Chuanwu Zhang a,c, Laura Mussulman d, Niaman Nazir b, Byron J Gajewski1 a,*
PMCID: PMC11624317  NIHMSID: NIHMS1984481  PMID: 38847351

Abstract

Bayesian adaptive designs with response adaptive randomization (RAR) have the potential to benefit more participants in a clinical trial. While there are many papers that describe RAR designs and results, there is a scarcity of works reporting the details of RAR implementation from a statistical point exclusively. In this paper, we introduce the statistical methodology and implementation of the trial Changing the Default (CTD). CTD is a single-center prospective RAR comparative effectiveness trial to compare opt-in to opt-out tobacco treatment approaches for hospitalized patients. The design assumed an uninformative prior, conservative initial allocation ratio, and a higher threshold for stopping for success to protect results from statistical bias. A particular emerging concern of RAR designs is the possibility that time trends will occur during the implementation of a trial. If there is a time trend and the analytic plan does not prespecify an appropriate model, this could lead to a biased trial. Adjustment for time trend was not pre-specified in CTD, but post hoc time-adjusted analysis showed no presence of influential drift. This trial was an example of a successful two-armed confirmatory trial with a Bayesian adaptive design using response adaptive randomization.

Keywords: drift analysis, comparative effectiveness trial, Bayesian adaptive designs

Background

The European Medicines Agency (EMA) issued a reflection paper in 2007 that recognized that adaptive designs have potential to increase a clinical trial’s efficiency in terms of time and required resources, without sacrificing scientific and regulatory standards (Committee for Medicinal Products for Human Use 2007). In 2019 the Food and Drug Administration (FDA) released guidance for the design, conduct, reporting and submission process of an adaptive clinical trial, including Bayesian adaptive trials (Center for Drug Evaluation and Research, Center for Biologics Evaluation and Research 2019). The FDA recognizes four major potential advantages of an adaptive design: statistical efficiency, accommodation of ethical considerations, improved understanding of drug effects, and accessibility to stakeholders.

In the last two decades, numerous methodological papers were published on developments in Bayesian adaptive designs, for example, variants of the dose-finding methods, different randomization techniques, and solutions for master protocol trials (Berry S.M. 2010; Meyer et al. 2020; Robertson et al. 2020; Giovagnoli 2021). On par with methodological advances, there is an increase in a number of Bayesian adaptive trial designs adopted in clinical research (Barker et al. 2009; Kim et al. 2011; Carey and Winer 2016; Papadimitrakopoulou et al. 2016; Park et al. 2016; Rugo et al. 2016; Angus et al. 2020) such as response adaptive randomization, incorporating or removing treatment arms, early stopping and platform trials. Many scientific papers published in medical journals include only a brief section devoted to the conduct of Bayesian trials, but few published reports provide practical details exclusively on the specifics of the statistical conduct of Bayesian adaptive trials. Gajewski et al. (2023) provide specific execution of an RAR but only had a single endpoint and did not do a post-hoc time trend analysis.

In adaptive designs predefined rules and accrued data guide adaptations throughout a trial. In the Bayesian paradigm, the prior distribution gets updated at each predetermined step, and accrued information determines the next planning decision according to prespecified rules. Various aspects of a trial undergo adaptations, including sampling, stopping for futility or efficacy, treatment arm inclusion or exclusion, and randomization rules. In Bayesian response adaptive randomization designs, the randomization is usually driven by one primary endpoint. As more information is gathered at each interim analysis, more participants get randomized to the stronger performing arm. There is a debate in the literature on potential disadvantages of unequal allocation for participants (Hey and Kimmelman 2015; Thall 2015; Sim 2019). However, as argued by Robertson et al. (2020), Giovagnoli et al. (2021), Villar et al. (2020), and Berry (2015), Bayesian adaptive design is still advantageous if used appropriately in a suitable setting. It provides benefits to participants in comparative effectiveness clinical trials (Connor, Elm, and Broglio 2013) and is considered well-suited for Patient-Centered Outcomes Research Institute (PCORI) pragmatic clinical trials (Mullins et al. 2014). Another obstacle to the use of response-randomization designs is possible inflation of a false positive error rate caused by an unknown time trend over the course of the trial (Karrison, Huo, and Chappell 2003; Proschan and Evans 2020). Among other reasons the drift occurs when characteristics of subjects recruited at later stages diverge from those recruited at earlier stages of a trial. As explained by Lipsky et al. (2011), when background risk changes over time, combined with change in allocation ratio, this can lead to a shift in distributions in the arms. If confounding is left unadjusted, large enough time trends can lead to an increase in Type 1 error rate. “Bayesian Time Machine” models include a parameter adjustment for time interval using a hierarchical modelling (Saville et al. 2022). The variance of this parameter controls the amount of dynamic smoothing. With an appropriate level of flexibility, the estimated drift is driven by observed data. Although different methods have been developed to account for time trends, the lack of reporting on their magnitude in practice has only recently been recognized (Robertson et al. 2020).

In this paper we will discuss statistical aspects of the Changing the Default (CTD) a confirmatory single center prospective randomized comparative effectiveness trial, to determine the effectiveness of offering smoking cessation assistance to all patients who smoke unless they decline it (Opt-out) versus only those who say they are ready to quit (Opt-in, the standard approach to tobacco treatment) (Faseru et al. 2017). Tobacco cessation-oriented pharmacotherapy and counseling lead to higher rates of successful quitting (Gonzales et al. 2006; Fiore MC 2008). However, treatment utilization rate remains low (Browning et al. 2008; Freund et al. 2008; Jamal et al. 2012). CTD is a pioneer trial in its goal to test the impact of changing a treatment default. We chose an innovative Bayesian adaptive design with response adaptive randomization due to its greater benefit to participants in mind. We report statistical methodology, execution, and real analysis of CTD trial and perform post hoc drift-analysis to evaluate time trend influence over the course of the trial. The result of CTD is presented in Richter et al. (2023) but lacks details of the implementation of the Bayesian RAR and how it looked at each of the interim analyses; we fill the gap in this paper.

Methods

We here summarize the trial design, in its final form, as Bayesian two-armed responsive adaptive randomization study. We conducted a population-based adaptive randomization trial to compare the effects of opt-out care—presumptively providing medications/counseling to all—versus opt-in care among 1,000 smokers (Faseru et al. 2017). To maximize participation, the study randomized and treated all eligible patients. All the trial’s participants were offered cessation medication and 4 counselling sessions. Consent was obtained 1-month post-randomization during a phone call with a counselor. Delayed consent resulted in successful enrollment of most participants who were initially randomized into the trial, including those not motivated to quit (Faseru et al. 2022). Only consenting participants (74.1% of all randomized) were included in interim and outcome analyses. Outside of the project director and the data coordinating center, all study staff and investigators, including the principal investigator, were blinded to all interim analyses and changes in randomization.

Primary co-Endpoints

Initially the primary endpoint for the trial was the percentage of participants who quit (biochemically confirmed 7-day point prevalence abstinence) at short term (1-month). At the sixth interim analysis, 6-month co-endpoint was added to optimize the chance of seeing a signal at this important endpoint. This change took place before the trial had an opportunity to stop for success and the response adaptive randomization (RAR) formula was unchanged. For the remaining interim analyses, the endpoint was calculated as the percentage of participants who quit at each of the end points, both 1 month and 6 months post randomization. More details regarding the endpoints were prespecified in the published protocol (Faseru et al. 2017).

Subjects’ smoking status was verified based on cotinine cut-off point of no more than 10 ng/ml and carbon monoxide (CO) cutpoint of no more than 10 ppm. Cotinine was the preferred measure. CO testing was a method of verification for subjects who were taking nicotine replacement therapy or the ones who refused to provide a saliva sample.

Treatment arms

The patients were randomized into one of the two treatment arms. We labeled treatment arms as j, where j=1 is an indicator of opt-in arm and j=2 is an indicator for opt-out arm.

Statistical model

A Bayesian statistical model was used to make inferences in the trial. We assume biochemically confirmed 7-day point prevalence abstinence (SQj) follows a binomial distribution, i.e., SQj~Binom(nj,θQj), where nj denotes sample size of the jth arm and θQj denotes probability of a successful outcome (e.g. confirmed quit smoking) at 1-month with assumed prior knowledge in the form of a “uninformative” prior distribution logθQj1θQj~N(0,1002). An analogous separate model is used for the outcome at 6-months where δQj we denote probability of successful outcome.

Statistical Quantities

The posterior distribution of the model parameters was calculated using a Markov Chain Monte Carlo (MCMC) algorithm. For the k=1, …, K draws from the posterior distribution, the frequency each treatment arm has higher quit rates is the posterior probability that the treatment arm is better. The posterior probability was calculated for each arm at 1 or 6 months per formula (1) and (2) below:

Pj1m=1Kk=1KI(θQj>θQj), (1)
Pj6m=1Kk=1KI(δQj>δQj). (2)

Trial adaptation and stopping criteria

The first interim analysis was conducted after 400 patients were randomized—the study’s “burn-in” period. During the burn-in period, the randomization scheme had 1:1 ratio. The data set for the first interim analysis included participants consenting to be enrolled and 1-month survey windows that had closed. All analyses used follow the intent-to-treat (ITT) principle, in which all participants were included in the arm to which they were originally randomized. After the study’s “burn-in” period, RAR was used for the patient’s allocation (detailed in the following section). At each interim analysis posterior probability of having higher quit rates was calculated for both arms and the maximum out of two was picked (they are called maxPj1m and maxPj6m). The trial’s stopping criteria was that the maximum posterior probability Pj at 1-month and 6-month endpoints was greater than 0.9925 (determined by simulations):

(max(Pj1m)>0.9925)(max(Pj6m)>0.9925).

The trial team planned to continue recruitment until all 1000 patients were randomized unless the best arm was identified after 500 patients were randomized. Subsequent interims occurred every 13-weeks until randomization was stopped because of early stopping or 1000 patients randomized. A calendar-based interim schedule is advantageous so the study team can prepare for each interim analysis. If at any point of the trial after these first 500 patients the stopping criteria had been met, randomization would be stopped, and the remaining patients would be assigned into the more effective arm. Regardless of when the probability cutpoints had been reached, the finding would be confirmed with a subsequent analysis and evaluation (>.99 was determined by simulations), which could be at either 1-month or 6-month endpoints, after all data from participants are obtained:

(max(Pj1m)>0.99)(max(Pj6m)>0.99).

Response adaptive randomization

RAR was firstly conducted after the 1st interim analysis. The best utility probability (i.e., the posterior probability that the treatment arm is better) was evaluated at each interim and impacted the randomization of the newly enrolled patients after the 1st interim analysis. The newly enrolled patients randomized to the jth arm was proportional to

Vj=(Pj1m Var(θQj)nj+1)12 (3)

where θQj was the utility parameter (i.e., smoking quitting rate parameter) of the jth arm at 1-month only, that took advantage of the information gained from our analyses up to that point. Simply weighting for the posterior probability carries a risk of unbalancing a sample in favour of the inferior treatment if initial data is poor (Thall and Wathen 2007). The adjusted weight (3) takes into account an estimate of the information added by one more subject assigned to an arm. This additional information is estimated by the current variance in the posterior distribution and the number of subjects already allocated to the arm. Allocation ratio in each arm was calculated as Vj/(V1+V2), and it would randomize more newly enrolled patients to the more beneficial arm. Allocation probability was set to 0 for values less than 0.05. Such RAR process would be conducted at each following interim analysis.

Operating characteristics

It was projected that 6.7 patients would randomize each week. The interim analysis was planned to be conducted every 13 weeks. Three scenarios describing different patients’ prognosis were set to evaluate the power, sample size and duration of the trial. Under the assumed scenarios 100,000 trials were simulated using Fixed and Adaptive Clinical Trial Simulator (FACTS 6.4). Each trial had 15,000 Markov Chain Monte Carlo (MCMC) simulations with 5,000 simulations’ length of burn in period. Under “expected” scenario it was estimated that the response rate would be 15.7% in the opt-in arm and 25.2% in the opt-out arm at 1-month and 6-month endpoints. This scenario had 99% power with 75% of the early success trials, 24% late success. One percent of the trials were inconclusive. The average sample size of this trial scenario was 789 participants with more than half (546) in the better opt-out arm. The average length of these simulated trials was 145 weeks. Under “expected/smaller” scenario quit rates were projected to be 15.7% in the opt-in arm and 25.2% in the opt-out arm at 1 month and 15.7% in the opt-in arm and 20.0% in the opt-out arm at 6 months. Under this scenario 68% of trials had early success and 23% had late success, providing 91% power. The trial lasted 167 weeks on average and required 947 participants with 696 of them allocated to the opt-out arm. The “no difference” scenario proposed 15.7% quit rates in both arms at both endpoints. No trials had early success and 5% of the trials had late success. Mean number of participants enrolled was 1000 with a 1:1 allocation rate. The average trial duration was 175 weeks.

Re-calculated operating characteristics considering the observed dropout rate

Out of 1000 enrolled participants 261 dropped out at 1 month and 351 dropped out at 6 months. These participants were not included in the final analysis. The same three scenarios were re-run with the observed dropout rate. In the opt-in arm 21.7% of participants left the trial at 1 month and 11.9% at 6 months. In the opt-out arm 28.4% and 7.5% of participants dropped out at 1 month and 6 months respectively. Under the “expected” scenario the estimated power was 82% with 26% if the trials having early success and 56% having late success. It took 961 participants and 169 weeks on average. The “expected/smaller” scenario showed 6% of the early success trials and 54% of the late success trials, giving 60% power. Mean trial duration was 173 weeks with mean number of subjects 990. Under a “no difference” scenario 6% of the trial had late success and none of them had early success. The trial lasted 175 weeks on average and required 1000 participants. Simulations’ results summary is listed in Table 1.

Table 1.

Operating characteristics.

Description Scenario Percentage of successful trials (early success) Participants (in opt-out arm) Duration in weeks
No dropouts
Expected 15.7% vs 25.2% at month 1/ 15.7% vs 25.2% at month 6 99% (75%) 789 (546) 145
Expected/Smaller 15.7% vs 25.2% at month 1/ 15.7% vs 20.0% at month 6 91% (68%) 947 (696) 167
No difference 15.7% vs 15.7% at month 1/ 15.7% vs 15.7% at month 6 5% (0%) 1000 (500) 175
Accounting for dropouts
Expected 15.7% vs 25.2% at month 1/ 15.7% vs 25.2% at month 6 82% (26%) 961 (652) 169
Expected/Smaller 15.7% vs 25.2% at month 1/ 15.7% vs 20.0% at month 6 60% (6%) 990 (676) 173
No difference 15.7% vs 15.7% at month 1/ 15.7% vs 15.7% at month 6 6% (0%) 1000 (500) 175

Post-assessment of “drift” on the 1-month endpoint

Time trend drift are considered to be a major obstacle to the use of RAR. To estimate the degree to which time trend might affect trial results we modelled drift li in θQj counting backwards from the final analysis (i=11) to the first interim (i=1).

logit (θiQj)=γj+li
γj~N(0,1002)
li~N(li+1,τ2)

The following hyperprior distribution of τ2 is

τ2~IG(0.1, 0.001).

The model assumed the same drift in both arms. The “weakly informative” hyperprior of τ has weight of 0.2 and centred around 0.1 and puts emphasis on data while allowing dynamic smoothing. (Saville et al. 2022) Drift estimation follows the observed probability of the successful outcome in each arm from assumed zero drift at final analysis (l11=0) to the beginning of the trial, representing the most recent time of the trial life.

Results

Trial execution

Our standard operating procedure (SOP) for interim analyses is shown in Figure 1. For the duration of interim analysis randomization was paused for no more than two weeks. The CTD data management team (DM) from the data coordinating center provided the study project director (PD) with an analytic file with proposed cases for the interim analysis. The PD confirmed cases to be included, verified participants’ smoking status, excluded any patients who should not be included (for example, who had refused consent), and returned the revised interim analysis dataset to the DM. Based on the smoking status of consented participants a statistician checked stopping criteria and produced a randomization table. Future patients were randomized in blocks. Block size differed to accommodate allocation ratio.

Figure 1.

Figure 1.

Standard operating procedure for each interim analysis.

Results at interim analyses and final analysis

Figure 2 details development of the trial through the 10 interim and final analyses. The study enrolled the first patient in September of 2016 and the last patient in June of 2020. The trial was conducted for 4 years and 2 months including time for the last tested sample. Starting at 4th interim analysis, the posterior probability of 1-month higher quit rates in opt-out arm climbed above 0.9 and stayed above that level throughout the trial never reaching 0.9925 threshold. After 4th interim a noticeably larger number of subjects were assigned to the “winner” arm. There was a concern that not enough data would be collected to evaluate the long-term effect of the intervention. The team decided to introduce 6-month co-endpoint into the protocol. It affected only the stopping criteria, conservatively, and was introduced prior to the 500-patient burn in period. This had the effect of making it harder to conclude the trial as the stopping criteria required both 1-month and 6-month posterior probabilities to be greater than 0.9925.

Figure 2.

Figure 2.

Key clinical trial metrics across 10 interim analyses and the final analysis.

After the first 500 patients were randomized the stopping for success criteria could be initialized. Even though at the last two interims 6-month posterior probability climbed above 0.8, it suddenly dropped to 0.591 at the final analysis (Richter et al. 2023). The data analysis of the 10th interim was performed during the first week of May 2020 and the final analysis was done at the last week of March 2021. Data from 208 and 325 participants in the opt-in and opt-out arms were available for the 10th interim analysis. There were 15.9% and 19.1% verified 6-month quitters in the opt-in and opt-out arm. Using 10th interim analysis data we calculated posterior predictive probabilities to predict the future consenting participants abstaining in the opt-in and opt-out arms. Posterior predictive distributions (Figure 3) suggest that 3 out of 21 participants would be abstinent in the opt-in arm and 18 out of 95 participants would be abstinent in the opt-out arm. Instead, 8 patients quitted in the opt-in arm bringing 6-months quit rate for the last cohort to 38.1%. There was only 4.5% probability of getting this or higher level of abstinence in the opt-in arm. The opt-out arm had slightly smaller than expected 16.8% 6-month quit rate which corresponds to 71.8% probability of having more quitters. Conversely, the 1-month final analysis didn’t show any dramatic changes and the final posterior probability was 0.971 (Richter et al. 2023).

Figure 3.

Figure 3.

Posterior predictive distributions of 6-month abstinence rates for the last cohort using data from prior to 10th interim analysis.

Impact of RAR on Allocation and Outcomes

Allocation was calculated based on 1-month quit rates only. After the first interim analysis the allocation ratio favoured the opt-out arm, and throughout the trial (Table 2). The variance of the probability of successful 1-month outcome decreased throughout the trial with an increased number of enrolled participants. The posterior probability was the major contributor to the increased proportion of participants allocated to the better performing arm, but its effect was reduced by an increased number of participants in this arm. The proportion of subjects assigned to opt-out arm was increasing from 60.1% after the 1st interim to 83% after the 5th interim. As the maximum posterior probability at the 6th and the 7th interim analyses dipped slightly, so did RAR in opt-out arm. At the 8th interim analysis, the maximum posterior probability bounced back to the 5th interim levels. Even though after the last interim only 16.5% participants were assigned to the opt-in arm, a larger initial allocation ratio kept the overall number of subjects in the opt-out arm at 65.5%.

Table 2.

Results of the RAR. The subscripts “1” and “2” represent opt-in and opt-out groups respectively. N is the number of subjects randomized and n is the number with observed outcomes (after dropout).

Interim N1 N2 n1 n2 (P11m)12 (P21m)12 VarθQ11/2 VarθQ21/2 V1 V2 RAR1 RAR2
1 201 199 143 148 0.558519 0.829492 0.030558 0.031574 0.001422 0.002146 0.399 0.601
2 225 237 175 174 0.384844 0.922982 0.026389 0.029257 0.000766 0.002041 0.273 0.727
3 245 281 198 207 0.381948 0.924184 0.024312 0.026226 0.000658 0.001681 0.281 0.719
4 266 332 217 244 0.240420 0.970669 0.022313 0.024284 0.000363 0.001506 0.194 0.806
5 282 396 231 277 0.195540 0.980696 0.021368 0.022846 0.000274 0.001344 0.170 0.830
6 295 455 220 273 0.232026 0.97271 0.023983 0.024322 0.000374 0.001429 0.208 0.792
7 309 506 229 314 0.294194 0.955746 0.023661 0.022432 0.000459 0.001208 0.275 0.725
8 325 555 238 351 0.195108 0.980782 0.023155 0.021621 0.000292 0.001130 0.205 0.795
9 335 598 253 390 0.132631 0.991165 0.022416 0.020780 0.000187 0.001042 0.152 0.848
10 342 639 259 422 0.137452 0.990508 0.022385 0.020053 0.000191 0.000966 0.165 0.835
Final 345 655 270 469

Comparison of trial results to operating characteristics

Simulation results showed that on average 67.8% of patients would be assigned to the better performing opt-out arm under “expected” scenario, and 68.3% patients would be in opt-out arm under “expected/smaller” scenario (Table 3). The difference is explained by the fact that a trial would run longer under “expected/smaller” scenario allowing more interims to happen. The final analysis resulted in 655 (65.5%) participants assigned to the opt-out arm (469 after dropouts). This ratio was lower than the “expected” description of simulations. The observed 1 month quit rates were slightly lower than the “expected” scenario, affecting the maximum posterior probability and consequently the allocation ratio. In terms of resulting posterior probabilities (Table 3), CTD had followed “expected/smaller” scenario more closely.

Table 3.

Allocation under different scenarios and trial enrollment rates. P21m and P26m denote the mean posterior probabilities of opt-out arm having higher quit rates at 1 and 6 months respectively.

Description Scenario n Percentage of successful trials based on co-endpoint P21m P26m Allocation to opt-out arm More enrolled in opt-out arm Dropout rate
Expected 15.7% vs 25.2% at month 1/ 15.7% vs 25.2% at month 6 961 82% 0.964 0.958 0.678 343 315
Expected/Smaller 15.7% vs 25.2% at month 1/ 15.7% vs 20.0% at month 6 990 60% 0.963 0.779 0.683 362 325
No difference 15.7% vs 15.7% at month 1/ 15.7% vs 15.7% at month 6 1000 6% 0.353 0.432 0.457 −86 322
Trial Results N.A. 1000 N.A. 0.971 0.591 0.655 310 261

Post-hoc drift analysis

The model estimates that the drift was never large enough to be meaningfully influential (Figure 4, Figure 6). There is a slight divergence in the log-odds of the probability of observing a successful outcome at month 1. Appearing after the 5th interim, negative drift indicates that there was a minor time trend present and that 1-month quit rates increased by 0.013 over the course of the trial. Almost all observed values were covered by credible intervals in the opt-out case (Figure 5). The opt-in arm observed quit rates were partially covered by the credible intervals. However, we observed values outside of both upper and lower limits. The fact that allocation started to favour the better performing opt-out arm after 2nd interim might have influenced trial results for the 1-month outcome, but the estimated magnitude of the time trend doesn’t seem to be a concern in this particular case.

Figure 4.

Figure 4.

Estimated drift li with 95% credible intervals at 1-month analysis. The dashed grey lines indicate credible intervals. The dotted and dashed black lines indicate the estimated drift under alternative hyperpriors. The assumed zero l11 at the final analysis is not included in the figure.

Figure 6.

Figure 6.

Estimated drift li with 95% credible intervals at 6-month analysis. The dashed grey lines indicate credible intervals. The dotted and dashed black lines indicate the estimated drift under alternative hyperpriors. The assumed zero l11 at the final analysis is not included in the figure.

Figure 5.

Figure 5.

Estimated 1-month θQj with 95% credible intervals. The final analysis is under the 11th interim. The dashed lines indicate credible intervals. The dashed-dotted lines indicate observed abstinence.

There is an indication that 6-month quit rates were fairly stable throughout the trial and were not affected by the time trend (Figure 6). Similar to the 1-month outcome model, we estimated that the log-odds increased over the course of the trial. The fitted quit rates increased for no more than 0.08 from the first interim to the last (Figure 7). The 95% credible intervals covered the observed rates well up to 9th interim in both arms. After 9th interim unexpectedly abstinence rate started to climb up in opt-in arm. However, this was at the late stage of the trial (6.7% of the participants to be randomized) and consequently the change in the trend shouldn’t have impacted the trial’s conclusion for the 6-month outcome drastically.

Figure 7.

Figure 7.

Estimated 6-month θQj with 95% credible intervals. The final analysis is under the 11th interim. The dashed lines indicate credible intervals. The dashed-dotted lines indicate observed abstinence.

Sensitivity analysis

The sensitivity analysis explored effects of the choice of the prior on the smoothing parameter. We evaluated two assumptions of more aggressive (τ2~IG1, 0.001) and less aggressive (τ2~IG1, 0.1) smoothing. The dashed and dotted black lines on Figures 4 and 6 demonstrate the estimated drift for these two assumptions. At 1-month analysis the chosen model showed compromise between two alternative models. The original model (τ2~IG0.1, 0.001) was closer in level of smoothing to the more aggressive model, but 95% credible intervals still covered the estimated drift under the less aggressive model. At 6-month analysis the model with more smoothing showed drift levelling out the sharp drop in quit rates at the final analysis. The less aggressive model put more emphasis on the data and consequently we observed jump in the estimated drift from the final analysis. The posterior predictive distributions indicated the last drastic change in opt-in arm was unexpected. Otherwise, the estimated drift of the less aggressive model was fairly stable. After initial jump from 0 at the final analysis it followed the upper limit of the credible interval relatively close. This particular case demonstrated why the picked hyperprior provided good fit to the data without sacrificing assessment of the general trend.

Discussion

Helping people quit smoking is a powerful tool to improve public health. As a majority of smokers are interested in quitting (Babb 2017) even a small “nudge” could help them achieve their goal. CTD was essentially comparing two effective treatment strategies in a ‘real-world’ pragmatic setting. Bayesian adaptive design was suitable for this purpose. The uninformative prior, conservative initial allocation ratio, and higher threshold for stopping for success created a conservative setting to keep possible downsides of the design minimized. This conscious decision had been proven to be justified. Up to the 6th interim analysis, the posterior probability of one arm showing abstinence rates higher than the other was climbing closer to predefined threshold of early stopping for success. Prior to 500 patients being enrolled, study investigators, who were blinded to interim analyses, feared the trial would stop. The month 6 outcome was introduced to the stopping criterion to make it more strict and to facilitate analyses of the long-term effects of opt-out care.

Initially, the protocol did not specify a dropout rate. During trial execution it became evident that dropout rates may have been affecting power. The post-hoc assessment of operating characteristics showed that the power came down to 82% from 99% initially under the “expected” scenario and from 91% to 60% under “expected/smaller” scenario. The trial resulted in 97.1% posterior probability of treatment arm being better at 1-month analysis and 59.1% at 6-month analysis. It is likely that the trial followed in general the “expected/smaller” scenario. The opt-out arm performed better than opt-in throughout all interims for the 1-month outcome. The 6-month posterior probability was around 80% for opt-out in the first 5 interims but dropped for opt-out participants in the last cohort.

The potential presence of a time trend is one of the major arguments against RAR designs. Bayesians now address this risk through a prespecified time trend model (e.g., “Bayesian time machine”). There was no reason to believe there was a drift in the presented case because no changes in trial features were introduced besides the co-endpoint. There was an increase in treatment effect over time as more patients were simultaneously being randomized to the opt-out arm. The drift model parsed out the treatment effect because the drift model assumed that the drift impacted both arms on the log-odds scale in the same magnitude and did not have separate drift for each treatment. This paper showed an example of a single trial where even though the analysis did not account for time trend, post hoc analysis showed that it did not occur at a magnitude to affect study outcomes. The fact that initial allocation accounted for 40% of patients minimized the possibility of overly skewed results and benefitted participants in the trial. As argued previously (Lipsky and Greenland 2011) time trends could be major concerns, but if managed properly the bias they bring could be minimized while at the same time maximizing benefit to participants.

While the short-term effect of the intervention was evident, the probability that opt-out was better than opt-in at 6-month post randomization was just over 0.5. As mentioned above participants had 4 counselling sessions during first months after hospitalization. Following unblinding of the study results, the CTD investigative team hypothesized that prolonged access to counselling could extend cessation. Future research could examine the impact of opt-out versus opt-in care by extended versus short-term treatment, converting CTD to essentially a multi-arm adaptive platform trial. Participants in “opt-out boost” arm could be offered counselling sessions for 6 months and quitting rates will be assessed at month 1 and month 6. As shown by Saville et al. (2022) “Bayesian Time Machine” model is helpful in a case of comparison of newer treatment arm to a former control or a treatment arm. A temporal drift is a major argument against pooled analysis in multi-arm platform trials. As we didn’t find presence of a time trend concerning in the original trial, this novel approach offers a solution for the next trial that builds of the obtained results.

Conclusion

Cautiously planned Bayesian response adaptive randomization design served well for the purpose of comparing two effective treatment strategies. The trial followed SOP closely and we encountered no operational difficulties that would prevent the team from adopting adaptive designs in future trials. The final analysis was not adjusted for time trend, but we found no evidence that patient drift was a concern for this trial. This wouldn’t be applicable for all situations, but we emphasize that the CTD trial was an example of successful two-armed confirmatory trial with Bayesian adaptive design.

Acknowledgments

This project is funded by the National Heart, Lung, and Blood Institute (R01HL131512, Richter-PI) and utilized Biostatistics and Informatics Shared Resource (BISR) and Clinical Pharmacology Shared Resource (CPSR) from the Comprehensive University of Kansas Cancer Center (KUCC, P30CA168524).

Appendix 1

model<- function() {
 for (i in 1:11){
 for (j in first_pt_in[i]:last_pt_in[i]){
  dYY1in[j]~dbin(P_in[i],1)
 }
 P_in[i]<-exp(theta_in[i])/(1+exp(theta_in[i]))
 for (j in first_pt_out[i]:last_pt_out[i]){
  dYY1out[j]~dbin(P_out[i],1)
 }
 P_out[i]<-exp(theta_out[i])/(1+exp(theta_out[i]))
 }#end of i loop 
 for (j in 1:11){
 theta_in[j]<-Theta[1]+ltheta[j]
 theta_out[j]<-Theta[2]+ltheta[j]
 }
 ltheta[11]<−0
 for (j in 1:10){
 ltheta[11-j]~dnorm(ltheta[11-j+1],invT)
 }
 invT~dgamma(0.1,0.001)
 Theta[1]~dnorm(0,.0001)
 Theta[2]~dnorm(0,.0001)
 p<-step(Theta[2]-Theta[1])
 }
out <- bugs(data, inits=NULL,
  parameters.to.save = c(“ltheta”,”p”,”Theta”,”invT”,”theta_in”,”theta_out”), 
  model.file = model, n.chains = 1, n.iter = 102000,
  n.burnin = 60000, debug=T)

Footnotes

Ethics approval and consent to participate

The study was approved by the University of Kansas Human Subjects Committee (IRB00006196; STUDY00001774). Consistent with the modified Zelen’s design, consent to participate is received at the 1-month follow-up.

Disclosure Statement

The authors declare they have no competing interests.

References

  1. Angus DC, Berry S, Lewis RJ, Al-Beidh F, Arabi Y, van Bentum-Puijk W, Bhimani Z, Bonten M, Broglio K, Brunkhorst F, Cheng AC, Chiche JD, De Jong M, Detry M, Goossens H, Gordon A, Green C, Higgins AM, Hullegie SJ, Kruger P, Lamontagne F, Litton E, Marshall J, McGlothlin A, McGuinness S, Mouncey P, Murthy S, Nichol A, O’Neill GK, Parke R, Parker J, Rohde G, Rowan K, Turner A, Young P, Derde L, McArthur C, and Webb SA (2020), “The REMAP-CAP (Randomized Embedded Multifactorial Adaptive Platform for Community-acquired Pneumonia) Study. Rationale and Design,” Ann Am Thorac Soc, 17 (7), 879–891. DOI: 10.1513/AnnalsATS.202003-192SD. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Babb S, Malarcher Ann, Schauer Gillian, Asman Kat, Jamal Ahmed (2017), “Quitting Smoking Among Adults — United States, 2000–2015,” MMWR Morb Mortal Wkly Rep, 65, 1457–1464. DOI: 10.15585/mmwr.mm6552a1. [DOI] [PubMed] [Google Scholar]
  3. Barker AD, Sigman CC, Kelloff GJ, Hylton NM, Berry DA, and Esserman LJ (2009), “I-SPY 2: an adaptive breast cancer trial design in the setting of neoadjuvant chemotherapy,” Clin Pharmacol Ther, 86 (1), 97–100. DOI: 10.1038/clpt.2009.68. [DOI] [PubMed] [Google Scholar]
  4. Berry DA (2015), “Commentary on Hey and Kimmelman,” Clinical Trials, 12 (2), 107–109. DOI: 10.1177/1740774515569011. [DOI] [PubMed] [Google Scholar]
  5. Berry SM, C. BP, Lee JJ, Muller P (2010), Bayesian Adaptive Methods for Clinical Trials, Boca Raton, FL, USA: CRC Press. [Google Scholar]
  6. Browning KK, Ferketich AK, Salsberry PJ, and Wewers ME (2008), “Socioeconomic disparity in provider-delivered assistance to quit smoking,” Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco, 10 (1), 55–61. DOI: 10.1080/14622200701704905. [DOI] [PubMed] [Google Scholar]
  7. Carey LA, and Winer EP (2016), “I-SPY 2 — Toward More Rapid Progress in Breast Cancer Treatment,” New England Journal of Medicine, 375 (1), 83–84. DOI: 10.1056/NEJMe1603691. [DOI] [PubMed] [Google Scholar]
  8. Center for Drug Evaluation and Research, C. f. B. E. a. R. 2019. Adaptive Designs for Clinical Trials of Drugs and Biologics Guidance for Industry. Silver Spring, MD, USA: Food and Drug Administration, U.S. Department of Health and Human Services, Center for Drug Evaluation and Research (CDER), Center for Biologics Evaluation and Research (CBER). [Google Scholar]
  9. Committee for Medicinal Products for Human Use, E. (2007), “Reflection paper on methodological issues in confirmatory clinical trials planned with an adaptive design,” London: EMEA. [Google Scholar]
  10. Connor JT, Elm JJ, and Broglio KR (2013), “Bayesian adaptive trials offer advantages in comparative effectiveness trials: an example in status epilepticus,” Journal of Clinical Epidemiology, 66 (8, Supplement), S130–S137. DOI: 10.1016/j.jclinepi.2013.02.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Faseru B, Ellerbeck EF, Catley D, Gajewski BJ, Scheuermann TS, Shireman TI, Mussulman LM, Nazir N, Bush T, and Richter KP (2017), “Changing the default for tobacco-cessation treatment in an inpatient setting: study protocol of a randomized controlled trial,” Trials, 18 (1), 379–379. DOI: 10.1186/s13063-017-2119-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Faseru B, Mussulman LM, Nazir N, Ellerbeck EF, Shergina E, Scheuermann TS, Gajewski BJ, Catley D, and Richter KP (2022), “Use of pre-enrollment randomization and delayed consent to maximize participation in a clinical trial of opt-in versus opt-out tobacco treatment,” Subst Abus, 43 (1), 1035–1042. DOI: 10.1080/08897077.2022.2060441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Fiore MC, J. C, Baker TB, et al. (2008), Treating Tobacco Use and Dependence: 2008 Update, Rockville, MD: U.S. Department of Health and Human Services. Public Health Service. [Google Scholar]
  14. Freund M, Campbell E, Paul C, McElduff P, Walsh RA, Sakrouge R, Wiggers J, and Knight J (2008), “Smoking care provision in hospitals: a review of prevalence,” Nicotine & tobacco research : official journal of the Society for Research on Nicotine and Tobacco, 10 (5), 757–774. DOI: 10.1080/14622200802027131. [DOI] [PubMed] [Google Scholar]
  15. Gajewski BJ, Carlson SE, Brown AR, Mudaranthakam DP, Kerling EH, and Valentine CJ (2023), “The value of a two-armed Bayesian response adaptive randomization trial,” J Biopharm Stat, 33 (1), 43–52. DOI: 10.1080/10543406.2022.2148161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Giovagnoli A (2021), “The Bayesian Design of Adaptive Clinical Trials,” Int J Environ Res Public Health, 18 (2). DOI: 10.3390/ijerph18020530. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Gonzales D, Rennard SI, Nides M, Oncken C, Azoulay S, Billing CB, Watsky EJ, Gong J, Williams KE, and Reeves KR (2006), “Varenicline, an alpha4beta2 nicotinic acetylcholine receptor partial agonist, vs sustained-release bupropion and placebo for smoking cessation: a randomized controlled trial,” Jama, 296 (1), 47–55. DOI: 10.1001/jama.296.1.47. [DOI] [PubMed] [Google Scholar]
  18. Hey SP, and Kimmelman J (2015), “Are outcome-adaptive allocation trials ethical?,” Clin Trials, 12 (2), 102–106. DOI: 10.1177/1740774514563583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Jamal A, Dube SR, Malarcher AM, Shaw L, and Engstrom MC (2012), “Tobacco use screening and counseling during physician office visits among adults--National Ambulatory Medical Care Survey and National Health Interview Survey, United States, 2005–2009,” MMWR Suppl, 61 (2), 38–45. [PubMed] [Google Scholar]
  20. Karrison TG, Huo D, and Chappell R (2003), “A group sequential, response-adaptive design for randomized clinical trials,” Controlled Clinical Trials, 24 (5), 506–522. DOI: 10.1016/S0197-2456(03)00092-8. [DOI] [PubMed] [Google Scholar]
  21. Kim ES, Herbst RS, Wistuba II, Lee JJ, Blumenschein GR Jr., Tsao A, Stewart DJ, Hicks ME, Erasmus J Jr., Gupta S, Alden CM, Liu S, Tang X, Khuri FR, Tran HT, Johnson BE, Heymach JV, Mao L, Fossella F, Kies MS, Papadimitrakopoulou V, Davis SE, Lippman SM, and Hong WK (2011), “The BATTLE trial: personalizing therapy for lung cancer,” Cancer Discov, 1 (1), 44–53. DOI: 10.1158/2159-8274.Cd-10-0010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Lipsky AM, and Greenland S (2011), “Confounding due to changing background risk in adaptively randomized trials,” Clinical Trials, 8 (4), 390–397. DOI: 10.1177/1740774511406950. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Meyer EL, Mesenbrink P, Dunger-Baldauf C, Fülle HJ, Glimm E, Li Y, Posch M, and König F (2020), “The Evolution of Master Protocol Clinical Trial Designs: A Systematic Literature Review,” Clin Ther, 42 (7), 1330–1360. DOI: 10.1016/j.clinthera.2020.05.010. [DOI] [PubMed] [Google Scholar]
  24. Mullins CD, Vandigo J, Zheng Z, and Wicks P (2014), “Patient-Centeredness in the Design of Clinical Trials,” Value in Health, 17 (4), 471–475. DOI: 10.1016/j.jval.2014.02.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Papadimitrakopoulou V, Lee JJ, Wistuba II, Tsao AS, Fossella FV, Kalhor N, Gupta S, Byers LA, Izzo JG, Gettinger SN, Goldberg SB, Tang X, Miller VA, Skoulidis F, Gibbons DL, Shen L, Wei C, Diao L, Peng SA, Wang J, Tam AL, Coombes KR, Koo JS, Mauro DJ, Rubin EH, Heymach JV, Hong WK, and Herbst RS (2016), “The BATTLE-2 Study: A Biomarker-Integrated Targeted Therapy Study in Previously Treated Patients With Advanced Non-Small-Cell Lung Cancer,” J Clin Oncol, 34 (30), 3638–3647. DOI: 10.1200/jco.2015.66.0084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Park JW, Liu MC, Yee D, Yau C, van ‘t Veer LJ, Symmans WF, Paoloni M, Perlmutter J, Hylton NM, Hogarth M, DeMichele A, Buxton MB, Chien AJ, Wallace AM, Boughey JC, Haddad TC, Chui SY, Kemmer KA, Kaplan HG, Isaacs C, Nanda R, Tripathy D, Albain KS, Edmiston KK, Elias AD, Northfelt DW, Pusztai L, Moulder SL, Lang JE, Viscusi RK, Euhus DM, Haley BB, Khan QJ, Wood WC, Melisko M, Schwab R, Helsten T, Lyandres J, Davis SE, Hirst GL, Sanil A, Esserman LJ, and Berry DA (2016), “Adaptive Randomization of Neratinib in Early Breast Cancer,” New England Journal of Medicine, 375 (1), 11–22. DOI: 10.1056/NEJMoa1513750. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Proschan M, and Evans S (2020), “Resist the Temptation of Response-Adaptive Randomization,” Clinical Infectious Diseases, 71 (11), 3002–3004. DOI: 10.1093/cid/ciaa334. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Richter KP, Catley D, Gajewski BJ, Faseru B, Shireman TI, Zhang C, Scheuermann TS, Mussulman LM, Nazir N, Hutcheson T, Shergina E, and Ellerbeck EF (2023), “The Effects of Opt-out vs Opt-in Tobacco Treatment on Engagement, Cessation, and Costs: A Randomized Clinical Trial,” JAMA Intern Med. DOI: 10.1001/jamainternmed.2022.7170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Robertson DS, Lee KM, Lopez-Kolkovska BC, and Villar SS 2020. Response-adaptive randomization in clinical trials from myths to practical considerations. arXiv. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Rugo HS, Olopade OI, DeMichele A, Yau C, van ‘t Veer LJ, Buxton MB, Hogarth M, Hylton NM, Paoloni M, Perlmutter J, Symmans WF, Yee D, Chien AJ, Wallace AM, Kaplan HG, Boughey JC, Haddad TC, Albain KS, Liu MC, Isaacs C, Khan QJ, Lang JE, Viscusi RK, Pusztai L, Moulder SL, Chui SY, Kemmer KA, Elias AD, Edmiston KK, Euhus DM, Haley BB, Nanda R, Northfelt DW, Tripathy D, Wood WC, Ewing C, Schwab R, Lyandres J, Davis SE, Hirst GL, Sanil A, Berry DA, and Esserman LJ (2016), “Adaptive Randomization of Veliparib–Carboplatin Treatment in Breast Cancer,” New England Journal of Medicine, 375 (1), 23–34. DOI: 10.1056/NEJMoa1513749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Saville BR, Berry DA, Berry NS, Viele K, and Berry SM (2022), “The Bayesian Time Machine: Accounting for temporal drift in multi-arm platform trials,” Clinical Trials, 17407745221112013. DOI: 10.1177/17407745221112013. [DOI] [PubMed] [Google Scholar]
  32. Sim J (2019), “Outcome-adaptive randomization in clinical trials: issues of participant welfare and autonomy,” Theor Med Bioeth, 40 (2), 83–101. DOI: 10.1007/s11017-019-09481-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Thall PF, Fox PS and Wathen JK (2015), “Some Caveats for Outcome Adaptive Randomization in Clinical Trials. In Modern Adaptive Randomized Clinical Trials. Statistical and Practical Aspects,” in In Modern Adaptive Randomized Clinical Trials: Statistical and Practical Aspects, ed. Sverdlov O, Boca Raton: Chapman and Hall/CRC Press. [Google Scholar]
  34. Thall PF, and Wathen JK (2007), “Practical Bayesian adaptive randomisation in clinical trials,” Eur J Cancer, 43 (5), 859–866. DOI: 10.1016/j.ejca.2007.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Villar SS, Robertson DS, and Rosenberger WF (2020), “The Temptation of Overgeneralizing Response-adaptive Randomization,” Clinical Infectious Diseases, 73 (3), e842–e842. DOI: 10.1093/cid/ciaa1027. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES