Abstract
Bayesian adaptive randomization is a heuristic approach that aims to randomize more patients to the putatively superior arms based on the trend of the accrued data in a trial. Many statistical aspects of this approach have been explored and compared with other approaches; yet only a limited number of works has focused on improving its performance and providing guidance on its application to real trials. An undesirable property of this approach is that the procedure would randomize patients to an inferior arm in some circumstances, which has raised concerns in its application. Here, we propose an adaptive clip method to rectify the problem by incorporating a data-driven function to be used in conjunction with Bayesian adaptive randomization procedure. This function aims to minimize the chance of assigning patients to inferior arms during the early time of the trial. Moreover, we propose a utility approach to facilitate the selection of a randomization procedure. A cost that reflects the penalty of assigning patients to the inferior arm(s) in the trial is incorporated into our utility function along with all patients benefited from the trial, both within and beyond the trial. We illustrate the selection strategy for a wide range of scenarios.
Keywords: Adaptive clip method, adaptive randomization, multi-arm trials, patient horizon, utility
1. Introduction
Bayesian adaptive randomization (BAR), also known as Bayesian outcome adaptive randomization or Bayesian response adaptive randomization, is a heuristic approach that aims to randomize more patients to the putatively superior arms based on the trend of the accrued data in a trial. It is only one type of adaptive randomization, and there are many other families of procedure.1 The idea is based on Thompson sampling2 where posterior probabilities are computed using the accrued data. For two-arm trials, Thall and Wathen3 initially advocate BAR as an alternative to conventional randomization. For multi-arm trials, Wathen and Thall4 show the application of BAR in conjunction with early stopping for futility, and make comparisons with equal randomization in terms of the trial operating characteristics with some cautions to be elaborated later. Others such as Lee et al.5 and Du et al.6 also compare the performance of equal randomization with some variants of BAR by simulation studies. On the other hand, Wason and Trippa7 and Lin and Bunn8 compare multi-arm multi-stage designs with a design that uses BAR, whereas others such as Villar et al.9 propose a different type of allocation rule and make comparison to various randomization approaches including BAR in multi-arm settings.
Many have also explored various aspects of BAR. For example, Thall et al.10 comment on some inferential problems of implementing BAR. Viele et al.11 compare multiple features of response adaptive randomization based on various metrics. Jiang et al.12 investigate the impact of time tend on treatment estimation when BAR is implemented. Yin et al.13 propose a randomized phase II trial design using BAR with a predictive probability for monitoring purposes. Viele et al.14 investigate the role of some control allocation schemes on the operating characteristics of some randomization procedures including BAR. Jiang et al.15 combine a Bayesian decision-theoretic framework with BAR to improve trial efficiency and achieve an ethical goal simultaneously.
Despite the vast methodological investigation, BAR has received much debate and criticism in the biostatistical community.10,16–23 Most of the discussions focus on the statistical properties of BAR when making comparisons with other randomization approaches, such as those that have fixed allocation probabilities. An undesirable property of this approach is that the procedure would randomize patients to an inferior arm in some circumstances,4,10 which has raised concerns in its application. Yet, only a limited number of works has focused on improving the performance of BAR and providing guidance on its application to real trials. For example, Du et al.24 compare several regularization methods for BAR in a two-arm trial setting, in view of improving the properties of BAR.
The aims of this paper are two-fold: propose a solution to improve the performance of BAR and a strategy for identifying situations in which the implementation of BAR outperforms other randomization procedures. More specifically, we propose an adaptive clip method by incorporating a data-driven function to be used in conjunction with BAR. This function aims to minimize the chance of assigning patients to the inferior arms during the early time of the trial, especially when the uncertainty about the estimates of treatment effects is large and there is a negative bias in parameter estimation.25 Moreover, we propose a utility approach to facilitate the selection of a randomization procedure for implementation purpose. Given the patient horizon, i.e. the total patient population available in the whole society,26 we compare the gains of implementing a trial with different randomization approaches by the potential of maximizing the utility. We consider that future patients who are not in the trial will be treated with the best treatment found upon the completion of the trial. A cost that reflects the penalty of assigning patients to the inferior arm(s) during the trial is incorporated into our utility function along with all patients benefited from the trial, both within and beyond the trial.
The structure of this paper is as follows. In Section 2, we give examples of real trials that have implemented BAR. In Section 3, we describe some variants of BAR and propose an adaptive clip method for setting the lower bound of the treatment allocation probabilities. We illustrate the operating characteristics of BAR in multi-arm settings through simulation studies. In Section 4, we propose a utility function to reflect the benefit and cost of employing a randomization procedure from the perspective of the whole society. We provide guidance and illustrate the strategy through redesigning a study on ketamine. In Section 5, we summaries our proposal and discuss the finding of our illustrations.
2. Trials that employ Bayesian adaptive randomization
Apart from the two well-known trials that have used BAR, namely I-SPY 227,28 for breast cancer and BATTLE29 for lung cancer, we present other trial examples that have aimed at assigning more patients to the better arms through the implementation of BAR.
An example of two-arm trials is a study on patients with sarcomas. It was a multi-centre, open-label phase II trial without early stopping, which compared gemcitabine alone to the combination of gemcitabine and docetaxel in patients with metastatic soft tissue sarcomas. The primary endpoint was tumour response defined as complete or partial response or stable disease lasting at least 24 weeks. As a result of using BAR, 73 patients (60%) were randomized to gemcitabine-docetaxel and 49 patients (40%) to gemcitabine. The study found that the combination of the two agents resulted in better progression-free and overall survival than gemcitabine alone, but with increased toxicity.30
On the other hand, Cellamare et al.31 built and illustrated a BAR procedure for an on-going multi-arm multistage trials on patients with multi-drug-resistant tuberculosis. They compared the power of BAR to non-adaptive randomization and observed that BAR requires fewer patients under several hypothetical scenarios. The study that employs the proposal, namely endTB, is a non-inferiority Phase III trial with early stopping for futility. It compares five experimental treatments with a control treatment. The study team aims to enrol 750 participants (27% less than that required by a design that use a balanced randomization) to achieve a power of 80% to detect up to two (of five) novel treatment regimens that are non-inferior to the control treatment. The primary endpoint is treatment success after 73 weeks from randomization.32
Another example is a study on ketamine for treatment resistance late-life depression (identifier on ClinicalTrials.gov is NCT02556606). This study examines the efficacy, safety and tolerability of three different dosages of intravenous ketamine (single-dose 0.1, 0.25, 0.5 mg/kg over 40 min) compared to active placebo mid-azlolam (0.03 mg/kg) in up to 72 patients with late-life treatment-resistant depression. The primary outcome is the time to relapse measured by repeated measurements using the Montgomery-Asberg Depression Rating Scale during a four-week post-infusion follow-up. O’Brien et al.33 present the details of the trial including the randomization algorithm and prior distributions that were employed in the design. Specific illustration of the implementation of BAR is given in Section 4.2.
3. Bayesian outcome adaptive randomization
We now present the details of our investigation. We consider Bayesian paradigm for both the randomization algorithm and the inference procedure. Consider a trial that has N patients and k treatments, k = 1, . . . , K, with binary endpoint that have independent response rates θ = {θ1, . . . , θK}. Consider the use of a conjugate prior distribution, Beta(α, β). Let k = 1 correspond to the control treatment, nk be the sample size of arm k, Nf be the number of patients outside the trial such that the patient population size in the whole society, i.e. patient horizon, is N + Nf. Let Pi(k) denote the allocation probability of patient i to arm k given the responses of patient 1, . . . , i − 1, and be the posterior probability that is the maximum given the accrued data.
BAR has the following procedure. Initially, Nb/K patients are randomized to each arm with a restricted block randomization approach, where Nb with K ≤ Nb < N is known as the burn-in sample size. After observing the outcome of Nb patients, the allocation probabilities are updated using the observations of the binary outcome. There are various ways to compute the allocation probabilities.
3.1. Variants of the heuristic approach
Thall and Wathen3,4 propose to allocate patient i, i = Nb + 1, . . . , N, to arm k with the sequentially updated probability
The parameter τ is a tuning parameter that reflects the strength of deviation from the equal allocation probability. Common values considered in the literature include τ = 0.5, 1, and i/(2N). When τ = 0, this corresponds to the simple randomization procedure with a fixed equal allocation probability, i.e. Pi(k) = P(k) = 1/K for all patients i. Note that simple randomization may not result in nk = N/K for all k due to finite random sampling.
Connor et al.34 suggest to account for the reduction in the variance of in the randomization procedure as more patients are enrolled to the trial. They define
as the expected reduction in the variance of the parameter estimates when an additional patient is enrolled to arm k, where nk′ is the current sample size of arm k. They propose to use
as the allocation probability of patient i to arm k given the accrued data.
3.2. Lower bound of the allocation probability: Adaptive clip method
It can happen that when Pi(k) becomes very small, the chance of a patient being allocated to arm k becomes essentially zero. This is undesirable as having small Pi(k) may be due to a random chance or that an insufficient amount of data fails to guide the allocation to this arm. The latter is true especially when Nb is small, i.e. during the early time of a trial where limited information is available. To avoid this, some authors proposed to set a lower bound for Pi(k) when it falls below a threshold. Denote this threshold by c where 0 ≤ c < 1. When this happens, say one of the Pi(k) is set to c, other Pi(k) are normalized such that the sum of these Pi(k) is equal to 1 − c. This approach is known as a clip method in the literature.24
Instead of choosing a fixed value c, we propose an adaptive clip method for setting the lower bound of Pi(k). The motivation of this proposal comes from the fact that (i) the estimates of θ tend to be negatively biased when an adaptive randomization procedure is implemented25,35; and (ii) the procedure may assign more patients to the inferior arms with an undesirably high probability in some circumstances, which contradicts the purpose of implementing BAR.4 These unwanted characteristics often occurred in the early phase of patient enrolment when the variability in estimating the treatment effects is large. To rectify the problem, we suggest to use a data-driven function in conjunction with BAR. The aim is to reduce the variability of the treatment allocation and hence minimize the probability that patients are assigned to the inferior arm(s) in the trial.
More specifically, we propose a function that provides increasing values as the negative bias becomes trivial and decreasing values when more patients have been enrolled to the trial. The rational of having the former property is to allow the arm that has an under-estimated θk during the early to middle stage of the trial to enrol more patients, aiming to improve the property of the estimate with more data. The latter property is for ethical concern that as the trial progresses towards the end, the allocation probability of an inferior arm shall be small when the accrued data are expected to be sufficient for estimating the corresponding θk consistently.
As the true bias is never known in practice, we approximate the bias by the difference between the posterior mean of θ and priori values of θ, e.g. the values under the null scenario, the value under the alternative scenario or based on clinical evidence. We denote this quantity by Δi|data. When Δi|data < 0, the estimate of a treatment effect is either under-estimated or the treatment is in fact less efficacious than expected. We propose to set
where
and u is a user-specify constant such that 0 ≤ g(i, Δi) ≤ 1/K. As before, when one or more than one Pi(k) are set to either c or g(i, Δi), the other Pi(k) are normalized accordingly. Note that g(i, Δi) is not only dependent on i and Δi, K and Nb also play a role in the function: for example, if a larger Nb is adopted, the arm that has an underestimated treatment effect can have a higher chance of getting more observations to update the algorithm as more patients are enrolled, whereas having K in the function allows us to have a decreasing function that starts with a value which is close to 1/K after patients in the burn-in stage have been randomized.
Figure 1 shows the behaviour of g(i, Δi) for K = 3 and N = 150 in panel a) and K = 4 and N = 200 in panel b). We see that g(i, Δi) is a decreasing function of i; larger u leads to larger g(i, Δi). The burn-in sample size, Nb, shows when the adjustment kicks in. By choosing u appropriately for c < 1/K, we can set Pi(k)≈1/K for the first few patients after the burn-in period when Δi|data ≤ 0 and Pi(k) < max{g(i, Δi), c}. By choosing the value of c, we can decide at which patient that Pi(k) would have a minimum bound. In general, larger c would reduce the role of g(i, Δi) in the procedure when Δi|data < 0.
Figure 1.
The proposed adaptive clip function for different choices of u and Nb. (a) K = 3 and N = 150; (b) K = 4 and N = 200.
Using this adaptive clip method appropriately would allow higher chances of enrolling more patients to the arms that have Δi|data < 0 after the burn-in period. This is a desirable feature especially when Nb is small, since this can prevent treatment arms from having very small Pi(k) due to under-estimation of θk. When a larger Nb is chosen (relative to using a small Nb), our function would give a larger value of Pi(k), i.e. compare the curves with different line types in Figure 1. This feature is also important because the accrued data would be expected to estimate θk relatively more consistent when a large Nb is employed; if θk is still under-estimated, i.e. Δi|data < 0, the corresponding arm should deserve a higher Pi(k) for obtaining more data to correct for the bias.
In what follows, we denote BAR(τ, c) for the randomization procedure by Thall and Wathen3,4 and BAR(I(k), c) for the one by Connor et al.34 with c as the threshold of the lower bound of Pi(k).
3.3. Performance of BAR
We now compare the adaptive clip method (used in conjunction with the BAR approach described by Thall and Wathen with τ = 1) with the BAR approach, BAR(1, c), that has a fixed c. We mimic the trial settings of Wathen and Thall4 for multi-arm designs. We delegate the comparison with simple randomization to the next section since the comparison between the performance of this approach and BAR has already been studied.4
More specifically, with four treatment and a control arms, and initial burn-in sample size of Nb = 50, for N = 250, 500, respectively, we generate responses for two types of alternative scenario: the least favourable configuration (LFC), i.e. θ = {0.2, 0.2, 0.2, 0.2, 0.4}, and the “staircase” scenario with θ = {0.2, 0.25, 0.3, 0.35, 0.4}. The prior Beta(0.2, 0.8) for θk, k = 1, . . . , 5, is used as in Wathen and Thall.4 Each trial scenario is replicated 10 000 times in the simulation; with 10,000 posterior samples of θk generated when Pi(k) is updated and when computing Δi|data, respectively. The values of θ in the LFC and the staircase scenarios are first used in the respective simulation when computing Δi|data. We then show the impact of approximating Δi|data using θ = {0.2, 0.2, 0.2, 0.2, 0.2} on the performance of the randomization approaches.
One of the performance measures considered in the literature is the probability that the number of patients randomized to the control arm is at least m larger than the number of patients randomized to arm k, k > 1. This probability is denoted by ηm = P(n1 > nk + m) for k > 1 in the papers.3,4 For ease of presentation, we denote this probability for treatment k by ηk,m. We call ηk,m × 100% the percentage of adverse imbalance in treatment arm k. When a randomization procedure is aiming at randomizing more patients to the putatively superior treatment arms, small ηk,m is desirable when the treatments are truly more efficacious than the control treatment. This indicates that the procedure has a small rate in not achieving the objective of the procedure; in other words, the percent of adverse imbalance, ηk,m, can be viewed as a “false negative” rate where the procedure identifies a treatment arm as less superior or equivalent to the control arm (and hence assign relatively less patients to this arm) when the treatment is truly more efficacious.
Apart from ηk,m, we consider the probability that n1 is always larger than the sample size of several other treatment arms by m more patients. We denote this probability by Pm. It provides a different perspective for measuring the performance of BAR at the design stage of a trial, since several configurations of θ are often considered when finding a design for implementation; judgement based on the individual ηk,m alone may be over-conservative. For the illustrations here, we consider Pm = (n1 > n2 + m and n1 > n3 + m and n1 > n4 + m) because treatment k = 5 is the most efficacious out of the five options in both the LFC and the staircase scenarios; including n1 > n5 + m in the probability statement would reduce the magnitude of Pm considerably (and hence become a less meaningful metric) since η5,m would be the smallest among others when BAR is implemented.
Tables 1 and 2 show the simulation results for the LFC scenario with N = 250 and N = 500, respectively; Tables 3 and 4 shows the simulation results for the staircase scenario with N = 250 and N = 500, respectively. Each column of values (from left to right) corresponds to the output of BAR(1, c) with fixed c and adaptive clip method with u = 2, 3, 4, respectively. The top (and bottom) panel corresponds to using c = 0 (and c = 0.1) in the randomization procedures. Note that the adaptive clip method also uses c in updating the lower bound of Pi(k). All the presented probabilities, ηk,m and Pm for m = 10, 20, 30, and k = 2, . . . , 5, have been multiplied by 100% for ease of presentation. We also report the mean and standard deviation (SD) of the total number of respondents in the trial (denoted by R).
Table 1.
Simulation results for the least favourable configuration (LFC) scenario where θ1 = θ2 = θ3 = θ4 = 0.2, θ5 = 0.4, and trial sample size N = 250.
| BAR(1, c) | Adaptive clip method | |||
|---|---|---|---|---|
| u = 2 | u = 3 | u = 4 | ||
| c = 0, Multi-arm trials with N = 250 | ||||
| η.,10, η.,20, η.,30 | 22.9, 11.3, 5.8 | 15.4, 4.7, 2.1 | 14.5, 6.4, 3.0 | 16, 7.2, 3.9 |
| η5,10, η5,20, η5,30 | 1.9, 1.5, 1.2 | 1.3, 0.9, 0.5 | 1.6, 1.1, 0.7 | 2, 1.4, 1.0 |
| P10, P20, P30 | 11.4, 5.6, 2.8 | 6.7, 2.5, 1.1 | 7.7, 3.5, 1.8 | 8.9, 4.1, 2.2 |
| Mean of R (SD) | 81.4 (10.6) | 74.6 (9.9) | 77.1 (9.8) | 77.6 (9.9) |
| c = 0.1, Multi-arm trials with N = 250 | ||||
| η.,10, η.,20, η.,30 | 13.6, 4.7, 2.0 | 12.9, 3.4, 1.4 | 13.7, 4.6, 2.1 | 14, 5.0, 2.4 |
| η5,10, η5,20, η5,30 | 1.5, 1.0, 0.7 | 1.4, 0.9, 0.5 | 1.7, 1.1, 0.7 | 1.7, 1.2, 0.8 |
| P10, P20, P30 | 6.2, 2.5, 1.1 | 5, 1.7, 0.7 | 6.1, 2.7, 1.2 | 6.4, 2.7, 1.4 |
| Mean of R (SD) | 72.8 (8.5) | 71.0 (8.3) | 72.3 (8.3) | 72.6 (8.5) |
Different column corresponds to different clip method. The probabilities are presented in %.
Table 2.
Simulation results for the least favourable configuration (LFC) scenario where θ1 = θ2 = θ3 = θ4 = 0.2 θ5 = 0.4, and trial sample size N=500.
| BAR(1, c) | Adaptive clip method | |||
|---|---|---|---|---|
| u = 1 | u = 3 | u = 4 | ||
| c = 0, Multi-arm trials with N = 500 | ||||
| η.,10, η.,20, η.,30 | 28.7, 16.6, 9.8 | 19.2, 7.2, 3.4 | 18.2, 8.5, 4.7 | 19.7, 10.1, 5.6 |
| η5,10, η5,20, η5,30, | 0.5, 0.5, 0.4 | 0.2, 0.2, 0.2 | 0.2, 0.2, 0.2 | 0.3, 0.2, 0.2 |
| P10, P20, P30 | 14.1, 7.9, 4.6 | 8.3, 3.5, 1.9 | 9.4, 4.6, 2.5 | 10.2, 5.3, 2.9 |
| Mean of R (SD) | 176.6 (15.6) | 166.9 (14.2) | 170.6 (14.0) | 172.2 (14.1) |
| c = 0.1, Multi-arm trials with N = 500 | ||||
| η.,10, η.,20, η.,30 | 20.2, 7.0, 2.8 | 19, 5.7, 2.1 | 20, 6.8, 3.0 | 19.9, 7, 3 |
| η5,10, η5,20, η5,30 | 0.2, 0.2, 0.1 | 0.2, 0.1, 0.1 | 0.2, 0.2, 0.1 | 0.2, 0.2, 0.1 |
| P10, P20, P30 | 7.6, 2.8, 1.3 | 6.6, 2.2, 1.1 | 7.2, 3.0, 1.6 | 7.5, 3.0, 1.6 |
| Mean of R (SD) | 152.4 (11.6) | 150.7 (11.5) | 152.0 (11.5) | 152.5 (11.5) |
Different column corresponds to different clip method. The probabilities are presented in %.
Table 3. Simulation results for the staircase scenario where θ = {0.2, 0.25, 0.3, 0.35, 0.4}, and trial sample size N=250.
| BAR(1, c) | Adaptive clip method | |||
|---|---|---|---|---|
| u = 1 | u = 3 | u = 4 | ||
| c = 0, Multi-arm trials with N = 250 | ||||
| η2,10, η2,20, η2,30 | 14.1, 5.4, 2.0 | 7.6, 1.4, 0.4 | 6.6, 2.2, 0.9 | 7.7, 2.5, 0.9 |
| η3,10, η3,20, η3,30 | 9.2, 3.5, 1.4 | 3.6, 0.8, 0.3 | 4.9, 1.7, 0.7 | 5.5, 1.7, 0.6 |
| η4,10, η4,20, η4,30 | 6.1, 2.5, 1.0 | 2, 0.6, 0.2 | 3.1, 1.1, 0.5 | 3.6, 1.4, 0.6 |
| η5,10, η5,20, η5,30 | 2.9, 1.6, 0.8 | 1.1, 0.4, 0.2 | 1.8, 0.9, 0.4 | 2.1, 0.9, 0.5 |
| P10, P20, P30 | 1.5, 0.5, 0.2 | 0.6, 0.2, 0.1 | 1, 0.4, 0.2 | 1.1, 0.4, 0.2 |
| Mean of R (SD) | 85.1 (8.9) | 82.1 (8.4) | 83.1 (8.5) | 83.7 (8.6) |
| c = 0.1, Multi-arm trials with N = 250 | ||||
| η2,10, η2,20, η2,30 | 7.6, 1.7, 0.5 | 7.3, 0.9, 0.3 | 7.9, 1.2, 0.3 | 8, 1.7, 0.4 |
| η3,10, η3,20, η3,30 | 5.7, 1.2, 0.3 | 4.4, 0.7, 0.2 | 5.6, 0.9, 0.3 | 5.8, 1.2, 0.4 |
| η4,10, η4,20, η4,30 | 3.6, 0.8, 0.2 | 3, 0.5, 0.1 | 3.6, 0.8, 0.3 | 3.5, 0.8, 0.3 |
| η5,10, η5,20, η5,30 | 1.8, 0.6, 0.2 | 1.1, 0.3, 0.1 | 1.5, 0.4, 0.2 | 1.9, 0.6, 0.2 |
| P10, P20, P30 | 0.9, 0.3, 0.1 | 0.7, 0.1, 0.0 | 0.7, 0.2, 0.1 | 0.8, 0.3, 0.1 |
| Mean of R (SD) | 81.2 (7.9) | 80.4 (7.8) | 81.1 (7.9) | 81 (8) |
Different column corresponds to different clip method. The probabilities are presented in %.
Table 4. Simulation results for the staircase scenario where θ = {0.2, 0.25, 0.3, 0.35, 0.4}, and trial sample size N=500.
| BAR(1, c) | Adaptive clip method | |||
|---|---|---|---|---|
| u = 1 | u = 3 | u = 4 | ||
| c = 0, Multi-arm trials with N = 500 | ||||
| η2,10, η2,20, η2,30 | 18.2, 9.4, 5.1 | 8.7, 2.6, 1.0 | 10.1, 4.3, 2.0 | 11.7, 5.0, 2.3 |
| η3,10, η3,20, η3,30 | 11.7, 5.9, 2.8 | 3.7, 1.2, 0.5 | 5.9, 2.5, 1.3 | 6.6, 2.9, 1.6 |
| η4,10, η4,20, η4,30 | 5.4, 2.8, 1.5 | 2, 0.8, 0.4 | 2.7, 1.2, 0.6 | 3.7, 1.7, 1.0 |
| η5,10, η5,20, η5,30 | 2.3, 1.5, 0.9 | 0.5, 0.3, 0.2 | 1.1, 0.6, 0.3 | 1.3, 0.7, 0.4 |
| P10, P20, P30 | 1.3, 0.6, 0.3 | 0.5, 0.2, 0.2 | 0.7, 0.4, 0.2 | 0.8, 0.5, 0.2 |
| Mean of R (SD) | 177.3 (14.4) | 172.5 (12.9) | 174.2 (13.1) | 174.8 (13.5) |
| c = 0.1, Multi-arm trials with N = 500 | ||||
| η2,10, η2,20, η2,30 | 13.8, 3.3, 0.9 | 13.4, 2.7, 0.5 | 13.6, 3.2, 0.8 | 13.8, 3.3, 1.0 |
| η3,10, η3,20, η3,30 | 9.5, 2.2, 0.6 | 8.2, 1.8, 0.4 | 9.3, 2.0, 0.5 | 9, 2.2, 0.6 |
| η4,10, η4,20, η4,30 | 4.7, 1.1, 0.3 | 4.1, 1.0, 0.2 | 4.7, 1.1, 0.3 | 4.4, 1.2, 0.3 |
| η5,10, η5,20, η5,30 | 1.1, 0.3, 0.1 | 0.9, 0.2, 0.1 | 1, 0.3, 0.2 | 1.1, 0.3, 0.1 |
| P10, P20, P30 | 0.9, 0.3, 0.1 | 1, 0.2, 0.1 | 0.9, 0.2, 0.1 | 0.9, 0.2, 0.1 |
| Mean of R (SD) | 166.1 (11.7) | 165.4 (11.7) | 166.0 (11.7) | 166.0 (11.5) |
Different column corresponds to different clip method. The probabilities are presented in %.
When the LFC scenario is considered in the simulation studies, we find that ηk,m for k = 2, 3, 4, are similar since their treatment effects are the same. Hence, we present the average value of η2,m, η3,m, η4,m and label this value by η.,m in both Tables 1 and 2. Compare all the four set-up of different c and N (i.e. both Tables 1 and 2), we find that the smallest η.,10 when the LFC scenario is considered can be as high as 12.9%, as seen in Table 1. This is obtained when the adaptive clip method with u = 2 and c = 0.1 is implemented for the trial with N = 250. Besides that, we find η.,m and η5,m for m = 10, 20, 30, are larger for the trial with N = 500 than that with N = 250.
These observations may be due to the fact that the allocation probabilities to arm k = 1, . . . , 4, are similar; in some replications when k = 5 has a lower allocation probability than the other four arms, there are relatively more patients being randomized across arm k = 1, . . . , 4, and hence lead to a larger value of η.,m (can imagine this as similar to implementing simple randomization for a four arm trial where each arm may not have equal sample size at the end of the trial due to random variation in treatment allocation). Moreover, by having more patients in these particular replications may not increase the allocation probability of arm k = 5 quickly enough to the extent that Pi(5) is much larger than all other Pi(k), k = 1, . . . , 4. A further explanation for the latter observation may also be due to the fact that in some replications, the algorithm faces the aforementioned situation only after 250 patients have been randomized. Thus, these replications were not observed in the simulation for the trial setting of N = 250 but present in that with N = 500, leading to higher rates of η.,m and η5,m when N is large.
Consider P10 for the LFC scenario in Tables 1 and 2, the smallest value is about 5% and is obtained for the trial with N = 250 when the adaptive clip method with u = 2 and c = 0.1 is implemented in conjunction with the BAR procedure. This rate indicates that if 100 trials of the same set-up is conducted, five of these trials would have a control arm larger than the three treatment arms by extra 10 patients each. We see from Table 1 that η5,10 is 1.4% for this particular case, meaning that the best arm would have less patients than the control arm in no more than two trials out of the 100 on average. Nonetheless, the mean of R is the smallest when the adaptive clip method with u = 2 and c = 0.1 is implemented for the trial with N = 250. This shows that the adaptive clip method reduces the percent of adverse imbalance at the cost of reducing the total number of respondents in the trial.
Consider the trade-off between the mean of R (and the SD), η.,10 and P10, we think that the adaptive clip method with u = 3 and c = 0 is the most appropriate approach for this LFC scenario compare to BAR(1, c) with fixed c = 0: it reduces both η.,10 and P10 at the cost of reducing the mean of R by about 4 and 6 units, respectively, for the trials with N = 250 and with N = 500, but also with less variability in the distribution of R. For example, η.,10 is reduced from 22.9% and 28.7% to 14.5% and 18.2%, respectively, for the trials with N = 250 and with N = 500. The adaptive clip method with u = 4 and c = 0 might be another reasonable option though it produces a slightly larger η.,10 and P10 than that with u = 3 and c = 0 (in both Tables 1 and 2), and with only a small gain in the mean of R. Note that the expected numbers of responders under equal randomization are R = 60 and 120 for N = 250 and 500, respectively, for the LFC. All the adaptive randomization methods result in higher numbers of responders as expected.
We now consider the staircase scenario in Tables 3 and 4 simultaneously. We find that the adaptive clip method with c = 0 (and u = 2 or 3) in general outperforms BAR(1, 0) and BAR(1, 0.1) for this alternative scenario. We find that the smallest η2,10 for the least good treatment, i.e. k = 2, is 6.6% (in Table 3), which is smaller by about half of what we observed for the LFC scenario. This is obtained for the trial with N= 250 when the adaptive clip method with u = 3 is implemented along with c = 0. All η5,m for the best treatment, i.e. k = 5 in both tables, are less than 3%. The smallest value of P10 is 0.5% and the largest value is 1.5%. These small numbers indicate that BAR is performing well in terms of randomizing patients to the putatively more efficacious arms in this scenario. As in the previous scenario, we see that ηk,m in general are larger for trials with large sample sizes, but the magnitudes are smaller than the corresponding settings when LFC scenario is the alternative scenario. Nevertheless, Pm for m = 10, 20, 30, are significantly smaller in Table 4. the staircase scenario, than those in Table 2, the LFC scenario. Similarly as before, all the adaptive randomization methods result in higher numbers of responders; the expected numbers of responders under equal randomization are R = 75 and 150 for N = 250 and 500, respectively, for the staircase scenario.
To illustrate the impact of using values that are different to the true θ in the adaptive clip method, we repeat the simulation with θ = {0.2, 0.2, 0.2, 0.2, 0.2} in computing Δi|data for both the LFC and the staircase scenarios. The counterparts of the above tables 1–4 are presented in the supplemental material (Tables S1-S4). For the LFC scenario, we see that all the considered matrices in Table 1 and in Table 2 for a given u are similar to the resulting values when such values of θ are employed in approximating the bias term. This is because in the LFC scenario, only one of the intervention is effective and has an effect of 0.4. For the staircase scenario, implementing the adaptive clip approach with such θ in the bias term in general leads to higher ηk,m and Pm than those in Table 3 and in Table 4 for the same u. Nevertheless, the adaptive clip method with a specific u can still lead to lower ηk,m and Pm than those from BAR(1, c). For example the second column top panel in Table S3 in the supplemental material shows adaptive clip method with u = 3 and c = 0 leads to lower ηk,m and Pm than the first column top panel in Table 3; this comes at a cost of getting 1.7 (85.1–83.4) less respondents (on average) than that when BAR (1, 0) is implemented.
In summary, the above illustrations show that BAR is performing better (in terms of the considered metrics) for the staircase scenario than the LFC scenario, and in smaller trials. When the BAR procedure with τ = 1 is implemented in conjunction with the adaptive clip method, the percentage of adverse imbalance and Pm can be reduced from what BAR(1, c) with fixed c can achieve. We also find that both ηk,m and Pm for m = 20, 30 are within a reasonable range in most cases when BAR is implemented in conjunction with the appropriate adaptive clip method. In general, we think that c = 0 is more appropriate than c > 0 since the latter is likely to trade unnecessarily more respondents to achieve the desirable allocations (e.g. compare the last row in the top and bottom panel of Tables 2 and 4). In this case, when the true treatment effect is much larger than the assumed value in the adaptive clip method, the resulting Δi|data > 0 and the updated allocation probability is used. On the contrary, when the true treatment effect is much smaller than the assumed value, the resulting Δi|data < 0 and the allocation probability would set to be at least g(i, Δi); the behaviour of the adaptive clip method would be similar to that of BAR(1,0) if u is chosen such that g(i, Δi) is close to zero for most i and small Δi|data.
In the next section, we illustrate the strategy for choosing a randomization procedure for implementation purpose. The adaptive clip method with c = 0 would be used in conjunction with several variants of BAR in what to follow.
4. Strategy to choosing a randomization procedure
To assess the benefits of BAR and to help us choose a randomization procedure, we consider the following approach for making inference, and subsequently define a utility function for the evaluation of randomization procedures.
Given the data of a completed trial, treatment k > 1 is declared more efficacious than the control treatment when the posterior probability
where δ is the minimal clinically significant improvement over the control treatment and ν is a threshold chosen by the user (see the next paragraph).
A trial success, S, is defined as the situation when at least one treatment is shown to be more efficacious than the control treatment, i.e.
The value of ν is determined numerically to ensure that the probability of declaring a trial success is equal to a pre-specified false positive rate under the null case, e.g. 10% when all θk are the same. The trial sample size can be chosen numerically under the alternative case such that the probability of declaring a trial success, P(S = 1), is close to the desired level, e.g. 90%. Examples of an alternative case are the LFC scenario or the staircase scenario.4
Note that an alternative formulation to identify if treatment k is superior when compared with others is to consider
In this setting S = 1 if otherwise. The procedure of sample size calculation described above remains the same. The choice of the formulation depends on the clinical context of a trial.
Without loss of generality, we propose the following utility function to reflect the benefit of implementing a trial with a randomization procedure ξ:
where R is the number of respondents, Nf is the number of patients outside the trial, {θ1, . . . , θK} are the parameter values under the alternative scenario, w ≥ 0 is a weight parameter, and
The definition of nk− reflects the negative impact of assigning more patients to the control arm; this aligns with the goal of BAR especially when there is at least one efficacious intervention. Nevertheless, it is important to assign patients to the control arm throughout the study period as this might mitigate the impact of a possible trend on the final inference (see e.g. Lee and Wason36 who explore the impact of trends on the analysis when nonconcurrent control data are used for the analysis of a platform trial). The second term of the first expression in the utility function represents the expected number of future respondents after the trial is declared successful. We assume patients who are not in the trial will be treated with the most efficacious treatment following the results of the trial. The third term reflects the cost, or the undesirable feature of employing a randomization procedure, i.e. the penalty of assigning more patients unnecessarily to the control arm. Specifically, w = 0 implies that assigning more patients to the control arm is not an issue from the perspective of the patients as well as the trial teams. By varying w, we can explore the trade-off between R and the undesirable events that are expected to happen to the (‘extra’) patients in the control group to the extent that Uξ(Nf, w) = 0, which indicates that implementing a trial with the corresponding randomization procedure provides no benefit when Nf = 0.
At the design stage of a trial, we can conduct simulation studies to compare randomization procedures based on the expected utility value under the alternative trial scenarios. As the parameters of the inference approach varies according to the randomization procedure, we recommend practitioners to choose one procedure as the benchmark approach for the comparisons. Then, an optimal randomization approach can be sought by maximizing the utility given the requirement and resources of a trial.
4.1. Procedure of making comparison
We now present the procedure of choosing a randomization approach.
Identify the requirement and the setting of a trial. For example, the number of arms, the effect sizes under the alternative and the null scenarios, the definition of a trial success, S, and the prior distributions used in the randomization approach as well as in the final inference.
Choose a benchmark randomization approach (e.g. simple randomization, or restricted block randomization) and compute the sample size and cut-off point of the decision rule such that the power and the false positive rate are close to the required rates.
Identify the randomization procedures that are to be compared and the corresponding set-up. For Bayesian adaptive randomization approaches, choose an initial burn-in sample size, Nb, the values used in computing the bias element, Δi|data, and some values of u in the clip method. Plots such as those presented in Figure 1 can be used to assist the decisions on Nb and u such that a minimum allocation rate of future patients is at the desirable level when the treatment effect is either under-estimated or the treatment is in fact less efficacious than expected.
Run simulation for each of the randomization procedures with the sample size and cut-off point selected in step 2 under the alternative scenario. Record the expected values of the number of respondents, R, the expected number of trial successes, S, and the expected number of future respondents from the trial.
Compare the estimated utility values of the procedures across a range of Nf and w. Examine the values of ηk,m and Pm for some values of m if they are of concern. Select the randomization approach that gives the highest E(Uξ(Nf, w)) for the range of Nf and w that are of interest to the trial.
4.2. Illustration: Which randomization approach is favourable
We now illustrate the strategy through redesigning the study on ketamine mentioned in Section 2. In this example a trial scenario is replicated with a randomization procedure 10,000 times; 10,000 posterior samples of θk are generated for each k when updating Pi(k) and when computing Δi|data.
Step 1. Consider the primary interest is in the proportion of participants demonstrating > 50% reduction on Montgomery-Asberg Depression Rating Scale scores at 72 h post-infusion, and that this outcome is available for updating the randomization allocation before the next patient is enrolled to the study. Following O’Brien et al.,33 we consider a null scenario that has θ = {0.09, 0.09, 0.09, 0.09}, and an alternative scenario that has θ = {0.09, 0.2, 0.3, 0.5}. An uninformative prior, Beta(1, 1) for θk, k = 1, . . . , 4, is incorporated into the Bayesian randomization approaches as well as in the final inference. The trial is declared successful when at least one of the intravenous ketamine dose is proven superior to the active placebo midazlolam, i.e. S = 1 if .
Step 2. Consider simple randomization, BAR(0, 0), that has Pi(k) = 1/4∀k as the benchmark approach. A sample size of N = 180 and a cut-off point, ν = 0.847, were identified for a power that is close to 80% and a false positive rate of 5%.
Step 3. Let BR denote the restricted block randomization. Consider the following randomization procedures to be compared: ξ = {BR, BAR(0, 0), BAR(0.5, 0), BAR(1, 0), BAR(i/(2N), 0), BAR(I(k), 0)}. For each of these approaches, Nb = 40 is selected and u = 2, 3, 4, are considered, respectively, in the clip method. The dashed and dotted lines in Figure 1(b) show the minimum allocation rates for u = 2, 3, 4, respectively. We see that for u = 2 the black lines for patient i = 40, . . . , 60, are rather high, i.e. Pi(k) ≥ 0.25. If the treatment is in fact less efficacious than expected, having such Pi(k) might lead to assigning more patients to this arm, which would than reduce the number of respondents in the trial.
Step 4. We run simulation for each of the randomization procedures under the alternative scenario. In this particular example, we estimate Δi|data by taking the difference between the posterior mean of θ and the value of θ under the null scenario, i.e. θ = {0.09, 0.09, 0.09, 0.09}.
Step 5. We examine the values of ηk,10 for k = 2, . . . , 4, and identify the optimal randomization approaches for a range of Nf and w under the alternative scenario. Table S5 in the supplemental material shows P(S = 1), the mean (and SD) of R, and ηk,10 for k = 2, . . . , 4, for the four BAR procedures implemented in conjunction with the adaptive clip method where u = 2, 3, 4, respectively. As expected when the adaptive clip method is implemented with u = 2, the means of R across the four procedures are consistently smaller than those when the method is implemented with u = 3 or u = 4; ηk,10 are also higher when u = 2. These observations suggest that u = 2 is unlikely to be a good option to be considered further. In what follows, we make comparisons between the optimal randomization procedures when the adaptive clip method with u = 3 and u = 4 is implemented, respectively.
Consider when adaptive clip method is implemented with u = 3. The box plots in Figure 2(a) show the distributions of R where R > 0 when S = 1 in the simulation, otherwise R = 0. The numbers under the box plots are the power of the trial, P(S = 1), when a randomization procedure (shown on the x-axis) is implemented. We see that all the adaptive randomization approaches have higher P(S = 1) than both BR and BAR(0, 0) in this example, with BAR(0.5, 0) and BAR(I(k), 0) lead to the highest P(S = 1). Looking at the box pots, we also see that all BAR procedures result in a larger medium and interquartile range of R when compared to that of BR and BAR (0, 0); and that implementing BAR(1, 0) leads to the largest values of R on average. The heatmap in Figure 2(b) indicates the randomization procedure that maximizes E[Uξ(Nf, w)] for a given pair of Nf (x-axis) and w (y-axis): green for BAR(1, 0), peach-orange for BAR(0.5, 0) and light-blue for BAR(I(k), 0). Some examples of the interpretation of a heatmap are as follows. Given Nf = 50 (x-axis) and w = 2 (y-axis), BAR(1, 0) yields the largest E[Uξ(Nf, w)] when compared to other procedures; when w = 3, BAR(0.5, 0) provides the largest E[Uξ(Nf, w)]; but when w ≥ 6.5 (and also when Nf > 2800 for all w ≥ 0) BAR(I(k), 0) leads to the largest E[Uξ(Nf, w)].
Figure 2.
Scenario where K = 4, Nb = 40, N = 180 and the adaptive clip method has u = 3 and θ = {0.09, 0.09, 0.09, 0.09} in Δi|data. (a) Box plot of R with the numbers under the box plots correspond to P(S = 1). (b) Colour code reflects the randomization procedure that maximizes the expected utility for a given pair of Nf (x-axis) and w (y-axis). The x-axis ranges from 1 to 3000.
Consider when adaptive clip method is implemented with u = 4. Figure 3(a) shows that BAR(I(k) 0) now has the highest power across the considered procedures while the other observations from Figure 2(a) remain the same. On the other hand Figure 3(b) shows that BAR(I(k), 0) maximizes the utilities for Nf ≥ 500 and w ≥ 0; and that the other two adaptive approaches are the optimal approach for scenarios when Nf is small.
Figure 3.
Scenario where K = 4, Nb = 40, N = 180 and the adaptive clip method has u = 4 and θ = {0.09, 0.09, 0.09, 0.09} in Δi|data. (a) Box plot of R with the numbers under the box plots correspond to P(S = 1). (b) Colour code reflects the randomization procedure that maximizes the expected utility for a given pair of Nf (x-axis) and w (y-axis). The x-axis ranges from 1 to 500.
To decide the final setting of the optimal randomization approach for a given pair of values of Nf and w, we compare the resulting expected utility values from the optimal adaptive clip method with u = 3 with the corresponding values from the optimal adaptive clip method with u = 4. Figure 4 shows the optimal approaches for this final comparison. We find that there is a triangle region (defined by the black lines in the figure) for some combinations of Nf and w (which are of relatively small magnitudes) where implementing the optimal adaptive clip methods with u = 3 leads to a higher expected utility value. For w ≥ 4 and Nf that is not too small, the figure suggests implementing BAR(I(k), 0) with u = 4 is the best option among the considered approaches in terms of maximizing the expected utility from the trial.
Figure 4.
Optimal randomization approach: green for BAR(1, 0), peach-orange for BAR(0.5, 0) and light-blue for BAR(I(k), 0). The black lines separate the region where adaptive clip method with u = 3 or with u = 4 is the optimal approach.
5. Discussion
We have proposed an adaptive clip method to (i) prevent the treatment allocation probabilities from becoming infinitesimal during the early to middle stage of the trial; (ii) reduce the frequency that the algorithm randomizes more patients to the inferior arm on average. The simulation studies in Section 3.3 show that BAR is performing better when the efficacy of treatments is dissimilar and when the trial sample size is small, in terms of the percentage of adverse imbalance in the treatment arms. When BAR with τ = 1 is implemented in conjunction with our clip method the frequency of assigning patients to inferior arms can be reduced.
Since there are different variants of BAR in the literature, it can be difficult to envisage which one is the most relevant for application to real trials. We have proposed a utility function to help the selection and provided step-by-step guidance on implementation. Instead of just focusing on the power and the total number of respondents (i.e. known as patient benefits), we consider the treatment benefits for future patients who are not in the trial and include a penalty element for assigning more patients unnecessarily to the control arm. For a range of penalty weights and a range of patient horizons, we have illustrated how to identify the optimal randomization procedure that maximizes the expected utility. In our illustrations, we see that both the simple randomization and the restricted block randomization are bad options from the perspective of maximizing the utility of implementing a trial. This is likely due to the fact that simple randomization leads to large percentages of adverse imbalance in all treatment arms (see for example, Wathen and Thall4) and that BAR leads to a higher power when a staircase scenario is considered.
We did not make analogous comparisons when other variants of BAR are implemented. We conjecture that the adaptive clip method with c = 0 would complement BAR variants that have high variability in randomizing patients to the treatment arms. We suggest choosing the tuning parameter u based on the setting of a trial if the utility approach is not of the primary interest. For example, when the burn-in sample size of the trial is large, one can choose u such that the clip function decays more quickly towards 0 after observing a few patients following the burn-in period. A future investigation could be exploring how to choose the burn-in period when the adaptive clip method is used in conjunction with a BAR procedure.
One limitation of our illustrations is that in approximating the bias term of the adaptive clip method, we have only considered either the true values under the alternative scenario or the values under the null scenario. One might conduct sensitivity analysis in practice with different values of θ in generating the responses and in approximating the bias element of the adaptive clip method to explore the deviation. We remind readers that different settings and a benchmark approach can be considered in the selection strategy, e.g. given a trial sample size and a benchmark approach, identify the error rates accordingly before comparing different randomization procedures through simulation studies.
We note that other formulations of the utility function can be explored to suit the need or the context of a trial. For example include monetary values to reflect cost-effectiveness of the approach. A future work could be investigating such a selection strategy for implementing BAR in trials where there is no control arm and in platform trials, such as in the setting considered by Lee et al.37 One may also extend the strategy to compare different classes of adaptive randomization, where the goal of the randomization approach includes maintaining covariate balance, see Rosenberger and Lachin,38 for example.
Supplementary Material
Acknowledgements
We are grateful to the reviewers for their helpful comments on an earlier version of this paper.
Funding
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: This work and KML were funded by the Medical Research Council in the United Kingdom (grant codes MR/ N028171/1). JJL’s research was supported in part by the grants CA016672 and CA221703 from the National Cancer Institute in the United States.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
References
- 1.Robertson DS, Lee KM, Lopez-Kolkovska BC, et al. Response-adaptive randomization in clinical trials: from myths to practical considerations. [accessed 17 February 2021];ArXiv. :2005.00564. doi: 10.1214/22-STS865. http://arxiv.org/abs/2005.00564 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Thompson WR. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika. 1933;25:285–294. [Google Scholar]
- 3.Thall PF, Wathen JK. Practical Bayesian adaptive randomisation in clinical trials. Eur J Cancer. 2007;43:859–866. doi: 10.1016/j.ejca.2007.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Wathen JK, Thall PF. A simulation study of outcome adaptive randomization in multi-arm clinical trials. Clin Trials J Soc Clin Trials. 2017;14:432–440. doi: 10.1177/1740774517692302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Lee JJ, Chen N, Yin G. Worth adapting? Revisiting the usefulness of outcome-adaptive randomization. Clin Cancer Res. 2012;18:4498–4507. doi: 10.1158/1078-0432.CCR-11-2555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Du Y, Wang X, Jack Lee J. Simulation study for evaluating the performance of response-adaptive randomization. Contemp Clin Trials. 2015;40:15–25. doi: 10.1016/j.cct.2014.11.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wason JMS, Trippa L. A comparison of Bayesian adaptive randomization and multi-stage designs for multi-arm clinical trials. Stat Med. 2014;33:2206–2221. doi: 10.1002/sim.6086. [DOI] [PubMed] [Google Scholar]
- 8.Lin J, Bunn V. Comparison of multi-arm multi-stage design and adaptive randomization in platform clinical trials. Contemp Clin Trials. 2017;54:48–59. doi: 10.1016/j.cct.2017.01.003. [DOI] [PubMed] [Google Scholar]
- 9.Villar SS, Wason J, Bowden J. Response-adaptive randomization for multi-arm clinical trials using the forward looking Gittins index rule. Biometrics. 2015;71:969–978. doi: 10.1111/biom.12337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Thall P, Fox P, Wathen J. Statistical controversies in clinical research: scientific and ethical problems with adaptive randomization in comparative clinical trials. Ann Oncol. 2015;26:1621–1628. doi: 10.1093/annonc/mdv238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Viele K, Saville BR, McGlothlin A, et al. Comparison of response adaptive randomization features in multiarm clinical trials with control. Pharm Stat. 2020;19:602–612. doi: 10.1002/pst.2015. [DOI] [PubMed] [Google Scholar]
- 12.Jiang Y, Zhao W, Durkalski-Mauldin V. Time-trend impact on treatment estimation in two-arm clinical trials with a binary outcome and Bayesian response adaptive randomization. J Biopharm Stat. 2020;30:69–88. doi: 10.1080/10543406.2019.1607368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Yin G, Chen N, Jack Lee J. Phase II trial design with Bayesian adaptive randomization and predictive probability. J R Stat Soc Ser C Appl Stat. 2012;61:219–235. doi: 10.1111/j.1467-9876.2011.01006.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Viele K, Broglio K, McGlothlin A, et al. Comparison of methods for control allocation in multiple arm studies using response adaptive randomization. Clin Trials (London, England) 2020;17:52–60. doi: 10.1177/1740774519877836. [DOI] [PubMed] [Google Scholar]
- 15.Jiang F, Jack Lee J, Müller P. A Bayesian decision-theoretic sequential response-adaptive randomization design. Stat Med. 2013;32:1975–1994. doi: 10.1002/sim.5735. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Sim J. Outcome-adaptive randomization in clinical trials: issues of participant welfare and autonomy. Theor Med Bioethics. 2019;40:83. doi: 10.1007/s11017-019-09481-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Fillion N. Clinical equipoise and adaptive clinical trials. Topoi. 2019;38:457–467. [Google Scholar]
- 18.London AJ. Learning health systems, clinical equipoise and the ethics of response adaptive randomisation. J Med Ethics. 2018;44:409–415. doi: 10.1136/medethics-2017-104549. [DOI] [PubMed] [Google Scholar]
- 19.Bothwell LE, Kesselheim AS. The real-world ethics of adaptive-design clinical trials. Hastings Center Rep. 2017;47:27–37. doi: 10.1002/hast.783. [DOI] [PubMed] [Google Scholar]
- 20.Freidlin B, Korn EL. Wiley StatsRef: Statistics Reference Online. John Wiley; 2016. Ethics of outcome adaptive randomization. [Google Scholar]
- 21.Saxman SB. Ethical considerations for outcome-adaptive trial designs: a clinical researcher’s perspective. Bioethics. 2015;29:59–65. doi: 10.1111/bioe.12084. [DOI] [PubMed] [Google Scholar]
- 22.Hey SP, Kimmelman J. Are outcome-adaptive allocation trials ethical? Clin Trials J Soc Clin Trials. 2015;12:102–106. doi: 10.1177/1740774514563583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Korn EL, Freidlin B. Outcome-adaptive randomization: is it useful? J Clin Oncol. 2011;29:771–776. doi: 10.1200/JCO.2010.31.1423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Du Y, Cook JD, Lee JJ. Comparing three regularization methods to avoid extreme allocation probability in response-adaptive randomization. J Biopharm Stat. 2018;28:309. doi: 10.1080/10543406.2017.1293077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Bowden J, Trippa L. Unbiased estimation for response adaptive clinical trials. Stat Meth Med Res. 2017;26:2376–2388. doi: 10.1177/0962280215597716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Berry DA, Eick SG. Adaptive assignment versus balanced randomization in clinical trials: a decision analysis. Stat Med. 1995;14:231–246. doi: 10.1002/sim.4780140302. [DOI] [PubMed] [Google Scholar]
- 27.Park JW, Liu MC, Yee D, et al. Adaptive randomization of neratinib in early breast cancer. New Engl J Med. 2016;375:11–22. doi: 10.1056/NEJMoa1513750. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Rugo HS, Olopade OI, DeMichele A, et al. Adaptive randomization of veliparib-carboplatin treatment in breast cancer. New Engl J Med. 2016;375:23–34. doi: 10.1056/NEJMoa1513749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kim ES, Herbst RS, Wistuba II, et al. The BATTLE trial: personalizing therapy for lung cancer. Cancer Discov. 2011;1:44–53. doi: 10.1158/2159-8274.CD-10-0010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Maki RG, Wathen JK, Patel SR, et al. Randomized phase II study of gemcitabine and docetaxel compared with gemcitabine alone in patients with metastatic soft tissue sarcomas: results of sarcoma alliance for research through collaboration study 002 [corrected] J Clin Oncol. 2007;25:2755–2763. doi: 10.1200/JCO.2006.10.4117. [DOI] [PubMed] [Google Scholar]
- 31.Cellamare M, Ventz S, Baudin E, et al. A Bayesian response-adaptive trial in tuberculosis: the end TB trial. Clin Trials J Soc Clin Trials. 2017;14:17–28. doi: 10.1177/1740774516665090. [DOI] [PubMed] [Google Scholar]
- 32.Cellamare M, Milstein M, Ventz S, et al. Bayesian adaptive randomization in a clinical trial to identify new regimens for MDR-TB: the endTB trial. Int J Tuberculosis Lung Dis. 2016;20:8–12. doi: 10.5588/ijtld.16.0066. [DOI] [PubMed] [Google Scholar]
- 33.O’Brien B, Green CE, Al-Jurdi R, et al. Bayesian adaptive randomization trial of intravenous ketamine for veterans with late-life, treatment-resistant depression. Contemp Clin Trials Commun. 2019;16:100432. doi: 10.1016/j.conctc.2019.100432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Connor JT, Elm JJ, Broglio KR, et al. Bayesian adaptive trials offer advantages in comparative effectiveness trials: an example in status epilepticus. J Clin Epidemiol. 2013;66(8 Suppl):S130–S137. doi: 10.1016/j.jclinepi.2013.02.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Wang Y, Zhu H, Lee JJ. Evaluation of bias for outcome adaptive randomization designs with binary endpoints. Stat Interf. 2020;13:287–315. [Google Scholar]
- 36.Lee KM, Wason J. Including non-concurrent control patients in the analysis of platform trials: is it worth it? BMC Med Res Methodol. 2020;20:165. doi: 10.1186/s12874-020-01043-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lee KM, Wason J, Stallard N. To add or not to add a new treatment arm to a multiarm study: a decision theoretic framework. Stat Med. 2019;38:3305–3321. doi: 10.1002/sim.8194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Rosenberger WF, Lachin JM. Wiley Series in Probability and Statistics. John Wiley; Hoboken, NJ, USA: 2002. Randomization in clinical trials: theory and practice. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




