Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Apr 7.
Published in final edited form as: Pharm Stat. 2016 Jun 15;15(5):396–404. doi: 10.1002/pst.1755

Hyperbaric oxygen brain injury treatment (HOBIT) trial: a multifactor design with response adaptive randomization and longitudinal modeling

Byron J Gajewski a,*, Scott M Berry a,b, William G Barsan c, Robert Silbergleit c, William J Meurer c,d, Renee Martin e, Gaylan L Rockswold f,g
PMCID: PMC13051609  NIHMSID: NIHMS2129641  PMID: 27306921

Abstract

The goals of phase II clinical trials are to gain important information about the performance of novel treatments and decide whether to conduct a larger phase III trial. This can be complicated in cases when the phase II trial objective is to identify a novel treatment having several factors. Such multifactor treatment scenarios can be explored using fixed sample size trials. However, the alternative design could be response adaptive randomization with interim analyses and additionally, longitudinal modeling whereby more data could be used in the estimation process. This combined approach allows a quicker and more responsive adaptation to early estimates of later endpoints. Such alternative clinical trial designs are potentially more powerful, faster, and smaller than fixed randomized designs. Such designs are particularly challenging, however, because phase II trials tend to be smaller than subsequent confirmatory phase III trials. The phase II trial may need to explore a large number of treatment variations to ensure that the efficacy of optimal clinical conditions is not overlooked. Adaptive trial designs need to be carefully evaluated to understand how they will perform and to take full advantage of their potential benefits. This manuscript discusses a Bayesian response adaptive randomization design with a longitudinal model that uses a multifactor approach for predicting phase III study success via the phase II data. The approach is based on an actual clinical trial design for the hyperbaric oxygen brain injury treatment trial. Specific details of the thought process and the models informing the trial design are provided.

Keywords: Bayesian adaptive design, multiple factors, phase II clinical trial

1. INTRODUCTION

This manuscript describes a phase II clinical trial adaptive design for selecting the combination of hyperbaric oxygen (HBO2) treatment dose parameters [pressure, frequency, and intervening normobaric hyperoxia (NBH)] that provide the greatest improvement in the rate of good neurological outcome versus standard care for subjects with severe traumatic brain injury (TBI). A second goal of this phase II trial is to determine if there is any factor combination of HBO2 treatment that has at least a 50% probability of demonstrating improvement in the rate of good neurological outcome versus a control (i.e., standard care) in a subsequent phase III confirmatory trial, assuming to be 500 in the control and 500 in the novel arm.

Despite numerous clinical trials for treatment of TBI, subjects with TBI have high mortality and poor outcomes [1]. Preclinical and clinical investigations indicate that HBO2 is physiologically active in reducing brain injury and improving outcomes in severe TBI [2]. There are peer-reviewed published animal studies from well-established research laboratories that indicate that HBO2 potentially improves outcome from TBI by multiple mechanisms [3]. By markedly increasing oxygen (O2) delivery to the traumatized brain, HBO2 can improve cellular energy metabolism, attenuate cell signaling and cytosolic ischemic cascades, and reduce subsequent necrotic and programmed cell death. Many clinical investigations in HBO2 have steadily corroborated that HBO2 in comparison with standard care significantly improves markers of oxidative metabolism in relatively uninjured brain as well as pericontusional tissue, reduces intracranial hypertension, demonstrates improvements in markers of cerebral toxicity, and improves clinical outcome. However, important information, optimizing the HBO2 treatment factor combination in terms of pressure, frequency, and whether NBH delivered following the HBO2 treatment, is required prior to a definitive clinical efficacy study. Preclinical investigations working with TBI models have used pressures varying from 1.5 to 3.0 atmospheres absolute (ATA). Clinical investigators have used pressures varying from 1.5 to 2.5 ATA. However, the lungs in severe TBI subjects have frequently been compromised by direct lung injury and/or acquired ventilator pneumonia and are susceptible to O2 toxicity. Working within these constraints, it is essential to determine the most effective HBO2 dose schedule without producing O2 toxicity and clinical complications. This proposed clinical trial is designed to answer these questions and to provide important information for a confirmatory efficacy phase III trial.

The primary endpoint is the severity adjusted Glasgow Outcome Scale - Extended (GOS-E, binary response) at 6 months after the patient has enrolled in this study. Moreover, GOS-E at 30 days could be used for predicting 6-month GOS-E, allowing for accelerated learning of the primary endpoint through the longitudinal modeling. The combination of the three HBO2 treatment factors with different levels – pressure (1.5, 2.0, and 2.5 ATA), dose frequency [once a day (QD) and twice daily (BID)], and NBH (with and without) – will be studied. Not all possible combinations are used because of potential O2 toxicity, and the trial will explore the efficacy of nine different active arms in comparison with the control arm. If there is at least one experimental arm with sufficient probability of being better than control at phase II, the combination of that active arm will be selected for the future phase III trial to confirm the HBO2 efficacy. Another design constraint is the sample size of about 200 enrolled subjects from approximately 15 clinical centers. The sample size is relatively small because of budget constraints.

Given these constraints, this multifactor treatment design is challenging: there are three dimensions to study, the sample size is small, and the tool that provides the endpoint (GOS-E) is subject to variability from the heterogeneity of the disease. In addition, the added variability from posterior predictive distributions used for forecasting phase III success contributes noise.

Therefore, it would be quite attractive to garner model efficiencies to improve later prediction. This manuscript, using a detailed phase II clinical trial design as example, which has been tailored for the case of multiple factors for predicting phase III success using the phase II data, illustrates a Bayesian adaptive Q2 design is useful. Bayesian predictive probabilities are useful to drive learning, which in itself is not necessarily novel for a phase II trial, but has been underutilized. A potential model’s efficiency can be improved with a main effects factor model based on the strong assumption that there are no interactions. We also use the main effects model in this design for getting information from the other arms. This could have other applications, as studying related treatments is common and could be done more efficiently if such treatments are not assumed to be independent. Every subject will still help in learning about all three factors and predicting phase III success. However, it is believed that there is a possibility of interactions within the treatment factors, so it is decided to use a pairwise independent model for the primary analysis. This can be considered a hybrid design of both a pairwise independent and main effects models. The approach based on an actual emergency medicine clinical trial design is called the hyperbaric oxygen brain injury treatment (HOBIT) trial.

1. Design Choice

Possible choices for the design are fixed sample size trials or response adaptive randomization (RAR) with interim analyses for possible success and/or futility. Other options include RAR with longitudinal (RAR+L) modeling because it can increase efficiency of the trial, particularly when the short-term endpoint is relatively predictive of the long-term endpoint and some relationship (linear, quadratic, etc.) between the two is approximately correct [47].

Response adaptive randomization may provide clinical trial designs that are more powerful, faster, and smaller than fixed randomized designs. However, phase II trials tend to have a smaller sample size than their subsequent confirmatory phase III trials and may require exploration of numerous treatment options to identify the combination of treatment parameters most likely to improve clinical outcome. Therefore, trial designs need to be carefully studied in order to take full advantage of the RAR approach [8,9]. For example, a multifactor treatment clinical trial quickly reduces to a sparse amount of human subjects per treatment combination. In other words, efficiency is achieved by considering the information of adjacent dose cells as informative rather than independent. The RAR in dose finding strategy provides a useful approach for optimal clinical trial design in the case of a single factor with multiple doses [10]; there is also literature on the two-factor clinical trial design [1114]; however, there is little literature on trial design and sample size computations using multifactor (e.g., three-factor) designs having RAR.

The trial will utilize RAR to favor the better performing experimental arms and possibly early stopping for success or futility. RAR implies using a predefined algorithm for changing the treatment allocation during the trial based on efficacy data, while leaving the clinical investigators blinded. In some situations, RAR allows for substantially smaller sample sizes and provides better conclusions by favoring arms that are performing better and slowing enrollment to the arms that are performing relatively poorly [15]. Although unlikely in the phase II multi-treatment space, early termination of the study can allow more rapid development of a promising treatment or, more commonly, efficient identification and rejection of less or not effective treatments that are unlikely to be beneficial if pursued further.

2. METHODS

This section presents the details for the phase II HOBIT trial. The goal of this phase II trial is to identify the best HBO2 treatment for subjects with severe TBI, which would optimally combine three HOBIT treatment factors with different levels: pressure (1.5, 2.0, and 2.5 ATA), dose frequency (BID and QD), and NBH (with and without). Furthermore, this manuscript also tries to predict the phase III success based on the phase II data.

2.1. Treatment arms

There are 10 treatment arms in the trial:

We label the control arm as a = 1, and the experimental arms as a = 2, 3, 4, 5, 6, 7, 8, 9, and 10, respectively.

Arm (a) Pressure (ATA) Frequency NBH

1 0 0 Without
2 2.0 QD Without
3 2.5 QD Without
4 1.5 QD With
5 2.0 QD With
6 2.5 QD With
7 1.5 BID Without
8 2.0 BID Without
9 2.5 BID Without
10 1.5 BID With

ATA, atmospheres absolute; NBH, normobaric hyperoxia; QD, once a day; BID, twice daily.

2.2. Primary endpoint

The primary endpoint is the GOS-E assessed at 6 months after the subject is enrolled in the clinical trial. The GOS-E is a binary response variable with the value of success or failure. We label it as Y6. Additionally, the GOS-E is available 1 month after enrollment that will be used to inform a longitudinal model to predict the 6-month outcome response. Similarly, we label it as Y1.

2.3. Primary analyses

In the first analysis, we define the maximally effective treatment (amax) as the treatment with the greatest effect. For each experimental arm, we calculate the posterior probability of being superior to control, Prθa-θ1>0, where θa is the 6-month GOS-E response rate for experimental arm a and θ1 is that for control arm. As clinical data are analyzed by the pairwise independent model, the phase II trial will be stopped if one of the three following cases occurs:

  1. Early success: At each interim analysis, the trial may stop accrual for expected success if Prθa-θ1>0>0.975 for amax. There must be at least 150 total subjects enrolled before the trial may stop for success. If a success stopping rule is met, then a final analysis will be conducted after all currently enrolled subjects have been followed to their final endpoint.

  2. Early futility: At each interim analysis, the trial may stop accrual for early futility if Prθa-θ1>0<0.55 for amax; therefore, all arms would meet this inequality for futility.

  3. Final success: At the final analysis, the trial will be considered a success if Prθa-θ1>0>0.94 for any a; this inequality would need to occur only for one arm. Otherwise, the phase II trial will fail.

We now explain the second analysis that is using this phase II data for the prediction of phase III success. A prediction of phase III success is if the maximally effective treatment has a greater than 50% probability of HBO2 treatment demonstrating improvement versus control in a subsequent confirmatory trial with size 500 in the control and 500 in the novel treatment. A prediction of phase III success is only calculated if Prθa-θ1>0>0.94 for the arm amax. Note that this calculation is made using the main effects model rather than the independent model previously mentioned. This should increase efficiency in this prediction.

2.4. Analysis population

The intent-to-treat population will be used to analyze the data. The intent-to-treat population will include all randomized subjects, and they will be assigned to different arms based on the randomization information regardless of the treatment received.

2.5. Adaptive design

The purpose of this phase II adaptive clinical trial design is to explore the efficacy of different active arms in comparison with the control arm. The trial will not compare the active arms with each other. The trial will utilize RAR to favor the better performing experimental arms. If the efficacy within at least one experimental arm is promising enough, it will advance to a phase III trial and be compared for superiority with the control arm (Figure 1). The trial will use both a pairwise independent and main effects multiple factor models to analyze the data based on the study objective (more detailed information will be described in Section 2.7).

Figure 1.

Figure 1.

Study design.

2.6. Randomization introduction

  1. Burn-in phase: An initial burn-in period of 50 subjects is used in which these subjects are enrolled in a fixed randomization to each arm. A ratio of 1:1:1:1:1:1:1:1:1:1 will be used for the burn-in period.

  2. Adaptive randomization phase: After the initial burn-in period, adaptive randomization will be utilized. A vector of probabilities, q=q2,q3,q4,q5,q6,q7,q8,q9,q10, is created for randomizing to the experimental arms. A constant proportion of 20% of subjects will be enrolled to the control arm through the rest of phase II. Interim analysis will be executed quarterly to adjust the randomization probabilities based on the current interim analysis results. The probabilities will be proportionally set to each experimental arm based on arm with amax (more information and details is described in Section 2.7.5).

2.7. Statistical modeling

This section describes the statistical modeling used in the adaptive design and the primary analysis. The modeling is Bayesian in nature.

Two models – a pairwise independent response model and a main effects response model – are utilized in this study. All of the trial adaptations are driven by the pairwise independent model, and we also identify the maximally effective experimental arm with this model. The main effects model will be used to predict phase III success, and this model assumes no interactions among the different level combinations of HOBIT treatment factors and will likely improve efficient prediction.

2.7.1. Pairwise independent response model for 6-month Glasgow Outcome Scale - Extended response.

The primary outcome is 6-month GOS-E response, and we label it for some subject i as Yi,6. We model the 6-month primary outcomes as Bernoulli distributed. The model is Yi,6Bernoulliθai, where ai=1,2,3,,10 is the arm for subject i.

We label the probability of the 6-month GOS-E response for arm a as θa. It is expected that GOS-E response for control arm and novel arms has the following prior distributions: logitθ1N-.41,.752, for the control arm; logitθaN0,1.752, for the experimental arms where a=2,3,4,,10. According to the previous clinical trials with the same endpoint, the control arm’s prior on the response scale (θ1) has a median of 0.40. If simulated data are fitted to a beta distribution, the control arm’s prior is equivalent to eight patients, that is, α0+β08, where α0 and β0 are beta parameters. The novel arm’s prior median on the response scale is higher than that of the control arm at 0.50 but is much more vague because it is equivalent to only two patients (similar to a uniform distribution, α0+β02).

2.7.2. Main effects model for 6-month Glasgow Outcome Scale - Extended response used for phase III prediction.

The main effects model is Yi,6BernoulliPi, for subject i. We construct a main effects model for the GOS-E response rate that is a function of pressure, NBH, and duration. The logit transformation of Pi is modeled with a linear equation. By assuming no interaction among the main factors, this model has a lower number of parameters and is designed to increase confidence to predict phase III success. However, if there is an interaction, there would be less uncertainty in the predicted value, but there would be bias. This scenario is explored later in trial simulations. The structure is

logitPi=Xi1μ+Xi2α1.5ATA+Xi3α2.0ATA+Xi4α2.5ATA+Xi5γNBH+Xi6βBID.

The Xs are 0 or 1 depending on the treatment factor combination assigned to subject i; μ represents the effect of control. The α’s represent the additional effect of pressure relative to control. The γ’s and β’s represent the additional effect of NBH and BID, respectively. The main effects model relates to the control and experimental arms in the following way:

  1. control μ

  2. 2.0 ATA, without NBH, QD μ+α2.0ATA

  3. 2.5 ATA, without NBH, QD μ+α2.5ATA

  4. 1.5 ATA, with NBH, QD μ+α1.5ATA+γNBH

  5. 2.0 ATA, with NBH, QD μ+α2.0ATA+γNBH

  6. 2.5 ATA, with NBH, QD C μ+α2.5ATA+γNBH

  7. 1.5 ATA, no NBH, BID μ+α1.5ATA+βBID

  8. 2.0 ATA, no NBH, BID μ+α2.0ATA+βBID

  9. 2.5 ATA, no NBH, BID μ+α2.5ATA+βBID

  10. 1.5 ATA, with NBH, BID μ+α1.5ATA+γNBH+βBID

The priors for GOS-E response for control arm and novel arm have the following prior distributions: logit(μ)N-.41,.752, the control arm, and logit (all other parameters)~N(0,102). The intuition regarding the control arm’s prior is the same as the control’s prior in the independent model. The prior for the additional parameters’ is essentially flat (i.e., uniform on the real line).

2.7.3. Longitudinal model.

In addition to the final 6-month endpoint, subjects will have a scheduled visit at 1 month. At any interim analysis, there may be subjects in each of the following categories: subjects who have completed all the visits with known final endpoint; subjects who are still in the visit process without final endpoint value; and subjects without data at all.

Let Yi,6be the final endpoint value for subject i and let Yi,1be the 1-month response. We construct a longitudinal model to allow the unobserved 6-month endpoint to be imputed from the partial data. The beta binomial longitudinal model updates two beta distributions:

  1. The posterior probability that a subject who is a responder at 1-month will be a responder at the 6 month endpoint:
    PrYi,6=1Yi,1=1Betaα1,β1,
    where
    α1=20+Yi,6=1,Yi,1=1,andβ1=5+Yi,6=0,Yi,1=1.
  2. The posterior probability that a subject who is a failure at 1-month will be a responder at the 6 month endpoint:
    PrYi,6=1Yi,1=0Betaα2,β2,
    where
    α2=5+Yi,6=1,Yi,1=0,andβ2=20+Yi,6=0,Yi,1=0.

In the aforementioned notation, Yi,6=1,Yi,1=1 indicates a count of the number of subjects whose 6-month endpoint was observed to be a response and whose intermediate outcome at 1-month was also a response. The other formulas with absolute values have similar correspondence. We fit a set of models pooling data across all arms. Note that our priors are fairly diffuse, and each has a prior sample size equivalent to 25 subjects. This prior was informed from previous TBI studies.

2.7.4. Bayesian quantities.

The following Bayesian quantities used in this adaptive design are calculated at each interim analysis. From the joint posterior distribution, the posterior probability that each arm, a=2,3,4,,10 is the maximally effective arm, Pamax, is calculated. The arm with the largest Pamax is called the most likely maximum effective novel treatment. The posterior mean and variance for each GOS-E response rate is also calculated. We label Vθa as the posterior variance of the parameter θa. For GOS-E response rate, the posterior probability that experimental arm is superior (larger response rate) to the control arm is calculated: Prθa>θ1data), where a=2,3,4,,10theobserveddatafromtheph. Each of these Bayesian quantities is calculated using the data of all subjects who have already completed the trials or are still in visit process for each interim analysis.

2.7.5. Adaptive randomization details.

The specification of the vector of probabilities for RAR is described in this section. The randomization vector is created by selection based on the posterior distribution of the GOS-E response for each arm.

The purpose of the adaptive randomization is to allocate subjects to the arms most likely to be maximum effective. In addition, the adaptive randomization could improve learning that is the maximum effective arm is in comparison with the control arm.

A component, labeled as Va,is constructed for each arm. Set V1=1, assuring 1/5 probability for control arm throughout the trial. The component for arms a=2,3,4,,10 is Va=4Pamax for a=2,3,4,,10. The randomization vector, q, is set as qa=Va/5 for a=1,2,3,4,10.

2.7.6. Phase III predictions.

This adaptive design phase II clinical trial provides valuable information to predict the success probability of a phase III clinical trial that aims to confirm the efficacy and safety of optimal combination of HBO2 treatment factors for severe TBI in comparison with the standard care. The primary endpoint of the phase III trial will be the same as the one in phase II, that is, sliding dichotomized GOS-E at 6 months, which will be primarily analyzed by a chi-square test in the phase III study. The sample size for phase III is 500 in the control and 500 in the novel arms, totally n =1000, based on α =0.05 and two-tailed.

Taking the maximally effective arm from phase II trial simulations, we calculate the posterior predictive probability that there is a >50% probability of hyperbaric treatment demonstrating improvement in the rate of good neurological outcome versus standard treatment in a subsequent phase III confirmatory trial. This is calculated by the main effects model for the experimental arm that shows the maximum efficacy. To accomplish this, the posterior predictive distribution for future responses of control and novel arms, we label y1P and yaP, for arm a=2,3,,10, respectively, and they could be calculated via the following formula:

py1P,yaPy=500y1PP1y1p1-P1500-y1p500yapPayap1-Pa500-yapp(θy)dθ,

where y is the observed data from the phase II trial and θ is a vector of parameters from the main effects model. This distribution is calculated using simulation. Suppose that a is the best HBO2 experimental arm identified in phase II, then the appropriate predictive chi-square test statistic is Xa2P which is calculated by y1P and yaP. To achieve a >50% success probability of phase III, two conditions must be satisfied: (1) the main effects model indicates a >94% probability that treatment of arm a is better than that in control arm; (2) there is a >50% probability that Xa2P>3.841.

2.8. A comparison of pairwise independent and main effects models

For the purposes of comparing the two models in a closed form situation, the primary endpoint should be considered as a continuous response. Using a flat prior for regression parameters and known variance, the posterior standard errors of the model parameters representing the parameter estimation could be easily calculated. Consider the burn-in period, it can be easily shown, for example, that the posterior standard error of the pairwise independent model is σ15+15=0.6325σ when ATA is 2.0. The standard error within main effects model is 0.5578σ; thus, the main effects standard error is 88% of the one of the pairwise independent model, indicating the potential increase in efficiency of the phase III use of the main effects model. More explicated information will be provided in the illustration of Table V in the results section. Obviously, this reduction comes with added bias in case that an interaction does exist.

Table V.

Simulated trial operating characteristics.

Case Power phase II Futility prob. Size (n) Duration (weeks) %Patient allocated to 2.0 ATA pressure arms Probability (>50%) of phase III success*

1. Null hypothesis 0.20 0.34 176 118 33% 0.20
2. Small treatment effect 0.48 0.13 186 129 38% 0.51
3. Medium treatment effect 0.65 0.06 187 131 36% 0.71
4. Large treatment effect 0.96 0.01 174 125 45% 0.98
5. Harmful treatment effect 0.09 0.57 158 102 33% 0.08
6. Medium Interaction 0.63 0.09 185 106 33% 0.57

ATA, atmospheres absolute.

*

New calculation based on main effects model (S = 1000).

2.9. Software and computations

Computations were performed using three types of software: Fixed and Adaptive Clinical Trial Simulator (FACTS) [16], R [17], and Windows Bayesian inference Using Gibbs Sampling (WINBUGS) [18]. General functions of these softwares are specified in Table I. The pairwise comparisons model with the longitudinal modeling and RAR was performed in FACTS. The main effects model was performed using simulation in R2WINBUGS with custom coding.

Table I.

Software we applied Software.

Software Function

graphic file with name nihms-2129641-t0004.jpg Simulate data and accrual pattern.
Fit model for response and longitudinal modeling.
Calculate all Bayesian quantities.
graphic file with name nihms-2129641-t0005.jpg Perform response adaptive randomization and allocate virtual patients to appropriate arms.
Tally all trial operating characteristics of phase II.
Pull data from FACTS into R.
Call WinBUGS program.
Tally all trial operating characteristics of phase III prediction.
graphic file with name nihms-2129641-t0006.jpg Fit main effects model and predict the posterior probability of phase III success.

First, FACTS is a software used to rapidly design, compare, and simulate both fixed and adaptive trial studies. It is built on compiled low-level languages, such as Fortran and C++; thus, it runs quite fast. Moreover, FACTS is flexible and easy to use because it could be accessed through an interactive graphical user interface. However, currently, FACTS does not have the capability to implement a main effects model. We decided to use FACTS to simulate the pairwise independent model taking advantage of its flexibility and speed, and then use the data output to call a program that was written specifically for phase III predictions in R2WINBUGS. The posterior simulated draws in FACTS were 1000 burn-in and then 2500 draws for inference. In WINBUGS, the burn-in was 1000 and 1000 draws for inference.

2.9.1. Simulations.

In FACTS, there is an option for defining dichotomous longitudinal response profiles that allows us to specify the overall transition probabilities between responder (“1”) and non-responder (“0”). The transition probabilities method for generating longitudinal responses simulates the response observed at each visit by using the probability that a subject becomes or remains a “1” from one visit to the next – and all subjects start with a response of 0.

We specify for each visit

  • the probability of a subject whose response was a “0” at the previous visit having a response of “1” at this visit

  • the probability of a subject whose response was a “1” at the previous visit having a response of “1” at this visit.

However, these probabilities give a particular probability that a subject has a response of “1” at the final visit, so they need to be modified for each arm in each dose response profile to give the desired final probability of response. This is done by numerically determining for each final response rate to simulate, a single value that when added to all the specified transition probabilities in the log-odds space yield the desired probability of final response.

With two visits (1-month and 6-months) and first visit probability of 0->1 of 0.5 and then at the second visit probabilities of 0->1 of 0.2 and of 1->1 of 0.8, the probability of a final response is 0.5 (Table II).

Table II.

Longitudinal data profile with initial transition probabilities.

Visit Month Prob 1->1 Prob 0->1

1 1 0.5
2 6 0.8 0.2

If a response profile calls for the probability of a final response to be simulated with a probability of 0.7, a fixed offset in log-odds is found that when applied to all the transition probabilities results in the desired final probability of a response. Replicating this by hand yields an offset of 0.68, which results in the adjustment in Table III. All scenarios are adjusted in this fashion.

Table III.

Longitudinal data profile after adjustment.

Visit Month Prob 1->1 Prob 0->1

1 1 0.664
2 6 0.888 0.330

Glasgow Outcome Scale - Extended responses at 6-months of all the treatment arms within different case scenarios are presented in Table IV. The first case is referred to as the null hypothesis as each of the arms has identical GOS-E responses. All the novel arms have no improvement in terms of GOS-E response in comparison with the control arm. The remaining case scenarios explore the different GOS-E responses for the experimental arms including one case where harm is exhibited. The final case investigates the interactions of medium effect of pressure associated with NBH and frequency, which aims to comprehend the possible robustness of the main effects model.

Table IV.

The adaptive clinical trial design evaluation via six case scenarios.

Case Scenario 1.0 Control 1.5 NBH 2.0, NBH 2.5, NBH 1.5, BID 2.0, BID 2.5, BID 1.5, NBH, BID 2.0 2.5

1. Null hypothesis 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4
2. Small treatment effect 0.4 0.45 0.5 0.43 0.45 0.5 0.43 0.48 0.48 0.4
3. Medium treatment effect 0.4 0.5 0.55 0.48 0.5 0.5 0.48 0.55 0.5 0.43
4. Large treatment effect 0.4 0.57 0.7 0.52 0.57 0.7 0.52 0.65 0.63 0.45
5. Harmful treatment effect 0.4 0.35 0.35 0.35 0.35 0.35 0.35 0.35 0.35 0.35
6. Medium Interaction 0.4 0.4 0.55 0.4 0.55 0.40 0.55 0.4 0.4 0.4

NBH, normobaric hyperoxia; BID, twice daily.

3. RESULTS

In this section, we summarize the results of several simulation cases as well as one additional null scenario case to control type I error of the design. For each case, 1000 trials are simulated. We present the results as a function of GOS-E response for each arm at 6-month.

For all simulations in this section, we assume an accrual rate of 1.75 subjects per week. No drop outs are assumed.

The study is considered a success if a target duration arm is identified and phase III is recommended to be carried out. In the simulations, if a trial enters a possible success stage, the trial is stopped in the simulation.

We performed six sets of trial simulations based on the various cases of response to calculate the trial operating characteristics, that is, power, futility probability, sample size, duration, and subject allocation, which are presented in Table V. The first four cases range from no effect (null) to large treatment effect. All these four cases have no interaction between factors. We could clearly see an increase in power (starting with a 20% type I error rate) but a decrease in futility rates as the effect increases. Because the null trial has a higher chance to stop for futility, the balance switches in a higher probability to stop for success as the benefit moves to large. Both the sample size and duration increase from null hypothesis to medium treatment effect and then go down in large treatment effect. The percentage of patients placed in the best pressure (2.0 ATA) arms generally increases as treatment effect increases. The probability of phase III success is uniformly larger than the power for phase II because of the added efficiency of the main effects model. The fifth case has explored characteristics, including the futility with a high rate, of the harmful treatment effect. Additionally, the medium interaction case has similar operating characteristics to the medium treatment effect case without interaction except that its probability (>50%) of phase III success prediction is lower than the phase II power. Of note, the estimated sample size of fixed trials that is 325 and would give similar power is substantially larger.

Figure 2 illustrates the patient allocation comparison of RAR+L and fixed one based on a single simulation’s MCMC results. More subjects are placed on control for the RAR+L. There are also more subjects placed on the better experimental arms, and the top two are the third and sixth arms. Further, the incremental improvement in the efficiency is shown by the decrease in the posterior standard deviation from pairwise independent model to a main effects model, illustrating the increase in prediction ability by the latter model. The standard deviation (Figure 3) for the main effects model was on average 79% of the pairwise independent model, which is very similar to the approximate approach discussed in Section 2.9.

Figure 2.

Figure 2.

Allocation of subjects across arms for one simulated medium effect case for RAR with longitudinal (RAR+L) versus the fixed designs. Probability arm a=2,3,4,,10 is better than control.

Figure 3.

Figure 3.

Posterior mean and posterior standard deviation of Glasgow Outcome Scale - Extended from subjects across treatment arms for one simulated medium effect case for pairwise independent and main effects models.

3.1. RAR with longitudinal versus response adaptive randomization versus fixed

We present the adaptive design in terms of RAR (no longitudinal) and RAR+L (adding longitudinal modeling) over the fixed design (no early stopping, no longitudinal modeling, and no RAR) to illustrate the incremental treatment effect (Table VI). Note that the two alternative adaptive designs were recalibrated (i.e., changing what the decision rule is for success and/or futility) so that all designs have the same type I error of 20% and futility rate is 34% in the case of “null hypothesis” (Table V). This allows for a fair comparison across designs. First, both RAR and RAR+L have a 6% and 9% power increases over the fixed design, respectively. The relative reduction of sample size in the RAR and RAR+L is 24 and 26, and the relative reduction of trial duration in the adaptive designs of RAR and RAR+L is14 and 15 weeks, respectively.

Table VI.

Simulated trial operating characteristics for different designs using large effect.

Design Power phase II Futility prob. Size (n) Duration (weeks)

1. Fixed 0.88 0.00 200 140
2. RAR 0.94 0.02 176 126
3. RAR+L 0.96 0.01 174 125

RAR, response adaptive randomization; RAR+L, RAR with longitudinal.

4. DISCUSSION

This manuscript presents a phase II clinical trial design that applies both pairwise independent and main effects models to explore whether one treatment for severe TBI is good enough to carry out a further phase III trial as well as to predict the probability of future success. This approach can identify the optimal treatment factor combinations in terms of pressure, duration, and addition of NBH to HBO2 for severe TBI. The pairwise independent model “drives” the RAR+L modeling, and main effects model uses the data in such a trial for the phase III study prediction.

The main effects model is applied to the specific HBO2 treatment provided by the trial’s goal of choosing an arm and predicting this arm’s phase III success probability. Moreover, the utilization of this approach will be identical in any clinical trial in which investigators are interested in adding efficiency of the prediction through the suppression of the interaction terms in the main effects. Also, this type of modeling is not limited to a binary endpoint, for example, external reviewers of our full clinical trial research plan wanted biological ties to the trial rather than solely based on clinical endpoint provided by the GOS-E. Therefore as a secondary outcome, it was proposed to measure intracranial pressure and model this as a continuous biological response. A main effects model was also used here, but rather than a binary response, a normal distribution response was used. We used the expected sample sizes for power calculations for the secondary endpoints.

The investigation in this manuscript is distinct from the dose finding models where investigators might be interested in similar optimization but of a single continuous factor, and those types of trials regarding dose finding have been investigated in RAR+L modeling and have provided nice innovations to the clinical trials toolbox. However, investigators may commonly face the issue of multiple factor selection; the approach explored in this manuscript would be a good reference for them to identify the novel treatment.

The model that we propose assumes that there is no interaction between the multiple factors in the step that predicts phase III success. We use this assumption because the sample size of the phase II trial is relatively small and that the GOS-E endpoint is a noisy binary variable (e.g., relative to a continuous endpoint). Therefore, it would be benefit to try to find efficiencies through the statistical model. This main effects model “collapses” these factors together so that each level of pressure is providing information in three cells of the model. Of note, we do not investigate each experimental arm of treatment factors that the 3×2×2 factorial model could provide, which with control arm would be 13 arms. Although it would be more efficient to keep all of the factor combinations in the design from a statistical viewpoint, the trade-off of clinical trials between risk–benefit while providing a wide enough net to grab potentially important novel treatments must be balanced. Thus, three factor combinations with too high oxygen dose were eliminated based on medical considerations. A second note is that while the phase III calculations suppress the interaction, the phase II portion uses the pairwise independent means model that actually allows for interactions of the multiple factors.

REFERENCES

  • [1].Finkelstein E, Corso P, Miller T. The Incidence and Economic Burden of Injuries in the United States. Oxford University Press: New York, 2006. [Google Scholar]
  • [2].Rockswold SB, Rockswold GL, Zaun DA, Liu J. A prospective, randomized phase II clinical trial to evaluate the effect of combined hyperbaric and normobaric hyperoxia on cerebral metabolism, intracranial pressure, oxygen toxicity, and clinical outcome in severe traumatic brain injury. Journal Neurosurg 2013; 118(6):1317–1328. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Lin K, Niu K, Tsai K, Kuo J, Wang L, Wang L, Chio C, Chang C. Attenuating inflammation but stimulating both angiogenesis and neurogenesis using hyperbaric oxygen in rats with traumatic brain injury. Journal Trauma 2012; 72(3):650–659. [DOI] [PubMed] [Google Scholar]
  • [4].Berry SM, Spinelli W, Littman GS, Liang JZ, Fardipour P, Berry DA, Lewis RL, Krams M. A Bayesian dose-finding trial with adaptive dose expansion to flexibly assess efficacy and safety of an investigational drug. Clinical Trials 2010; 7:121–135. [DOI] [PubMed] [Google Scholar]
  • [5].Cai C, Liuc S, Yuanc Y. A Bayesian design for phase II clinical trials with delayed responses based on multiple imputation. Statistics in Medicine 2014; 33(23):4017–4028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Kim MO, Liu C, Hu F, Lee JJ. Outcome-adaptive randomization for a delayed outcome with a short-term predictor: imputation-based designs. Statistics in Medicine 2014; 33(23):4029–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Wick J, Berry SM, Yeh H, Choi W, Pacheco CM, Daley C, Gajewski BJ. A novel evaluation of optimality for randomized controlled trials. Journal of Biopharmaceutical Statistics, (in press). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Lee JJ, Chu CT. Bayesian clinical trials in action. Statistics in Medicine 2012; 31(25):2955–2972. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Connor JT, Elm JJ, Broglio KR, Investigators ADAPT-IT. Bayesian adaptive trials offer advantages in comparative effectiveness trials: an example in status epilepticus. Journal of Clinical Epidemiology 2013; 66(8S):S130–137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Berry SM, Carlin BP, Lee JJ, Muller P. Bayesian Adaptive Methods for Clinical Trials. CRC Press: New York, 2011. [Google Scholar]
  • [11].Thall PF, Millikan RE, Mueller P, Lee SJ. Dose-finding with two agents in phase I oncology trials. Biometrics 2003; 59:487–496. [DOI] [PubMed] [Google Scholar]
  • [12].Wang K, Ivanova A. Two-dimensional dose-finding in discrete dose space. Biometrics 2005; 61:217–222. [DOI] [PubMed] [Google Scholar]
  • [13].Yuan Y, Yin G. Sequential continual reassessment method for two-dimensional dose-finding. Statistics in Medicine 2008; 27:5664–5678. [DOI] [PubMed] [Google Scholar]
  • [14].Jin IH, Huo L, Yin G, Yuan Y. Phase I trial design for drug combinations with Bayesian model averaging. Pharmaceutical Statistics 2015; 14:108–119. [DOI] [PubMed] [Google Scholar]
  • [15].Meurer WJ, Lewis RJ, Berry DA. Adaptive clinical trials: a partial remedy for the therapeutic misconception? Journal of the American Medical Association 2012; 307(22):2377–2378. [DOI] [PubMed] [Google Scholar]
  • [16].Berry S, Sanil A. FACTS Dose Finding: Single Endpoint Engine Specification. Tessela: Newton, MA, 2010. [Google Scholar]
  • [17].Core Team RR: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing: Vienna, Austria, 2013. Available at: http://www.R-project.org/ (accessed 26.05.2016). [Google Scholar]
  • [18].Lunn DJ, Thomas A, Best N, Spiegelhalter D. WinBUGS – a Bayesian modelling framework: concepts, structure, and extensibility. Statistics and Computing 2000; 10:325–337. [Google Scholar]

RESOURCES