Skip to main content
Oxford University Press logoLink to Oxford University Press
. 2025 Jun 20;81(2):ujaf073. doi: 10.1093/biomtc/ujaf073

Design of platform trials with a change in the control treatment arm

Peter Greenstreet 1,2,, Thomas Jaki 3,4, Alun Bedding 5, Pavel Mozgunov 6
PMCID: PMC12204708  PMID: 40539327

ABSTRACT

Platform trials are an efficient way of testing multiple treatments. We consider platform trials where, if a treatment is found to be superior to the control, it will become the new standard of care. The remaining treatments are then tested against this new control. In this setting, one can either keep the information on both the new standard of care and the other active treatments before the control is changed or discard this information when testing for benefit of the remaining treatments. We show analytically and numerically, retaining the information collected before the change in control can be detrimental to the power in a frequentist multi-arm multi-stage trial. Specifically, we consider the overall power, the probability that the active treatment with the greatest treatment effect is found during the trial, and the conditional power, the probability a given treatment is found superior against the current control. Also studied is the conditional type I error, the probability a given treatment is incorrectly found superior against the current control. We prove when retaining the information decreases both the overall and conditional power but also decreases the conditional type I error. A motivating example is then studied. Based on these observations, we discuss different aspects to consider when deciding whether to run a continuous platform trial or run an inherently new trial using the same trial infrastructure.

Keywords: change in control, frequentist trials, multi-arm, multi-stage, platform trials

1. INTRODUCTION

Clinical trials take many years and are very costly to run (Mullard, 2018), which has led to multiple developments in methodology on how to efficiently design them (Pallmann et al., 2018). One of these developments has been the idea of platform trials in which multiple treatments are tested against a common control group (Wason and Jaki, 2012). Platform trials can be advantageous due to having a shared trial infrastructure and shared control groups (Burnett et al., 2024). The interest in these types of trials has increased since the beginning of the COVID-19 pandemic (Stallard et al., 2020), as platform trials can result in therapies being identified faster while reducing cost and time (Cohen et al., 2015).

One additional ability one may want from a platform trial’s design is to be able to change the control group to a beneficial new treatment found within the trial. A change of control has happened in multiple platform trials such as STAMPEDE (Sydes et al., 2012) and RECOVERY (Horby et al., 2021). In STAMPEDE, the control was changed as the standard of care changed during the trial, and in RECOVERY, there was no standard treatment, so it was changed once a treatment was shown to be superior. When changing the control group, one may think of using all the data collected to calculate the future test statistics. There is little work currently investigating whether keeping the data collected prior to the change of the control group treatment is the most efficient approach when allowing for a single change in control. In this paper, we will consider 2 settings: (1) We keep all the data from before the change in control and (2) we do not keep any of the data prior to the control changing. The 2 settings lead to the question: Would one be better off starting a new trial or continuing with the original trial when considering power assuming that the trial objective remains the same and the established trial infrastructure is retained.

This work will focus on frequentist multi-arm multi-stage trials (MAMS). Multi-arm trials allow for multiple treatments to be compared at once against a common control treatment. Multi-stage trials have interim analyses, which allow for ineffective treatments to be dropped for futility (or lack of benefit) earlier. As a result interim analyses can improve a trial’s operating characteristics (Pocock, 1977; Todd et al., 2001). They can also allow treatments to stop early if a superior treatment is found; however, in the case studied here, the first time this happens this superior treatment will become the new control.

We will focus our investigation on 2 types of power: (1) Conditional power of a treatment—the probability a given treatment can be found superior against the current control. (2) Overall power of the trial—the probability that the active treatment with the greatest treatment effect is found during the trial. The conditional power is considered as it is of interest under the assumption that one of the remaining treatments may be superior to the new control but has not yet been declared the winner. The overall power is also considered as a metric that the trial results in the best treatment being found in the trial overall. Additionally, in this manuscript, the conditional type I error is investigated, which is the probability that a given treatment is incorrectly found superior against the current control.

As seen in the RECOVERY trial (Horby et al., 2021), when a new standard of care was found within the trial, the objective of the trial remained testing for superiority for the remaining treatments. The motivating example studied in this paper, in Section 5, is based on the TAILoR trial (Pushpakom et al., 2015). However, the results in this manuscript are generalizable to other frequentist platform trials. The TAILoR trial was motivated by combination antiretroviral therapy increasing the risk of insulin resistance, obesity, and type 2 diabetes, predisposing factors for cardiovascular disease, in human immunodeficiency virus (HIV)-positive individuals. The TAILoR study was an adaptive trial studying 3 different doses of telmisartan compared to control to see which (if any) reduces insulin resistance in HIV-positive individuals on antiretrovirals. The inclusion/exclusion criteria detailed in Pushpakom et al. (2015) included requiring the participants to be HIV-positive adults; receiving antiretroviral therapy for at least 6 months; having no pre-existing diagnosis of type 1 or 2 diabetes; and not known to have consistently low blood pressure. The primary outcome measure was a reduction in HOMA-IR (Homeostatic Model Assessment of Insulin Resistance) (Singh and Saxena, 2010). Superiority comparisons continue to be of interest if one finds that one of the lower doses is superior to the control as one would only want to consider using a higher dose if it is superior to the current dose. The results of this work, however, apply to any frequentist platform trial where continued testing for superiority is of interest.

In Section 2, we introduce the notation, the null hypotheses of interest and discuss type I error. Section 3 studies the conditional power and conditional type I error for MAMS trials and gives theorems to when keeping the old data is guaranteed to be detrimental to conditional power but beneficial to conditional type I error. In Section 4, we give the formulation for the overall power along with its definition and give theorems to when keeping the old data is guaranteed to be detrimental for the overall power. Section 5 studies the motivating example. Finally, we discuss the considerations one needs to make when deciding whether to use the pre-change data or not.

2. NOTATION AND TYPE I ERROR CONTROL

Consider a clinical trial with up to Inline graphic experimental arms that will be tested against 1 common control arm. For simplicity, we have the primary outcome for each patient is independent and normally distributed with known variance Inline graphic. In the Web Appendix D, the findings are generalized to the case where the test statistics use for the primary endpoint are asymptotically normal. Each active treatment is tested at Inline graphic analyses with Inline graphic interim analyses. Let Inline graphic denote the number of patients recruited to treatment Inline graphic by the end of stage Inline graphic. For this paper, the focus will be on equal sample size and allocation ratio for each treatments and equally spaced interim analyses for all treatments, as this ensures equal pairwise error for each treatment without needing to have multiple boundary shapes (Greenstreet et al., 2024). Therefore, the number of patients recruited between interim analyses is equal for each treatment, so we define Inline graphic for all Inline graphic and Inline graphic. The focus of this work will be on changing the control group once. We have Inline graphic denoting the current control treatment at stage Inline graphic, where Inline graphic. At the beginning of the trial Inline graphic, we let Inline graphic denote the number of patients recruited prior to treatment Inline graphic becoming the control at stage Inline graphic for all Inline graphic.

The null hypotheses of interest are Inline graphic where Inline graphic are the mean responses on the Inline graphic experimental treatments and Inline graphic is the mean response of the current control group with Inline graphic being the mean response of the initial control. Each of the Inline graphic hypotheses is potentially tested at a series of analyses indexed by Inline graphic where Inline graphic is the stage treatment Inline graphic becomes the control. At analysis Inline graphic for treatment Inline graphic, to test Inline graphic, it is assumed that responses, Inline graphic and Inline graphic, from patients Inline graphic on treatments Inline graphic and Inline graphic, are observed, respectively. These hypotheses are tested at given analysis Inline graphic using the test statistic

2.

If only the data post the change in the control is used, then the test statistics are

2.

These test statistics are used to test Inline graphic, so are used for the decision-making. Upper and lower stopping boundaries, Inline graphic and Inline graphic, are used as follows. If Inline graphic, then Inline graphic is rejected and the conclusion that treatment Inline graphic is superior to the current control is made. If Inline graphic, then treatment Inline graphic is dropped from all subsequent stages of the trial. If the Inline graphic statistics for all the treatments fall below their lower boundary, then the trial stops for futility. Treatment Inline graphic and control continues to its next stage if Inline graphic. If the post change data is only used, then the same rules apply now replacing Inline graphic with Inline graphic. If multiple treatments exceed the upper boundary at the same time point, following Magirr et al. (2012), then the treatment with the largest test statistic becomes the new control.

These upper and lower stopping boundaries are group-sequential bounds which are pre-defined in order to control the original type I error control aimed for in the original trial. Therefore, for example, they could be aiming to control the pairwise error rate (Choodari-Oskooei et al., 2020), the family wise error rate (FWER) (Burnett et al., 2024; Magirr et al., 2012), or the false discovery rate (Robertson et al., 2022). Typically, when continuing to use the same boundaries as already pre-defined, there is no longer a guarantee that this will control the type I error of interest after the change. This is because the original bounds were not designed for this. In Section 3, we study the effect on conditional type I error, which is the probability of making a type I error for a given active treatment after the control has changed.

3. CONDITIONAL POWER AND CONDITIONAL TYPE I ERROR

The conditional power is the probability that a treatment is found to be superior to the new control treatment. The conditional power is considered as one should only continue the trial if they believe that one of the remaining treatments may be superior to the new control but has not yet been declared the winner, and the conditional power is the probability of finding this winner.

Definition 1:

The conditional power for treatment Inline graphic, given Inline graphic becomes the new standard of care at stage Inline graphic, is the probability that treatment Inline graphic is found superior to treatment Inline graphic by the end of the trial, so by stage Inline graphic.

The conditional power can be split into 3 events. Event 1 (Inline graphic) is the event that treatment Inline graphic becomes the control at stage Inline graphic. Event 2 (Inline graphic) is that treatment Inline graphic is still in the trial when treatment Inline graphic becomes the control. Event 3 (Inline graphic) is that none of the other Inline graphic treatments become the control. The detailed formulations for Inline graphic, Inline graphic, Inline graphic are given in Web  Appendix A. The conditional power is, Inline graphic

In order to equate the conditional power, one can use the conditional probability definition to remove the need to calculate any highly truncated multivariate normal distributions. The conditional power is

3.

where Inline graphic is the event that we Inline graphic within the rest of the trial. The formulations for Inline graphic, Inline graphic, Inline graphic, and Inline graphic are given in Web Appendix A. This can be calculated using multivariate normal distributions as discussed for the motivating example.

In the case of only considering the data post changing the control, the test statistics before the change are now independent of the test statistics post the change. Therefore, one only needs the event that we Inline graphic within the rest of the trial. For the case where only the post change data is used, we define this as Inline graphic. The formulations for Inline graphic is given in Web Appendix A. The conditional power in this case is Inline graphic Once again this can be calculated using multivariate normal distributions as discussed for the motivating example.

It can be proven that, in many cases, there is never a benefit to retaining the information pre-change in control treatment when considering conditional power and using the predefined boundaries. The first theorem (Theorem 1) states that if there is only 1 stage left and the upper boundary is positive, then keeping the historic data is detrimental to the conditional power.

Theorem 1:

If a treatment Inline graphic becomes the control group treatment at stage Inline graphic (Inline graphic) and Inline graphic, then the conditional power for treatment Inline graphic, when retaining the data before the control changed, is less than or equal to the conditional power for treatment Inline graphic when not retaining the pre-change data.

The proof of Theorem 1 is given in Web Appendix B. Theorem 1 uses the fact that the active treatment of interest must have been no better than the new control group at the point of changing the control group. If this was not the case, then the active treatment of interest would be the new control. Therefore, by keeping the data before changing control, one is disadvantaging the active treatment as one retains the fact that the active treatment has so far been found worse than the new control treatment. This theorem can be further extended. First, in Theorem 2, which states that if there are multiple stages of the trial left and both the upper and lower boundaries are greater than or equal to 0, then retaining the post change data is detrimental to the conditional power. The second extension is Theorem 3 which states that if there are multiple stages of the trial left and the upper boundaries are positive and there is no lower boundaries, then retaining the post change data is detrimental to the conditional power.

Theorem 2:

If a treatment Inline graphic becomes the new control group treatment at stage Inline graphic (Inline graphic) and Inline graphic and Inline graphic for all Inline graphic, then the conditional power for treatment Inline graphic when retaining the data before the control changed is less than or equal to the conditional power for treatment Inline graphic when not retaining the pre-change data.

Theorem 3:

If a treatment Inline graphic becomes the new control group treatment at stage Inline graphic (Inline graphic) and Inline graphic and there are no lower boundaries for all Inline graphic, then the conditional power for treatment Inline graphic, when retaining the data before the control changed, is less than or equal to the conditional power for treatment Inline graphic when not retaining the pre-change data.

The proof for Theorems 2 and 3 is given in Web Appendix B. Furthermore, as shown in Web Appendix I, even if Inline graphic for some Inline graphic, then one will find that retaining the old information is likely detrimental for the conditional power. However, in Web Appendix J, it is shown that there are cases when Inline graphic where keeping the old data can be beneficial for conditional power. It is worth noting that a lower conditional power does not necessarily imply generally worse properties of a design. For example, in the case of treatment Inline graphic being inferior to treatment Inline graphic, there is an increased chance of the wrong decision being made if the data pre-change is not retained. We define this event as the conditional type I error for a treatment.

Definition 2:

The conditional type I error for treatment Inline graphic, given Inline graphic becomes the new standard of care at stage Inline graphic, is the probability that treatment Inline graphic is found superior to treatment Inline graphic by the end of the trial, given that Inline graphic.

The conditional type I error is therefore the probability that a given treatment is found superior to a new control when in fact it is not superior. The conditional type I error is calculated using the same equations as the conditional power when Inline graphic, so Theorems 1, 2, and 3 also hold. Therefore, when these theorems are true, the conditional type I error is decreased when retaining the pre-change data. Section 5 shows how much both the conditional power and conditional type I error changes between retaining the data or not for the motivating example.

4. OVERALL POWER

The overall power is the probability that during the trial, the active treatment with the greatest positive treatment effect is either taken forward as the new control or is declared superior compared to a new control, if the control has already changed.

Definition 3:

The overall power for the treatment Inline graphic which has the greatest treatment effect, Inline graphic equals the probability treatment Inline graphic is found to be the new control or subsequently treatment Inline graphic is found to be superior to a new control Inline graphic, where Inline graphic.

Therefore, overall power for the treatment Inline graphic which has the greatest treatment effect, Inline graphic can be calculated as

4.

Due to the multiple disjoint sets within Definition 3, the overall power can be split into multiple, easy to compute, parts. The first of these is the probability that at each interim Inline graphic, treatment Inline graphic becomes the control (Inline graphic) and this equals

4. (1)

The probability that another treatment becomes the new control and then this treatment is found to be better than the new control (Inline graphic) can be split into every possible Inline graphic and Inline graphic

4. (2)

Combining Equations (1) and (2), the overall power is Inline graphic When we consider only using the data post change in control, the probability that another treatment becomes the new control and then this treatment is found to be better than the new control (Inline graphic) becomes, Inline graphic This is due to the independence of event 4 with the rest of the events. Therefore the overall power is Inline graphic One can prove when it is guaranteed that there is no benefit to retaining the information pre-change in control treatment when considering overall power. Based on Theorems 2 and 3 for the conditional power, one can prove similar results for the overall power.

Theorem 4:

If Inline graphic and Inline graphic for all Inline graphic, then the overall power when retaining the data before the control changed is less than or equal to the overall power when not retaining the pre-change data.

Theorem 5:

If Inline graphic and there are no lower boundaries for all Inline graphic, then the overall power when retaining the data before the control changed is less than or equal to the overall power when not retaining the pre-change data.

The proof for Theorems 4 and 5 is given in Web Appendix C. Furthermore, as is shown in Web Appendix I, even if Inline graphic for any Inline graphic, then there are cases that retaining information pre-change in the control group reduces overall power. This is shown in the example in Web Appendix I as the difference in conditional power between keeping and discarding the pre-change data is negative, therefore, so will the overall power.

5. MOTIVATING TRIAL EXAMPLE

We consider the motivating trial of TAILoR (Pushpakom et al., 2015). The TAILoR trial was a 4 arm trial which studied the effect of different doses of a treatment on HIV. The study had 1 interim analysis. We are going to use the operating characteristics from this study to see the effects on overall and conditional power if the control was changed mid trial if a treatment was found superior. In the original design, the FWER (Pushpakom et al., 2015) was controlled at 5% one sided for a normal continuous endpoint and there was a planned 90% power. The trial was planned to have equal allocation across stages. In addition, the clinically relevant effect, Inline graphic, was set to Inline graphic and uninteresting effect, Inline graphic, set to 0.178 with variance Inline graphic.

Triangular stopping boundaries will be used (Whitehead, 1997) as recommended in Wason and Jaki (2012). The stopping boundaries will be calculated using the approach given in Magirr et al. (2012) to control FWER for the design before the change in control. The calculations of the power will be done using Greenstreet et al. (2025) in order to control the pairwise power for each treatment. This is chosen as it is similar to that used in the original trial but is designed for trials that continue after a treatment is taken forward. The calculations were carried out using R (R Core Team, 2021) with the method given here having the multivariate normal probabilities being calculated using the package mvtnorm (Genz et al., 2021); the upper and lower boundaries found using MAMS (Jaki et al., 2019) and the code was parallelized using packages doParallel (Daniel et al., 2022a) and foreach (Daniel et al., 2022b). The code is available in the Supplementary Materials.

5.1. Boundaries and sample size

Using the approach by Magirr et al. (2012), the triangular upper and lower stopping boundaries are found to be

5.1.

Using Greenstreet et al. (2025), the maximum sample size is 344 based on 43 patients per arm per stage to ensure pairwise power of 90%.

5.2. Conditional power and conditional type I error

There is only 1 place in the trial where either the conditional power or conditional type I error is not 0 as there is only 1 interim analysis in the study. This happens when a treatment becomes the new control at the first stage. The conditional power for treatment Inline graphic when treatment Inline graphic becomes the new control at stage 1 is

5.2. (3)

When we only retain the new information, the conditional power is Inline graphic In Web Appendix E, the formulations used to calculate Inline graphic and Equation (3) are given. Both equations are also used to calculate the conditional type I error.

5.3. Overall power

The overall power for treatment Inline graphic when the old information is retained is Inline graphic When only new data is used, the overall power is Inline graphic The formulation for Inline graphic, Inline graphic, and Inline graphic are given in Web Appendix F.

5.4. Results

In Figure 1, the difference between conditional power when retaining all the old data and not retaining the data can be seen. The conditional power for treatment 2, when treatment 1 is the new control after the first stage, is studied. The y-axis gives the treatment effect of treatment 2 compared to the original control and the x-axis gives effect of treatment 1 compared to the original control. The color as given on the scale, to the right of the figure, defines the difference in conditional power between retaining the information pre the change and not. The effect of different values of Inline graphic is very small, as it has very little effect on the probability that treatment 2 is found superior to treatment 1 in the final stage or vice versa. Therefore, we will focus on the results for when Inline graphic. However, in Web Appendix G, the effect of Inline graphic having both a negative and positive uninteresting treatment effect are shown.

FIGURE 1.

FIGURE 1

The difference in conditional power between keeping the data pre-change and not, for treatment 2 given that treatment 1 has gone forward at the first stage.

The difference between conditional power when retaining all the old data and not retaining the data can be seen in Figure 1 when Inline graphic. As can be seen in Figure 1, when Inline graphic is around 0.5, then the loss in conditional power is maximized. This can be greater than 50%. However, as this difference becomes a lot more extreme, the loss becomes close to 0. As at this point, either approach has almost an 100% chance of finding treatment 2 superior to treatment 1. In Web Appendix H, we study the effect on conditional power of different possible values of Inline graphic and Inline graphic for one of these points, Inline graphic and Inline graphic. Here, we can ignore the value of Inline graphic as this does not influence the probabilities as shown in the proof to Theorem 1 in Web Appendix B. It is shown here that even in this case where there is on average very little benefit in only retaining the new information, there are potential values of Inline graphic and Inline graphic where there is large benefit in only using the new data. However, the probability of these Inline graphic values happening is very small for the given Inline graphic and Inline graphic.

Figure 2 gives the difference in conditional type I error when comparing keeping or disregarding the data pre-change in control. The top left of the figure is grayed out for all the values of Inline graphic as no conditional type I error would be made. The inflation shown in Figure 2 is smaller than the increase seen for the conditional power with the maximum inflation in conditional type I error of 1.40% which happens when Inline graphic. This is because, for both approaches, the probability that treatment 2 is found superior to treatment 1 when in fact it is not, is small. However, this clearly demonstrates that disregarding the data pre-change can increase the conditional type I error.

FIGURE 2.

FIGURE 2

The difference in conditional type I error between keeping the data pre-change and not, for treatment 2 given that treatment 1 has gone forward at the first stage. The gray area defines the section where Inline graphic so no conditional type I error can be made.

In Figure 3, the difference between overall power when retaining all the old data and not retaining the data can be seen. As it was for conditional power, the effect of different values of Inline graphic is very small, so we will focus on the results for when Inline graphic.

FIGURE 3.

FIGURE 3

The difference in overall power between keeping the data pre-change and not.

The maximum difference in overall power is 1.7%. When calculating the overall power, most of the time, the correct treatment will be taken forward compared to the original control instead of one of the other treatments, so there is not a huge difference between retaining the pre-change data or not. This effect can be seen in Figure 4. This figure gives the probability of the treatment which does not have the greatest treatment effect becoming the control at the first stage.

FIGURE 4.

FIGURE 4

The probability of the treatment which does not have the greatest treatment effect becoming the control at the first stage.

6. DISCUSSION

In this paper, we have studied the effect of keeping or discarding the data post a change in control in a specific type of platform trial. The platform trial type of focus has been frequentist MAMS studies where all the arms begin at once. This work has shown that, for this trial design, one is likely to be better off not retaining the data, when changing the control treatment, with respect to overall and conditional power. The reason for this result is that the active treatment of interest must have been found no better than the new control group at the time of changing the control. If this was not the case, then the active treatment of interest would be the new control. By retaining the data from before the change of control, one is disadvantaging the active treatment as it has so far been found worse then the new control treatment. It would likely be beneficial to start a new trial instead, under the assumption that one is able to use the established infrastructure. There are likely to be other benefits in starting a new trial, including being able to adjust: the research question, the treatments of interest, the target population, and the treatment doses. However, a new trial may involve more administrative and logistical work compared to continuing the trial, even if one can keep the same established infrastructure. Additionally, starting a new trial can increase the conditional type I error compared to retaining all the data, therefore, increasing the chance an inferior treatment being incorrectly declared superior. Other performance metrics of a trial will also be effected by the choice to retain or not that are not considered here, which is an area for further research. It is worth noting that one maybe required to start a new trial instead of continuing the platform trial, in which any further treatments are tested against the new standard of care and the original standard of care, which is an area for further research.

This work has focused on the situation when all treatments start recruitment at the same time; however, one may be interested in the setting of adding additional arms (Burnett et al., 2024; Greenstreet et al., 2024, 2025). The methodology section in this paper is extended in Web Appendix L to the setting of finding the overall and conditional power when additional arms are added in a pre-planned manner. Concurrent data means only participants recruited to the current control arm at the same time as the active arm of interest are used in the comparisons. In the setting discussed in Web Appendix L, there is now unequal information per treatment when changing the control, therefore it is not guaranteed that retaining the information pre-change will be detrimental. One should calculate the conditional and overall power for there given example to establish which approach is favorable. This is shown in 2 examples in Web Appendix L. In the first case, there is no benefit in keeping the data, but for the second case, there can be benefit in keeping the data, with respect to the power of the study.

In this paper, it has been assumed that the primary outcome on each patient is independent and normally distributed with known variance Inline graphic. However, this work can be applied more generally to multiple different endpoints (Stallard and Todd, 2003; Jaki and Magirr, 2013) assuming asymptotic normality of the test statistics. In the Web Appendix D, the methodology for this is presented along with the additional assumptions required for the theorems in this work, when generalizing the results.

The findings of this work should also be extended to alternative platform trial designs, including using a Bayesian framework, using other adaptive features, and using other modes of pooling information between pre- and post-change in control, or changing to non-inferiority boundaries post the change in control. We believe that the results presented here will generalize to several other settings, as the active treatments will often still start with a disadvantage over the new control if the data is retained; however, further work is needed to study this.

Throughout this work, we considered the case of the original pre-defined stopping rules and sample size being retained. This is a realistic setting as the stopping rules where deemed the best for the objectives of the trial at the beginning of the study and hence continue being used in the light of unchanged objectives (ie, showing improvement over standard of care). Additionally, funding limits may restrict the ability to change the sample size of a study. It is, however, worth noting that the results presented in Sections 3 and 4 still hold if one uses different boundary shapes and sample sizes post changing the control as long as these are the same regardless of whether the data pre-change is retained or not.

Furthermore, in this work, we have looked at an ideal example where the trial has equal allocation as planned. However, in reality, the probability of having equal allocation is very slim depending on the treatment allocation method. Therefore, we have also considered the effect of using simple random allocation. This therefore means criteria of Theorem 2 or Theorem 3 are no longer met. We have investigated this for 3 cases. We studied the number of times out of 100 000 000 simulations that keeping the data has resulted in the treatment of interest being taken forward when this would not have been the case using only the new data. This probability is still very small (0.0006% in the example studied) relative to the probability that discarding the old data has resulted in the treatment of interest being taken forward when this would not have been the case using all the data. This can be seen in Web Appendix K.

Overall, this paper has highlighted the importance of considering what to do if you change control during a frequentist MAMS trial to one which has been found superior to the control. Therefore, one should consider if they should continue the current trial or stop and start a new trial with the new control using the same trial infrastructure.

Supplementary Material

ujaf073_Supplemental_Files

Web Appendices, data and code referenced in Sections 2, 3, 4, 5, 6 are available with this paper at the Biometrics website on Oxford Academic.

ACKNOWLEDGMENTS

We would like to thank the Editor, Associate Editor, and the two reviewers for their useful comments and suggestions. For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising.

Contributor Information

Peter Greenstreet, Ottawa Methods Centre, Ottawa Hospital Research Institute, Ottawa, ON K1H 8L6, Canada; Department of Mathematics and Statistics, Lancaster University, Lancaster, LA1 4YF, United Kingdom.

Thomas Jaki, MRC Biostatistics Unit, University of Cambridge, Cambridge, CB2 0RS, United Kingdom; Department of Machine Learning and Data Science, University of Regensburg, Regensburg, 93053, Germany.

Alun Bedding, Alun Bedding Coaching & Consulting Ltd, Bury St Edmunds, IP28 8PU, United Kingdom.

Pavel Mozgunov, MRC Biostatistics Unit, University of Cambridge, Cambridge, CB2 0RS, United Kingdom.

FUNDING

This report is independent research supported by the National Institute for Health Research (NIHR300576). The views expressed in this publication are those of the authors and not necessarily those of the NHS, the National Institute for Health Research, or the Department of Health and Social Care (DHSC). T.J. and P.M. also received funding from UK Medical Research Council (MC UU 00002/14, MC UU 00002/19, MC_UU_00040/03). This paper is based on work completed, while P.G. was part of the EPSRC funded STOR-i centre for doctoral training (EP/S022252/1). P.G. is supported by a CANSTAT trainee award funded by CIHR grant #262556.

CONFLICT OF INTEREST

None declared.

DATA AVAILABILITY

The simulated data that support the findings in this paper are available from the code provided in the Supplementary Materials.

REFERENCES

  1. Burnett  T., König  F., Jaki  T. (2024). Adding experimental treatment arms to multi-arm multi-stage platform trials in progress. Statistics in Medicine. 43, 3447–3462. [DOI] [PubMed] [Google Scholar]
  2. Choodari-Oskooei  B., Bratton  D. J., Gannon  M. R., Meade  A. M., Sydes  M. R., Parmar  M. K. (2020). Adding new experimental arms to randomised clinical trials: Impact on error rates. Clinical Trials (London, England), 17, 273–284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Cohen  D. R., Todd  S., Gregory  W. M., Brown  J. M. (2015). Adding a treatment arm to an ongoing clinical trial: a review of methodology and practice. Trials, 16, 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Daniel  F., Weston  S., Tenenbaum  D., Microsoft Corporation , (2022a). doParallel: Foreach Parallel Adaptor for the ‘parallel’ Package. https://cran.r-project.org/web/packages/doParallel/doParallel.pdf. [Accessed 09 April 2025]. [Google Scholar]
  5. Daniel  F., Ooi  H., Microsoft  R. C., Weston  S. (2022b). foreach: Provides Foreach Looping Construct. https://cran.r-project.org/web/packages/foreach/foreach.pdf. [Accessed 09 April 2025]. [Google Scholar]
  6. Genz  A., Bretz  F., Miwa  T., Mi  X., Leisch  F., Scheipl  F.  et al. (2021). mvtnorm: Multivariate Normal and t Distributions. R package version 1.1-2. https://cran.r-project.org/web/packages/mvtnorm/mvtnorm.pdf. [Accessed 09 April 2025]. [Google Scholar]
  7. Greenstreet  P., Jaki  T., Bedding  A., Harbron  C., Mozgunov  P. (2024). A multi-arm multi-stage platform design that allows preplanned addition of arms while still controlling the family-wise error. Statistics in Medicine, 43, 3613–3632. [DOI] [PubMed] [Google Scholar]
  8. Greenstreet  P., Jaki  T., Bedding  A., Mozgunov  P. (2025). A preplanned multi-stage platform trial for discovering multiple superior treatments with control of FWER and power. Biometrical Journal, 67, e70025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Horby  P., Lim  W. S., Emberson  J. R., Mafham  M., Bell  J. L., Linsell  L.  et al. (2021). Dexamethasone in hospitalized patients with Covid-19. The New England Journal of Medicine, 384, 693–704. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Jaki  T., Magirr  D. (2013). Considerations on covariates and endpoints in multi-arm multi-stage clinical trials selecting all promising treatments. Statistics in Medicine, 32, 1150. [DOI] [PubMed] [Google Scholar]
  11. Jaki  T. F., Pallmann  P. S., Magirr  D. (2019). The R package MAMS for designing multi-arm multi-stage clinical trials. Journal of Statistical Software, 88, 1–25. [Google Scholar]
  12. Magirr  D., Jaki  T., Whitehead  J. (2012). A generalized Dunnett test for multi-arm multi-stage clinical studies with treatment selection. Biometrika, 99, 494–501. [Google Scholar]
  13. Mullard  A. (2018). How much do phase III trials cost?. Nature Reviews Drug Discovery, 17, 777. [DOI] [PubMed] [Google Scholar]
  14. Pallmann  P., Bedding  A. W., Choodari-Oskooei  B., Dimairo  M., Flight  L., Hampson  L. V.  et al. (2018). Adaptive designs in clinical trials: why use them, and how to run and report them. BMC Medicine, 16, 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Pocock  S. J. (1977). Group sequential methods in the design and analysis of clinical trials. Biometrika, 64, 191–199. [Google Scholar]
  16. Pushpakom  S. P., Taylor  C., Kolamunnage-Dona  R., Spowart  C., Vora  J., García-Fiñana  M.  et al. (2015). Telmisartan and Insulin Resistance in HIV (TAILoR): protocol for a dose-ranging phase II randomised open-labelled trial of telmisartan as a strategy for the reduction of insulin resistance in HIV-positive individuals on combination antiretroviral therapy. BMJ open, 5, e009566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. R Core Team . (2021). R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. [Google Scholar]
  18. Robertson  D. S., Wason  J., König  F., Posch  M., Jaki  T. (2022). Online error control for platform trials, arXiv, arXiv:2202.03838, preprint. [DOI] [PMC free article] [PubMed]
  19. Singh  B., Saxena  A. (2010). Surrogate markers of insulin resistance: A review. World Journal of Diabetes, 1, 36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Stallard  N., Hampson  L., Benda  N., Brannath  W., Burnett  T., Friede  T.  et al. (2020). Efficient adaptive designs for clinical trials of interventions for COVID-19. Statistics in Biopharmaceutical Research, 12, 483–497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Stallard  N., Todd  S. (2003). Sequential designs for phase III clinical trials incorporating treatment selection. Statistics in Medicine, 22, 689–703. [DOI] [PubMed] [Google Scholar]
  22. Sydes  M. R., Parmar  M. K. B., Mason  M. D., Clarke  N. W., Amos  C., Anderson  J.  et al. (2012). Flexible trial design in practice—stopping arms for lack-of-benefit and adding research arms mid-trial in STAMPEDE: a multi-arm multi-stage randomized controlled trial. Current Controlled Trials in Cardiovascular Medicine, 13, 168–168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Todd  S., Whitehead  A., Stallard  N., Whitehead  J. (2001). Interim analyses and sequential designs in phase III studies. British Journal of Clinical Pharmacology, 51, 394–399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Wason  J. M. S., Jaki  T. (2012). Optimal design of multi-arm multi-stage trials. Statistics in Medicine, 31, 4269–4279. [DOI] [PubMed] [Google Scholar]
  25. Whitehead  J. (1997). The design and analysis of sequential clinical trials. Biometrics, 53, 1564. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ujaf073_Supplemental_Files

Web Appendices, data and code referenced in Sections 2, 3, 4, 5, 6 are available with this paper at the Biometrics website on Oxford Academic.

Data Availability Statement

The simulated data that support the findings in this paper are available from the code provided in the Supplementary Materials.


Articles from Biometrics are provided here courtesy of Oxford University Press

RESOURCES