Multi-Arm Multi-Stage Survival Trial Design with Arm-Specific Stopping Rule

Jianrong Wu; Yimei Li; Liang Zhu; Tushar Panti

doi:10.1080/10543406.2024.2398036

. Author manuscript; available in PMC: 2026 Feb 27.

Published in final edited form as: J Biopharm Stat. 2024 Sep 16:1–12. doi: 10.1080/10543406.2024.2398036

Multi-Arm Multi-Stage Survival Trial Design with Arm-Specific Stopping Rule

Jianrong Wu ¹, Yimei Li ², Liang Zhu ³, Tushar Panti ²

PMCID: PMC11911941 NIHMSID: NIHMS2023588 PMID: 39282887

Abstract

Traditional two-arm randomized trial designs have played a pivotal role in establishing the efficacy of medical interventions. However, their efficiency is often compromised when confronted with multiple experimental treatments or limited resources. In response to these challenges, the multi-arm multi-stage designs have emerged, enabling the simultaneous evaluation of multiple treatments within a single trial. In such an approach, if an arm meets an efficacy success criteria at an interim stage, the whole trial stops and the arm is selected for further study. However when multiple treatment arms are active, stopping the trial at the moment one arm achieves success diminishes the probability of selecting the best arm. To address this issue, we have developed a group sequential multi-arm multi-stage survival trial design with an arm-specific stopping rule. The proposed method controls the familywise type I error in a strong sense and selects the best promising treatment arm with a high probability.

Keywords: Multi-arm multi-stage, group sequential design, log-rank test, sequential conditional probability ratio test, time-to-event

1. Introduction

Traditional clinical trial designs, exemplified by the two-arm parallel design, have played a pivotal role in establishing the efficacy of medical interventions. However, their efficiency is often compromised when confronted with multiple experimental treatments or limited resources. In response to these challenges, the multi-arm multi-stage (MAMS) designs have emerged, enabling the simultaneous evaluation of multiple treatments within a single trial. This innovation significantly reduces the overall time, cost, and patient burden required to bring new therapies to market.

Several group sequential MAMS designs have been proposed including Stallard and Friede, 2008; Magirr, Jaki and Whitehead, 2012 (refer to as MJW hereafter); Wason and Jaki, 2012; Magirr et al., 2014; Wu et al., (2023), etc. Notably, Jaki and Magirr (2013) and Wu and Li (2023) introduced a group sequential MAMS trial design with time-to-event endpoints. In their approach, all treatment arms commence simultaneously. If an arm meets predefined futility criteria at an interim stage, it is discontinued from the trial, and the future subjects allocated to that arm are not reassigned to the remaining arms. Conversely, if an arm meets efficacy success criteria at an interim stage, the entire trial is halted, and the arm is selected and declared as the success arm.

However, this design faces inadequacy when multiple arms show potential for meeting efficacy success criteria if provided with the opportunity to continue the trial. Stopping the trial at the moment one arm achieves success diminishes the probability of selecting the best arm. To address this issue, we propose a new group sequential MAMS survival trial design that discontinues an arm upon demonstrating efficacy at an interim analysis but continues the evaluation of all remaining arms; termed arm-specific stopping rule based design. In such MAMS trial designs, the arm(s) selection are made after the final analysis. Either the most effective arm or multiple arms may be selected. The arm(s) selection is more flexible and can be made based on other clinical factors, e.g., treatment cost or toxicity information.

Our proposed method is based on the sequential conditional probability ratio test (SCPRT) procedure (Xiong, 1995) and employs an event-driven approach. The SCPRT procedure offers analytical solutions for establishing futility and efficacy boundaries, adaptable to trials with any number of stages and arms. Importantly, it maintains effective control over the familywise error rate (FWER) through the utilization of the Dunnett correction under a global null hypothesis. Furthermore, through the application of an event-driven approach, our proposed group sequential MAMS designs demonstrate resilience in the face of various challenges, including variations in accrual, issues related to censoring, accrual rate, and accurate specification of survival distributions, which are often challenging to establish accurately at the design stage.

2. Log-rank test

We consider a $(K + 1)$ -arm trial that compares $K$ treatment arms to a common control arm and label the arms $k = 0,1, \dots, K$ , where 0 represents the control arm. Let $λ_{k} (x)$ and $S_{k} (x)$ be the hazard and cumulative survival functions of the $k$ th arm, respectively. Assume proportional hazards models between each treatment arm and the control, $S_{k} (x) = {[S_{0} (x)]}^{δ_{k}}$ or $λ_{k} (x) = δ_{k} λ_{0} (x), k = 1, \dots, K$ , where $δ_{k}$ is the hazard ratio of the $k$ th treatment arm to the control arm. Assume that a total of $n$ patients are randomized to $K + 1$ arms with the allocation ratio $ω_{k} = n_{k} / n$ for the $k$ th arm ( $k = 0,1, \dots, K$ ), where $n_{k}$ is the sample size for $k$ th arm and $n = \sum_{k = 0}^{K} n_{k}$ is the total sample size. Let $Z_{k} (k = 1, \dots, K)$ be the standardized two-sample log-rank test for comparing the $k$ th treatment arm to the control arm. It has been shown (Schaid et al., 1990; or Appendix 1 of supplemental material) that the asymptotic joint distribution of $Z = (Z_{1}, \dots, Z_{K})$ is a multivariate normal distribution with mean $μ = (μ_{1}, \dots, μ_{K})$ , where

μ_{k} = - \sqrt{d} \log δ_{k} {(ω_{0} ω_{k})}^{1 / 2} / (ω_{0} + ω_{k})^{1 / 2}, k = 1, \dots, K

(1)

and variance-covariance matrix $Σ = (σ_{k k^{'}})$ , where

σ_{k k^{'}} = \{\begin{array}{l} 1 & k = k^{'} \\ \frac{{(ω_{k} ω_{k^{'}})}^{1 / 2}}{{(ω_{0} + ω_{k})}^{1 / 2} {(ω_{0} + ω_{k^{'}})}^{1 / 2}} & k \neq k^{'} \end{array}

(2)

with $d = n p$ as the total number of events and $p$ as the overall failure probability given by following equations (3) and (4). Assuming the accrual distribution is $A (\cdot)$ and recruitment with an accrual duration $t_{a}$ and follow-up time $t_{f}$ , and loss to follow-up distribution $G_{L} (\cdot)$ , then failure probability for $k^{t h}$ arm $p_{k} (k = 0,1, \dots, K)$ can be calculated as follows:

p_{k} = \int_{0}^{τ} A (τ - x) S_{k} (x) λ_{k} (x) G_{L} (x) d x,

(3)

where $τ = t_{a} + t_{f}$ is the study duration and $λ_{k} (\cdot)$ is the hazard function for the $k^{t h}$ arm. The overall failure probability of $p$ is then given as follows

p = ω_{0} p_{0} + ω_{1} p_{1} + \dots + ω_{K} p_{K} .

(4)

In this article, we assume that sample size allocation to each treatment arm is equal and sample size allocation ratio between the treatment and control arms is $r$ ( $n_{k} = r n_{0}$ ), then it is easy to verify by equations (1) and (2) that under the global null hypothesis $H_{G} : δ_{1} = \dots = δ_{K} = 1$ , the joint distribution of $Z = (Z_{1}, \dots, Z_{K})^{'}$ follows a $K$ -dimensional multivariate normal $M N (0, Σ)$ , where variance-covariance matrix is $Σ = (ρ_{k k^{'}})$ with

ρ_{k k^{'}} = \{\begin{array}{l} 1 & k = k^{'} \\ \frac{r}{1 + r} & k \neq k^{'} \end{array}

(5)

3. Familywise error rate

For a multi-arm trial, to control overall type I error due to multiple comparisons, it is common to consider the familywise error rate (FWER) which is the probability of rejecting at least one true null hypothesis across a set of null hypotheses $H_{0}^{(k)} : δ_{k} \geq 1, k = 1, \dots, K$ . Strong control of the FWER at level $α$ is that the FWER is below $α$ for all possible values ${δ_{k}$ , $k = 1, \dots, K$ } in the region of the null hypothesis. Magirr et al. (2012) have shown that the FWER is maximized under the global null $H_{G} : δ_{1} = \dots δ_{K} = 1$ for the simultaneous stopping rule (stopping for efficacy results in the trial terminating). Therefore, controlling FWER under the global null hypothesis provides strong control of the FWER. Similar results are applied to the arm-specific stopping rule design (Halabi and Michiels, 2019). In this paper, we consider controlling FWER under the global null hypothesis at level of $α$ . We can use the Dunnett correction (Dunnett, 1955) by choosing $c_{α}$ satisfying

\begin{array}{l} α = P (\cup_{k = 1}^{K} (Z_{k} > c_{α}) | H_{G}) \\ = 1 - P (Z_{1} < c_{α}, \dots, Z_{K} < c_{α}) | H_{G}) \\ = 1 - \int_{- \infty}^{c_{α}} \dots \int_{- \infty}^{c_{α}} ϕ (x_{1}, \dots, x_{K}; Σ) d x_{1} \dots x_{K}, \end{array}

(6)

where $ϕ (x_{1}, \dots, x_{K}; Σ)$ is a multivariate normal density function with mean 0 and variance-covariance matrix $Σ$ given by equation (5). Using numeric integration, such as in the method of Genz and Bretz (2009) which is implemented in R, critical values $c_{α}$ can be numerically solved from the above equation (6).

4. Power for multi-arm trial

The power of a multi-arm trial is more complex and has several definitions. Let the hazard ratio $δ (< 1)$ be a minimal clinically relevant treatment effect that we want to detect. We define an alternative hypothesis for a particular treatment arm $k$ as $H_{a}^{(k)} : δ_{k} = δ$ and the global alternative hypothesis for the $K$ -treatment arm trial as $H_{a} = \cap_{k = 1}^{K} H_{a}^{(k)}$ , that is $H_{a} : δ_{1} = \dots = δ_{K} = δ$ . In this paper, we consider two types of powers for multi-arm trial designs as outlined in the following subsections.

4.1. Disjunctive power

The disjunctive power (Wason and Jaki, 2012) is the probability of rejecting at least one null hypothesis under the global alternative hypothesis. Let $n$ be the total sample size and $c_{α}$ be the critical value to reject the null hypothesis based on Dunnett correction. Then the disjunctive power $1 - β$ is given as follows:

\begin{array}{l} 1 - β = P (\cup_{k = 1}^{K} (Z_{k} > c_{α}) | H_{a}) \\ = 1 - \int_{- \infty}^{c_{α}} \dots \int_{- \infty}^{c_{α}} ϕ (x_{1}, \dots, x_{K}; μ, Σ) d x_{1} \dots d x_{K}, \end{array}

(7)

where $ϕ (x_{1}, \dots, x_{K}; μ, Σ)$ is a multivariate normal density function with mean $μ$ and variance-covariance matrix $Σ$ under the global alternative hypothesis $H_{a} : δ_{1} = \dots = δ_{K} = δ$ , where $μ = (μ_{1}, \dots, μ_{K})$ and $Σ$ are given by equations (1) and (5).

4.2. Power under least favorable configuration

Under the alternative hypothesis $H_{L F C} : δ_{1} = δ^{(1)}, δ_{2} = δ^{(0)}, \dots, δ_{K} = δ^{(0)}$ , where $δ^{(1)}$ represents a clinically relevant improvement and $δ^{(0)} (< δ^{(1)})$ is an effect size such that if $δ_{k} \leq δ^{(0)}$ , then we would prefer not to proceed further in investigating treatment $k$ . This is known as the least favorable configuration (LFC) (Dunnett, 1984). Let $n$ be the total sample size and $c_{α}$ be the critical value to reject the null hypothesis based on Dunnett correction. Then the power under LFC is given as follows:

\begin{array}{l} 1 - β = P (\cup_{k = 1}^{K} (Z_{k} > c_{α}) | H_{L F C}) \\ = 1 - \int_{- \infty}^{c_{α}} \dots \int_{- \infty}^{c_{α}} ϕ (x_{1}, \dots, x_{K}; μ_{L F C}, Σ) d x_{1} \dots d x_{K} . \end{array}

(8)

where $ϕ (x_{1}, \dots, x_{K}; μ_{L F C}, Σ)$ is a multivariate normal density function with mean $μ_{L F C}$ which is calculated using equation (1) under the LFC alternative hypothesis $H_{L F C}$ and variance-covariance matrix $Σ$ is given by equation (5).

5. Group sequential MAMS design

5.1. SCPRT group sequential procedure

We now consider a group sequential MAMS trial with $J$ -stage and $K + 1$ arms and define $Z_{k, j}$ be the two-sample log-rank test using cumulative data up to the $j^{t h}$ interim look for comparing the $k^{t h}$ treatment arm to the common control, and $t_{j}^{*}$ be the information time at the $j^{t h}$ interim look, where $j = 1, \dots, J$ (including the final analysis). Based on the SCPRT procedure (Xiong, 1995), the sequential futility and efficacy boundaries at the $j^{t h}$ interim look are given as follows:

a_{j} = \frac{c_{α} t_{j}^{*} - {2 a t_{j}^{*} (1 - t_{j}^{*})}^{1 / 2}}{\sqrt{t_{j}^{*}}},

(9)

b_{j} = \frac{c_{α} t_{j}^{*} + {2 a t_{j}^{*} (1 - t_{j}^{*})}^{1 / 2}}{\sqrt{t_{j}^{*}}},

(10)

where $c_{α}$ is the critical value at final stage with one-sided FWER $α$ , and $a$ is a boundary coefficient which is determined by the maximum conditional probability of discordance $ρ$ (Xiong et al., 2003), the probability of the conclusion obtained by the sequential test at an interim look is being reversed by the test at the end of study. At the $j^{t h}$ interim look, if the two-sample log-rank test $Z_{k, j}$ satisfies $Z_{k, j} < a_{j}$ or $Z_{k, j} > b_{j}$ , the $k^{t h}$ treatment arm is dropped (stop for futility) or graduated (stop for efficacy) from the study. Otherwise, the $k^{t h}$ treatment arm goes to the next stage. At the final analysis (with $t_{J}^{*} = 1$ and $a_{J} = b_{J} = c_{α}$ ), if $Z_{k, j} > c_{α}$ , the $k^{t h}$ treatment arm is declared active, otherwise, the treatment arm is futile. The nominal significance levels at the $j^{t h}$ interim look for testing hypothesis $H_{0}^{(k)}$ are given by

P_{a_{j}} = 1 - Φ (a_{j}); P_{b_{j}} = 1 - Φ (b_{j}),

where $Φ (\cdot)$ is the cumulative distribution function of the standard normal. We accept or reject the null hypothesis at $j^{t h}$ interim analysis for $k^{t h}$ treatment arm if the observed $p$ -value of the test $Z_{k, j}$ is greater than $P_{a_{j}}$ or less than $P_{b_{j}}$ ; otherwise the trial continues to the next stage.

It is crucial to choose an appropriate boundary coefficient $a$ for the design such that the probability of conclusion from sequential test being reversed by the test at the planned end is small, but not unnecessarily too small. Specifically, Let $D$ be the event that the conclusion at an interim time be reversed at the final time and $θ$ be a drift parameter of the Brownian motion $B_{t}$ . Let $ρ_{s} = P_{θ} (D | B_{1} = s)$ , which is the conditional probability of discordance given the last stage observation $B_{1} = s$ , and let $ρ = m a x_{s} ρ_{s}$ , which is the maximum conditional probability of discordance. Boundary coefficient $a$ in equations (9) and (10) is determined by choosing appropriate $ρ$ . A smaller $ρ$ results in a larger $a$ so that upper and lower boundaries are further apart, which leads to a larger expected sample size. We recommend $ρ = 0.02$ , as it leads to a maximum discordance probability 0.0056 and results in a SCPRT boundary that is efficient as well as preserving the agreement of conclusions between the test at the early stopping and the test at the planned end. For balanced information time, the boundary coefficient $a$ is calculated for a given $ρ = 0.02$ in Table 1 for $J = 2, \dots,10$ . For unbalanced information time, we still can use $a$ with balanced information time. This only results in some slight changes in discordance probability (Xiong et al., 2003).

Table 1:

The maximum conditional probability of discordance $ρ$ and boundary coefficient $a$ for a $J$ -stage (include final analysis) group sequential SCPRT procedure with balanced information times.

	SCPRT boundary coefficient $a$ for $J$ interim analyses
$ρ$	$J = 2$	$J = 3$	$J = 4$	$J = 5$	$J = 6$	$J = 7$	$J = 8$	$J = 9$	$J = 10$
0.02	2.109	2.645	2.953	3.166	3.327	3.456	3.562	3.652	3.729

Open in a new tab

5.2. Interim analysis

The time for interim analysis is determined by information spending on the trial or information time. For the trial with a time-to-event endpoint, the information is determined by the number of events instead of sample size. Let $d = d_{0} + d_{1} + \dots d_{K}$ be the total number of events required for the trial where $d_{k} = n ω_{k} p_{k}$ is the number of events for $k$ th arm, $k = 0,1, \dots, K$ . After a combined number of events $d_{j}^{(k)} = (d_{0} + d_{k}) t_{j}^{*}$ observed between $k$ th treatment arm and the control at $j^{t h}$ interim look, we conduct an interim analysis for that treatment arm (vs the control), where $t_{j}^{*}$ is the pre-specified information time at the $j^{t h}$ look, e.g., $t_{j}^{*} = j / J$ for an equal number of events per arm per stage. This procedure needs to be performed for all remaining treatment arms at $j^{t h}$ -stage (these interim analyses are not conducted at same calendar time). This approach depends on the combined number of events between the treatment arm and the control at each interim analysis. Hence, dropping arm(s) due to futility or graduating arm(s) due to efficacy has no impact on the information time. Thus, the proposed procedure for interim analyses can preserve the power for the group sequential MAMS trial with dropping the futility arm(s) or/and graduating the efficacy arm(s) during the trial. With the interim analysis approach based on the combined number of events between a treatment arm and the control, there is a window time period (interim-window) for an interim analysis because the interim analysis occurs at different time points for different experimental treatment arms.

5.3. Group sequential FWER and power

To assess the FWER and power for a group sequential MAMS trial with $(K + 1)$ -arm and $J$ -stage, we define following events $C_{k, j} = (Z_{k, j} > b_{j})$ and $B_{k, j} = (a_{j} < Z_{k, j} < b_{j})$ for $k^{t h}$ treatment arm at $j^{t h}$ interim look, then, the event rejecting null hypothesis $H_{0}^{(k)}$ within $J$ stages for $k^{t h}$ treatment arm is given by

\begin{array}{l} R_{k} = \cup_{j = 1}^{J} \{[\cap_{i = 1}^{j - 1} B_{k, i}] \cap C_{k, j}\} \\ = \cup_{j = 1}^{J} (a_{1} < Z_{k,1} < b_{1}, \dots, a_{j - 1} < Z_{k, j - 1} < b_{j - 1}, Z_{k, j} > b_{j}), \end{array}

(11)

where $\cap_{i = 1}^{0} B_{k, i}$ represents $Ω$ (full set). Thus, one-sided FWER $α$ under the global null $H_{G}$ can be calculated as follows:

\begin{array}{l} α = P (Reject at least one null hypothesis | H_{G}) \\ = P (\cup_{k = 1}^{K} R_{k} | H_{G}) . \end{array}

(12)

The disjunctive power $1 - β$ under global alternative $H_{a}$ can be calculated as follows:

\begin{array}{l} 1 - β = P (Reject at least one null hypothesis | H_{a}) \\ = P (\cup_{k = 1}^{K} R_{k} | H_{a}) . \end{array}

(13)

The power $1 - β$ under the LFC alternative $H_{L F C}$ can be calculated as follows:

\begin{array}{l} 1 - β = P (Reject at least one null hypothesis | H_{L F C}) \\ = P (\cup_{k = 1}^{K} R_{k} | H_{L F C}) . \end{array}

(14)

Using these formulae combined with the joint multivariate normal distribution of $(Z_{k 1}, \dots, Z_{k, j})$ for $j = 1, \dots, J$ and $k = 1, \dots, K$ , the FWER and power in the group sequential setting can be calculated through multivariate integrations. It can be shown that the power function based on the SCPRT procedure is approximately same as that for the fixed sample design. Specifically, for a group sequential procedure with $J$ interim analyses, let ${\bar{β}}_{0} (θ)$ and ${\bar{β}}_{J} (θ)$ be the power functions for the fixed sample test and $J$ -stage group sequential SCPRT test, where $θ$ is a drift parameter of the Brownian motion and $ρ_{m a x}$ be the maximum discordant probability of the group sequential SCPRT procedure, respectively. Following Theorem 4.1 in Xiong (1995) for any $θ$ , we have

(1 - ρ_{m a x}) {\bar{β}}_{0} (θ) \leq {\bar{β}}_{J} (θ) \leq (1 - ρ_{m a x}) {\bar{β}}_{0} (θ) + ρ_{m a x}

which implies that the difference between the two power functions is less than $ρ_{m a x}$ . Thus, with a small maximum discordant probability $ρ_{m a x}$ , the power of a fixed sample design provides approximately the same power for the group sequential trial based on the SCPRT procedure. With recommended maximum conditional probability of discordance $ρ = 0.02$ , it leads to a maximum discordance probability $ρ_{m a x} = 0.0054$ . More details for computation of the maximum discordance probability $ρ_{m a x}$ can be found in Xiong et al., (2002). We also conduct simulation studies to demonstrate that the proposed group sequential MAMS design preserves the nominal FWER and power (see Section 6).

5.4. Implementation of trial design

The proposed group sequential MAMS design has been implemented in R codes (see supplemental materials). The SCPRT procedure has been implemented in a user-friendly software SCPRTinfWin which can be downloaded at https://www.stjude.org/research/departments/biostatistics/software/scprt (Xiong, 2017). The futility and efficacy boundaries $a_{j}$ and $b_{j}$ of the SCPRT procedure with Dunnett correction for the designs with number of arms up to $K = 4$ and number of stages $J = 4$ are given in Table 2. For example for a MAMS trial with three arms (two treatment arms and a control arm) ( $K = 2$ ) and three interim looks ( $J = 3$ ) with balanced information times $(t_{1}^{*}, t_{2}^{*}, t_{3}^{*}) = (1 / 3, 2 / 3, 1)$ and maximum conditional probability of discordance $ρ = 0.02$ , we can calculate the boundary coefficient $a = 2.645$ using SCPRTinfWin software (also see Table 1). Then, given FWER of $α = 5 %$ , we use the following R function ‘Dunnett’ to calculate critical value $c_{α} = 1.916$ with the Dunnett correction, and R function ‘SCPRT’ to calculate futility and efficacy boundaries $(a_{1}, a_{2}, a_{3}) = (- 0.772, 0.237, 1.916)$ and $(b_{1}, b_{2}, b_{3}) = (2.984, 2.892, 1.916)$ . The required total sample size ( $n$ ) and total number of events ( $d$ ) for the study designs given in Table 3 can be calculated using R functions ‘DisSize’ and ‘LFCSize’ with disjunctive power and LFC power, respectively.

Table 2:

The SCPRT boundaries $a_{j}$ and $b_{j}$ for group sequential $J$ -stage $(K + 1)$ -arm designs with Dunnett type adjustment for one-sided FWER $α = 0.05$ .

		$K = 2$		$K = 3$		$K = 4$
$J$	$t_{j}^{*}$	$a_{j}$	$b_{j}$	$a_{j}$	$b_{j}$	$a_{j}$	$b_{j}$
2	0.5	−0.097	2.807	0.006	2.910	0.076	2.980
	1	1.916	1.916	2.062	2.062	2.161	2.161
3	1/3	−0.772	2.984	−0.687	3.068	−0.630	3.126
	2/3	0.237	2.892	0.356	3.012	0.437	3.092
	1	1.916	1.916	2.062	2.062	2.161	2.161
4	0.25	−1.147	3.063	−1.074	3.136	−1.024	3.185
	0.5	−0.364	3.073	−0.260	3.176	−0.190	3.246
	0.75	0.444	2.874	0.571	3.001	0.656	3.087
	1	1.916	1.916	2.062	2.062	2.161	2.161

Open in a new tab

Table 3:

Total sample size ( $n$ ) and number of events ( $d$ ) are calculated for multi-arm trials with one-sided FWER of 5% and 90% disjunctive and LFC powers under exponential distributions for various design scenarios: median of the control $m_{0} = 7.3$ (months), accrual duration $t_{a} = 24$ (months), follow-up time $t_{f} = 6,12$ (months) various of hazard ratio $δ$ , and number of treatment arms $K = 2$ with $δ_{1} = δ_{2} = δ$ and for disjunctive power and $δ_{1} = δ$ and $δ_{2} = 1$ for LFC power. Simulations are conducted to estimate the empirical power (EP) and FWER based on 10,000 simulation runs.

		Disjunctive				LFC
$δ^{- 1}$	$t_{f}$	$n$	$d$	FWER	EP	$n$	$d$	FWER	EP
1.4	6	542	383	0.049	0.905	731	542	0.046	0.893
1.5	6	382	264	0.050	0.905	509	374	0.052	0.889
1.6	6	290	197	0.052	0.908	382	278	0.050	0.882
1.4	12	472	383	0.050	0.905	643	542	0.051	0.893
1.5	12	331	264	0.048	0.908	447	374	0.054	0.897
1.6	12	251	197	0.052	0.911	335	278	0.052	0.893
1.4	16	446	383	0.054	0.908	611	542	0.054	0.893
1.5	16	312	264	0.051	0.906	424	374	0.053	0.892
1.6	16	236	197	0.049	0.909	318	278	0.051	0.892

Open in a new tab

# R function ‘Dunnett’ calculate critical value with Dunnett correction

library(mvtnorm)

Dunnett(alpha=0.05,r=1, K=2)

$critical.value

[1] 1.916

# R function ‘SCPRT’ calculate SCPRT lower and upper boundaries

SCPRT(alpha=0.05, r=1, K=2, frac=c(1/3,2/3,1))

$critical.value

[1] 1.916

$lshape

[1] −0.772, 0.237, 1.916

$ushape

[1] 2.984, 2.892, 1.916

# R function ‘DisSize’ calculate total number of events and sample size

# for disjunctive power

DisSize(kappa=1,m0=7.3,delta=1/1.4,ta=24,tf=6,beta=0.1,r=1,eta=0,K=2)

$total.number.events

[1] 383

$total.sample.size

[1] 542

# R function ‘LFCSize’ calculate total number of events and sample size

# for LFC power

LFCSize(kappa=1,m0=7.3,delta=1/1.4,delta0=1,ta=24,tf=6,beta=0.1,r=1,eta=0,K=2)

$total.number.events

[1] 542

$total.sample.size

[1] 731

All R functions including R functions for group sequential MAMS trial designs and operating characteristics simulations are available in Appendix 2 supplemental material.

6. Simulation

In this section, simulation studies were conducted to determine the performance of the proposed fixed sample designs and operating characteristics of the proposed group sequential MAMS designs. We will also compare the selection probabilities of the most effective arm(s) between the two designs: utilizing a simultaneous stopping rule and an arm-specific stopping rule.

6.1. Performance of fixed sample design

The first simulation study is to verify the accuracy of the proposed sample size (or number of events) calculations for multi-arm fixed sample designs. For sample size calculation, we assume equal sample size allocation ( $r = 1$ ). Sample sizes are calculated under exponential survival distributions $S_{k} (t) = e^{- λ_{k} t} (k = 0,1, \dots, K)$ . The design parameter configurations are given as follows: number of treatment arms $K = 2$ ; uniform accrual with accrual duration $t_{a} = 24$ (months) and follow-up time $t_{f} = 6,12$ and 16 (months); the value of $λ_{0}$ is selected to reflect a median survival time $m_{0} = 7.3$ (months) of the control, and inverse hazard ratio $δ_{k}^{- 1} = λ_{0} / λ_{k}$ is set to be 1.4, 1.5 and 1.6 $(k = 1, \dots, K)$ . We further assume no loss to follow-up (administrative censoring only). Thus, censoring time due to the patient’s staggered entry follows a uniform distribution on interval $[t_{f}, t_{a} + t_{f}]$ . Table 3 presents the total number of events and sample size calculated under various scenarios with one-sided FWER of 5% and disjunctive power and LFC power of 90%. The sample size calculation results in Table 3 showed that studies with disjunctive power require the smaller total sample sizes (number of events) than the studies under LFC. Increasing the duration of follow up significantly reduces the total sample size but does not change the total number of events. Simulation results (based on 10,000 simulated trials) showed that the empirical FWERs and powers are all closer to their nominal level 5% and 90%, respectively. Therefore, the proposed sample size formula provides an accurate sample size (number of events) estimation for fixed sample designs.

6.2. Operating characteristics of MAMS design

The second simulation is to study the operating characteristics of the proposed group sequential MAMS design. We considered trials with a number of treatment arms $K = 2,3$ and number of stages $J = 2,3$ . Sample sizes were calculated for fixed sample designs under exponential distribution with one-sided FWER of 5% and power of 90% by assuming a median survival time of 7.3 months for the control group and an inverse hazard ratio $δ^{- 1} = 1.4$ ; uniform accrual with accrual duration $t_{a} = 24$ (months), follow-up time $t_{f} = 12$ (months) and no loss to follow-up. The empirical FWER and power for the corresponding group sequential MAMS design were obtained from 10,000 simulation runs. The results in Table 4 showed that all simulated empirical FWERs and powers were closer to their nominal values. Thus, we have empirically verified that proposed sample size calculation for the fixed sample design preserves the FWER and power for the group sequential MAMS design. Table 4 also provided the operating characteristics of the proposed group sequential MAMS designs, such as expected total sample size, number of events and study duration. As expected, that two-stage MAMS designs reduced the expected sample size, number of events, and study duration compared to the fixed sample design, and three-stage designs showed only marginal reductions in these quantities compared to the two-stage designs.

Table 4:

The operating characteristics of the proposed group sequential MAMS designs with number of treatment arms $K = 2$ , 3, number of stages $J = 2,3$ , accrual duration $t_{a} = 24$ (months), follow-up time $t_{f} = 12$ (months) and total study duration $τ = 36$ (months). Total sample size ( $n_{m a x}$ ) and number of events ( $d_{m a x}$ ) are calculated with one-sided FWER 5% and power of 90% under exponential model with median survival time 7.3 months for the control group to detect a hazard ratio $δ = 1 / 1.4$ . The corresponding empirical FWER, disjunctive (Dis) and LFC powers, expected sample size, number of events and study duration are estimated based on 10,000 simulated trials.

$K$	$J$	Type	$n_{m a x}$	$E (n_{m a x})$		$d_{m a x}$	$E (d_{m a x})$		$τ$	$E (τ)$		FWER	Power
				$H_{0}$	$H_{a}$		$H_{0}$	$H_{a}$		$H_{0}$	$H_{a}$
2	2	Dis	474	429	460	384	325	352	36	31.1	34.8	0.054	0.907
		LFC	645	590	610	543	447	466	36	31.1	34.7	0.054	0.894
2	3	Dis	474	427	460	384	305	344	36	29.2	34.2	0.055	0.910
		LFC	645	589	611	543	423	448	36	29.5	33.7	0.054	0.894
3	2	Dis	592	531	577	476	396	441	36	31.7	35.6	0.051	0.910
		LFC	932	847	870	792	634	657	36	31.9	35.3	0.052	0.893
3	3	Dis	592	525	578	476	369	432	36	30.0	35.4	0.053	0.913
		LFC	932	843	868	792	595	628	36	30.3	34.5	0.051	0.888

Open in a new tab

6.3. Comparison of selection probabilities

The goal of a phase II MAMS trial is to select one or more arms for advancement to phase III trial. Thus, the probability of selecting the most effective arm(s) is an important operating characteristic of the trial design. However, theoretical calculation of the selection probability for a MAMS trial is difficult, therefore we conducted a third simulation to compare the selection probabilities between two designs: one utilizing a simultaneous stopping rule (MJW method) which selects the most effective arm only and the other employing an arm-specific stopping rule (proposed method) which can select one or more effective arm(s). The treatment arm(s) selection is based on the observed values of the two-sample log-rank test statistics for both methods.

For the simulation, we consider that the trials involved three experimental arms ( $K = 3$ ) each undergoing three interim analyses ( $J = 3$ ) compared to a common control. Our analysis evaluated the performance of the established MJW method using O’Brien and Fleming boundary (with a fixed lower bound of zero, Magirr et al., 2012) versus proposed method across four distinct LFC design settings outlined in Table 5. For each design setting, multiple trials were generated under varying ground truth hazard ratios, which could either align with the design setting or deviate to different degrees. Survival times were generated using an exponential distribution with a median survival of 7.3 (months) in the control arm and uniform accrual with accrual duration $t_{a} = 24$ (months), length of follow-up $t_{f} = 12$ (months) and no loss to follow-up. The targeted one-sided FWER and power were set at 5% and 90%, respectively. Simulations were conducted with 10,000 simulation runs. The simulation results in Table 5 showed that within each design setting, MJW method had the highest probability to select the most efficacious arm 1 when the data truly matched the design setting, but its performance declined as the second and third arms became more efficacious. In contrast, the proposed method consistently selected the most efficacies arm 1 and chose multiple arms, including arm 1, when the efficacious were more similar among different arms. For instance, in the first design setting with hazard ratios of $(δ^{(1)}, δ^{(0)}, δ^{(0)}) = (0.65, 1, 1)$ , when the ground truth hazard ratios were $(δ_{1}, δ_{2}, δ_{3}) = (0.65, 0.7, 0.75)$ , MJW method only identified arm 1 as the most efficacious arm with a 57.6% chance. In contrast, the proposed method selected arm 1 as one of the efficacious arms with an 88.57% chance. This discrepancy was attributed to the inherent limitations of the MJW method, hindering its ability to differentiate between the most efficacies arm and other competing arms when their efficacious were closer. In contrast, our proposed method, free from such restrictions, demonstrated superior performance across all scenarios, even in the presence of other competing arms. Additionally, the results indicated that the MJW method was underpowered, particularly for a relatively small hazard ratio (large effect size).

Table 5:

Simulations to compare the selection probability of the most effective arm(s) between the MAMS designs using simultaneous stopping rule (MJW, select arm 1 only) or arm-specific stopping rule (select all effective arms) with number of treatment arms $K = 3$ , and number of stages $J = 3$ .

Design Parameter	Ground Truth	Selection probability
$(δ^{(1)}, δ^{(0)}, δ^{(0)})$	$(δ_{1}, δ_{2}, δ_{3})$	MJW	Arm 1	Arm 1 & 2	All arms
0.65,1,1	0.65,1,1	0.891	0.8857	0.0200	0.0037
${(129, 121)}^{*}$	0.65,0.8,1	0.836	0.8857	0.3609	0.0166
${(383.11, 0.03)}^{†}$	0.65,0.8,0.9	0.831	0.8857	0.3609	0.0780
	0.65,0.7,1	0.639	0.8857	0.6960	0.0192
	0.65,0.7,0.75	0.576	0.8857	0.6960	0.4688
0.65,0.8,0.8	0.65,0.8,0.8	0.876	0.8826	0.3552	0.2048
${(180, 114)}^{*}$	0.65,0.7,1	0.695	0.8826	0.6913	0.0183
${(550.91, 0.05)}^{†}$	0.65,0.7,0.75	0.635	0.8826	0.6913	0.4593
0.5,1,1	0.5, 1, 1	0.875	0.8769	0.0197	0.0058
(51, 47) ${(51, 47)}^{*}$	0.5,0.7, 1	0.812	0.8769	0.3560	0.0194
${(208.02, 0.01)}^{†}$	0.5,0.7,0.8	0.802	0.8769	0.3560	0.1166
	0.5,0.6,1	0.703	0.8769	0.6035	0.0221
	0.5,0.6,0.8	0.696	0.8769	0.6035	0.150
0.5, 0.8, 0.8	0.5,0.8,0.8	0.865	0.8844	0.1659	0.0643
(54, 46) ${(54, 46)}^{*}$	0.5,0.6,1	0.717	0.8844	0.6223	0.0171
${(225.97, 0.05)}^{†}$	0.5,0.6,0.7	0.682	0.8844	0.6223	0.3065

Open in a new tab

Note: ${(d_{1}, d_{2})}^{*}$ denotes $d_{1}$ as number of events per group using MJW method and $d_{2}$ as number of events per group using the proposed method., which are calculated under the corresponding design parameter $(δ^{(1)}, δ^{(0)}, δ^{(0)})$ . ${(t_{1}, t_{2})}^{†}$ denotes $t_{1}$ as the time (sec) for the corresponding design using MJW method with a simultaneous stopping rule and $t_{2}$ as the time (sec) for the corresponding design using the proposed method with a arm-specific stopping rule.

The selection probability for simultaneously choosing both Arm 1 and Arm 2 increased in tandem with the efficacy of the second arm in our proposed method. Notably, this probability remained constant even when the third arm exhibited comparatively higher efficacy. This flexibility was attributed to our proposed algorithm’s ability to include all efficacious arms, regardless of their relative efficacy. Furthermore, the selection probability of all arms increased with the enhancement in the efficacy of all three arms, highlighting the robustness and adaptability of our proposed method in handling the case presence of other competing arms.

7. Mixed treatment effects

In this article, the power of the trial is defined under either a global alternative (disjunctive) or LFC. However, in a real trial it is unlikely that only one experimental arm is effective or all experimental arms are effective. It is more likely that there are mixed effects among the experimental arms. Considering a general mixed treatment effect, without loss generality, we assume the first $s$ treatments are effective and others are ineffective, that is $δ_{1} = δ_{1}^{(1)}, \dots, δ_{s} = δ_{s}^{(1)}$ and $δ_{s + 1} = \dots = δ_{K} = δ^{(0)}$ , where $δ_{k}^{(1)} < δ^{(0)}, k = 1, \dots, s$ We conducted simulations under various mixed treatment effect scenarios with the number of treatment arms $K = 2, 3$ and number of stages $J = 2, 3, 4$ . Survival times were generated using an exponential distribution with a median survival of $m_{0} = 10$ (months) in the control arm and uniform accrual with accrual duration $t_{a} = 40$ (months), length of follow-up $t_{f} = 20$ (months) with all other design parameters are remaining the same as given in previous section. The simulation results in Table 6 showed that the proposed method controls the FWER well and provides sufficient power under various scenarios of the mixed treatment effects. Therefore, when multiple treatment arms are effective, group sequential MAMS trial design with an arm-specific efficacy stopping rule is more suitable for finding the most effective arm and selecting one or multiple effective arm(s).

Table 6:

Simulations are conducted to study the FWER and power under various mixed treatment effects with number of stage $J = 2,3$ and number of arm $K = 2,3$ , where $ψ_{k} = - \log δ_{k}$ for $k^{t h}$ is the log hazard ratio (HR) of the treatment arm vs the control. The FWER and power are estimated based on 10,000 simulation runs.

# of arm	log hazard ratio	sample size	# of events	$J = 2$		$J = 3$		$J = 4$
$K$	$(ψ_{1}, ψ_{2})$	$n$	$d$	FWER	EP	FWER	EP	FWER	EP
2	(0.3,0.1)	762	678	0.0494	0.8975	0.0482	0.8981	0.0492	0.8972
2	(0.3,0.2)	708	624	0.0530	0.9067	0.0540	0.9062	0.0535	0.9070
2	(0.3,0.25)	642	561	0.0498	0.9114	0.0494	0.9122	0.0500	0.9124
2	(0.3,0.3)	555	483	0.0532	0.9042	0.0536	0.9043	0.0530	0.9044
$K$	$(ψ_{1}, ψ_{2}, ψ_{3})$	$n$	$d$	FWER	EP	FWER	EP	FWER	EP
3	(0.3,0.1,0.1)	1104	984	0.0497	0.8954	0.0497	0.8953	0.0505	0.8953
3	(0.3,0.2,0.1)	1036	916	0.0490	0.9039	0.0499	0.9041	0.0498	0.9037
3	(0.3,0.25,0.1)	940	828	0.0498	0.9030	0.0504	0.9031	0.0508	0.9032
3	(0.3,0.2,0.2)	984	864	0.0510	0.8994	0.0501	0.9003	0.0517	0.8992
3	(0.3,0.25,0.2)	904	792	0.0494	0.9014	0.0503	0.9005	0.0496	0.9022
3	(0.3,0.25,0.25)	844	736	0.0488	0.9071	0.0490	0.9078	0.0503	0.9065
3	(0.3,0.3,0.3)	692	600	0.0541	0.9044	0.0540	0.9047	0.0545	0.9033

Open in a new tab

8. Conclusion

In this paper, we address the need for more efficient and adaptable clinical trial designs, particularly in the evaluation of multiple experimental treatments with time-to-event endpoints. The existing literature employs a simultaneous stopping rule, which halts the trial if any arm meets predefined efficacious criteria. However, this may prematurely terminate the trial and fail to capture the true efficacy of individual treatments, particularly when multiple arms show potential for success.

To address this issue, we propose a novel group sequential MAMS survival trial design that incorporates an arm-specific stopping rule. Our approach discontinues an arm upon demonstrating efficacy at an interim analysis but continues the evaluation of all remaining arms. This allows for a more nuanced assessment of treatment efficacy and enhances the probability of selecting the best-performing arm.

We conducted extensive simulation studies to assess the operating characteristics of our proposed MAMS design with arm-specific stopping rule, and to compare it to trial designs employing a simultaneous stopping rule like the MJW method. Our simulations demonstrated that the proposed method effectively controlled the FWERs and maintained the designed power. Moreover, while the MJW method excelled in selecting the most efficacious arm under ideal conditions, its performance diminished as the efficacy of other arms increased. In contrast, our proposed method consistently identified the most efficacious arm or accurately selected multiple arms, especially when their efficacious closely matched. This robust pattern held true across scenarios where either a single experimental arm was effective or where efficacy was observed across multiple arms in practical settings.

Finally, adaptive seamless phase II/III designs have become increasingly popular which are conducted in two stages with treatment selection at the first stage (Stallard and Friede, 2008; Jenkins et al., 2011). The proposed MAMS design provides a high probability of correctly selecting the most effective arm(s), making it suitable for the first stage of treatment selection in an adaptive seamless phase II/III trial and increasing the likelihood of trial success. This will be a future research topic.

Supplementary Material

NIHMS2023588-supplement-1.pdf^{(177.4KB, pdf)}

Acknowledgments

Dr. Wu’s research was supported by the University of New Mexico Comprehensive Cancer Center Support Grant National Cancer Institute (NCI) P30CA118100 and Dr. Li’s research was supported by the Comprehensive Cancer Center at St. Jude Children’s Research Hospital and American Lebanese Syrian Associated Charities (ALSAC).

Footnotes

CONFLICT OF INTEREST

The authors have declared no conflict of interest.

DATA AVAILABILITY STATEMENT

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

References

Dunnett C (1955). A multiple comparison procedure for comparing several treatments with a control. Journal of American Statistics Association 1955; 50:1096–1121. [Google Scholar]
Genz A, Bretz F. Computation of Multivariate Normal and t Probabilities. 2009; Heidelberg: Springer. [Google Scholar]
Halabi S, Michiels S. Textbook of clinical trials in oncology. Chapter 9, 2019, CRC Press, New York. [Google Scholar]
Jaki T, Magirr D. Considerations on covariates and endpoints in multi-arm multi-stage clinical trials selecting all promising treatment. Statistics in Medicine 2013; 32:1150–1163. [DOI] [PubMed] [Google Scholar]
Jaki T, Magirr D, Pallmann P. MAMS: Designing Multi-Arm Multi-Stage Studies. R package version 1.2, 2107; URL http://CRAN.R-project.org/package=MAMS [Google Scholar]
Jenkins M, Stone A, Jennison C. An adaptive seamless phase II/III design for oncology trials with subpopulation selection using correlated survival endpoints. Pharmaceutical Statistics 2011; 10:347–356. [DOI] [PubMed] [Google Scholar]
Magirr D, Jaki T, Whitehead J. A generalised Dunnett test for multi-arm, multi-stage clinical studies with treatment selection. Biometrika 2012; 99:494–501. [Google Scholar]
Magirr D, Stallard N, Jaki T, Flexible sequential designs for multi-arm clinical trials. Statistics in Medicine 2014; 33:3269–3279. [DOI] [PubMed] [Google Scholar]
Schaid DJ, Wieand S, Therneau TM. Optimal two-stage screening designs for survival comparisons. Biometrika 1990; 77:659–663. [Google Scholar]
Stallard N, Friede T. A group-sequential design for clinical trials with treatment selection Statistics in Medicine 2008; 27:6209–6227 [DOI] [PubMed] [Google Scholar]
Stallard N, Todd S. Sequential designs for Phase III clinical trials incorporating treatment selection. Statistics in Medicine 2003; 22:689–703. [DOI] [PubMed] [Google Scholar]
Wason JMS, Jaki T. Optimal design of multi-arm multi-stage trials. Statistics in Medicine 2012; 31:4269–4279. [DOI] [PubMed] [Google Scholar]
Wu J, Li Y. Group sequential multi-arm multi-stage survival trial design with treatment selection, Journal of Biopharmaceutical Statistic, DOI: 10.1080/10543406.2023.2235409, 2023. [DOI] [PubMed] [Google Scholar]
Wu J, Li Y, Zhu L. Group sequential multi-arm multi-stage trial design with treatment selection, Statistics in Medicine, 42:1480–1491, 2023. [DOI] [PubMed] [Google Scholar]
Xiong X A class of sequential conditional probability ratio tests. Journal of American Statistics Association 1995; 90:1463–1473. [Google Scholar]
Xiong X A computer program for SCPRT on information time. Version 1.0, 2017. [Google Scholar]
Xiong X, Tan M, Boyett J. Sequential conditional probability ratio tests for normalized test statistic on information time. Biometrics 2003; 59:624–631. [DOI] [PubMed] [Google Scholar]
Xiong X, Tan M, and Kutner MH. Computational Methods for Evaluating Sequential Tests and Post-test Estimations via Sufficiency Principle, Statistica Sinica 12(4):1027–1041, 2002. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS2023588-supplement-1.pdf^{(177.4KB, pdf)}

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

[R1] Dunnett C (1955). A multiple comparison procedure for comparing several treatments with a control. Journal of American Statistics Association 1955; 50:1096–1121. [Google Scholar]

[R2] Genz A, Bretz F. Computation of Multivariate Normal and t Probabilities. 2009; Heidelberg: Springer. [Google Scholar]

[R3] Halabi S, Michiels S. Textbook of clinical trials in oncology. Chapter 9, 2019, CRC Press, New York. [Google Scholar]

[R4] Jaki T, Magirr D. Considerations on covariates and endpoints in multi-arm multi-stage clinical trials selecting all promising treatment. Statistics in Medicine 2013; 32:1150–1163. [DOI] [PubMed] [Google Scholar]

[R5] Jaki T, Magirr D, Pallmann P. MAMS: Designing Multi-Arm Multi-Stage Studies. R package version 1.2, 2107; URL http://CRAN.R-project.org/package=MAMS [Google Scholar]

[R6] Jenkins M, Stone A, Jennison C. An adaptive seamless phase II/III design for oncology trials with subpopulation selection using correlated survival endpoints. Pharmaceutical Statistics 2011; 10:347–356. [DOI] [PubMed] [Google Scholar]

[R7] Magirr D, Jaki T, Whitehead J. A generalised Dunnett test for multi-arm, multi-stage clinical studies with treatment selection. Biometrika 2012; 99:494–501. [Google Scholar]

[R8] Magirr D, Stallard N, Jaki T, Flexible sequential designs for multi-arm clinical trials. Statistics in Medicine 2014; 33:3269–3279. [DOI] [PubMed] [Google Scholar]

[R9] Schaid DJ, Wieand S, Therneau TM. Optimal two-stage screening designs for survival comparisons. Biometrika 1990; 77:659–663. [Google Scholar]

[R10] Stallard N, Friede T. A group-sequential design for clinical trials with treatment selection Statistics in Medicine 2008; 27:6209–6227 [DOI] [PubMed] [Google Scholar]

[R11] Stallard N, Todd S. Sequential designs for Phase III clinical trials incorporating treatment selection. Statistics in Medicine 2003; 22:689–703. [DOI] [PubMed] [Google Scholar]

[R12] Wason JMS, Jaki T. Optimal design of multi-arm multi-stage trials. Statistics in Medicine 2012; 31:4269–4279. [DOI] [PubMed] [Google Scholar]

[R13] Wu J, Li Y. Group sequential multi-arm multi-stage survival trial design with treatment selection, Journal of Biopharmaceutical Statistic, DOI: 10.1080/10543406.2023.2235409, 2023. [DOI] [PubMed] [Google Scholar]

[R14] Wu J, Li Y, Zhu L. Group sequential multi-arm multi-stage trial design with treatment selection, Statistics in Medicine, 42:1480–1491, 2023. [DOI] [PubMed] [Google Scholar]

[R15] Xiong X A class of sequential conditional probability ratio tests. Journal of American Statistics Association 1995; 90:1463–1473. [Google Scholar]

[R16] Xiong X A computer program for SCPRT on information time. Version 1.0, 2017. [Google Scholar]

[R17] Xiong X, Tan M, Boyett J. Sequential conditional probability ratio tests for normalized test statistic on information time. Biometrics 2003; 59:624–631. [DOI] [PubMed] [Google Scholar]

[R18] Xiong X, Tan M, and Kutner MH. Computational Methods for Evaluating Sequential Tests and Post-test Estimations via Sufficiency Principle, Statistica Sinica 12(4):1027–1041, 2002. [Google Scholar]

PERMALINK

Multi-Arm Multi-Stage Survival Trial Design with Arm-Specific Stopping Rule

Jianrong Wu

Yimei Li

Liang Zhu

Tushar Panti

Abstract

1. Introduction

2. Log-rank test

3. Familywise error rate

4. Power for multi-arm trial

4.1. Disjunctive power

4.2. Power under least favorable configuration

5. Group sequential MAMS design

5.1. SCPRT group sequential procedure

Table 1:

5.2. Interim analysis

5.3. Group sequential FWER and power

5.4. Implementation of trial design

Table 2:

Table 3:

6. Simulation

6.1. Performance of fixed sample design

6.2. Operating characteristics of MAMS design

Table 4:

6.3. Comparison of selection probabilities

Table 5:

7. Mixed treatment effects

Table 6:

8. Conclusion

Supplementary Material

Acknowledgments

Footnotes

DATA AVAILABILITY STATEMENT

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases