Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Apr 1.
Published in final edited form as: Lifetime Data Anal. 2019 Apr 12;26(2):266–291. doi: 10.1007/s10985-019-09470-4

Group-sequential logrank methods for trial designs using bivariate non-competing event-time outcomes

Tomoyuki Sugimoto 1, Toshimitsu Hamasaki 2, Scott R Evans 3, Susan Halabi 4
PMCID: PMC7517875  NIHMSID: NIHMS1629114  PMID: 30980317

Abstract

We discuss the multivariate (2L-variate) correlation structure and the asymptotic distribution for the group-sequential weighted logrank statistics formulated when monitoring two correlated event-time outcomes in clinical trials. The asymptotic distribution and the variance-covariance for the 2L-variate weighted logrank statistic are derived as available in various group-sequential trial designs. These methods are used to determine a group-sequential testing procedure based on calendar times or information fractions. We apply the theoretical results to a group-sequential method for monitoring a clinical trial with early stopping for efficacy when the trial is designed to evaluate the joint effect on two correlated event-time outcomes. We illustrate the method with application to a clinical trial and describe how to calculate the required sample sizes and numbers of events.

Keywords: Bivariate dependence, Error-spending method, Independent censoring, Logrank statistic, Non-fatal events, Normal approximation

1. Introduction

Event-time outcomes are commonly used for evaluating the effect of a test intervention compared with a control. In some disease areas, e.g. HIV, oncology or cardiovascular disease, several event-time outcomes are used as the primary endpoints to more completely characterize the effect of an intervention on participants. Clinical trials with more than one primary endpoint can be designed to evaluate effects for all of the outcomes (i.e. co-primary endpoints) or to evaluate effects for at least one outcome (i.e. multiple primary endpoints). However, clinical trials with multiple event-time outcomes can be expensive and resource intensive as they often require large numbers of participants, collection of massive amounts of data, and long duration of follow-up. The use of group-sequential designs has the potential to improve efficiency, i.e. offering potentially fewer required trial participants, shortening the duration of clinical trials, and thus reducing the costs. Several authors have discussed group-sequential designs for multiple continuous or binary endpoints (e.g., Tang et al. 1989; Cook and Farewell 1994; Jennison and Turnbull 2000; Kosorok et al. 2004; Hung et al. 2007; Glimm et al. 2009; Tamhane et al. 2010, 2012; Asakura et al. 2014). Group-sequential theory and methods for single event-time outcomes have been studied (e.g., Tsiatis 1982; Slud and Wei 1982; Gordon and Lachin 1990; Gu and Lai 1991; Tsiatis et al. 1995; Lin et al. 1996; Lai and Shih 2004; Gombay 2008; Wu and Xiong 2017), and extended for multiple event-time outcomes (e.g., Wei and Lachin 1984; Pocock et al. 1987; Wei et al. 1990; Lin 1991; Cook and Farewell 1994) and for paired event-time data (e.g., Murray 2000; Andrei and Murray 2005; Jung 2008). Despite the extensive literature in group-sequential methods, there is a lack in the theory regarding the asymptotic structure of the weighted logrank statistics when group-sequentially comparing multiple event-time outcomes. Absence of this theory slows implementation and applying the group-sequential methodologies and creates challenges in calculating the power and the required sample size for multiple event-time outcomes.

We discuss a fundamental theory and methodology for group-sequential designs based on the weighted logrank statistic when monitoring several correlated event-time outcomes in clinical trials. We focus on bivariate event-time data rather than multivariate event-times, and consider a scenario where both events are non-fatal, as an extension of the existing method (Sugimoto et al. 2013). When considering the asymptotic distribution of the group-sequential logrank statistics, and the two martingale components with event-time outcomes are correlated on the different time axes, it is difficult to directly apply standard martingale theory for survival analysis, such as Rebolledo’s central limit theorem. We overcome this challenge by combining a martingale approach and Ito’s formula, and provide an asymptotic formula for group-sequential bivariate logrank statistic. We then apply the asymptotic result to group-sequential designs to evaluate a joint effect on both outcomes. We illustrate the design methodology with a clinical trial example.

This paper is organized as follows: in Sect. 2 we describe how the group-sequential weighted logrank statistic is applied to bivariate event-time data in a clinical trial. In Sect. 3, we discuss the asymptotic distribution with an explicit variance-covariance form for the bivariate version of group-sequential weighted logrank statistic, fundamental for determining the information fraction for each outcome and evaluating the probability of rejecting the null hypotheses. In Sect. 4, we apply the asymptotic result to a group-sequential clinical trial evaluating the joint effect on the co-primary endpoints. We outline how both or one of the outcomes are monitored and evaluated. In Sect. 5, we summarize the findings and discuss their implications.

2. Group-sequential bivariate event-time data and the logrank statistic

Consider designing a randomized group-sequential clinical trial comparing two interventions evaluating bivariate event-time outcomes. Suppose that up to the planned maximum number of participants nL will be recruited during an entry period and followed to observe the bivariate survival outcomes. Further, suppose interim analyses are planned with the pre-specified maximum number of analyses L. Let n and τ be the cumulative total number of participants and the analysis time at the th interim analysis, respectively, with n1 ≤ ⋯ ≤ nL and τ1 < ⋯ < τL, and let [0, τA] be a period on which the trial recruiting is performed or which is planned in advance. The group index of intervention is denoted by j = 2 if the ith participant belongs to the test group and j = 1 otherwise. Let n1 and n2 denote the numbers of participants assigned to the control and test interventions at the th analysis, respectively (n = n1 + n2), where the fractions nj1/n1,… , njL/nL may be often assumed to be approximately equal in each intervention. For i = 1,… , nL and k = 1, 2, let Oi be the ith participant’s entry time into the trial, let Tik be the ith participant’s underlying continuous event time for the kth outcomes, and let Ci be the ith participant’s underlying censoring time common for the two outcomes, where Oi is the origin time of Tik and Ci and is usually generated from the uniform distribution on the entry period [0, τA], the bivariate time (Ti1, Ti2) follows the joint survival distribution denoted by

Sj(t,s)=P(t<Ti1,s<Ti2gi=j),

gi is the ith group index of intervention, and all of the Ci’s follow the identical survival distribution C(t) = P(t < Ci) independently of (Ti1, Ti2). Thus, the ith right-censoring time occurring at the th analysis is Ci(), where

Ci()=min(Ci,max(τOi,0)).

We will assume no dropouts where we observe Ci = τLOi because of well-controlled trial. Suppose that Ti1 and Ti2 are non-competing event-times, that is neither event-time is censored by the occurrence of the other event, which is typical in the case of non-fatal events (Sugimoto et al. 2013). For simplicity on notation, we write O1 ≤ ⋯ ≤ Onτ, although we assume that Oi and Oi′ for ii′ are mutually independent. Hence, we have a series of cumulative data set denoted by {(Ti1(),Ti2(),Δi1(),Δi2(),gi)}i=1n, = 1,…, L, where Tik()=min(Tik,Ci()) and Δik()=1{Tik<Ci()} are the ith observable time and censoring indicator for the kth outcome at the th analysis, respectively, and 1{} is the indicator function. The information of (Tik(), Δik()) is also represented by the counting process Nik()(t)=1{Tik()t,Δik()=1} and the at-risk process Yik()(t)=1{Tik()t}. Denote their sums on the group j and the kth outcome by

N¯jk()(t)=i=1n1{gi=j}Nik()(t),Y¯jk()(t)=i=1n1{gi=j}Yik()(t),

N¯k()(t)=N¯1k()(t)+N¯2k()(t) and Y¯k()(t)=Y¯1k()(t)+Y¯2k()(t).

Also, let λjk(t) and Λjk(t) be the marginal hazard function and its cumulative function for the kth event time Tik in the group j, respectively. Denote the marginal hazard ratio for the kth outcome between the two groups by ψk(t) = λ2k(t)/λ1k(t) and let ψ(t) = (ψ1(t), ψ2(t))T.

We are interested in testing sequentially either hypothesis H0cp=H01H02 (for joint effect) or H0mp=H01H02 (for at least one effect) using the weighted log-rank statistics, where H0k is the single null hypothesis for the kth outcome, “ψk(t) = 1 for all t”. For the bivariate event-time outcome with L maximum analyses, we have a set of 2L group-sequential weighted logrank statistics,

Z^=(Z^1(τ1),,Z^1(τL),Z^2(τ1),,Z^2(τL))T

composed of

Z^k(τ)=nUk()(τ)V^kk0()(τ),k=1,2,=1,,L

where nUk()(t) is the weighted logrank process accompanied with the analysis time τ,

Uk()(t)=0tH^k()(s){dΛ^1k()(s)dΛ^2k()(s)},

V^kk0()(t) is the conditional variance of nUk()(t) under the null hypothesis H0k,

V^kk0()(t)=0tH^k()(s)2{1dN¯k()(s)1Y¯k()(s)1}{dΛ^1k()(s)n1Y¯2k()(s)+dΛ^2k()(s)n1Y¯1k()(s)}.

Also, Λ^jk()(t)=0tdN¯jk()(s)Y¯jk()(s) is the Nelson-Aalen estimator at the th analysis for the kth outcome in the group j, H^k()(s) is the following function including the weight W^k() of the class K (Fleming and Harrington 1991)

H^k()(s)=n1W^k()(s)Y¯1k()(s)Y¯2k()(s)Y¯k()(s),

W^k()(s)=f(S^k()(s)) or W^k()(s)=f(n1Y¯k()(s)), f(·) is a nonnegative bounded continuous function with bounded variation on [0, 1], and S^k()(s) is the Kaplan-Meier estimator for the kth outcome in the pooled sample at the th analysis time τ. A well-known fact is that the logrank and Prentice-Wilcoxon statistics use W^k()(s)=1 and W^k()(s)=S^k()(s), respectively, where s_ is a time just prior to s. The weight W^k() should be selected effectively to detect a clinically significant difference. If there is no prior assumption on a specific difference in the clinical significance, the logrank statistic may be adopted, which can be interpreted as detecting the difference in the mean hazard rate. Also, one can consider an optimality for testing using a special weight into the design, if pilot data or registry database are available.

3. Asymptotic structure of the group-sequential bivariate logrank statistic

Asymptotic results regarding the univariate statistic Z^k(τ) and its group-sequential version (Z^k(τ1),,Z^k(τL))T have been developed well (e.g., Andersen et al. 1993, X.2). For example, Lin (1991) shows that Z^ converges to a multivariate normal distribution with zero means and discuss the estimated variance-covariance matrix, although an explicit form for the asymptotic covariance of Z^ is not provided. Andrei and Murray (2005) provide a more detailed expression for the asymptotic covariance among weighted logrank statistics, but it is in the context of paired event-time data on the same time axes. To the best of our knowledge, a computable explicit form for the asymptotic variance-covariance of Z^ is not available in the literature. Extending the result for Z^ when L = 1, i.e. for fixed-sample design (Sugimoto et al. 2013), we provide the result of the asymptotic distribution of Z^ with an explicit variance-covariance structure for group-sequential design (Theorem 1).

We next provide details for expressing an asymptotic distribution of Z^. The limit forms of H^k()(t) and n1Y¯jk()(s) are different among the analysis time points as the censoring distributions vary with each analysis-time τ. Let Hk()(t), yjk()(t) and yk()(t) denote the limit forms of H^k()(t), nj1Y¯jk()(t) and n1Y¯k()(t), respectively. Denote a^j=njn for the sample rate of participants assigned to the group j at the th analysis and γ^=nnL for the sample size ratio between the th and final analyses. Let P denote the convergence in probability. We assume the following regularity conditions.

  • Condition 1. For each j, , 0 < ajℓ < 1 is satisfied, where ajℓ is a constant such that a^jPaj as n → ∞.

  • Condition 2. For each , 0 < γ ≤ 1 is satisfied with γ1 ≤ ⋯ ≤ γL, where γ is a constant such that γ^Pγ as nL.

  • Condition 3. For each j, k, , yjk()(t)>0 on [0, τ] is satisfied with τ=sup{t:yjk()(t)>0}, where yjk() is a deterministic function such that, as njℓ → ∞,

supt[0,τ]nj1Y¯jk()(t)yjk()(t)P0.

Under our setting, the convergences provided in Conditions 1-2 and Condition 3 are derived by the law of large numbers and Glivenko-Cantelli theorem, respectively. Hence, we have γ=E(γ^), aj=E(a^j) and yk()(t)=a1y1k()(t)+a2y2k()(t). Note that ajℓ permits changing on the analysis time τ, but each ajℓ should be fixed at the design stage to control Type I error rate. The type of convergence in Condition 1 is usually replaced with the non-probabilistic version based on an allocation proceduce. Condition 3 provides limtτ+0yjk()(t)=0, which means that all at-risk individuals are once censored at the th analysis time τ.

Let C(t) be the survival function of censoring times Ci() when the analysis time is τ. Under the independent censoring assumption, we can easily show that

yjk()(t)=C(t)Sjk(t)andyk()(t)=C(t)Sk()(t), (1)

where Sjk(t)=P(t<Tikgi=j) is the marginal survival function of Tik assigned to the group j, and Sk()(t)=a1S1k(t)+a2S2k(t). Hence, given the condition that bivariate event-time outcomes are non-fatal, for tτ, we have

Hk()(t)=Wk()(t)C(t)a1S1k(t)a2S2k(t)Sk()(t), (2)

where Wk()(t) is either f(Sk()(t)) or f(yk()(t)) corresponding to the selection of W^k() in the class K, so that Hk()(t) is a deterministic continuous function of bounded variation. In particular, when considering a typical group sequential trial, we will assume that participants are recruited uniformly on [0, τA], followed up with no dropouts and then will be analyzed at the times t = τ1,…,τL. Then we can specify the censoring survival distribution as

C(t)={1,0tτmin(τ,τA)(τt)min(τ,τA),τmin(τ,τA)<tτ,0,τ<t} (3)

(recall τA is the length of the entry period planned in advance). Hence, we have

γ=min(τ,τA)τA (4)

under the censoring assumption (3), because it is the averaged ratio of the number of participants recruited until the analysis time τ.

Suppose that Z=(Z1(t1),,Z1(tL),Z2(t1),,Z2(tL))T follows 2L-variate normal distribution N(Dnμ, Σ) with mean vector

Dnμ=Dn(μ1μ2)=(n1μ11,,nLμ1L,n1μ21,,nLμ2L)T

and variance-covariance matrix

Σ=(Σ11Σ12Σ21Σ22)=(σ1111σ111Lσ1211σ121Lσ11L1σ11LLσ12L1σ12LLσ2111σ211Lσ2211σ221Lσ21L1σ21LLσ22L1σ22LL),

where Dn=diag(n1,n2,,nL,n1,n2,,nL), μk = (μk1,…,μkL)T and Σkk = (σkk′ℓℓ′)ℓ,ℓ. That is, for k, k′ = 1, 2 and , ℓ′ = 1.…, L, the elements of means and covariances for Zk(t) and Zk(t) are written as

E(Zk(τ))=nμk=nmk()(τ)Vkk0()(τ),Cov(Zk(τ),Zk(τ))=σkk=Vkk(τ,ττ,τ)Vkk0()(τ)Vkk0()(τ),

where we assume that the elements mk(), Vkk0() and Vkk′ are defined by

mk()(t)=0tHk()(x){dΛ2k(x)dΛ1k(x)},Vkk0()(t)=0tHk()(x)2{dΛ1k(x)a2y2k()(x)+dΛ2k(x)a1y1k()(x)},Vkk(t,sτ,τ)=γγ0tsHk()(x)Hk()(x){dΛ1k(x)a1y1k()(x)+dΛ2k(x)a2y2k()(x)},V12(t,sτ,τ)=γγ0t0sH1()(x)H2()(y)C(xy)×{A1(dx,dy)a1y11()(x)y12()(y)+A2(dx,dy)a2y21()(x)y22()(y)},Aj(dx,dy)=Sj(dx,dy)+Sj(x,dy)dΛj1(x)+Sj(dx,y)dΛj2(y)+Sj(x,y)dΛj1(x)dΛj2(y),Sj(dx,dy)=Sj(x,y)Sj(x,y)Sj(x,y)+Sj(x,y),

Sj(dx, y) = Sj(x, y) – Sj(x_, y), Sj(x, dy) = Sj(x, y) – Sj(x, y_), xy = max(x, y) and xy = min(x, y). The forms provided in (4), (1) and (2) are applied into these elements mk(), Vkk0() and Vkk′. Under Conditions 1 and 3, it is well-known that the univariate weighted logrank statistic can be normally approximated (e.g., Fleming and Harrington 1991, Theorem 7.2.1). We have the following asymptotic result for the group-sequential weighted logrank statistic Z^ with correlated two outcomes.

Theorem 1 Suppose that Conditions 1-3 are satisfied (ajℓ ∈ (0, 1), r ∈ (0, 1], τ=sup{t:y1k()(t)y2k()(t)>0}), and that Sj(t, s), j = 1, 2 are continuous on (0, τL] × (0, τL]. Suppose that f(·) is a nonnegative bounded continuous function with bounded variation on [0, 1]. For sufficiently large n’s (n1 ≤ ⋯ ≤ nL), the distribution of the 2L-variate weighted logrank statistic Z^ can then be approximated by N(Dnμ, Σ). That is, as nL ≥ ⋯ ≥ n1 → ∞, Z^Dnμ^ converges in distribution to Z* – Dnμ distributed as N(0, Σ), where μ^ converges in probability to μ, μ^=(μ^1T,μ^2T)T, μ^k=(μ^k1,,μ^kL)T, μ^k=m^k()(τ)Vkk0()(τ),

m^k()(t)=0tH^k()(x){dΛ2k(x)dΛ1k(x)},

and 0 is the 2L-dimensional zero vector.

This proof is provided in Appendix A. By conducting simulation stduies to evalaute the finite sample behavior for Theorem 1, we found that the asymptotic distribution works well in most practical situations if the event rate or sample size is not so small.

Several authors (e.g., Wei and Lachin 1984; Lin 1991) have indicated that the proof can be completed by the multivariate central limit theorem and the Cramér-Wald device, leading to asymptotic normality, but the asymptotic form of the variance-covariance was not clearly defined. The asymptotic form of variance-covariance as described in Theorem 1 has not been provided in the context of comparing independent groups with respect to several possibly correlated co-primary endpoints. In fact, when two martingale components with event-time outcomes are correlated on the different time axes as in this context, it is difficult to directly apply standard martingale theory, such as Rebolledo’s central limit theorem, for survival analysis (Fleming and Harrington 1991) considering how the covariance of martingale components converges. As a reference to overcome the problem, we provide our solution based on a martingale approach through the proof of Theorem 1 in Appendix A.

Based on the result of Theorem 1 that the distribution of the weighted logrank statistics, Z^, can be approximated by N(Dnμ, Σ), we can consider a group-sequential design and the asymptotic power for the testing procedure. In our setting, the distribution parameters of the mean vector μ and the diagonal block matrix Σkk of Σ are determined by the setting of the marginal survival distributions Sjk(t), the censoring survival distributions C(t), and the sample rates ajℓ (k = 1, 2, j = 1, 2, = 1,…, L). In fact, the proportions γ1,…,γL of sample sizes are determined by τ1,… ,τL and τA under the censoring assumption (3). On the other hand, in determining the non-diagonal block matrix Σ12 (and Σ21) of Σ, the assumption of the joint survival distributions Sj(t, s), j = 1, 2 are required. At the design stage of a trial, one convenient setting is to model Sj(t, s) by

Sj(t,s)=C(Sj1(t),Sj2(s);θ) (5)

where C(,) is a copula function (such as Clayton, Gumbel and Frank models), and the association parameter θ characterizes the level of dependence between Sj1(t) and Sj2(t) and is a one-to-one function of a dependence measure (Hsu and Prentice 1996)

ρj=Corr[Λj1(Ti1),Λj2(Ti2)]=00Sj(t,s)dΛj1(t)dΛj2(s)1.

The mean vector μ and the diagonal block matrix Σkk depend on the assumptions of the censoring distribution and the hazard ratios ψ1(t) and ψ2(t). The weighted logrank statistic is nonparametric, so that it is reasonable to assume the exponential distribution for marginals Sj1(t) and Sj2(t) in one group. Given the hazard ratios independent of times, such as the proportional hazard hypothesis {ψk(t) ≡ ψk for t ∈ (0, τL], k = 1, 2} (μ = 0 if ψ1 = ψ2 = 1), the marginals Sj1(t) and Sj2(t) in one group may model those of another group. Hence, a typical design calculation may be based on four exponential marginals Sjk(t), j = 1, 2, k = 1, 2 and the setting of the analysis times (τ1,…, τL), the entry period (τA), the dependence measures (ρ1, ρ2) and the selection of some copula function. Numerically, calculations included in μ, Σ and ρj can be sufficiently precisely by using the numerical integration method, such as the Trapezoidal rule or Simpson’s rule (e.g., Sugimoto et al. 2013).

In group-sequential designs, the concept of information fraction is important in determining the critical boundary to preserve overall Type I error rate. This can be generalized to a bivariate event-time setting by analogy to a single event-time outcome. The information at τ for each outcome can be characterized using an asymptotic form of the Fisher information, i.e., Ik=nVkk0()(τ), k = 1, 2, which corresponds to the information under the null hypothesis for the log of the hazard ratios (e.g., Jennison and Turnbull 2000; Yin 2012). As information is accumulated from τ to τL, the standardized internal time RIk for each outcome is defined by the fraction of the maximum information of IkL, i.e., RIk=IkIkL. Theorem 1 provides that the components of Σ, under the null hypothesis of ψ1 = ψ2 = 1, are obtained as

Vkk0()(t)=a1a20tWk()(x)2C(x)Sk(x)dΛk(x),Vkk(t,sτ,τ)=a1a2γγ×{0tsWk()(x)Wk()(x)C(x)Sk(x)dΛk(x)ifk=k,0t0sWk()(x)Wk()(y)C(xy)A(dx,dy)ifkk,}

where we have Λk(x) = Λ1k(x) = Λ2k(x), Sk(x) = S1k(x) = S2k(x) and A(x, y) = A1(x, y) = A2(x, y) because ψ1 = ψ2 = 1. The result given for the single endpoint (e.g., Andersen et al. 1993, X.2) is that the asymptotic correlation between group-sequential weighted logrank statistics

Corr[Zk(τ),Zk(τ)]=Vkk(τ,ττ,τ)Vkk(τ,ττ,τ)Vkk(τ,ττ,τ),

reduces to RIkRIk when the null hypothesis is true (ψ1 = ψ2 = 1) and Wk()(s) is independent of , such as Wk()(s)=1. Theorem 1 describes that the correlation of (Z^k(τ), Z^k(τ)) including between different endpoints for k, k′ = 1, 2 and 1 ≤ ′ ≤ L can be approximated by

Corr[Zk(τ),Zk(τ)]=Vkk(τ,ττ,τ)Vkk(τ,ττ,τ)Vkk(τ,ττ,τ),

which is {1(k=k)+1(kk)ρZ(τ,τ)}RIkRIk if the null hypothesis is true and Wk()(s) is independent of , where

ρZ(τ,τ)=Vkk(τ,ττ,τ)Vkk(τ,ττ,τ)Vkk(τ,ττ,τ).

4. Application to group-sequential design

We provide an application to the group-sequential design based on the result discussed in Sect. 3. As a motivating example, consider a major HIV treatment trial within the AIDS Clinical Trials Group, “A Phase III Randomized Comparative Study of Three Non-Nucleoside Reverse Transcriptase Inhibitor (NNRTI)-Sparing Antiretroviral Regimens for Treatment-Naïve HIV-1-Infected Volunteers (The ARDENT Study: Atazanavir, Raltegravir, or Darunavir with Emtricitabine/Tenofovir for Naïve Treatment)” (Lennox et al. 2014). The planned total sample size of 1800 (equally-sized groups) was calculated for the paired comparison of the three regimens with respect to the two co-primary endpoints: “virologic failure” and “regimen failure due to tolerability”, not taking into account the potential correlation, with 3% inflation to the adjustment for interim monitoring, under the study duration of 96 weeks after enrollment of the last subject, where the two failures are non-fatal. The study had (i) a power of 0.90 to establish non-inferiority in the risk reduction of virologic failure with the non-inferiority margin of 10% at α = 0.0125 for one-sided test, assuming the virologic failure rate of 25% at 96 weeks, and (ii) a power of 0.85 to detect a 10% difference in regimen failure at α = 0.025 for two-sided test, assuming the regimen failure rate of 45% at 96 weeks.

For illustrative purposes, suppose that the objective of the ARDENT trial was to test for a two intervention superiority on both co-primary endpoints (OC1: virologic failure, OC2: regimen failure). The allocation ratios are assumed to be constant across analyses (aj1 = ⋯ = ajL) and are not changed arbitrarily during trial as the arbitrary choices may effect on the Type I error and power. The significance level of 2.5% (α = 0.025) is allocated to each endpoint using a one-sided logrank test in a group-sequential setting, where the group sizes at each analysis are equal (a1 = 0.5), the survival rate at 96 weeks is assumed to be 75% and 85% for OC1, and 55% and 65% for OC2, in the control and test intervention groups, respectively (S11(96) = 0.75, S21(96) = 0.85; S12(96) = 0.55, S22(96) = 0.65). Two analyses are planned: the first at τ1= 48 + τA and the final at τ2 = 96 + τA (L = 2). Letting ψ=(ψ1,ψ2) be the hazard ratios of interest as a true vector of ψ = (ψ1, ψ2), typical exponential assumptions lead to ψ*≐(0.565, 0.721) based on S2k(t)=S1k(t)ψk for ARDENT study.

A superiority clinical trial with two event-time outcomes (OC1 and OC2) as “co-primary” endpoints is often designed to evaluate if the test intervention is superior to the control on both outcomes. For two co-primary endpoints, the testing procedure is to test the union H0cp=H01H02 of two individual nulls against the alternative H1cp=H11H12. For simplicity, suppose that the proportional hazards hypothesis ψ1(t) ≡ ψ1 and ψ2(t) ≡ ψ2, and single null hypothesis H0k : ψk = 1 is tested versus H1k : ψk < 1 at the significance level α for each k. When evaluating a joint effect on both endpoints within the context of group-sequential designs, one decision-making framework associated with hypothesis testing is to reject H0cp if statistical significance of the test intervention relative to the control intervention is achieved for both of the endpoints at any interim analysis until the final analysis not necessarily simultaneously (Asakura et al. 2014; Hamasaki et al. 2015). The power corresponding to the decision-making framework at ψ = ψ* is

1β=P({=1L{Z1(τ)>c1(α)}}{=1L{Z2(τ)>c2(α)}})=1P(=1L{Z1(τ)c1(α)};ψ1=ψ1)P(=1L{Z2(τ)c2(α)};ψ2=ψ2)+P(=1L{Z1(τ)>c1(α)Z2(τ)>c2(α)};ψ=ψ), (6)

where ckℓ(α) is the critical boundary at the th analysis for the kth outcome, specified and determined in advance using any group-sequential methods, as if the two endpoints were a single primary endpoint, ignoring the other endpoint, analogously to the single endpoint case. Note that only the marginal results of Theorem 1 are required for the standardized internal times RIk, where RIk does not depend on the correlation between OC1 and OC2 in the situation where both outcomes are non-fatal. Once RIk, k = 1, 2 are determined, then the critical boundaries can be calculated using the group-sequential methods to control an overall Type I error rate in each marginal. Using the result of Theorem 1 that the distribution of Z^ can be approximated by that of Z* under a large sample size, the power (6) can be approximately calculated as

1β=1c11c1LfL(z11,,z1L;R11)dz11dz1Lc21c2LfL(z21,,z2L;R22)dz21dz2L+c11c1Lc21c2Lf2L(z11,,z1L,z21,,z2L;R)dz11dz1Ldz21dz2L, (7)

where fm(·; A) is m-variate normal density function with zero mean vector and variance-covariance matrix A, R is the correlation matrix given by

R=(R11R12R21R22)=S12ΣS12,

S = diag(σ1111, σ1122, … , σ11LL, σ2211, σ2222, … σ22LL), the integration limits ck are

ck=1σkk{ck(α)γnLμk},k=1,2;=1,,L,

and recall n = γnL.

Returning to the ARDENT study, let τA = 0 similarly to the manner assumed by Lennox et al. (2014). Although τA is not zero in fact, this selection of τA provides a conservative result and is reasonable in practice because of difficulty of estimating the feasible entry period. Two fixed analysis times are (τ1, τ2) = (48, 96), where the censoring distribution (3) under τA = 0 is simplified to

C1(t)={1,0tτ1=480,τ1<t},C2(t)={1,0t<τ2=960,τ2<t}. (8)

We select the weight function of Wk()(s)=1 corresponding to the logrank statistic. Under these configurations with the exponential marginal assumption, we calculate RIk whose values are 0.5314 and 0.5669 at 48 weeks for the OC1 and OC2, and then determine ckℓ(α) by the O’Brien-Fleming-type function (O’Brien and Fleming 1979) using the Lan-DeMets error-spending method (Lan and DeMets 1983), as shown in Table 1 including the Pocock-type boundary (Pocock 1977). The power (7) is then calculated, given the settings of the joint survival functions Sj(t, s) and the correlations ρj between the OC1 and OC2. We use the copula model (5) to identify the joint survival distribution Sj(t, s). In particular, we utilize the Clayton copula (late time-dependency) (Clayton 1976) and the Gumbel copula (early time-dependency) (Hougaard 1986), that is, we set, under the Clayton copula,

Sj(t,s)=(eθjλj1t+eθjλj2s1)1θj

and, under the Gumbel copula,

Sj(t,s)=exp({(λj1t)1θj+(λj2s)1θj}θj),

the mariginal hazard rates are given by λ1k = − log S1k (96)/96 and λ2k=λ1kψk, and the association parameter θj is determined by the value of ρj (see Sugimoto et al. (2013) for more details). For simplicity, we set the correlations as ρ1 = ρ2ρ and consider ρ = 0, 0.1, …, 0.9 and 0.95. Based on (7), the total maximum sample size (MSS) required for the final analysis is the smallest integer nL which provides (7) not less than the desired power at the prespecified ψ = ψ*. For example, using the method with above parameter configuration and setting, for ρ = 0, R11, R22, and R12 are approximately calculated by

R11=(1,0.72600.7260,1),R22=(1,0.75070.7507,1)andR12=(0,00,0),

respectively, and for ρ = 0.8

R11=(1,0.72600.7260,1),R22=(1,0.75070.7507,1)andR12=(0.2159,0.15690.1622,0.3341),

respectively. Once the MSS is computed, the maximum event number (MEN) dkL is calculated using dkL = nLPkL (event), where Pkℓ(event) is the probability that the event of the kth outcome occurs on the time interval (0, τ] and can be calculated, for example, based on Collett (2003) or Sugimoto et al. (2017, Appendix B). Also, the average event number (AEN) d¯k is calculated using hypothetical reference values, similarly to Asakura et al. (2014), by

d¯k==1L1dkP(stop)+dkL(1=1L1P(stop))

where dkℓ = nPkℓ(event), and P(stop) is the stopping probability as defined by the frequency of crossing the critical boundaries at the th interim analysis under the true values ψ* of the intervention effects. The AEN can provide information regarding the number of events anticipated in a group-sequential design in order to reach a decision point.

Table 1.

Calculated information fractions and the corresponding O’Brien-Fleming-type (OF) and Pocock-type (PC) critical boundaries.

Anal-
ysis
#
Calen-
dar
Time
OC1 OC2
Information
Fraction
OF-type
Bound
PC-type
Bound
Information
Fraction
OF-type
Bound
PC-type
Bound
1 48 0.5314 2.8616 2.1390 0.5669 2.7576 2.1200
2 96 1.0000 1.9718 2.2110 1.0000 1.9761 2.2215

Table 2 summarizes the MSS, MEN and AEN, and empirical power for the late time-dependent association. The empirical power under the calculated MSS achieves the targeted power. First of all, we can see that the group-sequential design provides a quite smaller AEN than fixed-sample design does in every case, which is preferred in terms of costs saving. As expected, the MSS decreases with higher positive correlation, but the reduction is small. Power and sample size is less impacted by the correlation than the hazard ratio. The MSS is nearly determined by the hazard ratio closer to 1 and it does not vary with the correlation when one hazard ratio is relatively smaller (or larger) than the other. Similarly there is little difference in the MEN between the group-sequential and fixed-sample designs. Based on these results, for the ARDENT study, the MSS is nearly determined by OC2. We only describe the result assuming the late time-dependent association. Similar patterns are observed in the case of an early time-dependent association (Gumbel copula), where the design planning results under the Gumbel copula is provided in Table B.1 of Appendix B. Also, we provide the considerations and results about Type I error rates control in Appendix B (Tables B.2 and B.3).

Table 2.

Sample sizes, number of events, and empirical powers in a group-sequential trials with two co-primary outcomes (Clayton copula).

Corr
ρj
*FSS Group-sequential design Empirical power (%)
MSS MEN AEN Both
EP
At least
one EP
Single EP
OC1 OC2 OC1 OC2 OC1 OC2
0.0 830 835 168 335 141 293 80.6 99.3 95.3 84.6
0.1 829 833 167 334 140 292 80.5 99.2 95.2 84.5
0.2 827 832 167 333 140 291 80.4 99.2 95.2 84.4
0.3 826 831 167 333 140 291 80.7 99.1 95.3 84.5
0.4 824 829 166 332 139 291 80.5 99.0 95.2 84.3
0.5 822 827 166 331 139 290 80.6 99.0 95.1 84.3
0.6 820 825 166 331 139 290 80.6 99.0 95.1 84.2
0.7 816 821 165 329 138 288 80.5 98.5 95.0 83.9
0.8 811 816 164 327 137 287 80.5 98.2 95.0 83.7
0.9 801 806 162 323 136 284 80.3 97.5 94.6 83.2
0.95 792 797 160 319 134 280 80.4 96.8 94.4 82.8
*

FSS:Sample sizes required for fixed-sample design.

The trial is designed to evaluate if an intervention is superior to the control with respect to both virologic (OC1) and regimen failure (OC2) with 80% power at the 2.5% significance level of a one-sided logrank test, where two analyses are planned at fixed calendar times of 48 and 96 weeks. For both outcomes, the critical boundaries are determined using the Lan-DeMets error-spending method with the O’Brien-Fleming type function. The bivariate exponential distribution is modeled using the Clayton copula. Empirical power is calculated using 100,000 repetitions. The marginal powers for OC1 and OC2 are calculated under a calculated maximum sample size.

In this illustration, the interim analysis was planned to be conducted at the prespecified calendar times as participants are recruited in calendar time. On the other hand, one may design a survival trial based on information fraction as interim summary statistics depend on the amount of information available. For example, the first analysis is planned when 50% of the maximum event numbers for one endpoint has been observed. The proposed method can be applied to information fraction as well. Table B.4 in Appendix B summarizes the statistics required for information-based designs including the corresponding calendar time, variance, and information fraction for one endpoint relative to information fraction for the other endpoint.

5. Discussion

A single primary endpoint may or may not provide a comprehensive picture of the important effects of the intervention. For this reason, many investigators prefer to design clinical trials with more than one primary endpoint (Dmitrienko et al. 2009). Multiple primary endpoints offer an attractive design feature as they capture a more complete characterization of the effect of an intervention on short and long term outcomes. For example, the Ambassador trial (NCT03244384) was designed to test the effect of pembrolizumab on overall survival and disease-free survival in patients with bladder cancer. In addition, it is common in oncology trials to use two primary endpoints to study the effect of treatment in different patient populations. For example, SWOG S0819 (Herbst et al. 2018) was designed to test the effect of cetuximab plus chemotherapy on overall survival in all patients with lung cancer and to study the impact of the combination therapy on progression-free survival in patients who were EGFR positive. However, for both multiple primary and co-primary endpoints, it is non-trivial to control the Type I and Type II errors when the endpoints are correlated. Evaluating an impact of the correlations among the endpoints is important, in design and analysis of clinical trials with multiple endpoints. Although methodologies to address continuous or binary endpoints in fixed-sample designs are well-developed, methodologies for event-time endpoints are limited (Halabi 2012; Rauch et al. 2016), especially in a group-sequential setting.

In this paper, we discuss a basic theory and method for group-sequential design in clinical trials with two non-fatal event-time outcomes. We present the asymptotic form and computing method of the variance-covariance function for the two sets of group-sequential weighted logrank statistic, which is fundamental for determining the information fraction for each outcome and for evaluating the probability of rejecting the null hypotheses. Several authors have developed many methods for group-sequential designs. However, in the context of comparing co-primary or multiple endpoints between groups, the form of the asymptotic variance-covariance matrix has not been provided based on the data correlation structure among two event-times. The description of the multivariate central limit theorem and the Cramer-Wald device by some authors did not clearly provide the asymptotic form of the variance-covariance matrix and the connection with a martingale approach, which cause challenges when calculating a power and the required sample size for a trial design. Although the covariance form similar to Theorem 1 has been reported in Murray (2000) and Andrei and Murray (2005), their contexts are different from ours and in paired logrank statistics on the same time axis. When two martingale components with event-time outcomes are correlated on different time axes, it is difficult to directly apply the standard martingale theory for survival analysis. We overcome these difficulties by deriving the two-dimensional Volterra integral equation using the discrete Ito formula (Jacod and Shiryaev 2003) within a martingale approach, which is provided in Appendix A as the proof of Theorem 1. From the simulation result, the asymptotic distribution of Theorem 1 works well in most practical situations as long as the event rate or sample size is not so small.

We apply the asymptotic result to group-sequential methodology for monitoring both or one of the event-time outcomes, when the trial is designed to evaluate a joint effect on both outcomes. There are several advantages for our developed methods. First, they provide an approach to determine the information and information fraction for two event-time outcomes. Second, these methods present the opportunity of evaluating the relationship between two event-time endpoints and how it impacts the decision-making for rejecting the null hypothesis, in terms of the Type I error, power, sample size and number of events. Finally, the methods provide insights on how to optimally choose a strategy for monitoring two event-time endpoints. We outline the method for calculating the probability, sample size, and number of events for the method, and illustrate the methods using a clinical trial example in HIV. Under a calculated total maximum sample size for a joint effect on two outcomes, the monitoring method achieves the targeted power and adequately controls the Type I error. The empirical Type I error rate was evaluated using Monte-Carlo simulation, and the methods presented here are valid in other practical situations. The objectives of the methods are to incorporate the correlation between the two event-time outcomes in power, Type I error evaluation and sample size calculation and to investigate how they behave as the correlation varies. The strength and shape of the association may be estimated from external or internal pilot data, but are usually unknown.

We discuss the situation where both event-time outcomes are non-fatal. Sugimoto et al. (2017) discussed the fixed-sample design when one event is fatal, and when both are fatal. An extension of their work to a group-sequential setting will require an extensive study to modify the variance-covariance structure of the group-sequential logrank statistics in order to handle dependent censoring. Research on group-sequential designs under such situations is an important area for future studies.

Acknowledgements

We thank one reviewer and the Associate Editor for their comments. Research reported in this publication was supported by JSPS KAKENHI grant numbers JP17K00054 and JP17K00069, the Project Promoting Clinical Trials for Development of New Drugs (18lk0201061h0002/18lk0201061h0202) from the Japan Agency for Medical Research and Development (AMED) and the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under Award Number UM1AI068634. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Appendix

A. Proof of Theorem 1

Let Mik()(t)=Nik()(t)0tYik()(x)dΛgik(x) and {Fk,t():t0} be a standard filtration generated from the history through time t for the kth outcome and the th analysis (Fk,t() is the smallest σ-algebra generated by {Nik()(x), NikC()(x):0xt, i = 1, ⋯ , n}, where NikC()(t)=1{Tik()t,Δik()=0} is a censoring counting process). As is well-known, Mik()(t) has the Fk,t()-martingale property. We discuss the asymptotic behavior using the decomposition of the weighted logrank process Uk()(t)=m^k()(t)+n12Mk()(t) from the definition of Uk(), where

Mk()(t)=0tH^k()(x)n12nLi=1dM~ik()(x)=0tH^k()(x)n12dM~k()(x),dM~k()(x)=dM¯2k()(x)Y¯2k()(x)dM¯1k()(x)Y¯1k()(x),dM¯jk()(x)=i=1n1{gi=j}dMik()(x),dM~ik()(x)=1{in}{1{gi=2}Y¯2k()(x)1{gi=1}Y¯1k()(x)}dMik()(x),

and Mk()(t) is Fk,t()-martingale because H^k()(t) is Fk,t()-predictable.

Let Z^=(Z^1(τ1),,Z^1(τL),Z^2(τ1),,Z^2(τL))T and let Z^k(τ) be Z^k(τ) whose denominator is replaced by the limit version,

Z^k(τ)=n12Uk()(τ)Vkk0()(τ)=n12μ^k+ξkMk()(τ),

where we write ξk=1Vkk0()(τ) for simplicity. The distribution of Z^Dnμ^ is asymptotically equivalent to

Z^Dnμ^=(ξ11M1(1)(τ1),,ξ1LM1(L)(τL),ξ21M2(1)(τ1),,ξ2LM2(L)(τL))T

because the dominated convergence theorem works by the convergence of V^kk0()(τ)PVkk0()(τ) uniformly on = 1,…, L as nL ≥ ⋯ ≥ n1 → ∞. We find it necessary to study the covariance of Mk(),s for characterizing the distribution of Z^Dnμ^.

In the proof hereafter, it is sufficient to consider the case of L = 2. As a function related to the characteristic function of Mk()(t), define

Gk()(t)=exp(izkMk()(t)+zk22Mk(),Mk()(t))

for a real non-zero zkℓ and i=1, where ⟨m1, m2⟩ denotes a predictable covariance process for two martingales m1 and m2. In this case we have

Mk(),Mk()(t)=n12n120tH^k()(x)H^k()(x){dΛ1k(x)Y¯1k()(x)+dΛ2k(x)Y¯2k()(x)},

following the standard martingale theory of survival analysis (see Fleming and Harrington (1991)). The consistency of S^jk(), the Glivenko-Cantelli theorem, and Conditions 1 and 3 imply sup0xτH^k()(x)Hk()(x)P0 and

sup0xτH^k()(x)n1Y¯jk()(x)hjk()(x)P0asn, (9)

where

hjk()(x)=Hk()(x)ajyjk()(x)=Wk()(x)ajSjk(x)Sk()(x),j=3j,

and note that 0Hk()(x)< for x ∈ [0, τ], Hk()(x)=0 for τ < x and 0hjk()(x)< for all x. The univariate asymptotic result provides E(eizkMk()(t))exp(zk22Vkk(t,tτ,τ)) as n → ∞, which corresponds to the following convergences,

E(Gk()(t))1andMk(),Mk()(t)PVkk(t,tτ,τ)

(Nishiyama 2011). For different kk′, it is difficult to show joint normality with correlation between Mk() and Mk() with standard martingale theory of counting processes (Fleming and Harrington 1991; Andersen et al. 1993). However, we overcome the challenge applying Ito’s formula. The discrete Ito’s formula (Jacod and Shiryaev 2003; Huang and Strawderman 2006) provides the decomposition of Gk()(t),

Gk()(t)1=j=1,20tGk()(x)H~jka()(x)dM¯jk()(x)+j=1,20tGk()(x)H~jk()(x)Y¯jk()(x)dΛjk(x), (10)

where, with i1 = −i and i2 = i,

H~jka()(x)=exp(ijzknH^k()(x)Y¯jk()(x))1,H~jk()(x)=exp(ijzknH^k()(x)Y¯jk()(x))1ijzknH^k()(x)Y¯jk()(x)+zk22(nH^k()(x)Y¯jk()(x))2.

The expectation of the right-hand side of (10) converges to zero as n → ∞, because

E(0tGk()(x)H~jka()(x)dM¯jk()(x))=0andE(0tGk()(x)H~jk()(x)Y¯jk()(x)dΛjk(x))0 (11)

by the martingale property of M¯jk() and the Lindeberg condition, respectively. In fact, using the integrable martingale property of Gk()(x) and the well-known inequality

exp(ic)1ic+12c21{cε}c3+1{c>ε}c2

for any real c, the latter result of (11) is obtained as

E(0tGk()(x)H~jk()(x)Y¯jk()(x)dΛjk(x))exp(zk22Mk(),Mk()(t))×{E(0tcjk(x)31{cjk(x)ε}Y¯jk()(x)dΛjk(x))}+{E(0tcjk(x)21{cjk(x)>ε}Y¯jk()(x)dΛjk(x))}0

as n → ∞, where ε is an arbitrary positive number, cjk(x)=zknH^k()(x)Y¯jk()(x) and we have ncjk(x)Pzkhjk()(x) uniformly on (0, τ] from (9). Hence, we have

E((G1()(t)1)(G2()(s)1))E(G1()(t)G2()(s))1 (12)

as n, nℓ′ → ∞ by the univariate results of E(Gk()(t))1, while using the formula (10) we can also find

E((G1()(t)1)(G2()(s)1))j=1,20t0sE(G1()(x)G2()(y)H~j1a()(x)H~j2a()(y)dM¯j1()(x)dM¯j2()(y)) (13)

as n, nℓ′ → ∞. Similarly to showing the latter result of (11), with asymptotic equality, we can replace the terms eij(cj1(x)+cj2(y)) and eijcjkℓ(·) included in (13) by 1+ij{cj1(x)+cj2(y)}12{cj1(x)+cj2(y)}2 and 1+ijcjk()12cjk()2, respectively. In fact, we can show that

H~j1a()(x)H~j2a()(y)=eij(cj1(x)+cj2(y))eijcj1(x)eijcj2(y)+1=cj1(x)cj2(y)+oP(1nn)

from the convergence result of ncjk(x). Hence, we have

nnH~j1a()(x)H~j2a()(y)Pz1z2hj1()(x)hj2()(y) (14)

as n, nℓ′ → ∞, so that we can apply this result to (13). Also, similar to Prentice and Cai (1992) and Sugimoto et al. (2013, 2017), we can show

1a^jnE(dM¯j1()(x)dM¯j2()(y))=E(dMi1()(x)dMi2()(y)gi=j)=C(xy)Aj(dx,dy).

For simplicity, let ϕ(t,s)=E(G1()(t)G2()(s)). From (12), (13), (14), γ^Pγ, a^jPaj (Conditions 1-2) and the dominated convergence theorem, we have the integral equation for ϕ(t, s) under n, nℓ′ → ∞,

ϕ(t,s)1=z1z2γγ×0t0sϕ(x,y)j=12ajhj1()(x)hj2()(y)C(xy)Aj(dx,dy). (15)

Similarly to bivariate survival function (Dabrowska 1988), the two-dimensional Volterra integral equation

ϕ(t,s)=1+0t0sϕ(x,y)b12(dx,dy)withϕ(t,0)=ϕ(0,s)=1

is solved as ϕ(t,s)=exp[0t0s{b12(dx,dy)b1(dx,y)b2(x,dy)}], where

b1(dx,y)=ϕ(dx,y)ϕ(x,y)andb2(x,dy)=ϕ(x,dy)ϕ(x,y).

However, note that it is difficult to obtain bk(x, y), k = 1, 2 by directly differentiating (15) because of including the expectation of non-differentiable Mi1()(x) and Mi2()(y). Alternatively, we can use the formula (10) again for the purpose, so that by the discussion similar to obtaining (15), as n, nℓ′ → ∞, we have

ϕ(dx,y)={E(dG1()(x)dG2()(y))+E(dG1()(x)G2()(y))}ϕ(x,y)E(jH~j1a()(x)dM¯j1()(x))=0.

This yields ∫∫ b1(dx, y)b2(x, dy) = 0. Hence, the solution of (15) is

ϕ(t,s)=exp(z1z2V12(t,sτ,τ)).

Therefore, if E(dM¯j1()(x)dM¯j2()(y))0, the correlation between the two martingales works, which results in E(G1()(t)G2()(s))1 but concludes

E(G1()(t)G2()(s))ϕ(t,s)11asnLn1.

In summary, these results provide that the characteristic function of marginal martingale vector (Mk()(t),Mk()(s))T converges to that of bivariate normal distribution as

E(eizkMk()(t)+izkMk()(s))exp(12zk2Vkk(t,sτ,τ)zkzkVkk(t,sτ,τ)12zk2Vkk(t,sτ,τ))={exp(2zk2Vkk(t,sτ,τ))ifk=k,=,exp(12{zkVkk(t,sτ,τ)12+zkVkk(t,sτ,τ)12}2)ifk=k,,same as the above formotherwise.}

A replication of the similar discussion provides that (M1(1)(t), M1(2)(t), M2(1)(t), M2(2)(t)) converges in distribution to a multivariate normal distribution with zero mean vector and covarince matrix

(V11(t,sτ1,τ1),V11(t,sτ2,τ1),V11(t,sτ2,τ2),V21(t,sτ1,τ1),V21(t,sτ1,τ2),V22(t,sτ1,τ1),V21(t,sτ2,τ1),V12(t,sτ2,τ2),V22(t,sτ2,τ1),V22(t,sτ2,τ2)).

These results lead imidiately to the convergence of Z^Dnμ^ in distibution to Z* – Dnμ, as summarized in Theorem 1. □

B. Some additional results

Table 2 of Sect. 4 displays the results obtained under the assumption of a late time-dependent association (Clayton copula) for the joint survival distribution of the two event-time outcomes. The users may be interested in how the results change if the other types of dependency between two outcomes are assumed. In Table B.1, we provide results from the design stage calculated under the same assumptions as Table 2 except that the joint survival distribution is replaced by an early time-dependent association (Gumbel copula). The pattern of the results of MSS, MEN and AEN under Gumbel copula are quite similar to Table 2, but, as the correlation is higher, their reduction rates from the values at zero correlation are slightly larger than those under Clayton copula.

Table B.1.

Sample sizes, number of events, and empirical powers in a group-sequential trial with two co-primary outcomes under an early time-dependent association (Gumbel copula).

Corr.
ρj
*FSS Group-sequential design Empirical power (%)
MSS MEN AEN Both
EP
At least
one EP
Single EP
OC1 OC2 OC1 OC2 OC1 OC2
0.0 830 835 168 335 141 293 80.6 99.3 95.3 84.5
0.1 824 829 166 332 139 290 80.6 99.0 95.2 84.4
0.2 818 823 165 330 138 289 80.4 98.6 95.2 83.9
0.3 812 817 164 327 137 286 80.5 98.1 94.9 83.7
0.4 805 810 163 325 136 285 80.5 97.7 94.8 83.4
0.5 799 804 161 322 134 282 80.6 97.3 94.6 83.3
0.6 792 797 160 319 133 280 80.3 96.7 94.2 82.7
0.7 786 791 159 317 132 279 80.6 96.3 94.2 82.7
0.8 780 785 158 315 132 277 80.3 96.0 94.1 82.2
0.9 776 781 157 313 131 276 80.7 95.8 94.2 82.3
0.95 775 780 157 313 131 276 80.4 95.6 94.0 82.0
*

FSS:Sample sizes required for fixed-sample design.

This table is created under the same settings and descriptions as those of Table 2 except the association between two outcomes OC1 and OC2. The joint survival distribution is modeled using the Gumbel copula which provides an early time-dependent association.

As indicated by one referee, an important matter of concern is how the Type I error rates are controlled or not. In fact, the proposed design method is based on asymptotic results. To answer such a problem, we evaluate the behavior of the actual Type I error rates under sample sizes calculated by the proposed methods. Using ARDENT study, we consider three settings of (ψ1, ψ2) = (1.0, 1.0), (0.565, 1.0) and (1.0, 0.721) (both null hypotheses and the two marginals) under the same configurations as Sect. 4, and we confirm the behavior via Monte-Carlo simulation with 1,000,000 runs. For the simulation, a trial ended at the planned follow-up duration. When the observed numbers were larger than the planned ones, the critical value at the final analysis was recalculated based on

1P(Zk1<ck1,,ZkL<c~kLH0k)=αk,

where c~kL is the critical value at the final analysis, recalculated such that the above equation is satisfied to control the Type I error adequately if the planned numbers are different from the observed ones.

Tables B.2 and B.3 show the results of the actual Type I error rates, which are corresponding to the situations under null hypotheses of Tables 2 and B.1 under Clayton and Gumbel copulas, respectively. Where the columns “Both” and “ALO” give the probabilities to reject two null hypotheses of OC1 and OC2 jointly (Both) and at least one (ALO), respectively, and “OC1” and “OC2” provide the probabilities to reject two single hypotheses of OC1 and OC2, respectively. We observe that the results of “Joint” are well controlled at the nominal error rate 2.5% in the three cases. Those of “ALO” are less than 2 × 2.5% only at both null hypotheses and reflect the effect of multiplicity using two times testing. Also, the results of “OC1” and “OC2’ are well controlled at the nominal Type I error rate in three cases. Therefore, our method works well in controlling the nominal Type I error rate under the calculated sample size.

Table B.2.

Simulation assessment: Probability of rejecting null hypothesis under Clayton copula.

Corr.
ρj
MSS (ψ1, ψ2) = (0.565, 1.0) (ψ1, ψ2) = (1.0, 0.721) (ψ1, ψ2) = (1.0.1.0)
Both ALO OC1 OC2 Both ALO OC1 OC2 Both ALO OC1 OC2
0.0 835 2.39 95.4 95.3 2.50 2.10 85.0 2.49 84.6 0.06 4.91 2.48 2.49
0.1 833 2.39 95.4 95.3 2.50 2.15 84.8 2.50 84.5 0.08 4.94 2.50 2.52
0.2 832 2.42 95.3 95.3 2.51 2.20 84.7 2.51 84.4 0.08 4.91 2.49 2.51
0.3 831 2.44 95.3 95.2 2.51 2.24 84.6 2.50 84.4 0.10 4.92 2.51 2.52
0.4 829 2.44 95.2 95.2 2.50 2.27 84.6 2.51 84.3 0.12 4.90 2.50 2.52
0.5 827 2.43 95.2 95.1 2.48 2.30 84.3 2.50 84.2 0.15 4.87 2.53 2.49
0.6 825 2.47 95.1 95.1 2.51 2.33 84.3 2.49 84.2 0.18 4.80 2.46 2.51
0.7 821 2.48 95.0 95.0 2.51 2.38 84.0 2.49 83.9 0.23 4.80 2.51 2.52
0.8 816 2.52 94.9 94.9 2.53 2.45 83.8 2.52 83.7 0.30 4.69 2.49 2.51
0.9 806 2.51 94.7 94.4 2.52 2.48 83.3 2.51 83.3 0.45 4.50 2.48 2.47
0.95 797 2.51 94.5 94.5 2.51 2.46 82.8 2.47 82.8 0.62 4.39 2.51 2.50

Table B.3.

Simulation assessment: Probability of rejecting null hypothesis under Gumbel copula.

Corr.
ρj
MSS (ψ1, ψ2) = (0.565, 1.0) (ψ1, ψ2) = (1.0, 0.721) (ψ1, ψ2) = (1.0.1.0)
Both ALO OC1 OC2 Both ALO OC1 OC2 Both ALO OC1 OC2
0.0 835 2.40 95.4 95.3 2.51 2.11 84.9 2.51 84.5 0.06 4.94 2.49 2.51
0.1 829 2.44 95.3 95.2 2.50 2.26 84.5 2.50 84.2 0.12 4.89 2.51 2.50
0.2 823 2.46 95.0 95.0 2.49 2.36 84.1 2.50 84.0 0.18 4.84 2.51 2.51
0.3 817 2.47 94.9 94.9 2.49 2.42 83.8 2.50 83.8 0.27 4.75 2.50 2.52
0.4 810 2.51 94.8 94.8 2.52 2.47 83.6 2.51 83.5 0.35 4.63 2.48 2.50
0.5 804 2.52 94.6 94.6 2.52 2.50 83.2 2.52 83.2 0.47 4.55 2.53 2.50
0.6 797 2.51 94.5 94.5 2.51 2.49 82.8 2.49 82.8 0.57 4.42 2.50 2.49
0.7 791 2.48 94.3 94.3 2.48 2.52 82.6 2.52 82.6 0.69 4.27 2.49 2.47
0.8 785 2.52 94.1 94.1 2.52 2.49 82.3 2.50 82.3 0.82 4.17 2.50 2.49
0.9 781 2.51 94.0 94.0 2.51 2.50 82.1 2.50 82.1 0.95 4.06 2.49 2.52
0.95 780 2.52 94.0 94.0 2.52 2.51 82.1 2.51 82.1 0.99 4.05 2.51 2.54

Table 1 of Sect. 4 displays the planning information for a group-sequential design at the fixed analysis time points (48 and 96 weeks) considered in ARDENT trial. Other group-sequential designs based on selected information fractions can be constructed. Table B.4 displays the planning information for a group-sequential design for information fractions of 0.5 and 1.0.

Table B.4.

Variance, calendar time and information fraction corresponding to the other endpoint’s information fraction

Endpoint Variance, corresponding calendar time and information fraction 1st analysis Final analysis
Virologic failure (OC1) information fraction 0.5 1.0
Corresponding Calendar time (week) 45.5 96.0
OC1 V110()(τ) 0.0252 0.0499
OC2
(Regimen failure)
V220()(τ) 0.0539 0.0998
Corresponding information fraction 0.5400 1.0
Regimen failure (OC2) information fraction 0.5 1.0
Corresponding Calendar time (week) 42.0 96.0
OC1
(Virologic failure)
V110()(τ) 0.0233 0.0499
Corresponding information fraction 0.4675 1.0
OC2 V220()(τ) 0.0502 0.0998

Footnotes

Publisher's Disclaimer: This Author Accepted Manuscript is a PDF file of an unedited peer-reviewed manuscript that has been accepted for publication but has not been copyedited or corrected. The official version of record that is published in the journal is kept up to date and so may therefore differ from this version.

Contributor Information

Tomoyuki Sugimoto, Department of Mathematics and Computer Science, Graduate School of Science and Technology, Kagoshima University, 1-21-35 Korimoto, Kagoshima 890-8580, Japan.

Toshimitsu Hamasaki, Department of Data Science, National Cerebral and Cardiovascular Center, 5-7-1 Fujishiro-dai, Suita, Osaka 565-8565, Japan.

Scott R. Evans, Epidemiology and Biostatistics and the Center for Biostatistics, George Washington University, 6110 Executive Boulevard Suite 750 Rockville 20852-3943, MD, USA

Susan Halabi, Department of Biostatistics and Bioinformatics,Duke University School of Medicine, Durham, 27705, NC, USA.

References

  1. Andersen PK, Borgan Ø, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer-Verlag, New York. [Google Scholar]
  2. Andrei A-C, Murray S (2005) Simultaneous group sequential analysis of rank-based and weighted Kaplan-Meier tests for paired censored survival data. Biometrics 61:715–720. [DOI] [PubMed] [Google Scholar]
  3. Asakura K, Hamasaki T, Sugimoto T, Hayashi K, Evans SR, Sozu T (2014) Sample size determination in group-sequential clinical trials with two co-primary endpoints. Stat Med 33:2897–2913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Clayton DG (1976) A model for association in bivariate life tablesand its application in epidemiological studies of familial tendencyin chronic disease. Biometrika 65:141–151. [Google Scholar]
  5. Collett D (2003) Modelling survival data in medical research, 2nd edn. Chapman & Hall/CRC, Boca Raton. [Google Scholar]
  6. Cook RJ, Farewell VT (1994) Guidelines for monitoring efficacy and toxicity responses in clinical trials. Biometrics 50:1146–1152. [PubMed] [Google Scholar]
  7. Dabrowska DM (1988) Kaplan-Meier estimate on the plane. Ann Stat 16:1475–1489. [Google Scholar]
  8. Dmitrienko A, Tamhane AC, Bretz F (2009) Multiple Testing Problems in Pharmaceutical Statistics. Chapman & Hall/CRC, Boca Raton. [Google Scholar]
  9. Fleming TR, Harrington DP (1991) Counting process and survival analysis. John Wiley & Sons, New York. [Google Scholar]
  10. Glimm E, Mauer W, Bretz F (2009) Hierarchical testing of multiple endpoints in group-sequential trials. Stat Med 29:219–228. [DOI] [PubMed] [Google Scholar]
  11. Gombay E (2008) Weighted logrank statistics in sequential tests. Sequential Anal 27:97–104. [Google Scholar]
  12. Gordon LKK, Lachin JM (1990) Implementation of group sequential logrank tests in a maximum duration trial. Biometrika 46:759–770. [PubMed] [Google Scholar]
  13. Gu MG, Lai TL (1991) Weak convergence of time-sequential censored rank statistics with applications to sequential testing in clinical trials. Ann Stat 19:1403–1433. [Google Scholar]
  14. Halabi S (2012) Adjustment on the type I error rate for a clinical trial monitoring for both intermediate and primary endpoints. J Biom Biostat 7:15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Herbst RS, Redman MW, Kim ES, Semrad TJ, Bazhenova L, Masters G, Oettel K, Guaglianone P, Reynolds C, Karnad A, Arnold SM, Varella-Garcia M, Moon J, Mack PC, Blanke CD, Hirsch FR, Kelly K, Gandara DR (2018) Cetuximab plus carboplatin and paclitaxel with or without bevacizumab versus carboplatin and paclitaxel with or without bevacizumab in advanced NSCLC (SWOG S0819): a randomised, phase 3 study. Lancet Oncol 19: 101–114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hamasaki T, Asakura K, Evans SR, Sugimoto T, Sozu T (2015) Group-sequential strategies in clinical trials with multiple co-primary endpoints. Stat Biopharm Res 7:36–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Hougaard P (1986) A class of multivariate failure time distribution. Biometrika 73:671–678. [Google Scholar]
  18. Hsu L, Prentice RL (1996) On assessing the strength of dependency between failure time variables. Biometrika 83:491–506. [Google Scholar]
  19. Huang X, Strawderman RL (2006) A note on the Breslow survival estimator. J Nonparametr Stat 18:45–56. [Google Scholar]
  20. Hung HMJ, Wang SJ, O’Neill RT (2007) Statistical considerations for testing multiple endpoints in group sequential or adaptive clinical trials. J Biopharm Stat 17:1201–1210. [DOI] [PubMed] [Google Scholar]
  21. Jacod J, Shiryaev AN (2003) Limit theorems for stochastic processes, 2nd edn. Springer-Verlag, Berlin-Heidelberg. [Google Scholar]
  22. Jennison C, Turnbull BW (2000) Group sequential methods with applications to clinical trials. Chapman & Hall/CRC, Boca Raton. [Google Scholar]
  23. Jung S-H (2008) Sample size calculation for the weighted rank statistics with paired survival data. Stat Med 27:3350–3365. [DOI] [PubMed] [Google Scholar]
  24. Kosorok MR, Shi Y, DeMets DL (2004) Design and analysis of group-sequential clinical trials with multiple primary endpoints. Biometrics 60:134–145. [DOI] [PubMed] [Google Scholar]
  25. Lai TL, Shih M-C (2004) Power, sample size and adaptation considerations in the design of group sequential clinical trials. Biometrika 91:507–528. [Google Scholar]
  26. Lan KKG, DeMets DL (1983) Discrete sequential boundaries for clinical trials. Biometrika 70:659–663. [Google Scholar]
  27. Lennox JL, Landovitz RJ, Ribaudo HJ, Ofotokun I, Na LH, Godfrey C, Kuritzkes DR, Sagar M, Brown TT, Cohn SE, McComsey GA, Aweeka F, Fichtenbaum CJ, Presti RM, Koletar SL, Haas DW, Patterson KB, Benson CA, Baugh BP, Leavitt RY, Rooney JF, Seekins D, Currier JS (2014) A phase III comparative study of the efficacy and tolerability of three non-nucleoside reverse transcriptase inhibitor-sparing antiretroviral regimens for Treatment-naïve HIV-1-infected volunteers: A randomized, controlled trial. Annals of Internal Medicine 161:461–471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Lin DY, Shen L, Ying Z, Breslow NE (1996) Group sequential designs for monitoring survival probabilities. Biometrics 52:1033–1041. [PubMed] [Google Scholar]
  29. Lin DY (1991) Nonparametric sequential testing in clinical trials with incomplete multivariate observations. Biometrika 78:123–131. [Google Scholar]
  30. Murray S (2000) Nonparametric rank-based methods for group sequential monitoring of paired censored survival data. Biometrics 54:984–990. [DOI] [PubMed] [Google Scholar]
  31. Nishiyama Y (2011) Statistical analysis by the theory of martingales. Kindaikagakusha, Tokyo. (in Japanese) [Google Scholar]
  32. O’Brien PC, Fleming TR (1979) A multiple testing procedure for clinical trials. Biometrics 35:549–556. [PubMed] [Google Scholar]
  33. Pocock ST, Geller NL, Tsiatis AA (1987) The analysis of multiple endpoints in clinical trials. Biometrics 43:487–498. [PubMed] [Google Scholar]
  34. Pocock ST (1977) Group sequential methods in the design and analysis of clinical trials. Biometrika 64:191–199. [Google Scholar]
  35. Prentice RL, Cai J (1992) Covariance and survivor function estimation using censored multivariate failure time data. Biometrika 79:495–512. [Google Scholar]
  36. Rauch G, Schüler S, Wirths M, Stefan E, Kieser M (2016) Adaptive designs for two candidate primary time-to-event endpoints. Stat Biopharm Res 8:207–216. [Google Scholar]
  37. Slud EV, Wei LJ (1982) Two-sample repeated significance tests based on the modified Wilcoxon statistic. J Am Stat Assoc 77: 862–868. [Google Scholar]
  38. Sugimoto T, Hamasaki T, Sozu T, Evans SR (2017) Sizing clinical trials when comparing bivariate time- to-event outcomes. Stat Med 36:1363–1382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Sugimoto T, Sozu T, Hamasaki T, Evans SR (2013) A logrank test-based method for sizing clinical trials with two co-primary time-to-event endpoints. Biostatistics 14:409–421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Tamhane AC, Mehta CR, Liu L (2010) Testing a primary and secondary endpoint in a group sequential design. Biometrics 66:1174–1184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Tamhane AC, Wu Y, Mehta C (2012) Adaptive extensions of a two-stage group sequential procedure for testing primary and secondary endpoints (I): unknown correlation between the endpoints. Stat Med 31:2027–2040. [DOI] [PubMed] [Google Scholar]
  42. Tang DI, Gnecco C, Geller NL (1989) Design of group sequential clinical trials with multiple endpoints. J Am Stat Assoc 84:776–779. [Google Scholar]
  43. Tsiatis AA, Boucher H, Kim K (1995) Sequential methods for parametric survival models. Biometrika 82:165–173. [Google Scholar]
  44. Tsiatis AA (1982) Group sequential methods for survival analysis with staggered entry In Survival Analysis (eds., Crowley J and Johnson RA), Hayward, California: IMS Lecture Notes, 257–268. [Google Scholar]
  45. Wei LJ, Su JQ, Latin JM (1990) Interim analyses with repeated measurements in a sequential clinical trial. Biometrika 77:359–364. [Google Scholar]
  46. Wei LJ, Lachin JM (1984) Two-sample asymptotically distribution-free tests for imcomplete multivariate observations. J Am Stat Assoc 79:653–661. [Google Scholar]
  47. Wu J, Xiong X (2017) Group-sequential survival trial design and monitoring using the log-rank test. Stat Biopharm Res 9:35–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Yin G (2012) Clinical trial design: Bayesian and frequentist adaptive methods. John Wiley & Sons, New York. [Google Scholar]

RESOURCES