Group Sequential Design for Randomized Phase III Trials under the Weibull Model

Jianrong Wu; Xiaoping Xiong

doi:10.1080/10543406.2014.971165

. Author manuscript; available in PMC: 2016 Jan 1.

Published in final edited form as: J Biopharm Stat. 2014 Oct 16;25(6):1190–1205. doi: 10.1080/10543406.2014.971165

Group Sequential Design for Randomized Phase III Trials under the Weibull Model

Jianrong Wu ¹, Xiaoping Xiong ¹

PMCID: PMC4400244 NIHMSID: NIHMS635404 PMID: 25322440

Abstract

In this paper, a parametric sequential test is proposed under the Weibull model. The proposed test is asymptotically normal with an independent increments structure. The sample size for fixed sample test is derived for the purpose of group sequential trial design. In addition, a multi-stage group sequential procedure is given under the Weibull model by applying the Brownian motion property of the test statistic and sequential conditional probability ratio test methodology.

Keywords: Brownian motion, Group sequential trial, Randomized clinical trial, Sample size, Time-to-event, Weibull distribution

1 Introduction

For ethical reasons, clinical trials are often monitored for early stopping if a sufficiently large treatment difference is observed during an interim analysis. Various group sequential monitoring methods have been developed in the past few decades, such as the procedures of Haybittle (1971), Pocock (1977), and O'Brien and Fleming (1979); the type I error spending function approach of Lan and DeMets (1983); the triangular test of Whitehead and Stratton (1983); the sequential conditional probability ratio test (SCPRT) of Xiong (1995); and many others. Comprehensive reviews of the these methods are provided by Jennison and Turnbull (2000, references therein).

In cancer clinical trials, time-to-event is often a primary endpoint for the study design, such as overall survival and event-free survival, where the event could be disease progression, relapse, or death. The primary interest is to compare the survival distributions between treatment groups. The non-parametric log-rank test is the most popular test statistic used to design such a study (Collett, 2003). Its Brownian motion property makes it easy to monitor such trials by using the group sequential procedure (Tsiatis, 1982; Sellke and Siegmund, 1983; Slud, 1984; Kim and Tsiatis, 1990).

For survival data, the exponential and Weibull distributions are the two most frequently used parametric models. Of the two distribution forms, the Weibull distribution is usually more appropriate to describe time-to-event data than the exponential distribution because it includes the shape parameter in addition to the scale parameter, with a decreasing or increasing hazard. In advanced-stage cancer studies, the survival rate usually drops dramatically toward the end of the study. Such characteristics of the survival distribution can be better approximated by a Weibull distribution. In general, a cancer survival trial under the Weibull model can also be designed under the proportional hazards model using the log-rank test. However, a parametric test derived under the Weibull model has better small sample properties than the non-parametric log-rank test because the latter has to be general and thus information from continuous quantities derived from specific parametric model cannot be included for inference (Wu, 2013). The maximum sample size is often large for a phase III group sequential trial. However, the available data could be small in the early stages of interim monitoring; therefore, a study with group sequential design under the Weibull model may perform better in the early stages than that a general proportional hazards model. Tsiatis et al. (1995) derived asymptotic sequential distributions for score and Wald tests in general parametric survival models, but the method has not been applied to group sequential trial design under the Weibull model. Recently, Jiang et al. (2012) proposed a simulation method for group sequential trial design under the Weibull model, but it is a computationally intensive method with restrictive assumptions. Heo et al. (1998) and Wu (2013) proposed a sample size formula for a fixed sample test under the Weibull model. Lu et al. (2012) derived a sample size formula for designing a two-stage seamless adaptive design under the Weibull model. However, a general multi-stage group sequential design under the Weibull model is not available in the literature.

The rest of this paper is organized as follows. In Section 2, a parametric sequential test statistic is proposed under the Weibull model. The sample size for a fixed sample test is given in Section 3. A general multi-stage group sequential procedure is discussed in Section 4. In Section 5, the empirical type I error and power of the proposed parametric sequential test are compared with those of the well-known non-parametric log-rank test. An example is given in Section 6 to illustrate the proposed method. The final conclusion is presented in Section 7.

2 Sequential Test Statistics

A parametric sequential test statistic is discussed in this section to provide group sequential design for randomized two-arm survival trials under the Weibull model. Assume that time-to-event variable T_j of a subject from the j^th group follows the Weibull distribution with a common shape parameter κ and scale parameter ρ_j, j = 1,2. That is, T_j has survival distribution function

S_{j} (t) = e^{- {(ρ_{j} t)}^{κ}}

and hazard function

h_{j} (t) = κ ρ_{j}^{κ} t^{κ - 1} .

The shape parameter κ indicates the degree of acceleration (κ > 1) or deceleration (κ < 1) of the hazard over time. In a cancer trial, the median survival time is an intuitive endpoint for clinicians. The median survival time of the j^th group for the Weibull distribution can be calculated as $m_{j} = ρ_{j}^{- 1} {log (2)}^{1 / κ}$ . Therefore, the Weibull survival distribution can be expressed as

S_{j} (t) = e^{- log (2) {(\frac{t}{m_{j}})}^{κ}}, j = 1, 2 .

The one-sided hypotheses of a randomized two-arm trial defined by median survival times can be expressed as

H_{0} : m_{1} \geq m_{2} vs . H_{1} : m_{1} < m_{2} .

For notation convenience, we convert the scale parameter ρ_j to a hazard parameter $λ_{j} = ρ_{j}^{κ} = log (2) / m_{j}^{κ}$ . Then the above hypotheses on median survival times are equivalent to the following:

H_{0}^{*} : δ \leq 1 vs . H_{1}^{*} : δ > 1,

where the hazards ratio δ = λ₁/λ₂ = R^κ, with R = m₂/m₁. Then the survival distribution is S_j(t) = e^{−λ_jt^κ} with hazard function h_j(t) = κλ_jt^κ⁻¹, in which κ is taken as a known constant. This indicates that the testing problem can also fit into a proportional hazard model and the log-rank test is applicable for the intended testing.

Now, suppose during the accrual phase of the trial, n_j subjects of the j^th group are enrolled in the study, and let T_ij and C_ij denote, respectively, the event time and censoring time of the i^th subject of the j^th group, with both being measured from the time of study entry, Y_ij. We assume that the event time T_ij is independent of the censoring time C_ij and entry time Y_ij, and {(Y_ij, T_ij, C_ij); i = 1,…, n_j} are independent and identically distributed. When the data are examined at calendar time t ≤ τ, where τ; is the study duration, we observe the time-to-event X_ij(t) = T_ij Λ C_ij Λ (t − Y_ij)⁺ and failure indicator Δ_ij(t) = I(T_ij ≤ C_ijΛ (t−;Y_ij)⁺), i = 1,…, n_j. Based on the observed data {X_ij(t), Δ_ij(t), i = 1,…, n_j, j = 1, 2}, the observed likelihood function at time t is proportional to (see, e.g., Cox & Oakes, 1984, Chapter 3)

L (λ_{1}, λ_{2}; t) = λ_{1}^{d_{1} (t)} λ_{2}^{d_{2} (t)} e^{- λ_{1} U_{1} (t) - λ_{2} U_{2} (t)},

where $d_{j} (t) = \sum_{i = 1}^{n_{j}} Δ_{i j} (t)$ is the total number of events observed in the j^th group by time t, and $U_{j} (t) = \sum_{i = 1}^{n_{j}} X_{i j}^{κ} (t)$ is the cumulative follow-up time by time t penalized by the Weibull shape parameter κ. The maximum likelihood estimate of λ_j(t) can be derived as

{\hat{λ}}_{j} (t) = d_{j} (t) / U_{j} (t),

and its variance is approximately ${\hat{λ}}_{j}^{2} (t) / d_{j} (t) (= - {[\frac{\partial^{2} log L}{\partial λ_{j}^{2}}]}_{{\hat{λ}}_{j}}^{- 1})$ . Therefore, under the null hypothesis, the Wald statistic of the log-hazard ratio γ = log(δ) at calendar time t is given by (see Appendix 1)

Z (t) = log {U_{2} (t) d_{1} (t) / U_{1} (t) d_{2} (t)} {(d_{1}^{- 1} (t) + d_{2}^{- 1} (t))}^{- 1 / 2}

(1)

and has approximately a standard normal distribution. To derive the group sequential design, let

U (t) = log {U_{2} (t) d_{1} (t) / U_{1} (t) d_{2} (t)} {(d_{1}^{- 1} (t) + d_{2}^{- 1} (t))}^{- 1};

(2)

then under the alternative γ = log(δ) > 0, the statistic U(t) is approximately normal with mean γV(t) and variance V(t) and has an independent increment structure, where $V (t) = {(d_{1}^{- 1} (t) + d_{2}^{- 1} (t))}^{- 1}$ . The above results can be derived from Tsiatis et al. (1995), who proved similar results for general survival parametric models. Since

n_{1}^{- 1} V (t) \to D (t) = {(p_{1}^{- 1} (t) + π^{- 1} p_{2}^{- 1} (t))}^{- 1},

(3)

where p_j(t) = P(Δ₁_j(t) = 1) and π = n₂/n₁ is the treatment allocation ratio. Thus, $B (I) = n_{1}^{- 1 / 2} U (t) / D^{1 / 2} (τ) ~ N (θ I, I)$ is approximately a Brownian motion with drift parameter $θ = n_{1}^{1 / 2} log (δ) D^{1 / 2} (τ)$ and information time I = D(t)/D(τ), where D(τ) is the value of D(t) at t = τ.

3 Sample Size for Fixed Sample Test

The sample size for a fixed sample test is calculated for the situation at the end of the study. Based on test statistic Z(t) at t = τ, under the null hypothesis,

Z (τ) = log {U_{2} (τ) d_{1} (τ) / U_{1} (τ) d_{2} (τ)} {(d_{1}^{- 1} (τ) + d_{2}^{- 1} (τ))}^{- 1 / 2}

has an approximate standard normal distribution. To calculate the power, let p_j(τ) be the probability of a subject from the j^th group having an event during the study. Then under the alternative δ = λ₁/λ₂ > 1, Z(τ) is an approximately normal distribution with mean $n_{1}^{1 / 2} log (δ) D^{1 / 2} (τ)$ and unit variance. Therefore, given a significance level α, the power (1 − β) of the Z(τ) test under the alternative is given by

power ≃ Φ {n_{1}^{1 / 2} log (δ) {(p_{1}^{- 1} (τ) + π^{- 1} p_{2}^{- 1} (τ))}^{- 1 / 2} - z_{1 - α}},

where Φ(·) is the standard normal distribution function and z₁₋_α = Φ⁻¹(1 −α). Thus, the sample size of the first group based on the Z(τ) test can be calculated as

n_{1} = \frac{{(z_{1 - α} + z_{1 - β})}^{2} (p_{1}^{- 1} (τ) + π^{- 1} p_{2}^{- 1} (τ))}{{[log (δ)]}^{2}},

(4)

where δ = R^κ (Wu, 2013). Therefore, the total sample size for the two groups is given by

n = \frac{(π + 1) {(z_{1 - α} + z_{1 - β})}^{2} (p_{1}^{- 1} (τ) + π^{- 1} p_{2}^{- 1} (τ))}{{[log (δ)]}^{2}} .

(5)

To calculate the number of subjects required for the study, we need to calculate p_j(τ), the probability of a subject in the j^th group having an event during the study. Typically, we assume that subjects are accrued over an accrual period of length t_a with an additional follow-up period of length t_f. A subject enters the study at time u, the entry time is uniformly distributed on [0,t_a], and no subject is lost to follow-up during the study. Then the probability of a subject having an event during the study under the Weibull model can be calculated by (Collett, 2003):

p_{j} (τ) = 1 - \frac{1}{t_{a}} \int_{t_{f}}^{t_{a} + t_{f}} e^{- log (2) {(\frac{u}{m_{j}})}^{κ}} du .

(6)

Therefore, given the design parameters δ (or κ), m₁, m₂, α, β, π, t_f and t_a, the number of subjects n required for the study can be calculated using formula (5).

In designing an actual trial, given the accrual time t_a, calculating the sample size is often impractical because we may not be able to enroll the total number of subjects as planned in the given accrual duration. It is more practical to design the study starting with the accrual rate r and then calculate the required accrual time t_a. This can be accomplished under the Weibull model assumption. First, the integration in the probability formula (4) can be simplified by approximation using Simpson's rule,

p_{j} (τ) = 1 - \frac{1}{6} {S_{j} (t_{f}) + 4 S_{j} (0.5 t_{a} + t_{f}) + S_{j} (t_{a} + t_{f})} .

(7)

Then, combining the sample size formula based on (5) with equation (7), we can define a root function of the accrual time t_a

root (t_{a}) = r t_{a} - \frac{(π + 1) {(z_{1 - α} + z_{1 - β})}^{2} (p_{1}^{- 1} (t_{a} + t_{f}) + π^{- 1} p_{2}^{- 1} (t_{a} + t_{f}))}{{[log (δ)]}^{2}} .

(8)

Now the accrual time t_a can be obtained by solving the root equation root(t_a) = 0 numerically in Splus using the uniroot function. The total sample size required for the study is approximately n = [rt_a]⁺, where [x]⁺ denotes the smallest integer greater than x.

The total sample sizes for two groups under each scenario for fixed sample tests recorded in Table 1 were calculated from formula (5) for parametric test Z(τ) and from a formula given by Collett (2003) for the log-rank test L(τ),

Table 1.

Sample size and simulated empirical type I error (α) and power (1 − β) based on 100,000 simulation runs for the Weibull distribution for fixed sample tests Z(τ) and L(τ) with a nominal type I error of 0.05 and powers of 80% and 90% (one-sided test).

Design	(90% Power)	R=1.5			R=1.6			R=1.7

κ	Test	n^*	α	1 − β	n	α	1 − β	n	α	1 − β
0.5	Z(τ)	578	0.049	0.902	434	0.051	0.901	344	0.050	0.903
0.5	L(τ)	577	0.049	0.900	433	0.050	0.901	342	0.050	0.899

1	Z(τ)	118	0.052	0.903	89	0.050	0.902	71	0.051	0.907
1	L(τ)	118	0.052	0.900	89	0.051	0.899	70	0.051	0.897

2	Z(τ)	27	0.051	0.903	20	0.052	0.902	16	0.053	0.903
2	L(τ)	27	0.054	0.887	20	0.055	0.881	16	0.057	0.876

Design	(90% Power)	R=1.8			R=1.9			R=2.0

κ	Test	n	α	1 − β	n	α	1 − β	n	α	1 − β

0.5	Z(τ)	283	0.049	0.902	239	0.050	0.903	207	0.050	0.903
0.5	L(τ)	281	0.050	0.899	237	0.050	0.900	205	0.049	0.900

1	Z(τ)	58	0.051	0.904	50	0.051	0.909	43	0.052	0.910
1	L(τ)	58	0.052	0.898	49	0.052	0.898	43	0.053	0.902

2	Z(τ)	13	0.054	0.903	11	0.055	0.906	10	0.054	0.916
2	L(τ)	13	0.058	0.871	11	0.059	0.867	10	0.061	0.876

Design	(80% Power)	R=1.5			R=1.6			R=1.7

κ	Test	n	α	1 − β	n	α	1 − β	n	α	1 − β

0.5	Z(τ)	418	0.051	0.802	314	0.050	0.801	248	0.050	0.802
0.5	L(τ)	417	0.049	0.798	313	0.049	0.800	247	0.049	0.800

1	Z(τ)	85	0.052	0.802	64	0.051	0.805	51	0.052	0.804
1	L(τ)	85	0.052	0.796	64	0.053	0.798	51	0.053	0.796

2	Z(τ)	20	0.052	0.816	15	0.053	0.815	12	0.054	0.819
2	L(τ)	20	0.055	0.789	15	0.056	0.781	12	0.059	0.778

Design	(80% Power)	R=1.8			R=1.9			R=2.0

κ	Test	n	α	1 − β	n	α	1 − β	n	α	1 − β

0.5	Z(τ)	204	0.050	0.805	173	0.051	0.806	149	0.049	0.804
0.5	L(τ)	203	0.051	0.800	172	0.051	0.803	148	0.051	0.799

1	Z(τ)	42	0.053	0.807	36	0.052	0.811	31	0.052	0.811
1	L(τ)	42	0.053	0.798	36	0.052	0.801	31	0.054	0.798

2	Z(τ)	10	0.054	0.827	8	0.055	0.810	7	0.056	0.816
2	L(τ)	10	0.061	0.778	8	0.062	0.748	7	0.063	0.746

Open in a new tab

n is the sample size per group (with equal allocation), and the sample size for the log-rank test was calculated from formula (9) given by Collett (2003).

n = \frac{{(z_{1 - α} + z_{1 - β})}^{2}}{π_{1} π_{2} {[log (δ)]}^{2} P (τ)},

(9)

where π₁ = 1/(1 + π) and π₂ = π/(1 + π) are proportions of subjects assigned to treatment 1 and treatment 2, respectively, and P(τ) = π₁p₁(τ)+π₂p₂(τ) is the combined probability of failure in [0, τ] for subjects from the two groups. As shown in Table 1, the sample sizes for the parametric test and log-rank test are very similar. In an unpublished manuscript on the log-rank test, Xiong (2014) obtained a precise analytical formula of E(τ), where E(τ) =μ̄ (τ)²/v̄ (τ) and μ̄(τ) = lim_d→∞ μ(τ)/d and v̄(τ) = lim _d→∞ V(τ)/d, where μ(τ) and V(τ) are the mean and variance of log-rank score statistic, and d is the total number of events in [0, τ]. The precise number of failures in calendar time interval [0,τ] for the log-rank test should be d = (z_1−α + z_1−β)²/E(τ), and the precise sample size should be n = d/P(τ), where P(τ) is defined above as the combined probability of failure on [0, τ] for the two groups. E(τ) is a function of the survival distributions, the hazards ratio, the entry time distribution, the censoring distribution, and allocation proportions of subjects in the two groups. By numerical computation using E(τ) as a criterion, we evaluated the existing formula of the sample size for the log-rank test and proposed a new formula for it. The computation indicates that [log(δ)]²π₁π₂≈E(τ) when the allocation is balanced (i.e., π₁(π₂) is close to 0.5) and |log(δ)| ≤ 1; this verifies the accuracy of the formula (9) for this range of parameters in the application. The computation also indicates that [log(δ)]² π₁π₂p₁(τ)p₂(τ)/[P(τ)]² ≈ E(τ) for any 0 < π₁ < 1 and |log(δ)| ≤ 2, which leads to

n = \frac{{(z_{1 - α} + z_{1 - β})}^{2} P (τ)}{π_{1} π_{2} {[log (δ)]}^{2} p_{1} (τ) p_{2} (τ)},

(10)

as a formula for the sample size calculation of the log-rank test that is more accurate than the formula in (9). We will give a numerical example in Section 5 to illustrate this feature. It is straightforward to check that equations (5) and (10) are mathematically equivalent, which implies that the formula in (5) not only works for the proposed parametric test for the Weibull distribution, but also works well for the log-rank test, especially when the assignment of subjects is unbalanced for the two groups.

By the way, in practice the attrition from clinical trials should be considered in the design stage. Patients who are lost to follow-up for various causes during the study are censored in the survival analysis. Assume that losses to follow-up are random and independent of the survival distribution. Thus it is part of the censoring distribution and can be incorporated in the trial design. Sample sizes can also be adjusted for the attrition due to patients' dropout or noncompliance to the study (see e.g. Lachin and Foulkes, 1986).

4 Group Sequential Procedure

In this section, we will apply an SCPRT procedure (Xiong, 1995) to the test statistic Z(t). The SCPRT has two unique features: (1) the maximum sample size of the sequential test is not greater than the size of the reference fixed sample test; and (2) the probability of discordance, or the probability that the conclusion of the sequential test would be reversed if the experiment were not stopped according to the stopping rule but continued to the planned end, can be controlled to an arbitrarily small level (Xiong et al., 2007). Furthermore, the power function of the SCPRT is virtually the same as that of the fixed sample test (Xiong, 1995). The SCPRT boundaries derived in this paper have analytical solutions. All these features make the SCPRT attractive and simple to use.

Let {B_t: 0 < t ≤ 1} be the Brownian motion B_t ∼ N(θt,t) and B₁ be the B_t at the final stage with full information t = 1. Then the joint distribution of (B_t, B₁) has a bivariate distribution with mean μ = (θt,θ) and a variance matrix Σ = (σ_ij)_2×2 with σ₁₁ = σ₁₂ = σ₂₁ = t and σ₂₂ = 1. Therefore, according to multivariate conditional distribution theory (e.g., Anderson, 1958), the conditional density f(B_t|B₁) is the normal density of N(B₁t, (1−t)t). Let s₀ = z₁₋_α be the critical value of B₁ to reject the null for the fixed sample test. Then the conditional maximum likelihood ratio for the stochastic process on information time t is (Xiong, 1995; Xiong, et al., 2003)

L (t, B_{t} | z_{1 - α}) = \frac{{max}_{{s > s_{0}}} f (B_{t} | B_{1} = s)}{{max}_{{s \leq s_{0}}} f (B_{t} | B_{1} = s)} .

Taking the logarithm, the log-likelihood ratio can be simplified as

log (L (t, B_{t} | z_{1 - α})) = \pm \frac{{(B_{t} - z_{1 - α} t)}^{2}}{2 (1 - t) t},

which has a positive sign if S_t > z₁₋_αt and a negative sign if B_t < z₁₋_αt. This equation leads to lower and upper boundaries for B_{t_k} as

a_{k} = z_{1 - α} t_{k} - {2 a t_{k} (1 - t_{k})}^{1 / 2}; b_{k} = z_{1 - α} t_{k} + {2 a t_{k} (1 - t_{k})}^{1 / 2},

(11)

for k = 1, …, K, where K is the total number of looks, and t₁, t₂,…, t_K(= 1) are the information times of the interim looks and the final look. The a in (11) is the boundary coefficient, and it is crucial to choose an appropriate a for the design such that the probability of conclusion by the sequential test being reversed by the test at the planned end is small but not unnecessarily small. The larger a is, the smaller is the discordance probability, and the wider apart the upper and lower boundaries are, making it harder for the sample path to reach the boundaries and stop early and resulting in larger expected sample sizes. Thus, an appropriate a can be determined by choosing an appropriate discordance probability (Xiong, 1995; Xiong et al., 2003).

Now we apply the SCPRT to the test statistic $B (I) = n_{1}^{- 1 / 2} U (t) / D^{1 / 2} (τ) ~ N (θ I, I)$ which is a Brownian motion in information time I = D(t)/D(τ) on [0,1], and the drift parameter $θ = n_{1}^{1 / 2} log (δ) D^{1 / 2} (τ)$ where D(t) is defined by (3) in Section 2. Suppose k^th interim looks are planned at calendar time t_k, k =1, …,K. Then based on the SCPRT procedure presented above, the lower and upper boundaries for B_{I_k} = B(I_k) at the k^th look are given by

a_{k} = z_{1 - α} I_{k} - {2 a I_{k} (1 - I_{k})}^{1 / 2}; b_{k} = z_{1 - α} I_{k} + {2 a I_{k} (1 - I_{k})}^{1 / 2},

(12)

for k = 1, …, K, where I_k = D(t_k)/D(τ) is the information time at the k^th look at calendar time t_k. The nominal critical p-values for testing H₀ are

P_{a_{k}} = 1 - Φ (a_{k} / \sqrt{I_{k}}); P_{b_{k}} = 1 - Φ (b_{k} / \sqrt{I_{k}}) .

(13)

The observed p-value at the k^th look is

P_{S_{I_{k}}} = 1 - Φ (B_{I_{k}} / \sqrt{I_{k}}) .

(14)

The stopping rule for monitoring the trial can be executed by stopping the trial when, for the first time, P_{B_{l_k}} ≥ P_{a_k} (accept H₀ and stop for futility) or P_{B_{l_k}} ≤ P_{b_k} (reject H₀ and stop for efficacy). Since Z(t_k) and $B_{I_{k}} / \sqrt{I_{k}}$ have the same asymptotic distribution under the null hypothesis, the observed p-value at the k^th stage can be calculated from the test statistic Z(t_k) by applying all observations up to stage k. As an illustration, the calculations of the operating characteristics of a multi-stage group sequential design are given in the Appendix 2.

5 Simulation Studies

In this section, we conducted simulation studies to compare the power and type I error of the proposed parametric test statistic Z(t) and the non-parametric log-rank test L(t) under various scenarios. In the simulations, the survival distribution of the j^th group was taken as S_j(t) = e^{−log(2)(t/m_j)^κ}, which is the Weibull distribution with shape parameter κ and median survival time m_j, j = 1, 2, where the shape parameter κ was taken as 0.5, 1, and 2 to reflect cases of decreasing, constant, and increasing hazard functions.

The null hypothesis was set to H₀ : m₁ = m₂ (= 1), and the ratio of medians R = m₂/m₁ under the alternative was taken as 1.5 − 2.0. Furthermore, we assumed that subjects were recruited with a uniform distribution over the accrual period t_a = 5 (years) and followed for t_f = 2 (years), and no subject was lost to follow-up during the study period τ = t_a + t_f = 7. Therefore, a subject was censored at calendar time t if his/her event time was longer than t − u, where u is the time when the subject entered the study. The sample sizes of the two groups were balanced, and thus π = n₂/n₁ = 1; this setting was based on the facts that the total sample size n = n₁ + n₂ is minimized with π close to 1 and a study with groups of equal sample sizes is easier to plan and manage.

In each design parameter configuration, 100,000 observed samples of censored event times were generated from the Weibull distribution to calculate the test statistics under the null or alternative hypothesis. The nominal significance level was set to 0.05 and power was set to 80% and 90%. Sample sizes, empirical type I error, and power for the fixed sample test were calculated at the end of the study, τ = 7. The empirical type I error and power for the two-stage SCPRT design were calculated at calendar time t₁ = 4 and t₂ = 7. The simulated empirical powers and type I errors in various scenarios for the fixed sample test and two-stage SCPRT test are summarized in Table 1 and Table 2, respectively.

Table 2.

Simulated empirical type I error and power of the two-stage SCPRT designs based on 100,000 simulation runs for sequential test statistics Z(t) and L(t) with a nominal type I error of 0.05 and power of 90% (one-sided test).

Design	(90% Power)			Type I error			Power

R	κ	Test	At k^th interim look	k = 1	k = 2	total	k = 1	k = 2	total
1.5	0.5	Z(t)	Empirical	0.005	0.045	0.050	0.382	0.519	0.901
		L(t)	Empirical	0.005	0.044	0.049	0.385	0.514	0.899
			Nominal	0.005	0.046	0.051	0.382	0.518	0.899

	1	Z(t)	Empirical	0.005	0.047	0.052	0.325	0.577	0.902
		L(t)	Empirical	0.005	0.047	0.053	0.324	0.575	0.899
			Nominal	0.005	0.046	0.051	0.325	0.574	0.899

	2	Z(t)	Empirical	0.005	0.047	0.052	0.323	0.579	0.902
		L(t)	Empirical	0.006	0.049	0.055	0.293	0.593	0.886
			Nominal	0.004	0.046	0.051	0.326	0.573	0.899

1.8	0.5	Z(t)	Empirical	0.005	0.045	0.050	0.378	0.524	0.902
		L(t)	Empirical	0.005	0.046	0.051	0.377	0.522	0.899
			Nominal	0.005	0.046	0.051	0.382	0.518	0.899

	1	Z(t)	Empirical	0.005	0.047	0.052	0.315	0.588	0.903
		L(t)	Empirical	0.005	0.048	0.053	0.315	0.582	0.897
			Nominal	0.005	0.046	0.051	0.314	0.585	0.899

	2	Z(t)	Empirical	0.005	0.049	0.054	0.286	0.614	0.900
		L(t)	Empirical	0.006	0.053	0.060	0.236	0.633	0.869
			Nominal	0.004	0.046	0.051	0.285	0.615	0.899

2.0	0.5	Z(t)	Empirical	0.005	0.046	0.051	0.378	0.525	0.903
		L(t)	Empirical	0.005	0.045	0.050	0.375	0.524	0.899
			Nominal	0.005	0.046	0.051	0.377	0.523	0.899

	1	Z(t)	Empirical	0.005	0.048	0.052	0.309	0.599	0.908
		L(t)	Empirical	0.006	0.048	0.054	0.312	0.589	0.901
			Nominal	0.005	0.046	0.051	0.307	0.593	0.899

	2	Z(t)	Empirical	0.006	0.050	0.055	0.275	0.637	0.913
		L(t)	Empirical	0.007	0.056	0.063	0.214	0.660	0.874
			Nominal	0.004	0.046	0.051	0.285	0.615	0.899

Open in a new tab

Sample sizes under each scenario for fixed sample tests recorded in Table 1 were calculated by formula (5) for the parametric test Z(τ) and by formula (9) for the log-rank test L(τ); the latter formula was given by Collett (2003). As shown in Table 1, the sample sizes for the two tests were very similar, both assuming π = 1, which is a favorable condition as discussed in the last paragraph of Section 3. For π not close to 1, the sample sizes for the two tests could be different. For example, in Table 1, assume the conditions are same as before except letting π = 7/13 (or π₁ = 0.65 and π₂ = 0.35); then for κ = 0.5 and R = 1.5, the total sample size n from formula (9) is 1249, whereas the total sample size n from formula (5) or (10) is 1289. The latter is close to the precise sample size, 1287, from the log-rank test using E(τ) (see the last paragraph of Section 3). The simulation results in Table 1 for the fixed sample tests showed that both the log-rank test L(τ) and parametric test Z(τ) had adequate empirical type I error and power for moderate to large sample sizes. However, the log-rank test L(τ) was liberal and underpowered when the sample size was small. The type I error and power of the parametric test Z(τ) were satisfactorily close to the nominal type I error of 0.05 and power of 90% or 80%, respectively, even when the sample size was small. The difference in performance for the two tests with a small sample size may be explained by the fact that L(τ) is non-parametric and includes only counting data, whereas Z(τ) is parametric and includes counting data d₁(τ) and d₂(τ) as well as continuous data U₁(τ) and U₂(τ). The results for the two-stage SCPRT design (Table 2) showed again that the empirical type I error and power of both tests were close to the nominal level at each stage for moderate to large sample sizes. However, the log-rank test L(t) was liberal and underpowered when the sample size was small. The parametric test Z(t) performed better with adequate empirical type I error and power at each stage.

To study the Brownian motion property of the statistic U(t) in equation (2), the empirical correlation matrix of the increments of the U(t) at times t = 3, 4, 5, 6, 7 with sample size n = 100 per group were also computed through 100,000 simulations. All of the correlations were close to theoretical value zero (see Table 3).

Table 3.

Simulated empirical correlation matrix of the statistic U(t) based on 100,000 simulation runs with sample size n = 100.

	Calendar time t_k

κ	3	4	5	6	7
0.5	1	0.0092	0.0061	0.0059	0.0056
		1	0.0036	0.0066	0.0070
			1	0.0088	0.0087
				1	0.0034
					1

1	1	0.0089	0.0068	0.0060	0.0057
		1	0.0060	0.0079	0.0075
			1	0.0056	0.0062
				1	0.0049
					1

2	1	0.0024	-0.0002	-0.0012	-0.0003
		1	-0.0004	-0.0004	0.0010
			1	0.0018	0.0027
				1	0.0019
					1

Open in a new tab

6 An Example

Rhabdoid tumors are aggressive pediatric malignancies with a poor prognosis. Over the past 5 years, St. Jude Children's Research Hospital accrued 14 pediatric patients with recurrent or refractory non-CNS rhabdoid tumors treated with conventional chemotherapy. The median event-free survival is only about 1 year, where the event is defined as disease relapse or death. All 14 patients had events within about 3 years. The Weibull model was fitted in Splus to the data, resulting an estimate (standard error) of the shape parameter κ = 1.37(0.28) and median event-free survival time of m₁ = 0.936 years, which provides a more satisfactory model than the exponential model (Wu, 2013). Now, suppose that we would like to design a multi-center randomized two-arm trial to assess the effectiveness of the small molecule inhibitor alisertib (treatment 2) versus conventional chemotherapy (treatment 1) for this group of patients. Patients will be randomized with equal allocation to each treatment group and hence π = n₂/n₁ = 1. The hypotheses of the planned study are H₀ : m₂ ≤ m₁ vs. H₁ : m₂ > m₁. The investigators would like to detect a half-year increase in the median event-free survival of the alisertib treatment group over that of the conventional chemotherapy group, with 90% power, 5% type I error, and 2 years of follow-up (t_f = 2) after the last patient is enrolled in the study. Then for the alternative hypothesis, m₂ = m₁ + 0.5 = 1.436 and δ = λ₁/λ₂ = (m₂/m₁)^κ = 1.797 and γ = log(δ) = 0.586. Assume this multi-center trial has the capacity to enroll and treat 20 patients per year. Then under the assumption of the Weibull model, with uniform entry and no loss to follow-up, the required total accrual time is t_a = 5.3 years, calculated by equation (8), in which p₁(τ) and p₂(τ) are found by (7) with τ = t_a + t_f. Then the study duration is τ = t_a + t_f = 7.3 years, and the total sample size is 106 patients (53 per group). Now assuming interim and final looks are planned at calendar times t₁ = 4, t₂ = 5, and t₃ = τ = 7.3 years, the corresponding information time at each planned interim look t_k can be calculated by I_k = D(t_k)/D(τ), where $D (t) = {(p_{1}^{- 1} (t) + π^{- 1} p_{2}^{- 1} (t))}^{- 1}$ with

p_{j} (t) = \frac{1}{t_{a}} \int_{0}^{t \land t_{a}} {1 - S_{j} (t - u)} du,

$S_{j} (t) = e^{- log (2) {(\frac{t}{m_{j}})}^{κ}}$ and π = 1. By calculation, the corresponding information times are I₁ = 0.511, I₂ = 0.706, and I₃ = 1. Assuming the maximum conditional probability of discordance ρ = 0.02, then the boundary coefficient a = 2.593 for K = 3 (Xiong et al., 2003), and the maximum probability of discordance is ρ_max = 0.0043. That is, under the most unfavorable setting of mean parameter, i.e., the underlying true drift θI of Brownian motion B(I) is pointing to the cutoff point z_1−α of the test at the final stage, on average, for every 1,000 such sequential tests, there would be only 4.3 cases of later reversal of conclusion (significant or futility) at early stopping if the sequential test were not stopped as it should be but continued to the planned end. The lower boundary and upper boundary calculated from (12) are (a₁, a₂, a₃) = (−0.298, 0.124, 1.645) and (b₁, b₂, b₃) = (1.979, 2.199, 1.645), respectively. The acceptance and rejection nominal critical significance levels are (0.6615, 0.4415, 0.050) and (0.0028, 0.0044, 0.050).

We performed 100,000 simulation runs under the Weibull distribution to evaluate the operating characteristics of the proposed group sequential design. The empirical (nominal) type I error and power of the sequential test are 0.0514 (0.0505) and 0.9034 (0.8995), respectively. The empirical (nominal) probabilities of stopping under the null and alternative hypotheses are 0.3446 (0.3412) and 0.2456 (0.2406) at the first look and 0.5902 (0.5817) and 0.4739 (0.4690) at the second look. The details of the operating characteristics for the proposed group sequential design are shown in Table 4.

Table 4.

The empirical operating characteristics of the three-stage SCPRT design for the example were estimated based on 100,000 simulation runs under the Weibull distribution.

At k^th interim look	k = 1	k = 2	k = 3	Total
Type I error
Nominal	0.0028	0.0031	0.0446	0.0505
Empirical	0.0031	0.0029	0.0453	0.0514

Power
Nominal	0.2494	0.2066	0.4435	0.8995
Empirical	0.2499	0.2106	0.4429	0.9034

Probability of stopping under null
Nominal	0.3412	0.2405	0.4184	1.0000
Empirical	0.3446	0.2456	0.4098	1.0000

Probability of stopping under alternative
Nominal	0.2554	0.2136	0.5311	1.0000
Empirical	0.2557	0.2182	0.5261	1.0000

Expected stopping time and sample size	ET(0)	ET(θ_a)^*	EN(0)	EN(θ_a)
Nominal	0.7625	0.8124	81	87
Empirical	0.7593	0.8108	81	86

Open in a new tab

θ_a = z₁₋_α + z₁₋_β is the draft parameter at the alternative hypothesis.

7 Conclusion

A parametric sequential test statistic under the Weibull model is proposed. Simulation results showed that the proposed parametric sequential test Z(t) has better small-sample properties than the log-rank test L(t) under the Weibull model. A multi-stage group sequential procedure is given based on the SCPRT test proposed by Xiong (1995). The maximum sample size of the sequential test is the same as the sample size of the fixed sample test, and group sequential boundaries have analytical solutions. Therefore, the proposed group sequential procedure is attractive and simple to use. The study can be monitored by pre-planned multi-stage interim analyses to stop for either efficacy or futility of the new treatment. By the way, if the hypothesis of a randomized phase III trial is a two-sided hypothesis, one can replace z_1−α by z₁₋_α_/2 in equations (5) or (8) to obtain a sample size formula, and in equation (12) to obtain SCPRT boundaries for the two-sided hypotheses for a group sequential trial design.

Acknowledgments

This work was supported in part by National Cancer Institute (NCI) support grant CA21765 and the American Lebanese Syrian Associated Charities (ALSAC).

Appendix 1: Derivation of the Sequential Test Statistic Z(t)

First, for notation convenience, we convert (λ₁, λ₂) to (γ, λ), where γ = log(λ₁/λ₂) is the log hazards ratio and λ = λ₂. Then the log-likelihood at calendar time t for (γ, λ) is given by

l_{γ} (γ, λ; t) = d_{1} (t) log (λ) + γ d_{1} (t) + d_{2} (t) log (λ) - λ e^{γ} U_{1} (t) - λ U_{2} (t) .

By solving the following score equations:

\begin{array}{l} l_{γ} (γ, λ; t) = d_{1} (t) - λ e^{γ} U_{1} (t), \\ l_{λ} (γ, λ; t) = \frac{d_{1} (t)}{λ} + \frac{d_{2} (t)}{λ} - e^{γ} U_{1} (t) - U_{2} (t), \end{array}

the maximum likelihood estimates of γ and λ are

\hat{γ} (t) = log {\frac{U_{2} (t) d_{1} (t)}{U_{1} (t) d_{2} (t)}} and \hat{λ} (t) = \frac{d_{2} (t)}{U_{2} (t)} .

The observed Fisher information matrix is given by

j (\hat{γ}, \hat{λ}; t) = (\begin{array}{l} d_{1} (t) & U_{2} (t) d_{1} (t) d_{2}^{- 1} (t) \\ U_{2} (t) d_{1} (t) d_{2}^{- 1} (t) & U_{2}^{2} (t) d_{2}^{- 2} (t) (d_{1} (t) + d_{2} (t)) \end{array}),

and then the variance of γ̂ can be estimated by ${\hat{j}}^{γ γ} (t) = (d_{1}^{- 1} (t) + d_{2}^{- 1} (t))$ , which is the (1,1) entry in the inverse of the Fisher information matrix j⁻¹(γ̂, λ̂, t). Therefore, the Wald test statistic of γ̂(t) is given by

Z (t) = (\hat{γ} (t) - γ) {{\hat{j}}^{γ γ} (t)}^{- 1 / 2} .

Under the null hypothesis H₀: γ = 0,

Z (t) = log {U_{2} (t) d_{1} (t) / U_{1} (t) d_{2} (t)} {(d_{1}^{- 1} (t) + d_{2}^{- 1} (t))}^{- 1 / 2}

has an approximate standard normal distribution.

Appendix 2: Computation for the sequential test normalized with information time

Assumption

Let B(t) ∼ N(θt,t) be a Gaussian process with the time variable t on interval [0,1] and drift θ. Let 0 < t₁ < … < t_K = 1 be the information times of the looks for a sequential test with K looks. Let a_k < b_k be lower and upper boundaries for B(t) at time t_k for k = 1,…, K − 1, and a_K = b_K.

Function $l_{t_{κ}}^{*} (\cdot)$

Define $l_{t_{κ}}^{*} (s)$ as a function of s on interval (a_k, b_k) for k = 1,…, K − 1; this series of functions can be determined recursively as follows. Let $l_{t_{1}}^{*} (s) = 1$ for s on (a₁, b₁); for k = 2,…, K, for s in (a_k, b_k),

l_{t_{κ}}^{*} (s) = \frac{1}{\sqrt{t_{κ - 1} (1 - t_{κ - 1} / t_{κ})}} \int_{a_{κ - 1}}^{b_{κ - 1}} l_{t_{κ - 1}}^{*} (x) ϕ (\frac{x - s t_{κ - 1} / t_{κ}}{\sqrt{t_{κ - 1} (1 - t_{κ - 1} / t_{κ})}}) dx,

(15)

where ϕ(·) in (15) is the density function of the standard normal distribution.

Function l_{t_k}(·)

Define l_{t_k}(s) as a function of s on interval (−∞, a_k)∪(b_k, ∞), for k = 1,…, K; this series of functions can be determined using functions $l_{t_{κ}}^{*} (.)$ defined in (15) as follows. Let l_t₁(s) = 1 for s on (−∞, a₁) ∪ (b₁, ∞); for k = 2,…, K, for s in (−∞, a_k) ∪ (b_k, ∞), let

l_{t_{κ}} (s) = \frac{1}{\sqrt{t_{κ - 1} (1 - t_{κ - 1} / t_{κ})}} \int_{a_{κ - 1}}^{b_{κ - 1}} l_{t_{κ - 1}}^{*} (x) ϕ (\frac{x - s t_{κ - 1} / t_{κ}}{\sqrt{t_{κ - 1} (1 - t_{κ - 1} / t_{κ})}}) dx,

(16)

where ϕ(·) in (16) is the density function of the standard normal distribution.

Power Function

For testing H₀: θ ≤ 0 vs. H_a: θ > 0, with functions l_{t_k}(·) by (16), the power function P(θ) or the probability of rejecting H₀ under the true mean θ is

P (θ) = \sum_{κ = 1}^{κ} (1 / \sqrt{t_{κ}}) \int_{b_{κ}}^{\infty} l_{t_{κ}} (x) ϕ ((x - θ t_{κ}) / \sqrt{t_{κ}}) dx .

(17)

For the sequential test design, the significance level is α = P(0) and the power is 1 − β = P(θ_a), where θ_a is the value of θ under H_a.

Probability of Stopping

The probability of stopping at time t_k is a function of θ as

P_{t_{κ}} (θ) = (1 / \sqrt{t_{κ}}) (\int_{- \infty}^{a_{κ}} + \int_{b_{κ}}^{\infty}) l_{t_{κ}} (x) ϕ ((x - θ t_{κ}) / \sqrt{t_{κ}}) dx,

(18)

with which the probability of stopping at t_k is P_{t_k} (0) for the null hypothesis and P_{t_k} (θ_a) for the alternative hypothesis.

Expected Stopping Time

With the probability of stopping P_{t_k}(θ) from (18), the expected stopping ET(θ) is a function of θ as

E T (θ) = \sum_{k = 1}^{κ} t_{k} P_{t_{k}} (θ),

(19)

with which the expected stopping time is ET(0) for the null hypothesis and ET(θ_a) for the alternative hypothesis.

Expected Sample Size

Suppose the maximum sample size for the sequential test is n. The expected sample size for the sequential test is a function of θ and can be obtained by

E N (θ) = n \times E T (θ),

(20)

with an expected sample size of EN(0) for the null hypothesis and EN(θ_a) for the alternative hypothesis.

For SCPRT design

To test H₀: θ ≤ 0 vs. H_a: θ > 0 with significance level α and power 1 − β by an SCPRT design, the cutoff value at the final stage t_K = 1 is a_K = b_K = z₁₋_α, the drift at the null hypothesis is θ₀ = 0, and the drift at the alternative hypothesis is θ_a = z_1−α + z₁₋_β, which are the same as those for the fixed test at the final stage with information time t = 1. Including the cutoff value θ₀ and θ_a into equations (17), (18), (19), and (20), we can compute the type I error, power, probability of stopping at a given t_k, and expected sample sizes under the null and alternative hypotheses for the SCPRT design.

For details of the derivation of these computational formulas, please refer to Xiong and Tan (1999, 2001) and Xiong et al. (2002).

References

Anderson TW. An introduction to multivariate statistical analysis. New York: Wiley; 1958. [Google Scholar]
Collett D. Modeling survival data in medical research. 2nd. London: Chapman and Hall; 2003. [Google Scholar]
Cox DR, Oakes DV. Analysis of Survival Data. London: Chapman and Hall; 1984. [Google Scholar]
Haybittle JL. Repeated assessment of results in clinical trials of cancer treatment. British Journal of Radiology. 1971;44:793–797. doi: 10.1259/0007-1285-44-526-793. [DOI] [PubMed] [Google Scholar]
Heo M, Faith MS, Allison DB. Power and sample size for survival analysis under the Weibull distribution when the whole lifespan is of interest. Mechanisms of Ageing and Development. 1998;102:45–53. doi: 10.1016/s0047-6374(98)00010-4. [DOI] [PubMed] [Google Scholar]
Jennison C, Turnbull BW. Group sequential methods with applications to clinical trials. New York: Chapman and Hall; 2000. [Google Scholar]
Jiang Z, Wang L, Li C, Xia J, Jia H. A practical simulation method to calculate sample size of group sequential trials for time-to-event under exponential and Weibull distribution. PLOS ONE. 2012;7:1–12. doi: 10.1371/journal.pone.0044013. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lan KKG, DeMets DL. Discrete sequential boundaries for clinical trials. Biometrika. 1983;70:659–663. [Google Scholar]
Lachin JM, Foulkes MA. Evaluation of sample size and power for analyses of survival with allowance for nonuniform patient entry, losses to follow-up, noncompliance, and stratification. Biometrics bf. 1986;42:507–519. [PubMed] [Google Scholar]
Lu Q, Tse SK, Chow SC, Lin M. Analysis of time-to-event data nonuniform patient entry and loss to follow-up under a two-stage seamless adaptive design with Weibull distribution. Journal of Biophar-maceutical Statistics. 2012;22:773–784. doi: 10.1080/10543406.2012.678528. [DOI] [PubMed] [Google Scholar]
Pocock SJ. Group sequential methods in the design and analysis of clinical trials. Biometrika. 1977;64:191–199. [Google Scholar]
Schoenfeld DA. Sample-size formula for the proportional-hazards regression model. Biometrics. 1983;39:499–503. [PubMed] [Google Scholar]
Schoenfeld DA, Ritcher JR. Nomograms for calculating the number of patients needed for a clinical trial with survival as an endpoint. Biometrics. 1982;38:163–170. [PubMed] [Google Scholar]
Sellke T, Siegmund D. Sequential analysis of the proportional hazards model. Biometrika. 1983;79:315–326. [Google Scholar]
Slud EV. Sequential linear rank tests for two-sample censored survival data. Annals of Statistics. 1984;12:551–571. [Google Scholar]
Tsiatis AA. Repeated significance testing for a general class of statistics used in censored survival analysis. Journal of the American Statistical Association. 1982;77:855–861. [Google Scholar]
Tsiatis AA, Boucher H, Kim K. Sequential methods for parametric survival models. Biometrika. 1995;70:165–173. [Google Scholar]
O'Brien PC, Fleming TR. A multiple testing procedure for clinical trials. Biometrics. 1979;35:549–556. [PubMed] [Google Scholar]
Whitehead J, Stratton I. Group sequential clinical trial with triangular continuation regions. Biometrics. 1983;39:227–236. [PubMed] [Google Scholar]
Wu J. Power and sample size for randomized phase III survival trials under the Weibull model. Journal of Biopharmaceutical Statistics. 2013 doi: 10.1080/10543406.2014.919940. inprint. [DOI] [PMC free article] [PubMed] [Google Scholar]
Xiong X. A class of sequential conditional probability ratio tests. Journal of American Statistical Association. 1995;15:1463–1473. [Google Scholar]
Xiong X. A precise approach for sequential test design on comparing survival distributions by log-rank test. Un-published Manuscript 2014 [Google Scholar]
Xiong X, Tan M, Boyett J. Sequential conditional probability ratio tests for normalized test statistic on information time. Biometrics. 2003;59:624–631. doi: 10.1111/1541-0420.00072. [DOI] [PubMed] [Google Scholar]

[R1] Anderson TW. An introduction to multivariate statistical analysis. New York: Wiley; 1958. [Google Scholar]

[R2] Collett D. Modeling survival data in medical research. 2nd. London: Chapman and Hall; 2003. [Google Scholar]

[R3] Cox DR, Oakes DV. Analysis of Survival Data. London: Chapman and Hall; 1984. [Google Scholar]

[R4] Haybittle JL. Repeated assessment of results in clinical trials of cancer treatment. British Journal of Radiology. 1971;44:793–797. doi: 10.1259/0007-1285-44-526-793. [DOI] [PubMed] [Google Scholar]

[R5] Heo M, Faith MS, Allison DB. Power and sample size for survival analysis under the Weibull distribution when the whole lifespan is of interest. Mechanisms of Ageing and Development. 1998;102:45–53. doi: 10.1016/s0047-6374(98)00010-4. [DOI] [PubMed] [Google Scholar]

[R6] Jennison C, Turnbull BW. Group sequential methods with applications to clinical trials. New York: Chapman and Hall; 2000. [Google Scholar]

[R7] Jiang Z, Wang L, Li C, Xia J, Jia H. A practical simulation method to calculate sample size of group sequential trials for time-to-event under exponential and Weibull distribution. PLOS ONE. 2012;7:1–12. doi: 10.1371/journal.pone.0044013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] Lan KKG, DeMets DL. Discrete sequential boundaries for clinical trials. Biometrika. 1983;70:659–663. [Google Scholar]

[R9] Lachin JM, Foulkes MA. Evaluation of sample size and power for analyses of survival with allowance for nonuniform patient entry, losses to follow-up, noncompliance, and stratification. Biometrics bf. 1986;42:507–519. [PubMed] [Google Scholar]

[R10] Lu Q, Tse SK, Chow SC, Lin M. Analysis of time-to-event data nonuniform patient entry and loss to follow-up under a two-stage seamless adaptive design with Weibull distribution. Journal of Biophar-maceutical Statistics. 2012;22:773–784. doi: 10.1080/10543406.2012.678528. [DOI] [PubMed] [Google Scholar]

[R11] Pocock SJ. Group sequential methods in the design and analysis of clinical trials. Biometrika. 1977;64:191–199. [Google Scholar]

[R12] Schoenfeld DA. Sample-size formula for the proportional-hazards regression model. Biometrics. 1983;39:499–503. [PubMed] [Google Scholar]

[R13] Schoenfeld DA, Ritcher JR. Nomograms for calculating the number of patients needed for a clinical trial with survival as an endpoint. Biometrics. 1982;38:163–170. [PubMed] [Google Scholar]

[R14] Sellke T, Siegmund D. Sequential analysis of the proportional hazards model. Biometrika. 1983;79:315–326. [Google Scholar]

[R15] Slud EV. Sequential linear rank tests for two-sample censored survival data. Annals of Statistics. 1984;12:551–571. [Google Scholar]

[R16] Tsiatis AA. Repeated significance testing for a general class of statistics used in censored survival analysis. Journal of the American Statistical Association. 1982;77:855–861. [Google Scholar]

[R17] Tsiatis AA, Boucher H, Kim K. Sequential methods for parametric survival models. Biometrika. 1995;70:165–173. [Google Scholar]

[R18] O'Brien PC, Fleming TR. A multiple testing procedure for clinical trials. Biometrics. 1979;35:549–556. [PubMed] [Google Scholar]

[R19] Whitehead J, Stratton I. Group sequential clinical trial with triangular continuation regions. Biometrics. 1983;39:227–236. [PubMed] [Google Scholar]

[R20] Wu J. Power and sample size for randomized phase III survival trials under the Weibull model. Journal of Biopharmaceutical Statistics. 2013 doi: 10.1080/10543406.2014.919940. inprint. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Xiong X. A class of sequential conditional probability ratio tests. Journal of American Statistical Association. 1995;15:1463–1473. [Google Scholar]

[R22] Xiong X. A precise approach for sequential test design on comparing survival distributions by log-rank test. Un-published Manuscript 2014 [Google Scholar]

[R23] Xiong X, Tan M, Boyett J. Sequential conditional probability ratio tests for normalized test statistic on information time. Biometrics. 2003;59:624–631. doi: 10.1111/1541-0420.00072. [DOI] [PubMed] [Google Scholar]

PERMALINK

Group Sequential Design for Randomized Phase III Trials under the Weibull Model

Jianrong Wu

Xiaoping Xiong

Abstract

1 Introduction

2 Sequential Test Statistics

3 Sample Size for Fixed Sample Test

Table 1.

4 Group Sequential Procedure

5 Simulation Studies

Table 2.

Table 3.

6 An Example

Table 4.

7 Conclusion

Acknowledgments

Appendix 1: Derivation of the Sequential Test Statistic Z(t)

Appendix 2: Computation for the sequential test normalized with information time

Assumption

Function $l_{t_{κ}}^{*} (\cdot)$

Function l_{t_k}(·)

Power Function

Probability of Stopping

Expected Stopping Time

Expected Sample Size

For SCPRT design

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Group Sequential Design for Randomized Phase III Trials under the Weibull Model

Jianrong Wu

Xiaoping Xiong

Abstract

1 Introduction

2 Sequential Test Statistics

3 Sample Size for Fixed Sample Test

Table 1.

4 Group Sequential Procedure

5 Simulation Studies

Table 2.

Table 3.

6 An Example

Table 4.

7 Conclusion

Acknowledgments

Appendix 1: Derivation of the Sequential Test Statistic Z(t)

Appendix 2: Computation for the sequential test normalized with information time

Assumption

Function ltκ∗(·)

Function ltk(·)

Power Function

Probability of Stopping

Expected Stopping Time

Expected Sample Size

For SCPRT design

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Function $l_{t_{κ}}^{*} (\cdot)$

Function l_{t_k}(·)