Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Jan 6.
Published in final edited form as: Biom J. 2020 Jul 6;62(8):1960–1972. doi: 10.1002/bimj.201900236

Information fraction estimation based on the number of events within the standard treatment regimen

Ha M Dang 1,2,*, Todd Alonzo 1,2, Meredith Franklin 1, Wendy J Mack 1, Mark D Krailo 1,2, Sandrah P Eckel 1
PMCID: PMC7953992  NIHMSID: NIHMS1650963  PMID: 32627859

Abstract

For a Phase III randomized trial that compares survival outcomes between an experimental treatment vs. a standard therapy, interim monitoring analysis is used to potentially terminate the study early based on efficacy. To preserve the nominal Type I error rate, alpha spending methods and information fractions are used to compute appropriate rejection boundaries in studies with planned interim analyses. For a one-sided trial design applied to a scenario in which the experimental therapy is superior to the standard therapy, interim monitoring should provide the opportunity to stop the trial prior to full follow-up and conclude that the experimental therapy is superior. This paper proposes a method called Total Control Only (TCO) for estimating the information fraction based on the number of events within the standard treatment regimen. Based on theoretical derivations and simulation studies, for a maximum duration superiority design, the TCO method is not influenced by departure from the designed hazard ratio, is sensitive to detecting treatment differences, and preserves the Type I error rate compared to information fraction estimation methods that are based on total observed events. The TCO method is simple to apply, provides unbiased estimates of the information fraction, and does not rely on statistical assumptions that are impossible to verify at the design stage. For these reasons, the TCO method is a good approach when designing a maximum duration superiority trial with planned interim monitoring analyses.

Keywords: Interim monitoring analysis, Survival outcome trials, Information fraction, Group sequential analysis, Maximum duration clinical trials, Pediatric oncology

1. Introduction

When designing a Phase III randomized clinical trial with survival outcome, investigators can follow either the maximum duration or maximum information paradigms. In a maximum duration design, events are accumulated over a fixed follow-up time on a specified sample size, whereas in a maximum information design the trial is concluded when a pre-specified number of events have been observed. Due to limited funding and fixed duration reasons, the focus of this paper is on maximum duration designs.

For many Phase III randomized clinical trials, it is standard to include trial efficacy monitoring and statistical approaches such as group sequential boundaries and alpha spending functions (Lan and DeMets, 1983; DeMets and Lan, 1994). This process requires an estimate of the information fraction tk that is defined as a ratio of the number events thus far to the number expected by the end of the trial in a study with survival outcomes. Proschan et al. (2006) explain that using the calculation on events “forces the spending function one has chosen to be identical to one’s actual spending”. Two information fraction scales for interim analyses were considered by Kim et al. (1995), one based on the null hypothesis (H0) of no treatment effect and the other based on a specified alternative hypothesis (HA). Typically, the timing of analyses will follow a schedule specified per study protocol when the expected information fractions are reached.

Bias in tk estimates is often encountered with a maximum duration design in which events are accumulated over a fixed follow-up time on a specified sample size. Several problems can arise when the experimental treatment demonstrates greater benefit while the tk is estimated assuming the null hypothesis. Most notably, the total observed events at the study endpoint is less than the total number of events expected under the null hypothesis due to fewer events occurring and less information being obtained in the experimental group, resulting in conservative estimated tk values and underspending of the Type I error rate as outlined by Kim et al. (1995). Thus, additional follow-up time is required to achieve the total number of events expected and is therefore not practical in maximum duration trials operating under fixed budgets and timelines.

Various methods have been suggested to reduce this bias when estimating tk values. Surrogate information fraction methods, such as exposure time (Lan and Lachin, 1990) or calendar time (Lan and Demets, 1989; Lan et al., 1994), rely on several statistical assumptions that cannot be readily verified. Freidlin et al. (2016) propose an earliest information time (EIT) approach that allows for an earlier stopping time in the maximum information trial.

Others recommend using adjustment methods, such as spending all remaining alpha at the final analysis (Scharfstein et al., 1997; Proschan et al., 2006; Lan and DeMets, 2009) or modifying the boundary values when tk exceeds 1 before the trial end (Kim and Tsiatis, 1990; Kim et al., 1995; Proschan et al., 2006; Proschan and Nason, 2011). Some scholars advise that the final analysis should only occur when an estimated tk reaches 1 (Scharfstein et al., 1997; Lan and DeMets, 2009). This approach helps maintain the Type I error rate while the power of the test is increased. However, one runs the risk of not completing the trial within the funding period.

The aims of this paper are to introduce a novel method, called Total Control Only (TCO), and to examine the characteristics of the TCO method in a maximum duration design with respect to (1) improvements in the estimated information fraction, (2) maintenance of the designed statistical power, (3) preservation of the Type I error rate, and (4) reduction of the study duration by early termination due to efficacy.

2. Motivating Example

The Children’s Oncology Group (COG), a nonprofit organization conducting studies of biology and treatment of pediatric cancer funded by the NCI, has routinely incorporated the alpha spending approach in the interim monitoring of its studies. Anderson and High (2011) assert that when a true difference exists, it is almost always observed early in follow-up for many pediatric oncology trials. Therefore, an accurate estimate of the information fraction as a measure of a study progress is desirable to determine the right amount of significance level, via an alpha spending function for example, and reach the correct conclusion at each interim analysis.

For example, consider AEWS0031, a COG trial comparing standard therapy (temozolomide with irinotecan) to a new regimen incorporating standard therapy plus bevacizumab for children with recurrent medulloblastoma/primitive neuroectodermal tumor (PNET). The study was designed using a two-sided log-rank test of 0.05, with 80% power to detect a hazard ratio of 0.64, favoring the new regimen. Formal interim monitoring for efficacy was to be performed per protocol using the 2nd power alpha spending function αk(tk)=αtk2 by Kim and DeMets (1987) for the first time after 1.5 years of study entry and repeated annually until the the study reached its enrollment target. The final analysis was planned when the last enrolled patient completed the 2-year follow-up. This planned schedule translated into information fractions near 8%, 27%, 51%, 79%, and 100% of the expected total information under H0 (163 events).

At the final analysis, the estimated hazard ratio was 0.69 (95%CI: 0.50–0.97; log rank pvalue: 0.03). It is apparent that the information fractions using the expected number of events under H0 were underestimated and the boundary values were unnecessarily conservative.

3. Proposed Method

3.1. Using the Information Within the Control Group

Without loss of generality, for the discussion in this section we assume survival time follows an exponential distribution with hazard λ and 1:1 randomization. We presume that a study has uniform enrollment over the interval [0,A], follow-up time F, study length L, random censoring, and no loss to follow-up or competing events. We propose to use the information, i.e., the number of events within the control group (or standard regimen), with a spending function approach to compute the amount of alpha spent and the boundary value at any kth interim analysis. Naturally, our proposed method is a measure of study progress assuming H0, and the rejection boundary based on the alpha spending function approach is computed under H0 to test H0 of no treatment effect. Specifically, the estimated information fraction is based on the information from a group for which clinical knowledge and treatment effect on survival are well established in the literature.

Let CO and TX denote the control group and experimental group, respectively. Then, the TCO information fraction is defined by

tkCO=E(dk,COH0)E(DK,COH0)

where k = 1,2 …, K is the ordered number of planned analyses, E(dk) is the expected number of events at each interim analysis, and E(Dk) is the total number of events expected by the end of the study. In practice, the information fraction is estimated by the number of observed events dk at the time of interim analysis divided by the number of events expected by the end of the study, i.e.,

t^kCO=dk,COE(DK,COH0)

In trials with limited funding and a fixed duration, it can be impractical to wait for the study to reach the expected number of events in scenarios where fewer events are observed on the experimental arm due to efficacy. Moreover, the actual direction of treatment outcome, efficacy or inefficacy, is unknown at the design stage. Therefore, the TCO method is useful in addressing the issue of under (or over) spending of the nominal alpha by using information within the control group only.

3.2. Connection of TCO to Standard Information Fraction Scales

In our paper, the two information fraction scales considered by Kim et al. (1995) are given by

tkHA=E(dk,CO)+E(dk,TX)E(DKHA)

and

tkH0=E(dk,CO)+E(dk,TX)E(DKH0)

The bias described in the Introduction arises when one uses an incorrect measure of the information fraction given the observed study data.

When a study follows H0, we have

tkCO=tkH0

and

tkHA>tkH0

which means that both tkH0 and tkCO methods are equivalent, in expectation, as E(dk,Co) = E(dk TX) (albeit TCO has slightly larger variance); hence, spending the same amount of significance level αk. The tkHA method, on the other hand, is an overestimate because E(DK|H0) > E(DK|HA), thus overspends the nominal significance level.

When a study follows HA, we have tkCO<tkHA which means that the tkH0 method is an underestimate of the true measure of the information fraction tkHA. This is due to both methods having the same numerator but different denominators where E(DK|H0) is greater than E(DK|HA).

In Appendix A.1. equation (1), we show the connection between tkH0 and tkCO when HA is true as

tkHA=Ωk×E(dk,CO)E(DK,CO)tkCOfork=1,,KandΩk~1

where Ωk=1+1/Δk1+1/ΔK and Δk is the ratio of the expected number of events in the control group to the expected number of events in the experimental group (see equation (2) in Appendix A.2.). As a study progresses and more information (events) is accumulated, we have Δk → ΔK, hence, Ωk ~ 1.

We use simulation to illustrate the values of Ωk as a study progresses. Table 1 provides the mean values of Ωk for several combinations of λCO and hazard ratio (HR) values under 5000 simulated trials (each with n=280). All simulated trials have a similar design as described in Section 2. The values are chosen to explore how Ωk varies in studies which may have small, moderate, or large λCO and HR. In addition to the typical 3-analysis plan near 30%, 60%, and 100% information fraction (i.e., at 24, 36, and 48 months after study enrollment, depending on design), we also report early analysis times starting at about 2% information fraction (about 6 months) to study how Ωk may vary under sparse accumulated information. Both Ωk and the true information fraction tk at each interim analysis time are calculated using the observed number of events per simulated trial, i.e., tk is the ratio of the observed number of events at the k analysis to the total observed number of events at the end of the trial. The true information fraction tk is reported for comparison. The Ωk values show greater departure from 1 when the true information fraction is small (e.g., less than 10%). This should not be a concern because in practice the first interim analysis is usually planned for tk ≥ 25%. It requires an extremely large observed effect size, which is unlikely for the moderate effect sizes we typically expect in pediatric oncology trials, to stop a study for efficacy when tk < 25%. We observe Ωk values are near 1 as early as 16% in true tk, depending on the setting. With the usual study assumptions, tkCO, in expectation, yields estimates close to tkHA for efficacy monitoring analyses which are typically performed following either the (near) 50%, 75%, and 100% plan or the (near) 33%, 67% plan, and 100% plan. Source code to reproduce the results is available as Supporting Information on the journal’s web page (http://onlinelibrary.wiley.com/doi/10.1002/bimj.201900236/suppinfo).

Table 1.

Mean Ωk by Hazard Ratio (HR) and λCO Values, Assuming Exponential Survival Time.

Calendar Time (Months)
6 12 18 24 30 36 48

λCO HR

0.805 0.25 True tk 0.028 0.100 0.205 0.334 0.482 0.644 1
0.805 0.25 Ωk 1.114 0.930 0.927 0.940 0.955 0.970 1
0.805 0.64 True tk 0.029 0.104 0.211 0.343 0.492 0.652 1
0.805 0.64 Ωk 1.089 0.973 0.963 0.968 0.975 0.983 1
0.805 0.80 True tk 0.030 0.107 0.217 0.349 0.498 0.659 1
0.805 0.80 Ωk 1.124 1.007 0.988 0.986 0.988 0.992 1

0.453 0.25 True tk 0.023 0.086 0.181 0.304 0.451 0.618 1
0.453 0.25 Ωk 1.288 1.015 0.966 0.966 0.974 0.982 1
0.453 0.64 True tk 0.023 0.087 0.184 0.308 0.455 0.622 1
0.453 0.64 Ωk 1.152 1.031 0.991 0.984 0.985 0.989 1
0.453 0.80 True tk 0.024 0.089 0.187 0.312 0.460 0.626 1
0.453 0.80 Ωk 1.145 1.059 1.010 0.999 0.995 0.996 1

0.255 0.25 True tk 0.020 0.076 0.164 0.282 0.427 0.597 1
0.255 0.25 Ωk 1.410 1.150 1.016 0.990 0.989 0.991 1
0.255 0.64 True tk 0.020 0.076 1.165 0.284 0.429 0.599 1
0.255 0.64 Ωk 1.194 1.113 1.027 1.006 0.999 0.997 1
0.255 0.80 True tk 0.020 0.077 0.167 0.286 0.432 0.601 1
0.255 0.80 Ωk 1.144 1.128 1.044 1.016 1.005 1.001 1

3.3. Departure from the Design Parameters

3.3.1. Departure from the Designed Hazard Ratio

For some clinical trials, the true HR is different from the one used in the trial design. When a trial is designed under the proportional hazard assumption with a given baseline survival time distribution such that λTX=λCOHR, changes in HR will only reflect changes in the expected number of events in the experimental group. This change in the experimental group will affect tkHA because tkHA takes into account events from both groups. Since tkCO only considers events within the control group and the control group has a well documented survival rate, tkCO is not influenced by the change in the number of events that occurred in the experimental group. When departure from the designed hazard ratio occurs by an amount ϵ such that equation (1) (Appendix A.1.) becomes

tkHA=1+1/(ϵΔk)1+1/(ϵΔK)×E(dk,CO)E(DK,CO)=1+1/(ϵΔk)1+1/(ϵΔK)×tkCO

which shows that TCO is not influenced by the information from the experimental group.

We use simulation to demonstrate the trend of bias regarding tkHA and tkH0 and the stability of tkCO as the designed HR values depart from the true HR. Table 2 provides the mean values of the estimated information fraction for tkHA, tkH0, and tkCO methods under 5000 simulated trials which have a similar design as described in Section 2. Under each combination of λCO and designed HR, simulated trials are generated for a range of true HR in increasing values. The values are chosen to explore the variability and trend of all 3 estimation methods in studies which may have small, moderate, or large λCO, designed HR, and true HR. Here, we follow the typical 3-interim-analysis plan. For each design, we use the expected number of events under H0 and HA to estimate the true information fraction. The true information fraction tk is reported for comparison and computed using the actual total number of events. Source code to reproduce the results is available as Supporting Information on the journal’s web page (http://onlinelibrary.wiley.com/doi/10.1002/bimj.201900236/suppinfo).

Table 2.

Mean of the Estimated Information Fractions under Departure from the Designed Hazard Ratio

Designed HR λCO True HR Interim k True tka t^kH0 t^kHA t^kCO

025b 0.805 0.45 1 0.34 0.29 0.40 0.36
0.25 0.805 0.45 2 0.65 0.54 0.76 0.67
0.25 0.805 0.45 3 1.00 0.84 1.17 1.00
0.25 0.805 0.64 1 0.35 0.32 0.44 0.36
0.25 0.805 0.64 2 0.65 0.60 0.83 0.67
0.25 0.805 0.64 3 1.00 0.91 1.28 1.00
0.25 0.805 0.72 1 0.35 0.33 0.46 0.36
0.25 0.805 0.72 2 0.66 0.61 0.86 0.67
0.25 0.805 0.72 3 1.00 0.94 1.31 1.00
0.25 0.805 0.80 1 0.35 0.34 0.47 0.36
0.25 0.805 0.80 2 0.66 0.63 0.88 0.67
0.25 0.805 0.80 3 1.00 0.96 1.34 1.00

0.45c 0.112 0.25 1 0.27 0.17 0.23 0.27
0.45 0.112 0.25 2 0.58 0.37 0.49 0.58
0.45 0.112 0.25 3 1.00 0.64 0.85 1.00
0.45 0.112 0.64 1 0.27 0.22 0.30 0.27
0.45 0.112 0.64 2 0.58 0.48 0.65 0.58
0.45 0.112 0.64 3 1.00 0.84 1.11 1.00
0.45 0.112 0.72 1 0.27 0.23 0.31 0.27
0.45 0.112 0.72 2 0.58 0.51 0.68 0.58
0.45 0.112 0.72 3 1.00 0.87 1.16 1.00
0.45 0.112 0.80 1 0.27 0.24 0.32 0.27
0.45 0.112 0.80 2 0.58 0.53 0.71 0.58
0.45 0.112 0.80 3 1.00 0.91 1.21 1.00

0.64d 0.453 0.25 1 0.30 0.21 0.23 0.32
0.64 0.453 0.25 2 0.62 0.42 0.48 0.63
0.64 0.453 0.25 3 1.00 0.68 0.77 1.00
0.64 0.453 0.45 1 0.30 0.24 0.27 0.32
0.64 0.453 0.45 2 0.62 0.49 0.56 0.63
0.64 0.453 0.45 3 1.00 0.79 0.90 1.00
0.64 0.453 0.72 1 0.31 0.28 0.32 0.32
0.64 0.453 0.72 2 0.62 0.57 0.64 0.63
0.64 0.453 0.72 3 1.00 0.91 1.03 1.00
0.64 0.453 0.80 1 0.31 0.29 0.33 0.32
0.64 0.453 0.80 2 0.63 0.59 0.67 0.63
0.64 0.453 0.80 3 1.00 0.94 1.06 1.00
a

Calculated using the True HR

b

For a design with HR = 0.25 and a 2-year λCO = 0.805, N=30, E(DK|HA)=15 and E(DK|H0)=21

c

For a design with HR = 0.45 and a 2-year λCO = 0.112, N=310, E(DK|HA)=45 and E(DK|H0)=60

d

For a design with HR = 0.64 and a 2-year λCO = 0.453, N=280, E(DK|HA)=133 and E(DK|H0)=151

As we anticipate, tkCO is not influenced by departures from the designed HR regardless of λCO values. Thus, it provides appropriate rejection boundaries as opposed to tkHA and tkH0. The convenience of using tkCO to estimate tk in this scenario is elimination of the issue of underestimation or overestimation and a need to adjust for the last boundary value. From our simulation, the tkHA method overestimates the tk when the true HR is greater than the designed HR and underestimates the tk when the true HR is less than the designed HR. For all designs, the bias from tkHA is more severe as the true HR values depart in greater magnitude from the designed HR and improves as the designed HR values approach the true HR. This behavior is expected because tkHA is unbiased under the designed HA. In contrast tkH0 consistently underestimates the true tk. The size of bias decreases as the true HR values approach 1 (null) because tkH0 is unbiased under H0. Examples of departure that are larger than the designed HRs are presented for completeness. In this scenario, we would expect a futility analysis to stop a study early, as a well-designed clinical trial imposes both efficacy and inefficacy monitoring methods.

3.3.2. Departure from the Designed Hazard Rate of the Control Group

In considering the TCO method, we assume the survival rate in the control arm to be the best estimate of the true rate. However, when a significant departure from the designed baseline rate λCO occurs and the proportional hazards assumption holds, all three information fraction methods are affected. It is intuitive that the bias from all three methods is more severe as the designed λCO shows greater departure from the true λco and reduced when the departure is mild. Using a similar design and simulation as in section 3.3.1, Table 3 presents the mean estimated information fractions to explore the trend and size of the bias from 5000 simulated trials when the designed λCO is nearly ±5%, ±20%, and ±50% apart from the true λCO. For mild departures such as ±5%, the bias observed from t^kHA and t^kCO are negligible. For departures larger than ±5%, the observed biases are significant. We note here that for this particular design, it takes a departure of around 20% for t^kH0 to provide an unbiased estimate of the true tk. Similar trend and size of biases are observed from simulations of other designs and are reported in Supplemental Materials 1.1. Source code to reproduce the results is available as Supporting Information on the journal’s web page (http://onlinelibrary.wiley.com/doi/10.1002/bimj.201900236/suppinfo).

Table 3.

Mean of the Estimated Information Fractions under Departure from the Control Rate

HR Designed λCO True λCO Interim k True tk a t^kH0 t^kHA t^kCO

0.64 0.453 0.255 1 0.28 0.17 0.19 0.20
0.64 0.453 0.255 2 0.60 0.36 0.40 0.42
0.64 0.453 0.255 3 1.00 0.59 0.67 0.69

0.64 0.453 0.362 1 0.30 0.23 0.26 0.27
0.64 0.453 0.362 2 0.61 0.47 0.53 0.54
0.64 0.453 0.362 3 1.00 0.76 0.86 0.88

0.64 0.453 0.430 1 0.31 0.26 0.30 0.31
0.64 0.453 0.430 2 0.62 0.53 0.60 0.61
0.64 0.453 0.430 3 1.00 0.85 0.97 0.97

0.64 0.453 0.476 1 0.31 0.28 0.32 0.33
0.64 0.453 0.476 2 0.63 0.57 0.64 0.65
0.64 0.453 0.476 3 1.00 0.90 1.03 1.03

0.64 0.453 0.544 1 0.32 0.31 0.35 0.36
0.64 0.453 0.544 2 0.63 0.62 0.70 0.71
0.64 0.453 0.544 3 1.00 0.98 1.11 1.10

0.64 0.453 0.693 1 0.33 0.37 0.42 0.43
0.64 0.453 0.693 2 0.65 0.71 0.81 0.81
0.64 0.453 0.693 3 1.00 1.11 1.26 1.23
a

True tk Calculated using the True λCO at 2 years for a design with HR = 0.64 and λCO = 0.453, N=280, E(DK|HA)=133 and E(DK|H0)=151

A well-designed clinical trial should include a periodic evaluation of the event rate in the control group. This presents an opportunity to make recommendations to the study committee regarding study design modifications due to early evidence of significant departure from the initially designed control rate. When the study team encounters this scenario, they may want to consider information fraction adjustment methods, such as those proposed by Kim and Tsiatis (1990) and Proschan and Nason (2011), in order to maintain the overall type I error rate given the updated control rate and the amount of type I error rate that has already been spent. This issue will be investigated fully in a separate publication.

4. A Worked Example based on the Motivating Example

In Section 2 we provide a motivating example by considering a phase III trial AEWS0031. The information fractions were estimated under the H0; however, the observed data demonstrated efficacy. In this section, we consider what might have happened if tkCO had been applied.

Table 4 displays the time of interim analyses performed, the true information fraction tk, the observed p-value and hazard ratio (95% CI) at each interim analysis, and cumulative alpha values using two methods of the estimating information fraction: t^kH0 under H0 and t^kCO. We also report t^kHA and its associated cumulative alpha values for completeness. The efficacy interim monitoring analyses were performed by comparing the observed p values to the cumulative alpha values corresponding to t^kH0 under H0. The study completed enrollment in August 2005. The DSMB recommended that study results be released in Fall 2007. It is evident that, compared to the true information fraction tk, t^kH0 was unnecessarily conservative in measuring the information fraction, whereas t^kCO was not. Noticeably, in the final analysis, t^kCO reached a 100% information fraction, whereas t^kH0 was 87%; thus, t^kH0 should have had an adjustment for the final critical value to maintain the nominal 0.05 level.

Table 4.

Interim monitoring of study AEWS0031 and the Comparison of the Observed P Value to Two Information Fraction Method Cumulative Alpha Values

Time true tk P value Est. HR (95% CI) t^kH0 t^kH0 Cum. α t^kCO t^kCO Cum. α t^kHA t^kHA Cum. α

Fall 2002 0.070 0.191 0.42 (0.11–1.62) 0.061 0.0002 0.086 0.0004 0.072 0.0003
Fall 2003 0.246 0.369 0.74 (0.38–1.44) 0.215 0.0023 0.245 0.0030 0.254 0.0032
Fall 2004 0.423 0.373 0.79 (0.48–1.32) 0.368 0.0068 0.405 0.0082 0.435 0.0095
Fall 2005 0.697 0.034 0.65 (0.44–0.97) 0.607 0.0184 0.724 0.0262 0.717 0.0257
Fall 2007 1.000 0.029 0.69 (0.50–0.97) 0.871 0.0379 1.006 0.0506 1.029 0.0529

5. Operating Characteristics by Simulation Studies

For the following discussion, we utilize simulations to explore the operating characteristics of group sequential log-rank tests based on the TCO method in comparison to tkH0 and tkHA methods in terms of the magnitude and direction of bias, significance level, power, average stopping time, and the probability of early stopping under Ha. We simulated the conduct of a typical Phase III superiority maximum duration trial in pediatric oncology with staggered entry and efficacy monitoring while accruing patients. We chose to apply each information fraction estimation method to the same set of simulation samples to best compare when one method fails to stop but another method stops the “trial”. Supplemental Materials 1.2 summarizes the sample size and expected number of events assuming exponential survival times, one-sided significance level of α = 0.05 with at least 80% power, 4 years duration, and 1:1 randomization via the Stata package ART (Friederike M.-S. et al., 2005).

To resemble a typical administrative reporting cycle, there were a maximum of three planned interim analyses during the fixed 4-year duration. Interim analyses were assumed to begin at 2 years after enrollment started and performed at 3 and 4 years after that. Boundaries were generated according to the 2nd power spending function αk(tk)=αtk2 at each of the three interim looks. In this simulation, the increment between any 2 consecutive estimated information fractions was restricted to 0.10 (in value), which guaranteed that the boundary value at one interim analysis look was not larger than that at the previous interim analysis look (Proschan, 1999). Simulation details are provided in Supplemental Materials 1.3. Source code to reproduce the results is available as Supporting Information on the journal’s web page (http://onlinelibrary.wiley.com/doi/10.1002/bimj.201900236/suppinfo).

Table 5 provides the estimated information fractions using the three methods in comparison to the true information fraction tk under H0 and Ha in 5000 simulation samples. As expected, the average estimated information fractions from the tkCO method were near the true information fraction, tk, regardless of whether the data followed H0 or Ha. Both the t^kHA and t^kH0 methods provided good measures of the true information fraction, tk, only when a particular hypothesis was true. Specifically, t^kHA performed well when Ha was true and t^kH0 performed well when H0 was true. Subsequently, the use of t^kH0 when the observed data followed HA or t^kHA when the observed data followed H0 required a boundary adjustment at the final analysis due to overspending or underspending of the alpha value.

Table 5.

Descriptive Statistics of the Estimated Information Fraction using Different Methods at Three Interim Looks Following a 33%-66%-100% Schedule.

Under H0
Under HA
λCO HR k Method tk Mean(SD) Min-Max tk Mean(SD) Min-Max

0.255 0.64 1 t^kH0 0.29 0.29(0.04) 0.15–0.45 0.29 0.24(0.04) 0.11–0.39
0.255 0.64 1 t^kHA 0.29 0.34(0.05) 0.18–0.53 0.29 0.28(0.04) 0.13–0.46
0.255 0.64 1 t^kCO 0.29 0.29(0.06) 0.11–0.54 0.29 0.29(0.06) 0.11–0.50

0.255 0.64 2 t^kH0 0.60 0.60(0.05) 0.39–0.80 0.60 0.51(0.05) 0.34–0.70
0.255 0.64 2 t^kHA 0.60 0.70(0.06) 0.46–0.93 0.60 0.60(0.06) 0.40–0.82
0.255 0.64 2 t^kCO 0.60 0.60(0.08) 0.36–0.89 0.60 0.60(0.08) 0.36–0.88

0.255 0.64 3 t^kH0 1.00 0.99(0.06) 0.76–1.22 1.00 0.85(0.06) 0.64–1.06
0.255 0.64 3 t^kHA 1.00 1.16(0.07) 0.90–1.43 1.00 1.00(0.07) 0.75–1.25
0.255 0.64 3 t^kCO 1.00 0.99(0.09) 0.66–1.30 1.00 0.99(0.09) 0.70–1.38

0.453 0.64 1 t^kH0 0.32 0.32(0.04) 0.17–0.48 0.31 0.27(0.04) 0.13–0.43
0.453 0.64 1 t^kHA 0.32 0.36(0.05) 0.20–0.55 0.31 0.31(0.04) 0.15–0.49
0.453 0.64 1 t^kCO 0.32 0.32(0.06) 0.15–0.54 0.31 0.32(0.06) 0.12–0.56

0.453 0.64 2 t^kH0 0.64 0.64(0.05) 0.46–0.83 0.62 0.55(0.05) 0.38–0.77
0.453 0.64 2 t^kHA 0.63 0.72(0.06) 0.52–0.94 0.62 0.62(0.06) 0.44–0.88
0.453 0.64 2 t^kCO 0.63 0.64(0.07) 0.40–0.93 0.62 0.63(0.07) 0.40–0.89

0.453 0.64 3 t^kH0 1.00 1.00(0.06) 0.79–1.20 1.00 0.88(0.05) 0.70–1.08
0.453 0.64 3 t^kHA 1.00 1.14(0.06) 0.90–1.36 1.00 1.00(0.06) 0.80–1.23
0.453 0.64 3 t^kCO 1.00 1.00(0.08) 0.72–1.27 1.00 1.00(0.08) 0.73–1.26

0.805 0.64 1 t^kH0 0.36 0.36(0.04) 0.22–0.51 0.34 0.31(0.04) 0.17–0.47
0.805 0.64 1 t^kHA 0.36 0.39(0.05) 0.24–0.56 0.34 0.34(0.05) 0.19–0.52
0.805 0.64 1 t^kCO 0.36 0.36(0.06) 0.14–0.56 0.34 0.36(0.06) 0.15–0.62

0.805 0.64 2 t^kH0 0.67 0.67(0.05) 0.51–0.84 0.65 0.60(0.05) 0.43–0.77
0.805 0.64 2 t^kHA 0.67 0.73(0.05) 0.56–0.92 0.65 0.65(0.05) 0.47–0.85
0.805 0.64 2 t^kCO 0.67 0.67(0.06) 0.42–0.93 0.65 0.66(0.07) 0.44–0.92

0.805 0.64 3 t^kH0 1.00 1.00(0.04) 0.84–1.16 1.00 0.91(0.05) 0.75–1.08
0.805 0.64 3 t^kHA 1.00 1.10(0.05) 0.92–1.28 1.00 1.00(0.05) 0.82–1.18
0.805 0.64 3 t^kCO 1.00 1.00(0.06) 0.78–1.22 1.00 1.00(0.06) 0.75–1.22

Table 6 provides the Type I error when H0 was true and the power when Ha was true, in 5000 simulation samples, at each of the three interim looks. When Ha was true, the observed powers by the t^kCO method were comparable to those by the t^kHA method for all interim looks, whereas the observed powers by the t^kH0 method were less accurate for the 1st and 2nd interim looks. When H0 was true, the observed Type I error rates at the final interim look were nearly identical for all three methods. The observed Type I error rate was slightly elevated for the first two interim looks comparing the t^kCO method to the t^kH0 method. Because we encountered the issue of under- and overspending alpha during our simulation runs, we adopted the spending all alpha method by Proschan et al. (2006) to adjust the final boundary value for each information fraction approach separately, which may have affected the observed Type I error rate and power. Specifically, we applied the adjustment method only to the simulated data sets that were not stopped for efficacy at either the 1st or 2nd interim analysis. This reflects how adjustment for the final boundary value would occur in practice to maintain the overall Type I error rate in case of under or over spending alpha.

Table 6.

Percentage of Time that the Study was Stopped for Efficacy using Different Information Fraction Methods (5000 Simulation Samples) Following a 33%-66%-100% Schedule.

Under HA
Under H0
Design Interim k t^kH0 t^kHA t^kCO t^kH0 t^kHA t^kCO

λCO = 0.255, HR=0.64 1 7.6 9.7 11.4 0.4 0.5 0.7
λCO = 0.255, HR=0.64 2 39.6 44.0 44.9 2.1 2.9 2.9
λCO = 0.255, HR=0.64 3 80.9 80.5 80.7 5.5 5.6 5.8

λCO = 0.453, HR=0.64 1 9.5 11.0 13.2 0.3 0.5 0.6
λCO = 0.453, HR=0.64 2 43.0 47.2 47.9 1.8 2.3 2.4
λCO = 0.453, HR=0.64 3 81.6 81.0 81.2 4.5 4.5 4.6

λCO = 0.805, HR=0.64 1 12.9 14.4 16.3 0.3 0.5 0.7
λCO = 0.805, HR=0.64 2 45.9 49.0 49.8 2.0 2.7 2.6
λCO = 0.805, HR=0.64 3 79.8 79.4 79.6 4.9 5.0 5.0

Table 7 provides the distribution of the number of rejections among the simulation samples that were stopped due to efficacy and the overall stopping time when the alternative hypothesis was true. Ideally, we expect a study to be stopped as soon as there is evidence of “efficacy”. From our simulation, it appears that the t^kCO method was more sensitive in detecting a treatment effect early on compared to t^kH0 and t^kHA Overall, nearly the same average stopping time was observed from all three methods. Additional results are presented in Supplemental Materials 1.4, Tables 3, 4, and 5.

Table 7.

Distribution of the Number of Rejections(%) Due to Efficacy and Average Stopping Times using Different Information Fraction Methods Following a 33%-66%-100% Schedule.

Under HA
Design Interim k t^kH0 t^kHA t^kCO

λCO = 0.255, HR=0.64 1 9.4 12.0 14.1
λCO = 0.255, HR=0.64 2 39.6 42.6 41.5
λCO = 0.255, HR=0.64 3 51.1 45.4 44.4

λCO = 0.255, HR=0.64 Avg. stopping time 3.4 3.3 3.3

λCO = 0.453, HR=0.64 1 11.7 13.6 16.3
λCO = 0.453, HR=0.64 2 41.1 44.7 42.7
λCO = 0.453, HR=0.64 3 47.2 41.7 41.0

λCO = 0.453, HR=0.64 Avg. stopping time 3.4 3.3 3.2

λCO = 0.805, HR=0.64 1 16.2 18.1 20.5
λCO = 0.805, HR=0.64 2 41.3 43.6 42.1
λCO = 0.805, HR=0.64 3 42.5 38.3 37.4

λCO = 0.805, HR=0.64 Avg. stopping time 3.3 3.2 3.2

As mentioned earlier, trial planing includes when and how often to execute interim analyses. For example, there may be two analyses in total with an interim analysis near 50% information fraction or three analyses in total including two interim analyses near 33% and 66% or near 50% and 75%. For practical purposes, we also examined the characteristics of the TCO compared to other information fraction estimation methods when timing of interim analysis are near 50%, 75%, and 100%. Similar characteristics were noticed with the TCO method as seen in Supplemental Materials 1.5, Tables 6, 7, and 8. Under the proportional hazards assumption, Dang (2015) showed that these same conclusions can be drawn from simulations which considered more interim looks (K=5), other types of alpha spending function, such as the O’Brien-Fleming spending function, a nonmixture cure model with Weibull Kernel survival times, and other covariance structures.

6. Summary

In this paper, we introduce the TCO method, which differs from existing methods in that the information fraction is defined entirely within the control arm. As the rejection boundary based on the alpha spending function approach is computed under H0, the TCO method is a natural measure of study progress assuming H0 to test the null hypothesis of no treatment effect. We showed that under H0, the TCO method and tkH0 are equivalent in expectation; hence, spending the same amount of αk. Under HA, the TCO method yields estimates that are close to tHA at any kth analysis, particularly when more information becomes available as a study progresses. By simulation studies, we have illustrated that all three information fraction methods are influenced by deviation from the control rate; however, the TCO method is not sensitive to hazard ratio deviation.

In our simulations, which implemented a maximum duration design setting, we demonstrated that the TCO method has better sensitivity in detecting a treatment effect at an early interim look and maintains the Type I error rate and power comparable to both the tHA and tkH0 methods.

We recognize that interim monitoring is a complex issue. We propose an option that is simple to apply, provides an unbiased estimate of the information fraction even in the case of deviation from the designed hazard ratio, and does not rely on assumptions that are impossible to verify at the design stage. It is important to disseminate a trial’s results to the scientific community early should the opportunity arise to stop the trial prior to full follow-up and conclude that the experimental therapy is superior. For these reasons, the TCO method is a helpful and a good choice when designing a maximum duration superiority design. We acknowledge that the overall characteristics of a trial depend on both efficacy and futility monitoring rules.

This study has a few limitations. First, the interim monitoring procedure was conducted following equally spaced calendar time intervals to resemble a typical administrative reporting cycle. We did not investigate a scenario in which, during the course of a study, information can accumulate faster earlier or later. Second, the αtk2 and O’Brien-Fleming spending functions were utilized because these methods are commonly used in NCI-sponsored trials. Other alpha spending functions that produce either more or less conservative rejection boundaries or have increasing critical values at subsequent analyses were not studied. Third, the simulation data were derived under a proportional hazards assumption for which an exponential and a nonmixture cure model with Weibull Kernel distribution functions were used. These are the typical survival functions that depict two extreme survival likelihoods seen in childhood cancer, i.e., the exponential survival function represents the eventual failures, and the cure model represents the eventual proportion of cured patients. Although we anticipate no major deviation from what we have observed, it would require further study to investigate the characteristics of the TCO method regarding the limitations identified here.

The properties of the TCO method in a reduction in therapy design trial and in comparison to the EIT method will be presented in a separate paper. An open issue is the over- or underestimation of the final information fraction that affects Type I error rate under H0 and power under HA. Future work could evaluate whether the power and Type I error rate can be maintained using the TCO method with adjustment approaches such as those by Kim and Tsiatis (1990) and Proschan and Nason (2011). It is also of interest to determine the optimal information fraction at which to perform re-estimation or adjustment of the maximum information. Implied by Proschan and Nason (2011) without explanation, re-estimation can occur at the study midpoint. In the case of over- or underestimation of the total events, a mid-way adjustment may occur either too late or too early in the study.

Supplementary Material

SUPPLEMENTAL MATERIALS
Supp_Info - code_data

Acknowledgements

The authors thank the anonymous reviewers and the editor for their helpful comments.

Appendix

A.1. Connection between tkHA and tkCO under the Alternative Hypothesis

In showing the connection between tkHA and tkCO, we have

tkHA=E(dk,CO)+E(dk,TX)E(EventsHA)=E(dk,CO)+E(dk,TX)E(DK,CO)+E(DK,TX)=E(dk,CO)+E(dk,CO)/ΔkE(DK,CO)+E(DK,CO)/ΔK=1+1/Δk1+1/ΔK×E(dk,CO)E(DK,CO)=Ωk×E(dk,CO)E(DK,CO)tkCO (1)

for k = 1, …, K and Ωk ~ 1

A.2. Ratio of the Expected Number of Events

Suppose that survival time T has a density function f (t), survival function S(t), and hazard function λ(t); and censoring time C has a density function g(t) and survival function H(t). Let (Xi, δi), i = 1, 2 …, n represent time-to-event survival data with sample size n and independent censoring such that Xi = min(Ti, Ci) and δi = I(Ti ≤ Ci). Furthermore, assume that patients enter the study at times Ei~i.i.dQE(t)=P(Eit). Supplemental Materials 1.6 provides the general form of the expected number of events in a sample of size n with entry distribution QE(Lt). Without loss of generality, we assume survival time follows an exponential distribution with hazard λ and 1:1 randomization. We presume that a study has uniform entry over the interval [0,A], follow-up time F, study length L = A + F, random censoring, and no loss to follow-up or competing events. Thus, QE(Lt) is given by

QE(Lt)={1iftLALtAifLA<tL0ift>L

and the expected number of events at the end of a study for sample size n is

ExpectedEvents=n2(1eλLλA[eλA1])

Given the known distribution of SCO(t) and QE(Lt), the ratio of the expected number of events in the control group to the experimental group is reduced to

E(DK,CO)E(DK,TX)=(1eλTXL×HRλTXA×HR[eλTXA×HR1])(1eλTXL[eλTXA1]λTXA)=λCOAeλCOF+eλCOLλTXAeλTXF+eλTXL×1HR=ΔK (2)

Similarly, the relationship E(dk,CO)E(dk,TX)=Δk holds at each interim analysis. As a study progresses, more information is accumulated, i.e., accrual time and/or follow-up time increases, Δk → Δk.

Footnotes

Conflict of Interest

The authors have declared no conflicts of interest.

Supporting Information for this article is available from the author or on the WWW under 10.1002/bimj.201900236

References

  1. Anderson JR and High R (2011). Alternatives to the standard fleming, harrington, and o’brien futility boundary. Clinical Trials 8, 270–276. [DOI] [PubMed] [Google Scholar]
  2. Dang H (2015). Interim analysis methods based on elapsed information time: strategies for information time estimation. PhD thesis, University of Southern California. [Google Scholar]
  3. DeMets DL and Lan KKG (1994). Interim analysis: The alpha spending function approach. Statistics in Medicine 13, 1341–1352. [DOI] [PubMed] [Google Scholar]
  4. Freidlin B, Othus M, and Korn EL (2016). Information time scales for interim analyses of randomized clinical trials. Clinical Trials 13, 391–399. [DOI] [PubMed] [Google Scholar]
  5. Friederike M-S,B, Royston P, and Babiker A (2005). The Stata Journal. The Stata Journal 5, 123–129. [Google Scholar]
  6. Kim K, Boucher H, and Tsiatis AA (1995). Design and Analysis of Group Sequential Logrank Tests in Maximum Duration Versus Information Trials. Biometrics 51, 988–1000. [PubMed] [Google Scholar]
  7. Kim K and DeMets DL (1987). Design and Analysis of Group Sequential Tests Based on the Type I Error Spending Rate Function. Biometrika Trust 74, 149–154. [Google Scholar]
  8. Kim K and Tsiatis AA (1990). Study Duration for Clinical Trials with Survival Response and Early Stopping Rule. Biometrics 46, 81. [PubMed] [Google Scholar]
  9. Lan KG, Reboussin DM, and DeMets DL (1994). Information and information fractions for design and sequential monitoring of clinical trials. Communications in Statistics - Theory and Methods 23, 403–420. [Google Scholar]
  10. Lan KKG and DeMets D (2009). Further Comments on the Alpha-Spending Function. Statistics in Biosciences 1, 95–111. [Google Scholar]
  11. Lan KKG and DeMets DL (1983). Discrete Sequential Boundaries for Clinical Trials. Biometrika Trust 70, 659–663. [Google Scholar]
  12. Lan KKG and Demets DL (1989). Group sequential procedures: Calendar versus information time. Statistics in Medicine 8, 1191–1198. [DOI] [PubMed] [Google Scholar]
  13. Lan KKG and Lachin JM (1990). Implementation of Group Sequential Logrank Tests in a Maximum Duration Trial. Biometrics 46, 759. [PubMed] [Google Scholar]
  14. Proschan MA (1999). Properties of Spending Function Boundaries. Biometrika Trust 86, 466–473. [Google Scholar]
  15. Proschan MA, Lan KKG, and Wittes JT (2006). Statistical Monitoring of Clinical Trials: A Unified Approach. Springer. [Google Scholar]
  16. Proschan MA and Nason M (2011). A Note on Correction of Information Time in a Survival Trial Using an Alpha Spending Function. Statistics in Biosciences 3, 250–259. [Google Scholar]
  17. Scharfstein DO, Tsiatis AA, and Robins JM (1997). Semiparametric Efficiency and Its Implication on the Design and Analysis of Group-Sequential Studies. Journal of the American Statistical Association 92, 1342–1350. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SUPPLEMENTAL MATERIALS
Supp_Info - code_data

RESOURCES