Skip to main content
Biostatistics (Oxford, England) logoLink to Biostatistics (Oxford, England)
. 2015 Dec 14;17(2):390–403. doi: 10.1093/biostatistics/kxv050

Semiparametric regression for the weighted composite endpoint of recurrent and terminal events

Lu Mao 1, D Y Lin 1,*
PMCID: PMC4804115  PMID: 26668069

Abstract

Recurrent event data are commonly encountered in clinical and epidemiological studies. A major complication arises when recurrent events are terminated by death. To assess the overall effects of covariates on the two types of events, we define a weighted composite endpoint as the cumulative number of recurrent and terminal events properly weighted by the relative severity of each event. We propose a semiparametric proportional rates model which specifies that the (possibly time-varying) covariates have multiplicative effects on the rate function of the weighted composite endpoint while leaving the form of the rate function and the dependence among recurrent and terminal events completely unspecified. We construct appropriate estimators for the regression parameters and the cumulative frequency function. We show that the estimators are consistent and asymptotically normal with variances that can be consistently estimated. We also develop graphical and numerical procedures for checking the adequacy of the model. We then demonstrate the usefulness of the proposed methods in simulation studies. Finally, we provide an application to a major cardiovascular clinical trial.

Keywords: Counting process, Dependent censoring, Intensity function, Inverse probability of censoring weighting, Mean function, Survival analysis

1. Introduction

In clinical and epidemiological studies, a subject can potentially experience multiple episodes of an adverse event, such as headache and pyogenic infection (Fleming and Harrington, 1991). Traditional survival analysis methods focusing on the time to the first event do not make full use of available data or characterize the entire clinical experience of the subject. It is statistically more efficient and clinically more meaningful to consider all recurrent events.

A number of statistical models and methods have been developed to analyze recurrent event data. Specifically, Andersen and Gill (1982) proposed the multiplicative intensity model by treating recurrent events as a non-homogeneous Poisson process, under which the risk of recurrence does not depend on the prior event history. To remove the Poisson assumption, Pepe and Cai (1993), Lawless and Nadeau (1995), and Lin and others (2000), hereafter referred to as LWYY, proposed to model the marginal rate function, which is easier to interpret than the intensity function. Prentice and others (1981) considered the hazard functions of the gap times between recurrent events, while Wei and others (1989) considered the marginal hazard function of each recurrent event.

Repeated occurrences of a serious adverse event, such as heart failure (Pfeffer and others, 2003), opportunistic HIV infection (Vlahov and others, 1991; Abrams and others, 1994), and cancer (Byar, 1980), tend to cause deterioration of health so that the subject may die during the course of the study. This phenomenon poses two challenges. First, the presence of a terminal event (i.e., death) invalidates the aforementioned methods for analyzing recurrent event data. Second, assessing the effects of treatments or other covariates on the entire clinical experience of a patient would need to take into account both recurrent and terminal events.

Two major approaches have been suggested to analyze recurrent and terminal events. The first one deals with the marginal rate or mean function of recurrent events, acknowledging the fact that there is no recurrent event after the terminal event (Cook and Lawless, 1997; Ghosh and Lin, 2000; Wang and others, 2001; Ghosh and Lin, 2002; Chen and Cook, 2004; Schaubel and others, 2006; Cook and others, 2009). The second one is the joint modelling for the two types of events (Huang and Wang, 2004; Liu and others, 2004; Ye and others, 2007; Zeng and Lin, 2009; Zeng and Cai, 2010). Both approaches treat recurrent and terminal events as two separate endpoints. The marginal rate and mean functions are affected by the distribution of the terminal event. The joint modelling approach assumes that a latent variable captures the dependence among recurrent events as well as the dependence between recurrent and terminal events, which is a simplistic and unverifiable assumption. For these reasons, the two approaches have rarely been used in actual clinical trials.

The current practice is to use the time to the first composite event (i.e, the first recurrent event or the terminal event, whichever occurs first) (Pfeffer and others, 2003; Yusuf and others, 2003; Anand and others, 2009; O'Connor and others, 2009; Zannad and others, 2011). This simple strategy is line with the ICH guidelines (Lewis, 1999) that “There should generally be only one primary variable” and that “If a single primary variable cannot be selected from multiple measurements associated with the primary objective, another useful strategy is to integrate or combine the multiple measurements into a single or composite variable, using a predefined algorithm.” The first composite event, however, is statistically inefficient and clinically unsatisfactory because it disregards all the events beyond the first one and does not distinguish recurrent and terminal events such that a subject who has a hospital admission first is treated the same as a subject who dies first.

Based on our recent conversations with cardiologists and regulatory statisticians, a weighted composite endpoint of all recurrent and terminal events, i.e., the cumulative number of recurrent and terminal events properly weighted by their degrees of severity, is an appealing alternative that is likely to be accepted by clinicians and regulatory agencies. This endpoint is a natural extension of the current practice of the first composite event so as to capture all the clinical events experienced by each patient. Compared with the first composite event, the weighted composite event process is not only statistically more efficient due to the use of all available data but also clinically more meaningful due to incorporation of the entire clinical experience of each patient and appropriate weighting of different types of events. This proposal reflects the recommendation of Neaton and others (2005) to optimally weight components of composite outcomes and to better use the entire event history of patients. An unweighted version of this composite endpoint is being used in a major clinical trial on the efficacy of an angiotensin receptor neprilysin inhibitor in reducing heart failures and cardiovascular death for patients with preserved ejection fraction.

The purpose of this article is to show how to properly analyze the weighted composite event process. We formulate the effects of treatments and other covariates on the weighted composite event process through a semiparametric proportional rates model and provide the corresponding inference procedures. In particular, we derive a non-parametric test statistic for assessing the treatment difference that does not involve any modelling assumption. The nonparametric nature is highly attractive for regulatory purposes. Because it is tempting to apply LWYY to the (unweighted) composite event process, we investigate the potential pitfalls of this strategy. We demonstrate the superiority of the new methods through simulated and real data.

2. Methods

Suppose that there are Inline graphic different types of events, including the terminal event, where Inline graphic is a fixed positive integer. For Inline graphic, let Inline graphic denote the cumulative number of the Inline graphicth type of event the subject has experienced by time Inline graphic. We assign the weight Inline graphic to the Inline graphicth type of event according to its relative severity and define the weighted sum of the Inline graphic counting processes: Inline graphic. Let Inline graphic denote a Inline graphic-vector of possibly time-varying external covariates, and Inline graphic denote the survival time, i.e., time to the terminal event. We specify that Inline graphic has multiplicative effects on the marginal rate function of Inline graphic, i.e.,

2. (2.1)

where Inline graphic is a Inline graphic-vector of unknown regression parameters, and Inline graphic is an arbitrary increasing function. Note that the dependence structure among recurrent and terminal events is completely unspecified. For time-invariant covariates, model (2.1) reduces to the proportional means model: Inline graphic, where Inline graphic is the baseline mean function.

In practice, Inline graphic and Inline graphic are subject to right censoring. Let Inline graphic be the censoring time, which is assumed to be independent of Inline graphic and Inline graphic conditional on Inline graphic. Write Inline graphic, Inline graphic, and Inline graphic, where Inline graphic, and Inline graphic is the indicator function. For a study with Inline graphic subjects, the data consist of Inline graphic Inline graphic.

The only available approach to fitting model (2.1) is the LWYY estimating function

2.

where Inline graphic, and Inline graphic denotes the end of the study. In this approach, which is only applicable to the unweighted composite event process, death is part of the composite endpoint and also a censoring variable. This estimating function can be written as

2.

where Inline graphic is some positive function. For this estimating function to be unbiased, we must have Inline graphic, i.e., Inline graphic. Thus, the LWYY inference pertains to the conditional rate

2. (2.2)

where Inline graphic. The integrated conditional rate does not have a clinical interpretation and is always greater than the marginal mean because Inline graphic.

It is possible for models (2.1) and (2.2) to hold for the same Inline graphic, in which case LWYY would provide valid inference on Inline graphic. For example, suppose that the process Inline graphic has the intensity function Inline graphic with respect to the filtration Inline graphic, where Inline graphic is a positive frailty. Assume also that the distribution of the survival time Inline graphic does not depend on Inline graphic or Inline graphic. Then

2.

and

2.

Thus, proportionality holds on both the marginal and conditional rate functions, although the baseline functions are different. If Inline graphic depends on Inline graphic or the dependence between recurrent events and death cannot be explained by a simple frailty, then the conditional rate model does not hold and LWYY will estimate a quantity that is different from Inline graphic of model (2.1).

To make valid inference for model (2.1), we need to exclude the dependent censoring by death from the “at-risk” indicators in the estimating function. Specifically, a subject should remain in the risk set until independent censoring occurs even if the subject dies before the independent censoring time (i.e., Inline graphic). In other words, the at-risk process is Inline graphic instead of Inline graphic. If there is no early withdrawal or loss to follow-up, then the censoring is all administrative (i.e., caused by the termination of the study) and the censoring time is known to be the difference between the study end date and the subject's entry time. Replacing Inline graphic in the LWYY estimating function with Inline graphic, we obtain the estimating function

2. (2.3)

which can be written as

2.

This is an unbiased estimating function because Inline graphic under model (1).

In most studies, there is random loss to follow-up in addition to administrative censoring, so that Inline graphic is not fully observed. Thus, we use the inverse probability of censoring weighting technique (Robins and Rotnitzky, 1992). Define

2.

where Inline graphic. Clearly, Inline graphic. We estimate Inline graphic by

2.

where Inline graphic is the estimator of Inline graphic under the proportional hazards model (Cox, 1972)

2. (2.4)

If Inline graphic is discrete, we may set Inline graphic to be the Kaplan–Meier estimator for covariate value Inline graphic. Replacing Inline graphic in (2.3) with

2.

we obtain an estimating function that allows unknown censoring times

2. (2.5)

Let Inline graphic denote the root of Inline graphic, which is obtained by the Newton–Raphson algorithm. The estimator Inline graphic is consistent and asymptotically normal with a covariance matrix estimator given in Section S.1 of supplementary material available at Biostatistics online. We make inference about Inline graphic or a subset of Inline graphic by the Wald method. If Inline graphic is the treatment indicator and Inline graphic pertains to the treatment-specific Kaplan–Meier estimator, then the Wald statistic provides a nonparametric test for the equality of the two mean functions (since there is no modelling assumption under Inline graphic).

To estimate the baseline mean function Inline graphic, we employ a weighted version of the Breslow estimator

2.

This estimator is consistent and asymptotically normal with a covariance function given in Section S.1 of supplementary material available at Biostatistics online. Since Inline graphic is nonnegative, we construct the confidence interval for Inline graphic based on the log transformation. To be specific, the Inline graphic confidence interval for Inline graphic is given by Inline graphic, where Inline graphic is the variance estimator for Inline graphic, and Inline graphic is the Inline graphicth percentile of the standard normal distribution. Incidentally, the LWYY estimator of Inline graphic is

2.

which overestimates Inline graphic because Inline graphic for all Inline graphic and Inline graphic.

To assess the adequacy of model (2.1), we define Inline graphic and Inline graphic. Because the Inline graphic are mean-zero processes under model (2.1), we plot the cumulative sum of the Inline graphic against the model component of interest. Specifically, to check the functional form of the Inline graphicth (time-invariant) covariate, we consider the cumulative-sum process

2.

where Inline graphic is the Inline graphicth component of Inline graphic. To check the exponential link function, we consider

2.

To check the proportionality assumption, we consider the standardized “score” process

2.

where Inline graphic is a covariance matrix estimator for Inline graphic. To check the overall fit of the model, we consider

2.

We show in Section S.2 of supplementary material available at Biostatistics online, that, under model (2.1), all the above processes are asymptotically zero-mean Gaussian processes whose distributions can be approximated by Monte Carlo simulation along the lines of LWYY. We can graphically compare the observed cumulative-sum process with a few realizations from its null distribution or perform a numerical test based on the maximum absolute value of the process.

3. Simulation Studies

We assess the finite-sample performance of the new methods through extensive simulation studies. We consider one sequence of recurrent events, along with a terminal event. In order to compare with existing methods, we focus on the unweighted version of the composite endpoint. It is not trivial to generate the composite event process that satisfies the proportional means assumption. We outline below our data generation scheme while relegating the details to Section S.3 of supplementary material available at Biostatistics online.

Let Inline graphic be a homogeneous Poisson process with intensity Inline graphic, and let Inline graphic be a stopping time such that there is at least one event in the interval Inline graphic. Then by labeling the last event in Inline graphic as the terminal event, we have a well-defined composite event process given by Inline graphic. If the distribution of Inline graphic is independent of Inline graphic and uninformative about Inline graphic, the optional sampling theorem implies that

3.

Thus, the process satisfies the proportional means assumption. In fact, given some appropriate Inline graphic, we can simulate Inline graphic that follows the exponential distribution with hazard Inline graphic. Furthermore, we can introduce a frailty term Inline graphic to the Inline graphic and Inline graphic of each subject so as to induce dependence among the event times of the same subject. We let Inline graphic and Inline graphic. Let the administrative censoring time be distributed as Inline graphic and the random loss of follow-up be distributed as exponential with hazard 0.1. Let the frailty term Inline graphic follow the gamma distribution with mean 1 and variance Inline graphic. Under these conditions, each subject has an average of 2–3 events, and the censoring rate is Inline graphic30%.

We conduct two sets of simulation studies to compare the new and LWYY methods for making inference on Inline graphic. In the first set, we let Inline graphic for all subjects, so that the two treatment groups have the same distribution of the terminal event. The results are displayed in Table 1. The new estimator Inline graphic is virtually unbiased, and its variance estimator accurately reflects the true variation. As expected, LWYY also provides correct inference since the simulation set-up also conforms to the conditional rate model (2.2).

Table 1.

Simulation results comparing the new and LWYY methods in estimating the treatment difference under equal distributions of death

New
LWYY
Inline graphic Inline graphic Inline graphic Bias SE SEE CP Bias SE SEE CP
100 0 0 0.002 0.158 0.161 0.952 0.002 0.163 0.165 0.951
100 0 0.5 -0.001 0.150 0.150 0.957 -0.002 0.149 0.149 0.946
100 0.25 0 -0.001 0.200 0.198 0.954 0.003 0.205 0.204 0.955
100 0.25 0.5 0.005 0.192 0.190 0.953 0.003 0.191 0.188 0.948
100 0.5 0 -0.003 0.240 0.243 0.953 0.003 0.240 0.239 0.952
100 0.5 0.5 0.007 0.213 0.214 0.957 0.002 0.212 0.212 0.950
200 0 0 0.002 0.111 0.114 0.945 0.002 0.111 0.114 0.956
200 0 0.5 0.001 0.106 0.106 0.951 0.002 0.107 0.109 0.951
200 0.25 0 0.003 0.140 0.141 0.944 -0.002 0.143 0.144 0.945
200 0.25 0.5 0.000 0.130 0.131 0.952 0.005 0.132 0.130 0.946
200 0.5 0 -0.001 0.164 0.162 0.948 0.002 0.168 0.167 0.950
200 0.5 0.5 0.000 0.148 0.146 0.944 0.001 0.156 0.153 0.952
500 0 0 0.001 0.067 0.067 0.951 0.002 0.072 0.071 0.944
500 0 0.5 0.001 0.066 0.064 0.954 0.000 0.069 0.068 0.950
500 0.25 0 0.002 0.093 0.094 0.947 0.002 0.085 0.083 0.945
500 0.25 0.5 0.002 0.088 0.087 0.953 -0.002 0.086 0.089 0.952
500 0.5 0 -0.001 0.101 0.099 0.954 -0.002 0.106 0.108 0.955
500 0.5 0.5 0.000 0.091 0.090 0.948 -0.002 0.097 0.097 0.951

Bias is the bias of the parameter estimator Inline graphic, SE is the empirical standard error of Inline graphic, SEE is the empirical mean of the standard error estimator of Inline graphic, and CP is the coverage probability of the Inline graphic confidence interval. Each entry is based on 10 000 replicates.

In the second set of studies, we let the distributions of the stopping time Inline graphic differ between the two groups, such that the hazards of death are different while the mean functions of Inline graphic are the same between the two groups (i.e., Inline graphic). Specifically, we generate a homogeneous Poisson process with intensity Inline graphic for both groups and label the last event in the time interval Inline graphic as death with probabilities Inline graphic and Inline graphic for groups 0 and 1, respectively. We let Inline graphic and Inline graphic or 0.5. Thus, the mean functions for the two groups are both Inline graphic for Inline graphic but their death rates are different. We let the administrative censoring time be distributed as Inline graphic, let Inline graphic, and keep the other conditions the same as before. As shown in Table 2, the new estimator is approximately unbiased and the corresponding test has correct type I error. In contrast, the LWYY method is biased and its type I error is inflated; the problem worsens as the death rates between the two groups become more different and as the sample size increases.

Table 2.

Simulation results comparing the new and LWYY methods in estimating and testing the treatment difference under unequal distributions of death

New
LWYY
Inline graphic Inline graphic Bias Size Bias Size
100 0.2 0.002 0.052 0.017 0.058
100 0.3 -0.004 0.046 0.037 0.108
100 0.5 0.005 0.054 0.133 0.208
200 0.2 0.009 0.053 0.016 0.066
200 0.3 0.006 0.058 0.039 0.204
200 0.5 -0.005 0.052 0.131 0.321
500 0.2 -0.002 0.045 0.018 0.102
500 0.3 0.002 0.048 0.038 0.389
500 0.5 -0.004 0.054 0.132 0.615

Bias is the bias of the parameter estimator, and size is the empirical type I error of the Wald statistic for testing Inline graphic at the nominal significance level of Inline graphic. Each entry is based on 10 000 replicates.

We adopt the first simulation set-up to assess the performance of the new and LWYY methods for estimating the baseline mean function Inline graphic. By treating death as censoring, the LWYY method will over-estimate the mean function. Indeed, LWYY estimates Inline graphic, which is strictly greater than Inline graphic. Simulation results are summarized in Table 3 and are consistent with the expectations.

Table 3.

Simulation results comparing the new and LWYY methods in estimating the baseline mean function

New
LWYY
Inline graphic Inline graphic Inline graphic Inline graphic Mean SE SEE CP Mean SE SEE CP
100 0 1 0.571 0.571 0.084 0.082 0.952 0.616 0.092 0.095 0.927
2 1.088 1.085 0.134 0.137 0.953 1.246 0.155 0.158 0.796
3 1.555 1.560 0.175 0.176 0.945 1.900 0.223 0.221 0.610
0.5 1 0.558 0.556 0.106 0.104 0.944 0.590 0.119 0.120 0.923
2 1.041 1.041 0.173 0.170 0.947 1.182 0.208 0.206 0.863
3 1.463 1.461 0.230 0.232 0.950 1.753 0.299 0.301 0.784
200 0 1 0.571 0.570 0.059 0.062 0.950 0.617 0.064 0.067 0.892
2 1.088 1.091 0.099 0.098 0.951 1.250 0.115 0.117 0.681
3 1.555 1.558 0.122 0.124 0.944 1.907 0.158 0.159 0.595
0.5 1 0.558 0.558 0.071 0.070 0.954 0.593 0.079 0.077 0.915
2 1.041 1.042 0.116 0.119 0.951 1.182 0.140 0.139 0.819
3 1.463 1.459 0.167 0.169 0.948 1.750 0.205 0.205 0.641
500 0 1 0.571 0.572 0.038 0.036 0.954 0.610 0.046 0.048 0.828
2 1.088 1.083 0.061 0.062 0.953 1.251 0.070 0.068 0.616
3 1.555 1.559 0.083 0.083 0.951 1.903 0.096 0.097 0.277
0.5 1 0.558 0.554 0.045 0.047 0.945 0.590 0.047 0.049 0.736
2 1.041 1.042 0.077 0.078 0.949 1.177 0.096 0.097 0.656
3 1.463 1.459 0.102 0.103 0.945 1.750 0.131 0.129 0.309

Mean is the empirical mean of Inline graphic, SE is the empirical standard error of Inline graphic, SEE is the empirical mean of the standard error estimator of Inline graphic, and CP is the coverage probability of the Inline graphic log-transformed confidence interval. Each entry is based on 10 000 replicates.

We also compare the power of the new method using different weighting schemes with the current practice of performing the Cox regression on the time to the first composite event. The results for the first simulation set-up are shown in Table S1 of supplementary material available at Biostatistics online. The power of the new method decreases as the weight on death increases. This is not surprising since the distributions of death are identical between the two groups. For all weighting schemes, the new method yields much higher power than the Cox regression.

Next, we consider mis-specified censoring distributions. We use the first simulation set-up but generate the time to random loss of follow-up Inline graphic from the proportional odds model: Inline graphic. We estimate the censoring distributions by the Kaplan–Meier estimator or under the Cox model. The results are summarized in Table S2 of http://biostatistics.oxfordjournals.org/lookup/suppl/doi:10.1093/biostatistics/kxv050/-/DC1 online. Under the Cox model, the type I error is only slightly inflated, and the power tends to be higher than the use of the Kaplan–Meier estimator.

Finally, we evaluate the type I error of the supremum tests for model adequacy. The simulation set-up is the same as the first one except that we let Inline graphic be the indicator variable and add a continuous variable Inline graphic that is standard normal. For Inline graphic, we simulate 1000 datasets with Inline graphic. For each dataset, we obtain 1000 realizations from the null distribution to perform the supremum test at the nominal significance level of 0.05. We assess the functional form of Inline graphic, the proportionality assumption on Inline graphic, the exponential link function, and the overall goodness of fit. The empirical type I error rates are Inline graphic, Inline graphic, Inline graphic and Inline graphic, respectively. Thus, the goodness-of-fit tests are accurate for practical use.

4. A real example

Heart Failure: A Controlled Trial Investigating Outcomes of Exercise Training (HF-ACTION) was a randomized controlled clinical trial to evaluate the efficacy and safety of exercise training among patients with heart failure (O'Connor and others, 2009). A total of 2331 medically stable outpatients with heart failure and reduced ejection fraction were recruited between April 2003 and February 2007 at 82 centers in the USA, Canada, and France. Patients were randomly assigned to usual care alone or usual care plus aerobic exercise training that consists of 36 supervised sessions followed by home-based training. The usual care group consisted of 1172 patients (follow-up data not available for 1 patient), and the exercise training group consisted of 1159 patients. There were a large number of hospital admissions (due to heart failure, other cardiovascular causes, or non-cardiovascular causes) and a considerable number of deaths in each treatment arm, as shown in Table S3 of supplementary material available at Biostatistics online.

The primary endpoint was a composite of all-cause mortality and all-cause hospitalization. Secondary endpoints included the composite of cardiovascular mortality and cardiovascular hospitalization, and the composite of cardiovascular mortality and heart failure hospitalization. Under the Cox models on the time to the first event adjusting for heart failure etiology (ischemic or not), the p-values for these three endpoints were found to be 0.13, 0.14, and 0.06, respectively (O'Connor and others, 2009). This analysis disregarded all the clinical events that occurred after the first one and attached the same clinical importance to hospitalization and death.

To provide a statistically more efficient and clinically more relevant evaluation of the benefits of exercise training, we use the proposed weighted composite event process for death and recurrent hospitalization. For each of the three endpoints, we first consider an unweighted version of the composite event process—each event receives the same weight. To reflect the unequal severity of death versus hospitalization, we also consider a weighted version which assigns the weights of 2 and 1 to death and hospitalization, respectively. Because heart failure is a life-threatening event, we consider another weighting scheme which assigns the weights of 3, 2, and 1 to cardiovascular death, heart failure, and other cardiovascular hospitalization, respectively. These weights are in line with the cardiology literature (e.g., Califf and others, 1990; Braunwald and others, 1992; Armstrong and others, 2011).

We apply the proportional means model to the aforementioned weighted composite event processes. Table 4 displays the results on the ratios of the mean frequencies of the weighted composite events between exercise training and usual care adjusted for heart failure etiology. The Inline graphic-values for the unweighted composite endpoints of all-cause mortality and all-cause hospitalization, cardiovascular mortality and cardiovascular hospitalization, and cardiovascular mortality and heart failure hospitalization are 0.06, 0.087, and 0.022, respectively, which are substantially smaller than the corresponding Inline graphic-values in the analysis of the first event. Because treatment differences are more profound for hospitalization than for mortality, assigning more weight to death than hospitalization tends to reduce the level of statistical significance. Because there are a large number of recurrent hospital admissions, however, the use of the weighted composite event process (with less weight on hospitalization than on death) still tends to yield stronger evidence for the benefits of exercise training than the use of the first composite event, especially for the composite of mortality and heart failure.

Table 4.

Proportional mean regression analysis of the HF-ACTION data under different weighting schemes

Weights Ratio SE 95% CI Inline graphic-value
All-cause mortality and all-cause hospitalization
Inline graphic Inline graphic 0.886 0.057 Inline graphic 0.060
Inline graphic Inline graphic 0.895 0.061 Inline graphic 0.104
Cardiovascular mortality and cardiovascular hospitalization
Inline graphic Inline graphic 0.892 0.059 Inline graphic 0.087
Inline graphic Inline graphic 0.906 0.070 Inline graphic 0.196
Inline graphic Heart Inline graphic Inline graphic 0.854 0.059 Inline graphic 0.022
Inline graphic Heart Inline graphic Inline graphic 0.863 0.062 Inline graphic 0.040
Inline graphic Heart Inline graphic Inline graphic 0.862 0.061 Inline graphic 0.037

Ratio is the estimated ratio of the mean frequencies of the weighted composite events between exercise training and usual case, SE is the (estimated) standard error for the ratio estimate, and 95%CI is the 95% confidence interval.

We compare the new and LWYY methods in the estimation of the mean functions for the unweighted composite event process of all-cause mortality and all-cause hospitalization. As shown in Figure 1, the LWYY estimates of the mean functions are considerably higher than ours. This phenomenon is consistent with the theory and simulation results.

Fig. 1.

Fig. 1.

Estimated mean functions for all-cause mortality and all-cause hospitalization by treatment group for non-ischemic patients in the HF-ACTION study: the left and right panels pertain to usual care and exercise training, respectively. The new and LWYY methods are denoted by the solid and dashed curves, respectively.

For further illustration, we apply the proportional means model to the last composite event process in Table 4 by adjusting for four additional covariates that were identified to be highly prognostic (O'Connor and others, 2009). These covariates are duration of the cardiopulmonary exercise test (CPX), left ventricular ejection fraction (LVEF), Beck Depression Inventory II score (BDI), and history of atrial fibrillation or flutter (AFF). The results on the regression effects are summarized in Table S4 of supplementary material available at Biostatistics online. With the covariate adjustment, the effect of exercise training is highly significant. Figure S1 of supplementary material available at Biostatistics online provides an example of predicting the number of events for a patient with given covariate values.

We check the model assumptions by using the diagnostic tools described in Section 2. The supremum tests for checking the functional forms of the continuous variables CPX, LVEF, and BDI have Inline graphic-values of 0.523, 0.217, and 0.308, respectively. The supremum tests for checking the proportionality assumptions on CPX, LVEF, BDI, AFF, HF etiology, and treatment group have Inline graphic-values of 0.138, 0.328, 0.070, 0.300, 0.105, and 0.256, respectively. The Inline graphic-value for checking the exponential link function is 0.083. Thus, the model fits the data reasonably well. A subset of the residual plots are displayed in Figure S2 supplementary material available at Biostatistics online.

5. Discussion

The presence of a terminal event poses serious challenges in the analysis of recurrent event data. The existing methods treating recurrent and terminal events as two separate endpoints have not been well received by clinicians or regulatory agencies. The nonparametric tests of Ghosh and Lin (2000) have been used in recent cardiovascular trials (Anand and others, 2009; Rogers and others, 2012), but only as secondary analysis; none of the other methods seem to have been used in actual clinical trials. The current practice is to use the first composite event as the primary endpoint. This endpoint disregards the information on the clinical events beyond the first one and does not distinguish the two types of events. The weighted composite event process is a natural extension of the current measure to enhance statistical power and clinical relevance. This endpoint is particularly useful when there are several types of recurrent events, some of which might have too few occurrences to be analyzed separately.

We have proposed a novel proportional rates/means model for studying the effects of treatments and other covariates on the weighted composite event process and provided the corresponding inference procedures. We have demonstrated that the proposed inference procedures have desirable asymptotic and finite-sample properties. We have shown both analytically and numerically that the LWYY approach always over-estimates the mean function of the (unweighted) composite event process (whether or not recurrent and terminal events are correlated) and generally yields biased estimation of the regression parameters.

Although the concept of proportional rates/means is simple and attractive, it is not obvious that the model can hold for the weighted composite event process. We have shown that there are realistic data generation mechanisms which satisfy this model. In addition, we have provided graphical and numerical methods to assess the adequacy of the model.

When constructing the estimating function for model (2.1), we exclude the censoring by death from the at-risk indicators. It seems counter-intuitive to regard a subject to be at risk after death. However, “at risk” is a mathematical construct to ensure unbiased estimating functions. If there is no censoring by Inline graphic, the composite endpoint process Inline graphic will be fully observed. In that case, it is clear that censoring Inline graphic at Inline graphic is mathematically wrong.

Regulatory submissions require the treatment efficacy be represented by a single parameter in the primary analysis. The rate (or mean) ratio for the weighted composite event process proposed in this article satisfies this requirement and provides a fuller and more meaningful characterization of the clinical course than the hazard ratio for the first composite event. It is sensible to combine death and life-threatening recurrent events (e.g., heart failure or stroke) with appropriate weighting in the primary analysis.

The analysis based on the composite event process provides an overall assessment of the treatment efficacy. A significant treatment effect on the composite endpoint does not imply significant treatment effects on all its components. The existing methods that treat terminal and recurrent events as two separate endpoints can be used to determine the nature of the treatment effect. If the treatment reduces the frequencies of both terminal and recurrent events, then its clinical benefits are clear. Because the occurrence of the terminal event precludes further development of recurrent events, it is possible for the treatment to reduce the risk of the terminal event and increase the incidence of recurrent events.

The choices of the weights will affect the power of statistical analysis and the interpretation of results. If the treatment effect on the terminal event is similar to or smaller than the treatment effect on recurrent events, then giving more weight to the terminal event than recurrent events will reduce statistical power, as evident by the simulation results and the HF-ACTION study. On the other hand, a composite endpoint that is dominated by recurrent events may not be of great interest to clinicians. One may choose the weights in a data-adaptive manner such that the weight for the terminal event depends on how many patients have experienced the terminal event. The weighting scheme should be specified a priory in consultation with the appropriate drug approval agency and clinicians.

Supplementary material

Supplementary material is available online at http://biostatistics.oxfordjournal.org.

Acknowledgments

Conflict of Interest: None declared.

Funding

This research was supported by the NIH grants R01GM047845, R01AI029168, and P01CA142538.

Supplementary Material

Supplementary Data

References

  1. Abrams D. I., Goldman A. I., Launer C., Korvick J. A., Neaton J. D., Crane L. R., Grodesky M., Wakefield S., Muth K., Korne-gay S., Cohn D. L., Harris A., Luskin-Hwark R., Markowitz N., Sampson J. H., Thompson M., Deyton L. (1994). A comparative trial of Didanosine or Zalcitabine after treatment with Zidovudine in patients with human immunodeficiency virus infection. New England Journal of Medicine 330, 657–662. [DOI] [PubMed] [Google Scholar]
  2. Anand I. S., Carson P., Galle E., Song R., Boehmer J., Ghali J. K., Jaski B., Lindenfeld J., O'Connor C., Steinberg J. S., Leigh J., Yong P., Kosorok M. R., Feldman A. M., DeMets D., Bristow M. R. (2009). Cardiac resynchronization therapy reduces the risk of hospitalizations in patients with advanced heart failure: results from the Comparison of Medical Therapy, Pacing and Defibrillation in Heart Failure (COMPANION) trial. Circulation 119, 969–977. [DOI] [PubMed] [Google Scholar]
  3. Andersen P. K., Gill R. D. (1982). Cox's regression model for counting processes: a large sample study. The Annals of Statistics 10, 1100–1120. [Google Scholar]
  4. Armstrong P. W., Westerhout C. M., Van de Werf F., Califf R. M., Welsh R. C., Wilcox R. G., Bakal J. A. (2011). Refining clinical trial composite outcomes: an application to the Assessment of the Safety and Efficacy of a New Thrombolytic-3 (ASSENT-3) trial. American Heart Journal 161, 848–854. [DOI] [PubMed] [Google Scholar]
  5. Braunwald E., Cannon C. P., McCabe C. H. (1992). An approach to evaluating thrombolytic therapy in acute myocardial infarction. The “unsatisfactory outcome” end point. Circulation 86, 683–687. [DOI] [PubMed] [Google Scholar]
  6. Byar D. P. (1980). The Veterans Administration study of chemoprophylaxis for recurrent stage I bladder tumors: comparisons of placebo, pyridoxine, and topical thiotepa. In Pavone-Macaluso M., Smith P. H. and Edsmyr F. (editors), Bladder Tumors and Other Topics in Urological Oncology. New York: Plenum, pp. 363–370. [Google Scholar]
  7. Califf R. M., Harrelson-Woodlief L., Topol E. J. (1990). Left ventricular ejection fraction may not be useful as an end point of thrombolytic therapy comparative trials. Circulation 82, 1847–1853. [DOI] [PubMed] [Google Scholar]
  8. Chen B. E., Cook R. J. (2004). Tests for multivariate recurrent events in the presence of a terminal event. Biostatistics 5, 129–143. [DOI] [PubMed] [Google Scholar]
  9. Cook R. J., Lawless J. F. (1997). Marginal analysis of recurrent events and a terminating event. Statistics in Medicine 16, 911–924. [DOI] [PubMed] [Google Scholar]
  10. Cook R. J., Lawless J. F., Lakhal-Chaieb L., Lee K. A. (2009). Robust estimation of mean functions and treatment effects for recurrent events under event-dependent censoring and termination: application to skeletal complications in cancer metastatic to bone. Journal of the American Statistical Association 104, 60–75. [Google Scholar]
  11. Cox D. R. (1972). Regression models and life-tables (with discussion). Journal of the Royal Statistical Society, Series B 34, 187–200. [Google Scholar]
  12. Fleming T. R., Harrington D. (1991) Counting Processes and Survival Analysis. New York: Wiley. [Google Scholar]
  13. Ghosh D., Lin D. Y. (2000). Nonparametric analysis of recurrent events and death. Biometrics 56, 554–562. [DOI] [PubMed] [Google Scholar]
  14. Ghosh D., Lin D. Y. (2002). Marginal regression models for recurrent and terminal events. Statistica Sinica 12, 663–688. [Google Scholar]
  15. Huang C., Wang M. (2004). Joint modeling and estimation for recurrent event processes and failure time data. Journal of the American Statistical Association 99, 1153–1165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Lawless J. F., Nadeau C. (1995). Some simple robust methods for the analysis of recurrent events. Technometrics 37, 158–168. [Google Scholar]
  17. Lewis J. A. (1999). Statistical principles for clinical trials (ICH E9): an introductory note on an international guideline. Statistics in Medicine 18, 1903–1942. [DOI] [PubMed] [Google Scholar]
  18. Lin D. Y., Wei L. J., Yang I., Ying Z. (2000). Semiparametric regression for the mean and rate functions of recurrent events. Journal of the Royal Statistical Society, Series B 62, 711–730. [Google Scholar]
  19. Liu L., Wolfe R. A., Huang X. (2004). Shared frailty models for recurrent events and a terminal event. Biometrics 60, 747–756. [DOI] [PubMed] [Google Scholar]
  20. Neaton J. D., Gray G., Zuckerman B. D., Konstam M. A. (2005). Key issues in end point selection for heart failure trials: composite end points. Journal of Cardiac Failure 11, 567–575. [DOI] [PubMed] [Google Scholar]
  21. O'Connor C. M., Whellan D. J., Lee K. L., Keteyian S. J., Cooper L. S., Ellis S. J., Leifer E. S., Kraus W. E., Kitzman D. W., Blumenthal J. A.. and others (2009). Efficacy and safety of exercise training in patients with chronic heart failure: HF-ACTION randomized controlled trial. Journal of American Medical Association 301, 1439–1450. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Pepe M. S., Cai J. (1993). Some graphical displays and marginal regression analyses for recurrent failure times and time dependent covariates. Journal of the American Statistical Association 88, 811–820. [Google Scholar]
  23. Pfeffer M. A., Swedberg K., Granger C. B., Held P., McMurray J. J., Michelson E. L., Olofsson B., Ostergren J., Yusuf S. (2003). Effects of candesartan on mortality and morbidity in patients with chronic heart failure: the CHARM-Overall programme. The Lancet 362, 759–66. [DOI] [PubMed] [Google Scholar]
  24. Prentice R. L., Williams B. J., Peterson A. V. (1981). On the regression analysis of multivariate failure time data. Biometrika 68, 373–379. [Google Scholar]
  25. Robins J. M., Rotnitzky A. (1992). Recovery of information and adjustment for dependent censoring using surrogate markers. In Jewell N., Dietz K. and Farewell V. (editors), AIDS Epidemiology. Boston: Birkhauser, pp. 297–331. [Google Scholar]
  26. Rogers J. K., McMurray J. J. V., Pocock S. J., Zannad F., Krum H., van Veldhuisen D. J., Swedberg K., Shi H., Vincent J., Pitt B. (2012). Eplerenone in patients with systolic heart failure and mild symptoms analysis of repeat hospitalizations. Circulation 126, 2317–2323. [DOI] [PubMed] [Google Scholar]
  27. Schaubel D. E., Zeng D., Cai J. (2006). A semiparametric additive rates model for recurrent event data. Lifetime Data Analysis 12, 389–406. [DOI] [PubMed] [Google Scholar]
  28. Vlahov D., Anthony J. C., Munoz A., Margolick J., Nelson K. E., Celentano D. D., Solomon L., Polk B. F. (1991). The ALIVE study: a longitudinal study of HIV-1 infection in intravenous drug users: description of methods. Journal of Drug Issues 21, 758–776. [PubMed] [Google Scholar]
  29. Wang M. C., Qin J., Chiang C. T. (2001). Analyzing recurrent event data with informative censoring. Journal of American Statistical Association 96, 1057–1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Wei L. J., Lin D. Y., Weissfeld L. (1989). Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. Journal of the American Statistical Association 84, 1065–1073. [Google Scholar]
  31. Ye Y., Kalbfleisch J. D., Schaubel D. E. (2007). Semiparametric analysis of correlated recurrent and terminal events. Biometrics 63, 78–87. [DOI] [PubMed] [Google Scholar]
  32. Yusuf S., Pfeffer M. A., Swedberg K., Granger C. B., Held P., McMurray J. J., Michelson E. L., Olofsson B., Östergren J. (2003). Effects of candesartan in patients with chronic heart failure and preserved left-ventricular ejection fraction: the CHARM-Preserved Trial. The Lancet 362, 777–781. [DOI] [PubMed] [Google Scholar]
  33. Zannad F., McMurray J. J., Krum H., van Veldhuisen D. J., Swedberg K., Shi H., Vincent J., Pocock S. J., Pitt B. (2011). Eplerenone in patients with systolic heart failure and mild symptoms. New England Journal of Medicine 364, 11–21.21073363 [Google Scholar]
  34. Zeng D., Cai J. (2010). A semiparametric additive rate model for recurrent events with an informative terminal event. Biometrika 97, 699–712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Zeng D., Lin D. Y. (2009). Semiparametric transformation models with random effects for joint analysis of recurrent and terminal events. Biometrics 65, 746–752. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from Biostatistics (Oxford, England) are provided here courtesy of Oxford University Press

RESOURCES