Abstract
Quantile residual lifetime analysis is conducted to compare remaining lifetimes among groups for survival data. Evaluating residual lifetimes among groups after adjustment for covariates is often of interest. The current literature is limited to comparing two groups for independent data. We propose a pseudo-value approach to compare quantile residual lifetimes given covariates between multiple groups for independent and clustered survival data. The proposed method considers clustered event times and clustered censoring times in addition to independent event times and censoring times. We show that the method can also be used to compare multiple groups on the cause specific residual life distribution in the competing risk setting, for which there are no current methods which account for clustering. The empirical Type I errors and statistical power of the proposed study are examined in a simulation study, which shows that the proposed method controls Type I errors very well and has higher power than an existing method. The proposed method is illustrated by a bone marrow transplant data set.
Keywords and phrases: pseudo–value, residual lifetime, clustered data
1. Introduction
A blood and marrow transplant is one of the most widely used procedures to treat cancers including leukemia, lymphoma, and multiple myeloma. As patients survive longer, pre- and post-transplant exposures have the potential to compromise life expectancy and can contribute to development of late complications (Majhail and Rizzo, 2013). Thus, clinicians in bone marrow transplantation are often interested in studying survival outcomes for patients who survived for at least some specific period after transplant. For example, Martin et al. (2010) studied residual life expectancy in patients surviving more than 5 years after allogeneic or autologous hematopoietic cell transplantation from 1970 through 2002. In practice, survival outcomes and censoring times of bone marrow transplant data are often clustered due to a transplant center effect. Thus, developing statistical models for clustered data is essential in studying residual life: remaining lifetime of a patient given that the patient survived at least to time t.
Quantile residual lifetime analysis is often preferred when the distribution of the residual lifetime is skewed (Ma and Wei, 2012). Statistical methods for residual lifetime analysis with survival data have been recently developed as in Kim, Zhou and Jeong (2012) and Lin, Zhang and Zhou (2014). Although they incorporate covariates into the models, they are restricted to independent survival data.
To illustrate some existing methods, consider independent survival data with sample size n. Let Ti, Ci, and Zi be the event time, censoring time, and covariate vector of individual i, respectively, for i = 1, …, n. For simplicity, we assume that Zi’s are fixed over time. The observed time of individual i is defined by Xi = min(Ti, Ci). Let Δi = I(Ti ≤ Ci). Define qτ as the τth conditional quantile residual lifetime given survival to time t0 for covariate Z = z0. Then, qτ satisfies
| (1.1) |
where S(t|Z) is a survival probability at time t given the covariate Z = z0. The Cox proportional hazards model (Cox, 1972) and Breslow estimator (Breslow, 1972) can be used to consistently estimate S(t|Z) as in Lin (2007) and Zhao et al. (2015). Lin, Zhang and Zhou (2014) estimated a solution to (1.1) by solving
| (1.2) |
Let q̂τ be a solution to (1.2). Lin, Zhang and Zhou (2014) studied the asymptotics of q̂τ and proposed one-sample and two-sample tests based on q̂τ for comparing conditional residual lifetimes given Z = z0. However, they did not discuss methods for comparing more than two groups, and their method is only applicable for independent survival data. In addition, estimating the variance of q̂τ requires simulation–based methods such as resampling methods and bootstrap in practice (Zeng and Lin, 2008). Thus, it is desirable to develop a non–simulation–based–method for multiple group comparisons with independent and clustered data.
Jeong and Fine (2009) proposed analyzing the cause-specific residual life distribution for the competing risks setting. For failure time T and cause of failure ε, let cumulative incidence function for cause k be Fk(t) = P (T ≤ t, ε = k). They defined the residual cumulative incidence function given survival to t0 for cause k as follows:
Let Qτ be the τth quantile of the cause k residual cumulative incidence function given survival to t0. Then, Qτ is defined as a solution to
Note that if there is at least an event for each cause, Fk(∞) < 1 for all k. Although the above equation allows that Fk(∞) < 1, Qτ may not exist for some extreme τ. In this paper, we only consider a case that Qτ exists. A nonparametric test was developed for testing one sample and two samples by Jeong and Fine (2013). Like Lin, Zhang and Zhou (2014), they did not discuss methods for comparing more than two groups, and their method is only applicable to independent data. Moreover, their proposed method does not take covariates into account.
We propose a pseudo–value–based method to compare quantile residual lifetimes or residual cumulative incidence given covariates between multiple groups for clustered survival and competing risks data. The pseudo–values can be used as a response variable under the generalized estimating equations (GEE) setting. Thus, statistical inference on clustered data can be readily handled via the GEE. Thanks to this technical convenience, the GEE using pseudo–values has been widely used to make inference on survival or competing risks data for independent and clustered data (Andersen, Klein and Rosthøj, 2003; Logan, Zhang and Klein, 2011). Our proposed method using pseudo–values is flexible and the strategy can be applied to independent or clustered data, as well as to residual lifetimes or cumulative incidence with minimal modifications. Standard errors and inference are also straightforward, without requiring simulation–based techniques. Importantly, it allows for covariate specific inference for comparing residual cumulative incidence functions, which has not previously been accomplished. We describe our motivating data in Section 2. Then, in Section 3 we extend the pseudo–value technique, which previously was restricted to clustered event times and independent censoring times, to also handle clustered censoring times. In Section 4, we propose a test statistic based on pseudo-values and study its asymptotic distribution. A simulation study is conducted in Section 5. A bone marrow transplant example is illustrated in Section 6. We have a brief conclusion in Section 7.
2. Data
The data for this application were collected by the Center for International Blood and Marrow Transplant Research (Shaw et al., 2010). It consists of pediatric patients (< 18 years) undergoing allogeneic T-replete, myeloablative bone marrow transplantation between 1993 and 2006. We study relapse and disease-free survival (DFS) of patients with severe disease conditions in this paper: intermediate or advanced disease status. The data contains 847 patients from 99 transplant centers. A significant center effect exists in DFS rates (p–value = 0.038) and relapse (p–value = 0.042) at a significance level of 0.05 using the random effect score test of Commenges and Andersen (1995). The censoring times of both endpoints are also correlated because their p–values from the score test are 0.050 and 0.008 for DFS and relapse, respectively.
We consider four variables: disease status, donor type, disease type, and recipient age at transplant. They were all identified as significant factors for DFS in the marginal Cox proportional hazards model (Lee, Wei and Amato, 1992). There are 619 and 228 patients with intermediate disease status and advanced disease status, respectively. Donor type has three groups: 80 patients with one-antigen mismatched related donors or phenotypically matched nonsibling related donors, 583 patients with human leukocyte antigen (HLA) identical sibling donors, and 184 patients with 8/8 allele-matched unrelated donors. Three disease types are considered: 170 patients with acute myeloid leukemia (AML), 547 patients with acute lymphoblastic leukemia (ALL), and 130 patients with myelodysplastic syndrome (MDS). Two groups are studied for recipient age as in Shaw et al. (2010): 554 patients who are younger than or equal to 10 years old and 293 who are older than 10 years old.
Many of the deadly treatment–related complications occur within 3 months after bone marrow transplant. Early complications include acute graft–versus–host disease, engraftment failure, and various early infections. The left plot of Figure 1 shows the cumulative incidence rate of treatment–related mortality (TRM). The dotted vertical line indicates 6 months after transplant. The majority of the treatment–related deaths occurred within 6 months post transplant due to early complications. Thus, it is of interest to study the residual lifetime of patients who survived disease–free to at least 6 months after transplant. There were 547 patients from 86 centers who survived disease–free to at least 6 months. The right plot of Figure 1 shows the histogram of the last–follow–up times of patients who survived disease–free to at least 6 months. The median last–follow–up time and the longest last–follow–up time are 15 months and 171 months, respectively. For patients who survived disease-free to at least 6 months, 48% of relapse and 47% of events in DFS occurred between 6 months and 1 year after transplant. Additional 21% of relapse and 20% of events in DFS occurred between 12 months and 18 months post transplant. These suggest that the distributions of event times for patients who survived disease–free to at least 6 months after transplant are skewed in relapse and DFS. Therefore, we will study conditional quantile residual lifetime of patients who survived disease–free to at least 6 months in relapse and DFS based on the four variables that we mentioned in the previous paragraph.
Fig 1.
Cumulative incidence rate of treatment-related mortality (left plot) and histogram of residual follow–up time (right plot). The dotted vertical line of the left plot indicates 6 months after transplant.
3. Pseudo–value approach
In this section, we extend the pseudo-value approach to clustered events and clustered censoring times for competing risks and survival settings. We first consider the competing risks setting with two causes of failure ε ∈ {1, 2}. We assume that there are m clusters and each cluster has ℓ individuals, where n = m × ℓ is the total sample size. Note that the clusters may have different sizes by defining censoring times as zero when observed times are missing (Spiekerman and Lin, 1998). Let Tij, Cij, εij, and Zij be the event time, censoring time, cause of failure, and covariate vector of individual j in cluster i, respectively, for i = 1, …, m and j = 1, …, ℓ. Let Ti = {Tij, j = 1, …, ℓ}, Ci = {Cij, j = 1, …, ℓ}, εi = {εij, j = 1, …, ℓ}, and Zi = {Zij, j = 1, …, ℓ}. Suppose that (Ti, εi, Ci, Zi) are independent and identically distributed (iid). We assume that the Cij’s do not depend on the Zij’s and the (Tij, εij)’s are independent of the Cij’s given Zij’s for i = 1, …, m and j = 1, …, ℓ. This setting allows that the event times may be correlated within the same cluster. Similarly, the censoring times may be correlated within the same cluster. The Cij’s are assumed to have a common distribution G(t) = P (C ≥ t), where C is a censoring time. Let Xij = min(Tij, Cij) be the observed time and Δij = I(Tij ≤ Cij).
To define a pseudo–value, we first consider the unadjusted marginal cumulative incidence function for cause 1 without loss of generality. We adjust for covariates under the GEE framework after obtaining the pseudo–values. Let F1(t) = P (T ≤ t, ε = 1) and Nkij(t) = I(Tij ≤ t)I(εij = k)Δij for k = 1, 2. Define Nij(t) = N1ij(t) + N2ij(t). Let Yij(t) = I{t ≤ Xij} and . The cumulative incidence estimate can be estimated by , where Ŝ (u−) is the Kaplan-Meier estimate of event-free survival and Ĥ1(t) is the estimated cause-specific cumulative hazard function given by
Zhou et al. (2012) showed that F̂1(t) is a consistent estimator of F1(t) for clustered competing risks event times and clustered censoring times.
A pseudo-value at time t of the jth individual in the ith cluster for F1(t) is defined by
where is the cumulative incidence estimate obtained by omitting the jth individual in the ith cluster. Let and Hc(t) be the cumulative hazard function by treating censored observations as events. Let and T is an event time of any cause. Then, as in the Supplementary Materials, we can show
| (3.1) |
which implies that ’s asymptotically depend on i. Thus, ’s are asymptotically independent with . A generalized estimating equation (GEE) setting can be used by treating pseudo–values as a response variable (Andersen, Klein and Rosthøj, 2003; Logan, Zhang and Klein, 2011; Klein and Andersen, 2005). We illustrate the use of the GEE at a fixed time point t. To model the marginal cumulative incidence function at time t, we assume the cumulative incidence is related to covariates through a link function h(·) so that h{F1(t|Zij)} = βT Zij. Let νij(t) = F1(t|Zij) = h−1(βT Zij) and νi = {νi1(t), …, νiℓ (t)}T for i = 1, …, m and j = 1, …, ℓ. Then, the GEE is defined as follows:
where Vi is a working covariance matrix for . Statistical inference on β can be readily handled by using the sandwich estimator in GEE (Liang and Zeger, 1986), which is consistent even if the working correlation matrix is misspecified (Zeger, Liang and Albert, 1988). This sandwich estimator may slightly overestimate the variance of β̂ as Jacobsen and Martinussen (2014) pointed out. However, they argued that this overestimation should be minor in many applications because their simulation study showed that the sandwich estimator estimated the variance of β̂ very well unless β was too large. In addition, Logan, Zhang and Klein (2011), Ahn and Mendolia (2014), and Klein and Andersen (2005) showed that the sandwich estimator worked well for the pseudo–value approach under their various simulation settings.
For the survival setting, the pseudo-value for survival at time t is defined as for i = 1, …, m and j = 1, …, ℓ, where Ŝ−ij(t) is the Kaplan-Meier estimate obtained by omitting the jth individual in the ith cluster. As in the competing risks setting, we can show
| (3.2) |
where T is event time of interest. This shows that and ’s are asymptotically independent. The GEE setting can be similarly used as in the competing risks setting to directly model the marginal survival function at time t in the presence of clustered survival and censoring data.
4. Method
In this section, we propose a test to compare the conditional residual quantiles between multiple groups for independent and clustered data, given a fixed covariate level Z = z0. We first propose a test for the competing risks setting, where cause 1 is of interest.
We denote Qgτ, g = 1, …, ζ as the τ th conditional quantiles of the cause 1 residual cumulative incidence given survival to time t0 for group g when the covariate level is Z = z0. Then, Qgτ is defined as a solution to
where F1g(t|z0) and Sg(t|z0) are the cumulative incidence and survival probability at time t given Z = z0 for group g. Assuming that F1g(t|z0) is absolutely continuous and f1g(t|z0) = dF1g(t|z0)/dt is positive on some neighborhood of t0 + Qgτ, 𝒰g(Qgτ) has a unique solution. We are interested in testing H0: Q1τ = ··· = Qζτ ≡ Qτ. Testing H0 is equivalent to testing due to the unique solution of 𝒰g(Qτ) for g = 1, …,. To test , we follow the steps below:
Obtain a consistent estimator Q̂τ of Qτ based on the pooled data;
Compute pseudo–values , and for i = 1, …, m and j = 1, …, ℓ;
Fit GEE using the pseudo-values to estimate F1g(t0+Q̂τ |z0), F1g(t0|z0), and Sg(t0|z0) for g = 1, …, ζ;
Estimate 𝒰g(Qτ) for g = 1, …, ζ and their covariance matrix using the estimates of the GEE;
Obtain a quadratic form test statistic to test .
For Step (1), we first study how to obtain consistent estimates of S(t|Z) and F1(t|Z). A consistent estimate of S(t|Z) even for clustered data is obtained by the marginal Cox model and Breslow estimator. See the Supplementary Materials for the details of the marginal Cox model and Breslow estimator for clustered data, and the asymptotic distribution of S(t|Z). To obtain a consistent estimate of F1(t|Z), we use a marginal proportional subdistribution hazard model proposed by Zhou et al. (2012). Here the subdistribution hazard λ1(t|Z) = dF1(t|Z)/{1 − F1(t|Z)} follows a proportional hazards representation as
Under this model, the cumulative subdistribution hazard for cause 1 can be estimated by
where ŵij(t) = I{Cij ≥ min(Xij, t)}Ĝ(t)/Ĝ{min(Xij, t)} and Ĝ(t) is the Kaplan–Meier estimate at time t for censoring distribution. Zhou et al. (2012) showed the consistency of F̂1(t|Z) = 1 − exp{−Λ̂1(t|Z)} for clustered event times and clustered censoring times.
Let F̂1(t|z0) and Ŝ(t|z0) be consistent estimators of F1(t|z0) and S(t|z0) based on the pooled data. Then, under the null hypothesis, Qτ can be consistently estimated by solving F̂1(t0 + Qτ |z0) − F̂1(t0|z0) − τ Ŝ (t0|z0) = 0. We estimate Q̂τ as the smallest Qτ at which F̂1(t0+Qτ |z0)−F̂1(t0|z0)−τŜ(t0|z0) crosses 0. Due to the consistency of F̂1(t|z0) and Ŝ(t|z0), Q̂τ is also consistent.
For Step (2), we calculate the pseudo–values , and for individual j of cluster i as described in Section 3.
Because each 𝒰g(Q̂τ) depends on the components F1g(t0+Q̂τ |z0), F1g(t0|z0), and Sg(t0|z0), we use a single pseudo-value regression model to obtain estimates of these components for each group in Step (3). One main advantage of using a single pseudo-value regression model is that it provides direct estimates of the covariance matrix across these components in the complex clustered data setting using the sandwich variance estimate, leading to straightforward estimation of the variance of the final test statistic. We parameterize a joint model using a link function h(·) so that
where θ1ζ = θ2ζ = θ3ζ = 0. Here ϕ, ϕ + ϕ1, and ϕ + ϕ2 are intercept terms for , and , respectively. And θ1g, θ2g, and θ3g are parameters for an indicator function of group g = 1, …, ζ − 1 for , and , respectively. In matrix notation, consider the response variable vector , where and for i = 1, …, m. Assuming N1ij(x) is continuous at x = Qτ + t0 with probability one, we can show that converges in probability to as in the Supplementary Materials. Define the parameter vector , where θk = (θk1, …, θk−1)T for k = 1, 2, 3. Define the mean vector as for i = 1, …, m, where and for i = 1, …, m. This joint model can be fitted using the GEE
where Vi models the covariance matrix of Pi. Then, converges in distribution to N(0, Σf) given Q̂τ. The sandwich estimator is used to estimate Σf as follows:
where
This GEE is used to estimate the parameter vector α and obtain the covariance matrix of the estimate; however, what we are ultimately interested in is testing through 𝒰, which is specific to a particular covariate value z0.
Thus, in Step (4), we estimate 𝒰g(Q̂τ) under the null hypothesis by
Using the delta method, we can show that under the null hypothesis {𝒰̂1(Q̂τ), …, 𝒰̂ζ (Q̂τ)}T converges in distribution to , where
We can obtain Ω̂f by plugging α̂ into Ωf.
Finally, in Step (5), we construct a quadratic form test based on an estimate of the vector
which should have a mean vector of 0 under the null hypothesis. Let ng be the sample size of group g and . Define the weighted mean of 𝒰̂g(Q̂τ)’s as
which converges in probability to zero under the null hypothesis. Then, under the null hypothesis {𝒰̂1(Q̂τ) − 𝒰̄ (Q̂τ), …, 𝒰̂ζ (Q̂τ) − 𝒰̄ (Q̂τ)}T converges in distribution to , where
Iζ is a ζ × ζ identity matrix and
Note that is not invertible. To test , the final test statistic is
where (BΩ̂Σ̂Ω̂TBT)− is a generalized inverse of BΩ̂Σ̂Ω̂TBT. Under the null hypothesis, X2 converges in distribution to a chi–squared distribution with degrees of freedom ζ − 1.
The proposed method can be easily adapted to the one–sample setting where we are interested in testing if Qτ = Q0τ for a prespecified Q0τ. Let
Let ω̂f be an estimator of ωf by plugging α̂ into ωf. We propose a test statistic m𝒰̂1(Q0τ)2/𝒟̂, where . Under the null hypothesis, m𝒰̂1(Q0τ)2/𝒟̂ follows a chi–squared distribution with degree of freedom 1.
The proposed method for the survival setting can be similarly constructed, see the Supplementary Materials for the details.
5. Simulation
5.1. Survival setting
In this section, we perform a simulation study for the survival setting. We consider one–sample test first. A positive stable frailty is used to generate correlated event times as in Logan, Zhang and Klein (2011). Independent of event times, correlated censoring times are generated. We consider m = 100, 200, and 400 with cluster size ℓ = 4. For each cluster, two independent random effects w and wc are generated from a positive stable frailty distribution with parameter ψ, where the Laplace transformation of the standard positive stable distribution is L(s) = exp(−sψ). Three ψ values are used: 0.25 and 0.5 for clustered data and 1 for independent data. These values of ψ would correspond to values of Kendall’s τ equal to 0.75, 0.5, and 0, respectively (Logan, Zhang and Klein, 2011). For each cluster, three binary covariates are considered: i) Z1 is a cluster level covariate. For each cluster, a random binary number is generated with probability 0.5 and it is assigned to all individuals within a cluster; ii) Z2 is independently generated with probability 0.5 for each individual within a cluster; and iii) Z3 represents a matched pair design. Thus, two individuals have 0’s and the other two have 1’s. Following Logan, Zhang and Klein (2011), we generate event and censoring times for each cluster from
| (5.1) |
where Z = (Z1, Z2, Z3)T and ψβ = (1, −0.5, 0.5)T. We select λc to generate a 50% censoring rate. We consider the median residual lifetime given z0 = (1, 1, 1)T conditional on survival to t0, where S(t0|z0) = 0.8, which yields t0 = −{log(0.8)}1/ψ/exp(1/ψ). Let q0τ be the true conditional median lifetime under the null hypothesis, where τ = 0.5. Then, q0τ = 0.013, 0.107, and 0.255 for ψ = 0.25, 0.5, and 1, respectively.
To test qτ = q0τ, we test if S(t0 + q0τ|z0) − 0.5S(t0|z0) = 0 using GEE. The identity link and logit link functions with the independence working correlation matrix are considered. Table 1 shows the empirical Type I error probabilities with 5,000 iterations at a significance level of 0.05. For each link function, two GEE models are fitted: i) GEE ignoring within cluster correlation, that is, assuming independent data; and ii) GEE accounting for within cluster correlation. “PVI–Ind”, “PVI–Cluster”, “PVL–Ind”, and “PVL–Cluster” indicate the pseudo–value approach with identity link function assuming independent data, identity link function accounting for within cluster correlation, logit link function assuming independent data, and logit link function accounting for within cluster correlation, respectively. As we can see in Table 1, the pseudo–value approach ignoring within cluster correlation performs well for independent data. However, it becomes more liberal as the within cluster correlation increases. Thus, the presence of a cluster effect leads to underestimation of the variance when the cluster effect is ignored under our simulation setting. On the other hand, the pseudo–value approach accounting for within cluster correlation controls Type I error very well for clustered data. As m increases, its empirical Type I errors become closer to 0.05 in general. It appears that the pseudo–value approach with the identity link function controls Type I errors slightly better than that with the logit link function.
Table 1.
Empirical Type I error probabilities for the survival setting: “PVI–Ind”, “PVI–Cluster”, “PVL–Ind”, and “PVL–Cluster” indicate the pseudo–value approach with identity link function assuming independent data, identity link function accounting for within cluster correlation, logit link function assuming independent data, and logit link function accounting for within cluster correlation, respectively.
| ψ | m | One–sample test
|
|||
|---|---|---|---|---|---|
| PVI–Ind | PVI–Cluster | PVL–Ind | PVL–Cluster | ||
| 1 | 50 | 0.0600 | 0.0646 | 0.0658 | 0.0714 |
| 100 | 0.0506 | 0.0516 | 0.0530 | 0.0578 | |
| 200 | 0.0506 | 0.0534 | 0.0508 | 0.0520 | |
| 400 | 0.0502 | 0.0514 | 0.0504 | 0.0516 | |
|
| |||||
| 0.5 | 50 | 0.0918 | 0.0656 | 0.0990 | 0.0790 |
| 100 | 0.0872 | 0.0600 | 0.0902 | 0.0640 | |
| 200 | 0.0868 | 0.0566 | 0.0832 | 0.0592 | |
| 400 | 0.0862 | 0.0516 | 0.0848 | 0.0558 | |
|
| |||||
| 0.25 | 50 | 0.1172 | 0.0734 | 0.1286 | 0.0850 |
| 100 | 0.1148 | 0.0600 | 0.1100 | 0.0622 | |
| 200 | 0.1096 | 0.0514 | 0.1046 | 0.0532 | |
| 400 | 0.1122 | 0.0500 | 0.1024 | 0.0512 | |
|
| |||||
| ψ | m | Three–sample test | |||
|
| |||||
| PVI–Ind | PVI–Cluster | PVL–Ind | PVL–Cluster | ||
|
| |||||
| 1 | 50 | 0.0604 | 0.0732 | 0.0650 | 0.0812 |
| 100 | 0.0556 | 0.0584 | 0.0586 | 0.0626 | |
| 200 | 0.0444 | 0.0504 | 0.0512 | 0.0524 | |
| 400 | 0.0484 | 0.0514 | 0.0494 | 0.0520 | |
|
| |||||
| 0.5 | 50 | 0.0166 | 0.0714 | 0.0250 | 0.081 |
| 100 | 0.0146 | 0.0594 | 0.0204 | 0.0614 | |
| 200 | 0.0148 | 0.0536 | 0.0178 | 0.0514 | |
| 400 | 0.0124 | 0.0514 | 0.0168 | 0.0506 | |
|
| |||||
| 0.25 | 50 | 0.0032 | 0.0592 | 0.0126 | 0.0770 |
| 100 | 0.0018 | 0.0586 | 0.0038 | 0.0610 | |
| 200 | 0.0016 | 0.0524 | 0.0034 | 0.0546 | |
| 400 | 0.0010 | 0.0468 | 0.0024 | 0.0514 | |
Next, we consider the three group comparison. A cluster size ℓ = 6 is considered with two binary covariates: i) Z1 is a cluster level covariate so that it is generated with probability 0.5; and ii) Z2 is independently generated with probability 0.5 for each individual within a cluster. Each group has two individuals from each cluster. We generate event and censoring times from (5.1) with Z = (Z1, Z2)T and ψβ = (1, 0.5)T. We choose λc to generate a 50% censoring rate. We consider the median residual lifetime given z0 = (1, 1)T conditional on survival to t0, where S(t0|z0) = 0.8, which leads t0 to be −{log(0.8)}1/ψ/exp(1.5/ψ). Table 1 shows the empirical Type I errors from the three–sample test with 5,000 iterations. As in the one–sample test setting, the pseudo–value approach accounting for within cluster correlation controls Type I errors very well for clustered data. As m increases, the empirical Type I errors become closer to 0.05. The pseudo–value approach ignoring within cluster correlation works properly only for independent data as expected. It is somewhat conservative for clustered data. This is because our three–group–comparison setting is more like a matched pairs setting, where ignoring the cluster effect amounts to ignoring the reduction in variance due to the within cluster group comparison. Therefore, the variance is overestimated and the type I error is conservative. A similar phenomenon was seen in Logan, Zhang and Klein (2011). We also examined various τ = 0.1, 0.25, 0.75, and 0.9. In addition, we performed a simulation assuming Z2 follows N(0, 1), to study the median residual lifetime given z0 = (0.5, 1)T conditional on survival to t0. In all scenarios, the proposed method performed as well as shown in Table 1. Thus, the results were omitted.
We also compare the proposed method to Lin, Zhang and Zhou (2014). Because the method of Lin, Zhang and Zhou (2014) is restricted to the two–sample test, two–group comparison is considered. Let qgτ be the conditional median residual lifetime of group g for g = 1 and 2. To test q1τ = q2τ, Lin, Zhang and Zhou (2014) proposed a test statistic
where q̂iτ is obtained by solving (1.2) for group i and is the estimated variance of q̂iτ for i = 1, 2. Under the null hypothesis, L follows a chi–squared distribution with degree of freedom 1. The bootstrap with 500 iterations is used to estimate for i = 1, 2. We also conducted the bootstrap with 1,000 iterations for some of the simulation scenarios and obtained very similar results to those with 500 iterations. The setting for examining empirical Type I errors is the same as the three–group comparison’s except ℓ = 4. Thus, each cluster consists of two groups instead of three groups. We compare two median residual lifetimes given z0 = (1, 1)T conditional on survival to t0, where S(t0|z0) = 0.8. We examine m = 100 and m = 200. Table 2 shows the summary of 5,000 iterations. For independent data, all methods control Type I errors well although the pseudo–value approaches ignoring within cluster correlation seem to work slightly better than the others. However, Lin, Zhang and Zhou (2014) and the pseudo-value approaches ignoring within cluster correlation are conservative for clustered data. To compare power, survival and censoring times are generated from
Table 2.
Two–sample setting with survival data and comparison to Lin et al. (2014):“PVI–Ind”, “PVI–Cluster”, “PVL–Ind”, and “PVL–Cluster” indicate the pseudo–value approach with identity link function assuming independent data, identity link function accounting for within cluster correlation, logit link function assuming independent data, and logit link function accounting for within cluster correlation, respectively.
| ψ | m | Empirical Type I errors
|
||||
|---|---|---|---|---|---|---|
| PVI–Ind | PVI–Cluster | PVL–Ind | PVL–Cluster | Lin et al. | ||
| 1 | 100 | 0.0552 | 0.0598 | 0.0540 | 0.0552 | 0.0594 |
| 200 | 0.0492 | 0.0486 | 0.0486 | 0.0506 | 0.0612 | |
|
| ||||||
| 0.5 | 100 | 0.0204 | 0.0588 | 0.0238 | 0.0526 | 0.0382 |
| 200 | 0.0214 | 0.0558 | 0.0278 | 0.0558 | 0.0398 | |
|
| ||||||
| 0.25 | 100 | 0.0046 | 0.0574 | 0.0074 | 0.0590 | 0.0304 |
| 200 | 0.0030 | 0.0560 | 0.0046 | 0.0540 | 0.0270 | |
|
| ||||||
| ψ | m | Empirical statistical power | ||||
|
| ||||||
| PVI–Ind | PVI–Cluster | PVL–Ind | PVL–Cluster | Lin et al. | ||
|
| ||||||
| 1 | 100 | 0.6714 | 0.6742 | 0.5614 | 0.5712 | 0.2002 |
| 200 | 0.9304 | 0.9298 | 0.8554 | 0.8572 | 0.5256 | |
|
| ||||||
| 0.5 | 100 | 0.6960 | 0.8216 | 0.5788 | 0.6584 | 0.0536 |
| 200 | 0.9572 | 0.9840 | 0.8832 | 0.9230 | 0.2616 | |
|
| ||||||
| 0.25 | 100 | 0.7412 | 0.9054 | 0.6154 | 0.7400 | 0.0022 |
| 200 | 0.9732 | 0.9986 | 0.8986 | 0.9522 | 0.0190 | |
where Si(t|w) is the survival probability for group i, ψβ = (1, 0.5, 0.5)T, and Zi’s have the same definitions as in the three–group comparison setting. As can be seen from Table 2, the pseudo–value approach has higher statistical power than Lin, Zhang and Zhou (2014). It appears that the pseudo–value approach with the identity link has the highest power among the five methods.
5.2. Competing risks setting
In this section, we conduct a simulation study for the competing risks setting. Consider the one–sample test first. As in Section 5.1, a positive stable frailty is used to generate correlated event times and censoring times. We consider m = 100, 200, and 400 with cluster size ℓ = 4. For each cluster, two independent random effects w and wc are generated from a positive stable frailty distribution with parameter ψ = 0.25, 0.5, and 1. Three binary covariates are considered, where the definitions of Z1;Z2, and Z3 are the same as in Section 5.1. Similarly to Logan, Zhang and Klein (2011), we generate competing risks event times and censoring times for each cluster from
| (5.2) |
where Z = (Z1, Z2, Z3)T and ψβ = (1, − 0.5, 0.5)T. We select p and λc to generate 30% cause 1 events, 30% cause 2 events, and 40% censoring.We consider the 0.25th quantile of the cause 1 residual cumulative incidence for z0 = (1, 1, 1)T conditional on survival to t0, where S(t0|z0) = 0.8. Let Q0τ be the true conditional quantile under the null hypothesis, where τ = 0.25. Then, Q0τ = 0.068, 0.194, and 0.341 for ψ = 0.25, 0.5, and 1, respectively. To test if Qτ = Q0τ, we test if 𝒰1(Q0τ) = F1(Q0τ +t0|z0) − F1(t0|z0) − τS(t0|z0) = 0 using GEE. The identity link and logit link functions with the independence working correlation matrix are considered. Table 3 shows the empirical Type I error probabilities with 5,000 iterations at a significance level of 0.05. The pseudo–value approaches ignoring within cluster correlation become more liberal as the within cluster correlation increases although they work well for independent data. On the other hand, the pseudo–value approach accounting for within cluster correlation controls Type I error very well for clustered data. As m increases, its empirical Type I errors become closer to 0.05 in general. It appears that the pseudo–value approach with the identity link function controls Type I errors slightly better than that with the logit link function as in the survival setting.
Table 3.
Empirical Type I error probabilities for the competing risks setting: “PVI–Ind”, “PVI–Cluster”, “PVL–Ind”, and “PVL–Cluster” indicate the pseudo–value approach with identity link function assuming independent data, identity link function accounting for within cluster correlation, logit link function assuming independent data, and logit link function accounting for within cluster correlation, respectively.
| ψ | m | One–sample test
|
|||
|---|---|---|---|---|---|
| PVI–Ind | PVI–Cluster | PVL–Ind | PVL–Cluster | ||
| 1 | 50 | 0.0560 | 0.0650 | 0.0914 | 0.0970 |
| 100 | 0.0544 | 0.0580 | 0.0658 | 0.0680 | |
| 200 | 0.0522 | 0.0588 | 0.0586 | 0.0590 | |
| 400 | 0.0460 | 0.0476 | 0.0462 | 0.0488 | |
|
| |||||
| 0.5 | 50 | 0.0936 | 0.0724 | 0.1374 | 0.1286 |
| 100 | 0.0898 | 0.0640 | 0.0930 | 0.0772 | |
| 200 | 0.0856 | 0.0568 | 0.0816 | 0.0612 | |
| 400 | 0.0872 | 0.0522 | 0.0782 | 0.0514 | |
|
| |||||
| 0.25 | 50 | 0.1246 | 0.0836 | 0.1638 | 0.1406 |
| 100 | 0.1092 | 0.0622 | 0.1068 | 0.0750 | |
| 200 | 0.1042 | 0.0556 | 0.0958 | 0.0598 | |
| 400 | 0.1082 | 0.0540 | 0.0952 | 0.0534 | |
|
| |||||
| ψ | m | Three–sample test | |||
|
| |||||
| PVI–Ind | PVI–Cluster | PVL–Ind | PVL–Cluster | ||
|
| |||||
| 1 | 50 | 0.0652 | 0.0816 | 0.0684 | 0.092 |
| 100 | 0.0566 | 0.0620 | 0.0582 | 0.0644 | |
| 200 | 0.0532 | 0.0550 | 0.0520 | 0.0572 | |
| 400 | 0.0514 | 0.0536 | 0.0496 | 0.0506 | |
|
| |||||
| 0.5 | 50 | 0.0226 | 0.0664 | 0.0584 | 0.0978 |
| 100 | 0.0222 | 0.0626 | 0.0314 | 0.0678 | |
| 200 | 0.0220 | 0.0572 | 0.0288 | 0.0578 | |
| 400 | 0.0190 | 0.0542 | 0.0256 | 0.0522 | |
|
| |||||
| 0.25 | 50 | 0.0007 | 0.0553 | 0.0555 | 0.0975 |
| 100 | 0.0064 | 0.0544 | 0.0142 | 0.0538 | |
| 200 | 0.0046 | 0.0498 | 0.0086 | 0.0486 | |
| 400 | 0.0060 | 0.0516 | 0.0080 | 0.0462 | |
Next, we consider the three group comparison. A cluster size of ℓ = 6 is considered with two binary covariates Z1 and Z2. Each group has two individuals from each cluster. We generate event and censoring times from (5.2) with Z = (Z1, Z2)T and ψβ = (1, 0.5)T. We choose p and λc to generate 30% of cause 1, 30% of cause 2, and 40% of censoring rate. We compare the 0.25th conditional quantile of the cause 1 residual lifetimes given z0 = (1, 1)T survival to t0, where S(t0|z0) = 0.8. Table 3 shows the empirical Type I errors from the three–sample test with 5,000 iterations. As in the one–sample test setting, the pseudo–value approach accounting for within cluster correlation controls Type I errors very well for clustered data. As m increases, the empirical Type I errors become closer to 0.05 in general. The pseudo–value approach ignoring within cluster correlation works satisfactorily only for independent data. It is conservative for clustered data. As in the survival setting, we also examined various τ = 0.1, 0.25, 0.75, and 0.9. In addition, we performed a simulation assuming Z2 follows N(0, 1). In all scenarios, the proposed method performed as well as shown in Table 3. Thus, the results were omitted.
6. Example
We revisit our motivating data (Shaw et al., 2010) of Section 2. As we discussed in Section 2, event times and censoring times in relapse and DFS are clustered. The independence working correlation matrix was used for GEE. We consider four variables: disease status, donor type, disease type, and recipient age at transplant. The censoring distribution of relapse and DFS did not depend on any of the four variables at the significance level 0.05.
First of all, we compare the conditional median residual disease–free–survival lifetimes given disease–free–survival to 6 months between the two recipient age groups for patients having intermediate disease status, HLA identical sibling donors, and AML. The left plot of Figure 2 shows two conditional residual disease–free–survival distributions given survival to 6 months for the patients of interest, that is,
Fig. 2.
Estimated conditional residual DFS and cumulative incidence. “CI” of the right plot stands for cumulative incidence
The dotted horizontal line indicates 50% of the conditional residual disease–free survival probability (CRDFS). The estimated conditional median residual lifetimes were 73 months and 23 months for the recipient age ≤ 10 and > 10, respectively. It appears that the two median residual lifetimes were different. The proposed method with the identity link function and the independence working correlation matrix found a statistically significant difference with a p–value of 0.004. The proposed method with the logit link function also found a statistical significance with a p–value 0.007. However, Lin, Zhang and Zhou (2014) did not find a statistically significant difference at a significance level of 0.05 with a p–value of 0.257. In addition, when adjustment for center effects was ignored, the proposed method with the identity link function and the logit link function did not find statistical significance with p–values of 0.081 and 0.138, respectively. As Jacobsohn (2015) recently pointed out, there are only a few pediatric late effects studies. Nonetheless, our finding is interesting because Ferry et al. (2007) did not find a significant age effect on long–term survival for pediatric patients with hematological malignancies. Our result suggests that there might be another group of pediatric patients that have a recipient age effect on long–term survival.
Next, we consider disease groups and disease status for the conditional quantile of the residual cumulative incidence of relapse. Treatment–related mortality is the competing risk for relapse. We compare the conditional 0.25th residual cumulative incidence of relapse given disease–free–survival to 6 months between the three disease groups of AML, ALL, and MDS. The right plot of Figure 2 shows the conditional residual cumulative incidence given disease–free–survival to 6 months of the three disease groups with intermediate disease status, that is,
The dotted horizontal line indicates 25% of the conditional residual relapse incidence (CRRI). The estimated conditional residual relapse incidence quantiles were 4 months, 5 months, and 17 months for AML, ALL, and MDS, respectively. It appears that the conditional 0.25th residual cumulative incidence of relapse for MDS is different from those for AML and ALL. We found a statistically significant result with a p–value of 0.037 using the proposed method. When we ignored adjustment for center effects, we did not find statistical significance with a p–value of 0.482. This result is consistent to the finding of a recent adult study: Eapen et al. (2015) that studied long–term outcomes including survival, relapse, and TRM for adult patients with hematological malignancies that had bone marrow transplant. They also found a significant disease effect on relapse.
7. Conclusion
A pseudo–value approach has been proposed to test conditional quantile residual lifetimes for survival and competing risks data. Using Zhou et al. (2012), we extended the method of Logan, Zhang and Klein (2011) to clustered event times and clustered censoring times. The proposed method uses GEE with a sandwich variance estimator to account for correlation within a cluster. The simulation studies show that the proposed method controls Type I errors well and has higher power than an existing method. A bone marrow transplant data set was provided as an example. Although we used the Cox proportional hazards model and the marginal Fine–Gray model to estimate S(t|Z) and F1(t|Z), respectively, other methods such as the additive hazards models (Yin and Cai, 2004) and the cause–specific hazards models (Prentice et al., 1978) can also be used to estimate them. It needs a further investigation via extensive simulation studies to examine the performance of the proposed method with various techniques to estimate S(t|Z) and F1(t|Z). Nonetheless, using those models to estimate S(t|Z) and F1(t|Z) adds additional assumptions to be satisfied to use the proposed method.
There are several areas for future research. We accounted for covariates by comparing covariate-specific residual quantiles; a method which aggregates this inference across multiple levels of covariates would also be useful. The methods focus on testing, but estimation of confidence intervals for the residual quantiles in the clustered data setting are also important for further investigation. Selecting the most appropriate link function for the pseudo–value approach is a hard problem in practice. Extensive simulation study may be needed to compare the performance of link functions in the future. The proposed method is a marginal model. Developing a pseudo–value approach dealing with random effects is another interesting problem. Finally, the proposed pseudo-value method assumes that censoring is independent of covariates. He et al. (2016) and Binder, Gerds and Andersen (2014) has recently studied the subdistribution hazards model and pseudo–values, respectively, allowing covariate–dependent censoring. Using these methods, the proposed method may be extended for covariate–dependent censoring data.
Supplementary Material
Acknowledgments
The US National Cancer Institute (U24CA076518) partially supported this work. The authors are grateful to the Editor, Dr. Beth Ann Griffin, the Associate Editor, and two anonymous referees for their helpful comments and suggestions.
Footnotes
(http://projecteuclid.org/all/euclid.aoas). The online Supplementary Materials are available with this paper at the Annals of Applied Statistics website.
References
- Ahn KW, Mendolia F. Pseudo–value approach for comparing survival medians for dependent data. Statistics in Medicine. 2014;33:1531–1538. doi: 10.1002/sim.6072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Andersen PK, Klein JP, Rosthøj S. Generalised linear models for correlated pseudo-observations, with applications to multi-state models. Biometrika. 2003;90:15–27. [Google Scholar]
- Binder N, Gerds TA, Andersen PK. Pseudo-observations for competing risks with covariate dependent censoring. Lifetime Data Analysis. 2014;20:303–315. doi: 10.1007/s10985-013-9247-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Breslow NE. Discussion of the paper by D. R. Cox. Journal of the Royal Statistical Society: Series B. 1972;34:216–217. [Google Scholar]
- Commenges D, Andersen PK. Score test of homogeneity for survival data. Lifetime Data Analysis. 1995;1:145–156. doi: 10.1007/BF00985764. [DOI] [PubMed] [Google Scholar]
- Cox DR. Regression models and life-tables (with discussion) Journal of the Royal Statistical Society: Series B. 1972;34:187–220. [Google Scholar]
- Eapen M, Logan BR, Appelbaum FR, Antin JH, Anasetti C, Couriel DR, Chen J, Maziarz RT, McCarthy PL, Nakamura R, Ratanatharathorn V, Vij R, Champlin RE. Long-term survival after transplantation of unrelated donor peripheral blood or bone marrow hematopoietic cells for hematologic malignancy. Biology of Blood and Marrow Transplantation. 2015;21:55–59. doi: 10.1016/j.bbmt.2014.09.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ferry C, Gemayel G, Rocha V, Labopin M, Esperou H, Robin M, de Latour RP, Ribaud P, Devergie A, Leblanc T, Baruchel EGA, Socie G. Long-term outcomes after allogeneic stem cell transplantation for children with hematological malignancies. Bone Marrow Transplantation. 2007;40:219–224. doi: 10.1038/sj.bmt.1705710. [DOI] [PubMed] [Google Scholar]
- He P, Eriksson F, Scheike TH, Zhang MJ. A proportional hazards regression model for the subdistribution with covariates–adjusted censoring weight for competing risks data. Scandinavian Journal of Statistics. 2016;43:103–122. doi: 10.1111/sjos.12167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jacobsen M, Martinussen T. Research report 14/4. Department of Biostatistics, University of Copenhagen; 2014. A note on the large sample properties of estimators based on generalized linear models for correlated pseudo-observations. [Google Scholar]
- Jacobsohn DA. Outcomes of pediatric bone marrow transplantation for leukemia and myelodysplasia using matched sibling, mismatched related, or matched unrelated donor. Bone Marrow Transplantation. 2015;50:749–750. [Google Scholar]
- Jeong JH, Fine J. A note on cause-specific residual life. Biometrika. 2009;96:237–242. [Google Scholar]
- Jeong JH, Fine J. Nonparametric inference on cause-specific quantile residual life. Biometrical Journal. 2013;55:68–81. doi: 10.1002/bimj.201100190. [DOI] [PubMed] [Google Scholar]
- Kim MO, Zhou M, Jeong JH. Censored quantile regression for residual lifetimes. Lifetime Data Analysis. 2012;18:177–194. doi: 10.1007/s10985-011-9212-2. [DOI] [PubMed] [Google Scholar]
- Klein JP, Andersen PK. Regression modeling of competing risks data based on pseudo-values of the cumulative incidence function. Biometrics. 2005;61:223–229. doi: 10.1111/j.0006-341X.2005.031209.x. [DOI] [PubMed] [Google Scholar]
- Lee EW, Wei LJ, Amato DA. Cox-Type Regression Analysis for Large Numbers of Small Groups of Correlated Failure Time Observations in Klein. In: JP, Goel PK, editors. Survival Analysis: State of the Art. Kluwer Academic Publishers; Dordrecht, Netherlands: 1992. [Google Scholar]
- Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73:13–22. [Google Scholar]
- Lin DY. On the Breslow estimator. Lifetime Data Analysis. 2007;13:471–480. doi: 10.1007/s10985-007-9048-y. [DOI] [PubMed] [Google Scholar]
- Lin C, Zhang L, Zhou Y. Conditional quantile residual lifetime models for right censored data. Lifetime Data Analysis. 2014 doi: 10.1007/s10985-013-9289-x. [DOI] [PubMed] [Google Scholar]
- Logan B, Zhang M-J, Klein JP. Marginal models for clustered time to event data with competing risks using pseudovalues. Biometrics. 2011;67:1–7. doi: 10.1111/j.1541-0420.2010.01416.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ma Y, Wei Y. Analysis on censored quantile residual life model via spline smoothing. Statistica Sinica. 2012;22:47–68. doi: 10.5705/ss.2010.161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Majhail NS, Rizzo JD. Surviving the cure: long term followup of hematopoietic cell transplant recipients. Bone Marrow Transplantation. 2013;48:1145–1151. doi: 10.1038/bmt.2012.258. [DOI] [PubMed] [Google Scholar]
- Martin PJ, Counts GW, Appelbaum FR, Lee SJ, Sanders JE, Deeg HJ, Flowers MED, Syrjala KL, Hansen JA, Storb RF, Storer BE. Life expectancy in patients surviving more than 5 years after hematopoietic cell transplantation. Journal of Clinical Oncology. 2010;28:1011–1016. doi: 10.1200/JCO.2009.25.6693. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prentice RL, Kalbfleisch JD, Peterson AV, Flournoy N, Farewell VT, Breslow NE. The analysis of failure times in the presence of competing risks. Biometrics. 1978;34:541–554. [PubMed] [Google Scholar]
- Shaw PJ, Kan F, Ahn KW, Spellman SR, Aljurf M, Ayas M, Burke M, Cairo MS, Chen AR, Davies SM, Frangoul H, Gajewski J, Gale RP, Godder K, Hale GA, Heemskerk MB, Horan J, Kamani N, Kasow KA, Chan KW, Lee SJ, Leung WH, Lewis VA, Miklos D, Oudshoorn M, Petersdorf EW, Ringden O, Sanders J, Schultz KR, Seber A, Setterholm M, Wall DA, Yu L, Pulsipher MA. Outcomes of pediatric bone marrow transplantation for leukemia and myelodysplasia using matched sibling, mismatched related, or matched unrelated donors. Blood. 2010;116:4007–4015. doi: 10.1182/blood-2010-01-261958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spiekerman CF, Lin DY. Marginal regression models for multivariate failure time data. Journal of the American Statistical Association. 1998;93:1164–1175. [Google Scholar]
- Yin G, Cai J. Additive hazards model with multivariate failure time data. Biometrika. 2004;91:801–818. [Google Scholar]
- Zeger SL, Liang KY, Albert PS. Models for longitudinal data: A generalized estimating equation approach. Biometrics. 1988;44:1049–1060. [PubMed] [Google Scholar]
- Zeng D, Lin DY. Efficient resampling methods for nonsmooth estimating functions. Biostatistics. 2008;9:355–363. doi: 10.1093/biostatistics/kxm034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao YQ, Zeng D, LABER EB, SONG R, Yuan M, Kosorok MR. Doubly robust learning for estimating individualized treatment with censored data. Biometrika. 2015;102:151–168. doi: 10.1093/biomet/asu050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou B, Fine J, Latouche A, Labopin M. Competing risks regression for clustered data. Biostatistics. 2012;13:371–383. doi: 10.1093/biostatistics/kxr032. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


