Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Nov 19.
Published in final edited form as: J Am Stat Assoc. 2016 Oct 18;111(515):1289–1300. doi: 10.1080/01621459.2015.1093942

Joint Inference for Competing Risks Survival Data

Gang Li a, Qing Yang b
PMCID: PMC6863485  NIHMSID: NIHMS986715  PMID: 31745375

Abstract

This article develops joint inferential methods for the cause-specific hazard function and the cumulative incidence function of a specific type of failure to assess the effects of a variable on the time to the type of failure of interest in the presence of competing risks. Joint inference for the two functions are needed in practice because (i) they describe different characteristics of a given type of failure, (ii) they do not uniquely determine each other, and (iii) the effects of a variable on the two functions can be different and one often does not know which effects are to be expected. We study both the group comparison problem and the regression problem. We also discuss joint inference for other related functions. Our simulation shows that our joint tests can be considerably more powerful than the Bonferroni method, which has important practical implications to the analysis and design of clinical studies with competing risks data. We illustrate our method using a Hodgkin disease data and a lymphoma data. Supplementary materials for this article are available online.

Keywords: Cause-specific hazard, Censoring, Cox’s model, Cumulative incidence, Log-rank test, Subdistribution hazard

1. Introduction

Competing risks failure time data arise commonly in clinical trials, reliability testing, and other fields. For instance, in a clinical trial, one may be interested in time to death due to a particular disease, but a patient can also die from other competing diseases that are potentially positively correlated with the disease of interest. Competing risks can also be negatively correlated with the event time of interest. For example, in a kidney transplantation program, patients who are ineligible for transplantation due to reasons, such as being overweight, are put on a waiting list until they become eligible (see, e.g., Sancho et al. 2007). An important outcome variable is the waiting time to become eligible for transplantation. In this case, death before becoming eligible for transplantation is a competing risk event that is potentially negatively correlated with the waiting time. More examples of competing risks failure time data can be found in Prentice et al. (1978), Pintilie (2006), Gichangi and Vach (2005), and Putter, Fiocco, and Geskus (2007), and the references therein. There is a broad literature on statistical methods for competing risks failure time data. Group comparison of a specific type of failure has been studied using either the cause-specific hazard (Prentice et al. 1978; Lindkvist and Belyaev 1998; Kulathinal and Gasbarra 2002) or the cumulative incidence (Gray 1988; Pepe and Mori 1993; Bajorunaite and Klein 2007). Methods to compare failures across failure types have been developed with respect to either the cause-specific hazard, or the cumulative incidence, or both (Aly, Kochar, and McKeague 1994; Sun and Tiwari 1995; Lam 1998; Luo and Turnbull 1999). Tiwari, Kulasekera, and Park (2006) proposed a test to check equality of cause-specific hazards across all failure types and groups. For regression analysis of competing risks failure time data, Prentice et al. (1978), Lagakos (1978), Holt (1978), Cox and Oakes (1984, chap. 9), Larson (1984), and Lunn and McNeil (1995) studied proportional cause-specific hazards models. Fine and Gray (1999) introduced a proportional subdistribution hazards model for the cumulative incidence function. Fine (1999, 2001), Klein and Andersen (2005), and Gerds, Scheike, and Andersen (2012) used transformation models to directly model the cumulative incidence function. Klein (2006) discussed additive models for both the cause-specific hazard and the cumulative incidence function. Comprehensive survey of statistical methods for competing risks survival data and further references can be found in Beyersmann et al. (2007), Latouche et al. (2007), and Haller, Schmidt, and Ulm (2012).

In this article, we focus on the problem of assessing the effects of a variable (treatment or covariate) on the time to a particular type of failure. For convenience, we assume hereafter that there are only two types of failure, where Type 1 represents the failure type of interest and Type 2 includes all other competing risks. As discussed earlier, there are mainly two approaches to this problem based on either the cause-specific hazard function or the cumulative incidence function. The cause-specific hazard function for Type 1 failure is defined as

λ1(t)=limdt0P(tT<t+dt,D=1|Tt)/dt,t>0,

the instantaneous risk for Type 1 failure at time t given that the subject is at risk just prior to t, where T is the continuous failure time with multiple failure types and D is the failure type. For example, Prentice et al. (1978) showed that the standard Cox (1972, 1975) regression method can be used to study the effects of a variable on the cause-specific hazard λ1(t) by treating other types of failures as independent right censoring events. The cumulative incidence function is defined as F1(t) = P(Tt, D = 1), t > 0, the cumulative incidence rate of Type 1 failure by time t, which can be uniquely characterized by the following subdistribution hazard:

λ˜1(t)=limdt0P(tT<t+dt,D=1|Tt(T<tD1))/dt=dlog{1F1(t)}/dt.

In particular, Gray (1988) developed a class of nonparametric tests to compare the cumulative incidence function of a given type of failure between different groups and Fine and Gray (1999) introduced a proportional subdistribution hazards model for the regression problem.

Despite the extensive literature on this topic, there are still confusions to practitioners as to which method should be used in practice when studying the effects of a variable on Type 1 failure. We point out that joint inference for both λ1(t) and F1(t) should be made. First of all, these two quantities describe different characteristics of Type 1 failure: λ1(t) represents the instantaneous Type 1 failure rate at time t given survival to t, whereas F1(t) summarizes the prevalence or cumulative incidence of Type 1 failure over the time interval [0, t]. Second, λ1(t) and F1(t) do not uniquely determine each other except when J = 1. It can be shown that F1(t)=0tS(u)λ1(u)du, where S(u) = P(T > u) is the all-cause survival function. Thus F1(t) depends not only on λ1(t), but also on other cause-specific hazards through the all-cause survival function S(t). Finally, the effects of a variable on λ1(t) can be different from its effects on F1(t) (Gray 1988; Fine and Gray 1999), and one often does not know which effects are to be expected. To the best of our knowledge, no formal joint inference procedure for these quantities is available in the literature. Although the Bonferroni method provides a straightforward solution, it can be severely under-powered as demonstrated later in Sections 4 and 5.

The primary purpose of this article is to develop joint inference procedures to assess the effects of a variable on λ1(t) and F1(t) simultaneously. We allow independent right censoring in addition to competing risks. We first consider the two-sample comparison problem with respect to both λ1(t) and F1(t). By establishing the asymptotic joint distribution of the weighted log-rank test statistic for λ1(t) and the Gray (1988) test statistic for F1(t), we derive two-sample joint tests for λ1(t) and F1(t). We then extend our method to a regression setting based on Cox-type models for λ1(t) and (or λ˜1(t)). We also discuss joint inference for other related quantities.

In Section 2, we first review the weighted log-rank test for group comparison of λ1(t) and the Gray (1988) test for F1(t). Then, we develop joint test procedures for group comparisons of both λ1(t) and F1(t). We also discuss joint tests for other equivalent pairs including λ1(t) with the all-cause hazard, and λ1(t) with the cause-specific hazard for other failure types. Section 3 develops joint regression analysis methods for λ1(t) and (or λ˜1(t)) under Cox-type regression models. Section 4 presents some simulation results to evaluate the proposed methods and compare them with the Bonferroni method. In Section 5, we illustrate our methods on a Hodgkin disease data and a lymphoma data. Section 6 gives some further remarks. The proofs for the theorems and additional simulation results are provided in the Appendix in the supplementary material.

2. Two-Sample Joint Tests for Competing Risks Data

Suppose that there are two independent groups of subjects. Let Tik, Dik, and Cik denote the continuous failure time, the type of failure, and the censoring time, respectively, for subject i in group k, i = 1, …, nk, k = 1, 2. Assume that the triplets (Tik, Dik, Cik) for different subjects within each group are independent and identically distributed and that the censoring time Cik is independent of the failure time Tik. The two groups are allowed to have different censoring distributions. For group k (k = 1, 2), one observes a right censored competing risks failure time data {(Xik, δik), i = 1, …, nk}, where Xik = min(Tik, Cik) and δik = DikI(TikCik). Let Sk(t) = P(Tik > t) and Skc(t)=P(Cik>t). For group k (k = 1, 2), let λ1k(t), F1k(t), and λ˜1k(t) denote the cause-specific hazard function, the cumulative incidence function, and the subdistribution hazard function, respectively, for Type 1 failure. We develop nonparametric tests for the following null hypothesis,

H0:λ11(t)=λ12(t)andF11(t)=F12(t)forall0<t<τ, (1)

where τ is some prespecified fixed time.

2.1. Preliminaries

We first review the two-sample weighted log-rank test for the cause-specific hazard and the Gray (1988) two-sample test for the cumulative incidence for Type 1 failure. These tests will be used as building blocks to develop joint tests for the hypothesis (1).

2.1.1. Two-Sample Tests for Cause-Specific Hazard

It is now well known that the standard (weighted) log-rank test (Peto and Peto 1972; Andersen et al. 1982) for right censored failure time data can be applied to test

H0:λ11(t)=λ12(t)forall0<t<τ, (2)

by treating all other competing risks as independent right censoring (Tsiatis 1975; Prentice et al. 1978; Lindkvist and Belyaev 1998). Specifically, let Njk(t)=i=1nkI(Xkit,Dki=j) be the counting process of the number of observed type j failures in group k by time t, and Yk(t)=i=1nkI{Xkit} be the at risk process indicating the number of subjects in group k who are at risk prior to time t, k = 1, 2. Let Nj(t)=k=12Njk(t) and Y(t)=k=12Yk(t). The weighted log-rank test statistic for (2) is defined as

U1k=0τW1(t)Yk(t){dN1k(t)Yk(t)dN1(t)Y(t)}, (3)

where W1(t) is a predictable weight function that converges in probability to some deterministic function w1(t) as n → ∞, and τ is the largest time at which all of the groups have at least one subject at risk. It can be shown that under the null hypothesis (2), n1/2U11/σ^ has a standard normal limiting distribution where

σ^2=n10τW12(t)Y1(t)Y2(t)Y(t)dN1(t)Y(t). (4)

This leads to an asymptotic χ2 test or a Z test for (2).

2.1.2. Two-Sample Tests for Cumulative Incidence Function

Gray (1988) developed a class of K-sample nonparametric tests to compare the cumulative incidence between different groups. Consider the following null hypothesis,

H0:F11(t)=F12(t)forall0<t<τ. (5)

The Gray (1988) nonparametric test statistic is defined as

U˜1k=0τkW˜(t)Rk(t){dN1k(t)Rk(t)dN1(t)R(t)}, (6)

where W˜(t) is a predictable weight function that converges in probability to some deterministic function w˜(t) as n → ∞, Rk(t)=I(τkt)Yk(t)G^1k(t)/S^k(t) can be considered as an adjusted risk set size for group k at time t, G^jk(t) is the left-hand limit of the Kaplan–Meier (1958) estimate of Gjk(t) = 1 − Fjk(t), S^k(t) is the left-hand limit of the Kaplan–Meier estimate of Sk(t), τk is some fixed time point satisfying Sk(τk)Skc(τk)>0, and R(t) represents the same quantity as Rk(t) using the pooled sample. Gray (1988) showed that under (5), n1/2U˜11/σ˜^ has a standard normal limiting distribution, where

σ˜^2=k=12n1{0τ1a^k2(t)h^k1(t)h^1(t)dN1(t)+0τ1b^2k2(t)h^k2(t)dN2k(t)}, (7)

with

a^k(t)=d^1k(t)+b^1k(t),b^jk(t)=[I(j=1)G^1(t)/S^k(t)][c^k(τ1)c^k(t)],c^k(t)=0td^1k(u)G^1(u)1h^1(u)dN1(u),d^jk(t)=n1I(j=1)W˜(t)R1(t)×[I(k=1)h^k(t)/h^(t)]/G^1(t),h^k(t)=I(tτk)n1Yk(t)/S^k(t),h^(t)=I(tmax(τ1,τ2))n1Y(t)/S^(t),G^1(t)=1F^1(t)=1n10th^1(u)dN1(u). (8)

This gives an asymptotic χ 2 test for (5) based on n1U˜112/σ˜^2 or a Z test based on n1/2U˜11/σ˜^.

Examples of the weight functions in the above test statistics have been discussed by a number of authors (Gehan 1965; Breslow 1970; Peto and Peto 1972; Kalbfleisch 1980; Gray 1988). A nice survey of various weight functions and their applications can be found in Klein and Moeschberger (2003, chap. 7.2).

2.2. Joint Two-Sample Tests for Cause-Specific Hazard and Cumulative Incidence Function

To test the joint null hypothesis (1), we first establish the joint limiting distribution of U11 and U˜11 below.

Theorem 1.

Let U11 and U˜11 be defined by (3) and (6). Under the null hypothesis (1), n1/2(U11,U˜11) has an asymptotically bivariate normal distribution with mean 0 and variance-covariance matrix Σ(1)=(σij(1)) as n → ∞, where Σ(1) is defined in (A.1) and (A.4) of Appendix A.1. Furthermore, σ11(1) and σ22(1) are consistently estimated by (4) and (7), and the covariance σ12(1) is consistently estimated by

σ^12(1)=n1{0τW1(t)Y2(t)Y(t)V^11(t)+c^1(τ)0τW1(t)Y2(t)Y(t)E^11(t)h^11(t)}Y1(t)dΛ^11(t)+n1{0τW1(t)Y1(t)Y(t)V^12(t)+c^2(τ)0τW1(t)Y1(t)Y(t)E^12(t)h^21(t)}Y2(t)dΛ^12(t), (9)

where Λ^1k(τ)=0τYk1(t)dN1k(t), V^jk(t)=[d^jk(t)E^jk(t)c^k(t)]h^k1(t), E^jk(t)=I(j=1)G^1k(t)/S^k(t), and other quantities are defined in (8).

2.2.1. Chi-Square Joint Test for (1)

Define

X2=n1(U11,U˜11)Σ^(1)(1)(U11U˜11).

It follows from Theorem 1 that under (1), X2 has an asymptotically chi-square distribution with 2 degrees of freedom. This leads to the following chi-square test for (1):

Reject(1)atlevelαifX2>χ22(α),

where χ22(α) is the upper 1 − α percentile of the standard χ22 distribution.

Rejection of (1) by the above chi-square test implies that there is a difference in either cause-specific hazard or cumulative incidence between the two groups. However, it does not indicate which individual quantity has a difference. The following maximum test provides an alternative joint test that allows one to draw a conclusion on each individual quantity. It also allows one-sided test.

2.2.2. Maximum Joint Test for (1)

Define

T*=max(|Z11|,|Z˜11|)),

where Z11=n1/2U11/σ^11(1) and Z˜11=n1/2U˜11/σ^22(1). We would reject (1) if the observed T* is large. It follows from Theorem 1 that for large samples, the distribution of (Z11,Z˜11) can be approximated by the bivariate normal distribution N((0,0)T,(1,1,ρ^)), where ρ^=σ^12(1)σ^11(1)σ^22(1). Thus, we can approximate the distribution of T* using Monte Carlo simulation. Specifically, we generate N pairs of random variables from the bivariate normal distribution N((0,0)T,(1,1,ρ^)). For the lth generated pair, compute the maximum absolute value, and denote it by Tl*. Let Tα be the upper 100(1 − α)th sample quantile of T1*,,TN*. Reject the null hypothesis (1) at level α if T* > Tα.

Remark 1.

It is straightforward to modify the maximum joint test procedure to test one-sided alternative(s) based on either T*=max(Z11,Z˜11), T*=max(|Z11|,Z˜11), or T*=max(Z11,|Z˜11|) as deemed appropriate.

Remark 2.

(K-Sample Joint Tests) The above two-sample joint tests can be easily extended to the K-sample problem (K ≥ 2) for the following null hypothesis

H0:λ11(t)==λ1K(t)andF11(t)==F1K(t)forall0<t<τ, (10)

where τ is some prespecified fixed time. Similar to Theorem 1, it can be shown that under the null hypothesis (10), Vn=n1/2(U11,,U1K1,U˜11,,U˜1K1) has an asymptotic multivariate normal distribution with mean 0 and variance-covariance matrix Σ*, where Σ* is defined as the limit of the variance-covariance matrix of Vn and can be consistently estimated as follows. From Kulathinal and Gasbarra (2002), we have cov^(n1/2U1k,n1/2U1k)=n10τW12(t)Yk(t)Yk(t)Y(t)dΛ^1(t), where k, k′ = 1, …, K. cov^(n1/2U˜1k,n1/2U˜1k) is given by Equation (2.10) on page 1146 of Gray (1988). Similar to the proof of Theorem 1,

cov^(n1/2U1k,n1/2U˜1k)=n10τ(W1(t)V^1kk(t)+c^kk(τ)0τW1(t)E^1k(t)h^k1(t))×Yk(t)dΛ^1k(t)+n1l=1K(0τW1(t)Yk(t)Y(t)V^1kl(t)+c^kl(τ)0τW1(t)Yk(t)Y(t)E^1l(t)h^l1(t))Yl(t)dΛ^1l(t),

where Λ^1k(τ)=0τYk1(t)dN1k(t), V^jkl(t)=[D^jkl(t)E^jl(t)c^kl(t)]h^l1(t), D^jkl=n1I(j=1)W˜(t)Rk(t)[I(k=l)h^l(t)/h^(t)]/G^1(t), c^kl(t)=n10td^1kl(u)G^1(u)1h^1(u)dN1(u), E^jk(t)=I(j=1)G^1k(t)/S^k(t), and all other quantities are defined in (8). These results allow one to derive a chi-square test and a maximal test similar to the two-sample case.

2.3. Joint Two Sample Tests for Other Quantities

Joint tests can also be derived for other related quantities. For group k, let λ2k(t) and λ·k(t) denote the other (Type 2) cause-specific hazard function and the all-cause hazard function, respectively.

2.3.1. Two-Sample Joint Tests for Cause-Specific Hazard and All-Cause Hazard

Consider the following null hypotheses

H0:λ11(t)=λ12(t)andλ1(t)=λ2(t)forall0<t<τ. (11)

Let

Uk=0τW(t)Yk(t){dNk(t)Yk(t)dN(t)Y(t)}, (12)

be the weighted log-rank test statistic for H0 : λ·1(t) = λ·2(t) for all t > 0, where Nk(t)=j=12Njk(t), N(t)=k=12j=12Njk(t), and W·(t) is a predictable weight function that converges in probability to some deterministic function w.(t) as n → ∞. Let U11 and U.1 be defined by (3) and (12). Then, n−1/2(U11, U.1) has an asymptotic bivariate normal distribution with mean 0 and variance-covariance matrix Σ(2)=(σij(2)). Furthermore, Σ(2) is consistently estimated by Σ^(2)=(σ^ij(2)), where σ^11(2)=n10τW12(t)Y1(t)Y2(t)Y1(t)+Y2(t)dN11(t)Y1(t), σ^22(2)=n10τW.2(t)Y1(t)Y2(t)Y1(t)+Y2(t)dN1(t)Y1(t), and σ^12(2)=n10τW1(t)W(t)Y1(t)Y2(t)Y1(t)+Y2(t)dN11(t)Y1(t). These results allow one to construct a chi-square joint test and a maximum joint test for (11) similar to those for (1) in the previous section.

2.3.2. Two-Sample Joint Tests for Both Cause-Specific Hazards

Consider

H0:λ11(t)=λ12(t)andλ21(t)=λ22(t)forall0<t<τ. (13)

Let

U2k=0τW2(t)Yk(t){dN2k(t)Yk(t)dN2(t)Y(t)}, (14)

be the weighted log-rank test statistic for H0 : λ21(t) = λ22(t) for all 0 < t < τ, where W2(t) is a predictable weight function that converges in probability to some deterministic function w2(t) as n → ∞. It is well known that U1k and U2k are asymptotically independent (Prentice et al. 1978). Hence, one can construct a chi-square joint test and a maximum joint test for (13) based on the joint distribution of the two test statistics. Joint test for (13) was also studied previously by Lindkvist and Belyaev (1998) and Kulathinal and Gasbarra (2002) among others. In particular, the K-sample chi-square test of Kulathinal and Gasbarra (2002, p. 150) for the (λ1k, λ2k) pair with a special weight function Kkijn(t)=I(i=j)W1(t) reduces to that based on U1k and U2k. We also note that the ideas of Kulathinal and Gasbarra (2002) could be extended to derive a test for the (λ1k, λ·k) pair, although it was not explicitly developed in their article.

Remark 3.

It can be shown that for group k, the three pairs of functions (λ1k(·), F1k(·)), (λ1k(·), λ·k(·)), and (λ1k(·), λ2k(·)) uniquely determine each other and that each pair uniquely determines the joint distribution of (Xik, δik). This implies that the three null hypotheses (1), (11), and (13) are equivalent. On the other hand, their alternative hypotheses are different because the three pairs of functions characterize different features of competing risks data. Furthermore, a significant effect of a variable on one pair does not necessarily imply a significant effect on another pair, as illustrated later in Section 5.1. A practical question is which pair(s) should be used, especially when planning a study. The answer would depend on the specific research questions of a study. The cause-specific hazard and cumulative incidence pair, or (λ1k(·), F1k(·)), would be useful when studying the effects of a variable on a given type (Type 1) failure since they directly characterize two distinct and easily interpretable features of Type 1 failure. The cause-specific hazard and all-cause hazard pair, or (λ1k(·), λ·k(·)), would be useful when the all-cause hazard describes a meaningful clinical outcome such as “overall survival” (death due to any disease) in a randomized clinical trial of a new treatment versus a standard treatment for a specific disease in which the disease-specific survival and overall survival are co-primary endpoints. Note that the all-cause hazard may not always describe a meaningful clinical outcome especially when the two types of failures are negatively correlated as exemplified in the kidney transplantation program example discussed in the beginning of Section 1. Finally, joint inference for both cause-specific hazards, or (λ1k(·), λ2k(·)), would useful when both types of failures are of interest to the study.

3. Joint Regression Analysis for Competing Risks Data

3.1. Joint Regression Analysis of Cause-Specific Hazard and Cumulative Incidence

We now consider joint inference for the cause-specific hazard and the cumulative incidence hazard under a regression setting. Assume that one observes n independent and identically distributed triples (Xi, δi, Zi), where for subject i (i = 1, …, n), Xi = min{Ti, Ci}, δi = DiI(TiCi), Ti is the failure time of interest, Ci is a right censoring time, Di is discrete random variable taking values on 1, 2 with Di = j indicating that type j failure is observed, and Zi is a vector of fixed or time-varying covariates that are observed on [0, Xi]. Assume Ci is independent of Ti, Di, and Zi, and pr(Cit) = Gc(t).

Let λ1(t|z) and λ˜1(t|z) be the conditional cause-specific hazard function and the conditional subdistribution hazard function for Type 1 failure for an individual with covariate z. Assume the proportional cause-specific hazards model (Prentice et al. 1978)

λ1(t|Z)=λ10(t)exp(β1TZ(1)(t)), (15)

and the proportional subdistribution hazards model (Fine and Gray 1999)

λ˜1(t|Z)=λ˜10(t)exp(γ1TZ(2)(t)), (16)

where λ10(t) and λ˜10(t) are unknown baseline cause-specific hazard and baseline subdistribution hazard for Type 1 failure, respectively, and Z(1)(t) and Z(2)(t) are functions of the original covariates Z and t that allow time × covariates interactions. Prentice et al. (1978) showed that inference for β1 under the proportional cause-specific hazards model (15) can be made using the standard Cox (1972, 1975) partial likelihood method by regarding other types of failure as independent censoring. The proportional subdistribution hazards model (16) was introduced by Fine and Gray (1999) who developed large sample inference for γ1.

Below we develop joint inference for β1 and γ1. Specifically, we consider the following joint null hypothesis

H0:A1Tβ1=d1andA2Tγ1=d2, (17)

where A1 and A2 are constant matrices, and d1 and d2 are constant column vectors.

Following Prentice et al. (1978) and Fine and Gray (1999), let

U1(β1)=i=1n0{Zi(1)(t)Z¯(1)(β1,t)}dNi1(t), (18)

and

U˜1(γ1)=i=1n0{Zi(2)(t)Z¯(2)(γ1,t)}ωi(t)dN˜i1(t), (19)

be the score functions for β1 and γ1 under models (15) and (16), respectively, where

Z¯(1)(β1,t)=l=1nYl(t)Zl(1)(t)exp(β1TZl(1)(t))l=1nYl(t)exp(β1TZl(1)(t)),Yi(t)=I{Xit},

and

Ni1(t)=I(Xit,Di=1),Z¯(2)(γ1,t)=l=1nωl(t)Y˜l(t)Zl(2)exp(γ1TZl(2)(t))l=1nωl(t)Y˜l(t)exp(γ1TZl(2)(t)),

N˜i1(t)=I(Tit,Di=1), Y˜i(t)=1N˜i1(t), ωi(t)=I(CiTit)G^c(t)/G^c(Xit), and G^c is the Kaplan and Meier (1958) estimate of the survival function Gc of the censoring variable C. Note that N˜i1(t) is different from Ni1(t) and may not be observed if the subject is censored, but ωi(t)N˜i1(t) can always be computed. Let β^1 and γ^1 be the solutions of the score equations U1(β1) = 0 and U˜1(γ1)=0, respectively.

Theorem 2.

Under similar regularity conditions to Andersen et al. (1982) and Fine and Gray (1999), we have

n1/2(β^1β1γ^1γ1)N(0,Σ(1)),asn,

where Σ(1) is defined by (A.11) in Appendix A.1. Furthermore, Σ(1) can be consistently estimated by

Σ^(1)=(Ω^(pp)(1)1Ω^(pp)(1)1Ω^(pq)(1)Ω^(qq)(1)1Ω^(qq)(1)1Ω^(qp)(1)Ω^(pp)(1)1Ω^(qq)(1)1Ω^(qq)*(1)Ω^(qq)(1)1), (20)

where

Ω^(pp)(1)=1ni=1n0[l=1nYl(t)Zl(1)(t)2exp(β^1TZl(1)(t))l=1nYl(t)exp(β^1TZl(1)(t))Z¯(1)(β^1,t)2]dNi1(t),Ω^(qq)(1)=1ni=1n0{l=1nωl(t)Y˜l(t)Zl(2)(t)2exp(γ^1TZl(2)(t))l=1nωl(t)Y˜l(t)exp(γ^1TZl(2)(t))Z¯(2)(γ^1,t)2}dN˜i1(t),Ω^(pq)(1)=1ni=1n{0(Zi(1)(t)Z¯(1)(β^1,t))×(dNi1(t)Yi(t)exp(β^1TZi(1)(t)dΛ^10(t))*η^i}+1ni=1n{0(Zi(1)(t)Z¯(1)(β^1,t))×(dNi1(t)Yi(t)exp(β^1TZi(1)(t))dΛ^10(t))*ϕ^i}Ω^(qq)*(1)=1ni=1n(η^i+ϕ^i)2, (21)

with

η^i=0{Zi(2)(t)Z¯(2)(γ^1,t)}ωi(t)dM˜^i1(t),M˜^i1(t)=N˜i1(t)0tY˜i(u)exp(γ^1TZi(2)(u))dΛ˜^10(u),Λ˜^10(t)=1ni=1n0t{l=1nY˜l(u)exp(γ^1TZl(2)(u))}1×ωi(u)dN˜i1(u),ϕ^i=0q^(t)π^(t)dM^ic(t),M^ic(t)=I(Xit,δi=0)0tI(Xiu)dΛ^c(u),Λ^c(t)=0ti=1nd{I(Xiu,δi=0)}i=1nI(Xiu),q^(t)=n1i=1n0{Zi(2)(s)Z¯(2)(γ^1,s)}×I(st>Xi)ωi(s)dM˜^i1(s),π^(t)=n1i=1nI(Xit).

Corollary 1.

Let ξn=n1/2(A1β^1d1) and ηn=n1/2(A2γ^1d2). Then, under the null hypothesis (17), we have

(ξnηn)N(0,V),asn,

where

V=(A100A2)Σ(1)(A1T00A2T). (22)

Define the following Wald-type test statistic

XW2=(ξnT,ηnT)V^1(ξnηn),

where V^ is a consistent estimate of V obtained by replacing Σ(1) with Σ^(1) in (22). It follows immediately from Corollary 1 that under (17), XW2 has an asymptotic chi-squared distribution with pd1 + pd2 degrees of freedom, where pd1 and pd2 are the dimensions of d1 and d2, respectively. This leads to the following chi-square joint test for (17):

Reject(17)atlevelαifXW2>χpd1+pd22(α),

where χpd1+pd22(α) is the upper 1 − α percentile of the standard χpd1+pd22 distribution.

3.2. Joint Regression Analysis of Other Quantities

Besides analyzing λ1(t|Z) and λ˜1(t|Z) jointly, it is sometimes also useful to consider other related quantities as discussed in Section 2.3 (Remark 3).

3.2.1. Joint Regression Analysis of Cause-Specific Hazard and All-Cause Hazard

Assume that the proportional cause-specific hazards model (15) holds. In addition, assume the proportional all-cause hazards model:

λ(t|Z)=λ0(t)exp(βTZ(3)(t)), (23)

where λ(t|Z)) denote the conditional all-cause hazard function given Z, λ0(t) is an unknown baseline hazard, and Z(3)(t) are functions of the original covariates Z and t that allow time × covariates interactions. Below we derive joint inference for β1 and β.

Let

U(β)=i=1n0{Zi(3)(t)Z¯(3)(β,t)}dNi(t), (24)

be the score function for β. under model (23), where

Z¯(3)(β,t)=l=1nYl(t)Zl(3)(t)exp(βTZl(3)(t))l=1nYl(t)exp(βTZl(3)(t))

and Ni(t) = I(Xit, δi = 1). Let β^ be the solution of the score equation U·(β·) = 0.

Theorem 3.

Under some regularity conditions, as n → ∞,

n1/2(β^1β1β^β)N(0,Σ(2)),

where Σ(2) is defined by (A.13) in Appendix A.1. Furthermore, Σ(2) can be consistently estimated by

Σ^(2)=(Ω^(pp)(2)1Ω^(pp)(2)1Ω^(pq)(2)Ω^(qq)(2)1Ω^(qq)(2)1Ω^(qp)(2)Ω^(pp)(2)1Ω^(qq)(2)1), (25)

where

Ω^(pp)(2)=1ni=1n0[l=1nYl(t)Zl(1)(t)2exp(β^1TZl(1)(t))l=1nYl(t)exp(β^1TZl(1)(t))Z¯(1)(β^1,t)2]dNi1(t),
Ω^pq(2)=1ni=1n0(Zi(3)(t)Z¯(3)(β^,t))×(Zi(1)(t)Z¯(1)(β^1,t))Yi(t)×exp(β^1TZ(1)(t))dΛ^10(t),
Ω^(qq)(2)=1ni=1n0[l=1nYl(t)Zl(3)(t)2exp(β^TZl(3)(t))l=1nYl(t)exp(β^TZl(3)(t))Z¯(3)(β^,t)2]dNi(t),

with Λ^10(t)=1ni=1n0t{l=1nYl(u)(β^1TZi(1)(u))}1dNi1(u) is an estimator of the baseline cumulative cause-specific hazard for Type 1 failure.

Theorem 3 enables one to draw joint inference for β1 and β along the lines of the previous section.

3.2.2. Joint Regression Analysis of Both Cause-Specific Hazards

Assume the proportional cause-specific hazards model (15) for Type 1 failure. In addition, assume the following proportional cause-specific hazards model for Type 2 failure:

λ2(t|Z)=λ20(t)exp(β2TZ(4)(t)), (26)

where λ20(t) is an unknown baseline cause-specific hazard, and Z(4)(t) are functions of the original covariates Z and t that allow time × covariates interactions.

Let

U2(β2)=i=1n0{Zi(4)(t)Z¯(4)(β2,t)}dNi2(t), (27)

be the score test statistic under model (26), where

Z¯(4)(β2,t)=l=1nYl(t)Zl(4)(t)exp(β2TZl(4)(t))l=1nYl(t)exp(β2TZl(4)(t)).

Let β^2 be the solution of the score equations U2(β2) = 0. It can be shown that U1 and U2 are asymptotically independent since Ni1(t) and Ni2(t) do not jump at the same time. Therefore, one draw joint inference for β1 and β2 similar to the previous sections.

Remark 4.

In addition to being easy to interpret, the PH models for the cause-specific hazard and the all-cause hazard only require that the censoring time be conditionally independent of the survival time given the observed covariates, which is weaker than the completely censoring at random assumption needed by the proportional subdistribution hazards model.

Remark 5 (Model Checking).

Model diagnostic techniques for the standard Cox (1972) proportional hazards model can be readily applied to assess model assumptions of the individual models (15), (23), and (26) (Schoenfeld 1980, 1982; Lagakos 1981; Andersen 1982; Nagelkerke, Oosting, and Hart 1984; Moreau, O’quigley, and Mesbah 1985; Arjas 1988; Beyersmann et al. 2007; Latouche et al. 2007; Grambauer, Schumacher, and Beyersmann 2010; Andersen et al. 2012; Haller, Schmidt, and Ulm 2012). Graphical methods for these models can also be adapted for the proportional subdistribution hazards model (16). Formal goodness-of-fit tests for (16) have been developed by Scheike and Zhang (2008). In addition to assessing goodness of fit of an individual model, it is also important to check if two individual models hold simultaneously. For example, it has been well recognized that the proportional hazards assumption for a time-independent covariate does not hold simultaneously for the cause-specific hazard and the cause-specific subdistribution hazard, and thus it is important for models (15) and (16) to allow time × covariates interactions. To check if (15) and (16) hold simultaneously, one needs to verify that for any z, Λ2(t|z)Λ˜1(t|z)Λ1(t|z)+logλ1(t|z)logλ˜1(t|z) is nondecreasing and satisfies Λ2(0|z) = 0. In other words, the above defined Λ2(t|z) is a proper conditional cumulative cause-specific hazard function for Type 2 failure. We provide an example of the joint model of (15) and (16) in Section 4 (model (28)).

4. Simulations

We present some simulation results to illustrate the advantage of the proposed joint tests over the Bonferroni method. The weight function is set to be a constant 1 in all simulations.

The first simulation considers two-group comparison of Type 1 failure with respect to both cause-specific hazard (CSH) and cumulative incidence function (CIF). We assign equal number of patients in the two groups. Competing risks data are generated using Beyersmann’s et al. (2009) cause-specific hazard-driven method that requires only specification of the cause-specific hazard for each type of failure.

Figure 1 depicts simulated rejection power of the two-sided chi-square joint test, maximum joint test, and Bonferroni joint test for hypothesis (1) for various sample sizes per group under four scenarios. Figure 1(a) corresponds to a null case under H0. Figure 1(b) corresponds to a scenario where there is a small group difference in CSH and a large group difference in CIF, whereas Figure 1(c) corresponds to an opposite situation. Figure 1(d) corresponds to a case where the group effects on CSH and CIF are similar. Specifically, in the first two scenarios, we assume constant cause-specific hazard for both causes, with λ11 = λ12 = 0.04, λ21 = λ22 = 0.01 for Figure 1(a) and λ11 = λ12 = 0.1, λ21 = 0.04, λ22 = 0.01 for Figure 1(b), where λjk denotes the cause-specific hazard for type j failure in group k. In the last two scenarios, we assume λ1(t|Z) = λ10(t) exp(γZ * I(t < 1) + βZ * I(t ≥ 1)) and λ˜1(t|Z)=λ˜10(t)exp(γZ), with β = 0.4, γ = 0.01 for Figure 1(c) and β = 0.5, γ = 0.5 for Figure 1(d), where λ10(t) = 0.05 * I(0 ≤ t < 1) + 0.1 * I(t ≥ 1), λ˜10(t)=0.05et10.05(1et), and Z is a binary group variable. The censoring rate is set to be 0.1 with an independent exponential censoring time in each scenario. The nominal significance level is 0.05. A graphical illustration of the CIF by groups under all four scenarios is presented in Appendix A.3 (Figure A.5).

Figure 1.

Figure 1.

Simulated power of the two-sided chi-square joint test, maximum joint test, and Bonferroni joint test for two-group Type 1 failure comparison with respect to the CSH and CIF pair under four scenarios as described in Sections 4: (a) null case under H0, (b) smaller group difference in CSH and larger group difference in CIF, (c) larger group difference in CSH and smaller group difference in CIF, and (d) similar group effects on CSH and CIF.

It is seen from Figure 1(a) that the Type I error rates for all three tests are well controlled around the 0.05 nominal level. In all three alternative cases ((b)–(d)), either the chi-square joint test, or the maximum joint test, or both are more powerful than the Bonferroni method. In the cases where the group effects on CSH and CIF are quite different (Figure 1(b) and 1(c)), the chi-square joint test is observed to be most powerful with substantially improved power. When the effect sizes for CSH and CIF are similar (Figure 1(d)), the maximum joint test outperforms the others. The improved power of the proposed joint tests has important implications for the design of clinical trials in the presence of competing risks. For example, to achieve 80% power under the second scenario (Figure 1(b)), it would require n = 80 patients for the chi-square joint test, about 200 patients for the maximum joint test, and more than 200 patients for the Bonferroni joint test.

We also conducted power comparisons for one-sided joint tests under the same four scenarios as in Figure 1. The results are presented in Figure A.1 in Appendix A.2. The results are consistent with the two-sided case except that the maximum joint test has much more pronounced improvement over the chi-square joint test in the last scenario. We note that the chi-square joint test is constructed for a two-sided hypothesis, and thus can be underpowered when used as a one-sided test as shown in Figure A.1(d).

The second simulation study considers a joint regression model of CSH and CIF with respect to Type 1 failure. It is well known that the proportional hazards assumption for a time-independent covariate usually does not hold simultaneously for the CSH and the CIF hazard (or subdistribution hazard), so it is imperative to include time by covariate interactions in the joint model. As an illustration, we consider the following joint model:

λ1(t|Z)=λ10(t)exp(γTZI(t<τ0)+βTZI(tτ0)),λ˜1(t|Z)=λ˜10(t)exp(γ˜TZI(t<τ0)+β˜TZI(tτ0)), (28)

where λ10(t) = aI(0 ≤ t < τ0) + bI(t ≥ τ0), λ˜10(t)=cet1c(1et), Z = (Z1, Z2) with Z1, Z2 being binary variables, γ = (γ1, γ2), β = (β1, β2), and τ0 is some prespecified constant. Note that under model (28), the conditional cumulative cause-specific hazard function for cause 2 given Z = z is Λ2(t|z)=Λ˜1(t|z)Λ1(t|z)+logλ1(t|z)logλ˜1(t|z). For Λ2(t|z) to be a proper conditional cumulative cause-specific hazard function, it must satisfy

Λ2(0|z)=0andλ2(t|z)=Λ2(t|z)t0forallt0,

which imply some constraints on the parameters in model (28). For simplicity, we further assume γ˜=β˜ for our simulation. In this case, it can be shown that Λ2(t|z) is a proper cumulative cause-specific hazard function if the following constraints hold: (i) a = cb, (ii) eγTz<1ac, (iii) eβTz<1/a(1eτ0), and (iv) γ˜=γ. We then generated competing risks data from λ1(t|z) and λ2(t|z) using the method of Beyersmann et al. (2009).

Figure 2 displays the simulated power curves of the three two-sided joint tests described in Sections 3.1 for the following local hypothesis regarding the effects of Z1 on the CSH and the CIF hazard after time τ0:

H0:β1=0andγ1=0. (29)

We consider four scenarios: (a) the null case (β1 = 0, γ1 = 0); (b) smaller Z1 effect on CSH and larger Z1 effect on CIF (β1 = −0.1,γ1 = −0.4); (c) larger Z1 effect on CSH and smaller Z1 effect on CIF (β1 = −0.6,γ1 = −0.2); and (d) similar Z1 effects on CSH and CIF (β1 = −0.5,γ1 = −0.5). In all four scenarios, we set a = 0.05, b = 0.1, β2 = −0.2, γ2 = −0.1, γ=γ˜, and τ0 = 1.

Figure 2.

Figure 2.

Simulated power of the two-sided chi-square joint test, maximum joint test, and Bonferroni joint test of a local hypothesis (29) for a joint regression model (28) of CSH and CIF under four scenarios as described in Sections 4: (a) null case, (b) smaller effects on CSH and larger effects on CIF, (c) larger effects on CSH and smaller effects on CIF, and (d) similar effects on CSH and CIF.

Figure 2 leads to similar conclusions to what we have observed for the two-group case in the first simulation study. In the supplementary material, we also present some simulations for the CSH and all-cause hazard (ACH) pair, which have similar conclusions.

Finally, we conducted a small-scale simulation to compare the power of the three joint tests for (1), (11), and (13). When there is little group difference in a particular quantity, a test for a pair involving that quantity was observed to have lower power than those for other pairs. This is not surprising because a joint test for a specific pair is constructed to detect a group difference in the direction of that pair. The details are omitted.

5. Real Data Example

We illustrate our methods on two real datasets. In the first example, we consider joint inference for time to second malignancy in Hodgkin disease patients. In the second example, we perform joint analysis of the cause-specific hazard (CSH) for time to progression (TTP) and the all-cause hazard (ACH) for time to progression or death (progression-free survival or PFS) for follicular-type lymphoma patients.

5.1. Hodgkin Disease

The Hodgkin disease data was described in Pintilie (2006). It consists of 865 patients who were diagnosed with Hodgkin disease and received radio therapy in Princess Margaret Hospital between 1968 and 1986. Here we are interested in studying time to second malignancy after receiving radio therapy, which is an important variable for evaluating the side effects of radio therapy. Death without second malignancy is a competing risk. Among the 865 patients, 93 developed second malignancy, 386 were dead without the second malignancy, and 386 were right censored who did not experience any of the two events by the end of study. For illustration purpose, we investigate whether or not the risks of developing second malignancy were the same among older (≥30) and younger (<30) patients.

Figure 3(a) and 3(b) depicts the cumulative cause-specific hazard functions and the cumulative incidence functions, respectively, for time to second malignancy for the older (≥30) and younger (<30) groups. There appears to be a higher cause-specific hazard for the older patients since the slope of their cumulative cause-specific hazard is noticeably bigger (Figure 3(a)). However, the cumulative incidence functions for the two age groups are barely distinguishable (Figure 3(b)). The two-sample log-rank test for the cause-specific hazard for time to second malignancy yields a p-value = 0.037. The Gray (1988) two-sample test for the cumulative incidence for time to second malignancy gives a p-value = 0.770. At 5% overall significant level, none of the individual tests is statistically significant at the Bonferroni adjusted level 0.05/2 = 0.025.

Figure 3.

Figure 3.

(a) Cumulative cause-specific hazard functions for time to second malignancy for older (≥30) and younger (<30) patients (log-rank test p-value = 0.037). (b) Cumulative incidence functions for time to second malignancy for older (≥30) and younger (<30) patients (Gray’s test p-value = 0.770).

We performed the chi-square joint test and the maximum joint test for the null hypothesis that there is no difference in the cause-specific hazard (CSH) and the cumulative incidence (CIF) for time to second malignancy between older and younger patients. The p-values are presented in the first part of Table 1, along with the results of the individual tests and the Bonferroni’s method. In contrast to the Bonferroni method, the two-sample chi-square joint test for the cause-specific hazard and the cumulative incidence yields a p-value 0.02, which is highly significant at 5% significance level. The maximum joint test is also significant at level 0.05 (p-value = 0.05). As illustrations, we also performed joint tests for (CSH, ACH) and for CSH with the other cause-specific hazard (OCH) (parts 2 and 3 of Table 1), which show that in addition to an elevated cause-specific hazard for time to second malignancy, the older patients also had a higher risk of dying from other life-threatening diseases without developing second malignancy. This explains why their observed cumulative incidence for time to second malignancy was not significantly different from the younger patients.

Table 1.

Separate and joint test results for Hodgkin disease example for three pairs of quantities.

Separate test Joint test
Test CSH CIF Bonferroni χ2 Max
p-value 0.037 0.770 0.074 0.020 0.050
Test CSH ACH Bonferroni χ2 Max
p-value 0.037 5.2E–8 1.0E–7 3.4E–7 3.0E–8
Test CSH OCH Bonferroni χ2 Max
p-value 0.037 4.7E–7 9.4E–7 3.5E–7 8.0E–7

NOTE: χ2 and Max are abbreviations for the chi-square joint test and the maximum joint test described in section 2.2.

5.2. Follicular Cell Lymphoma Study

The follicular cell lymphoma study (Pintilie 2006; Scheike and Zhang 2011) consists of 541 early stage (I or II) follicular type lymphoma patients who were enrolled between 1967 and 1996 and treated with either radiation alone (RT) or with radiation and chemotherapy (CMT). There were 272 events due to disease (relapse or no treatment response), 76 competing risk events (death without relapse), and 193 censored individuals who did not experience any of the two events at the end of the followup. As in Scheike and Zhang (2011), we test if the CMT group has a longer time to relapse or no treatment response than the RT group. Although one could study different pairs of quantities, we consider joint inference of the cause-specific hazard and the all-cause hazard based on models (15) and (23) because they correspond to two commonly used clinical endpoints, namely, time to progression (TTP) and progression-free survival (PFS), in oncology trials. Here TTP, defined as time to relapse or no treatment response, is an endpoint for the antitumor activity of a treatment, and PFS, defined as time to progression or death before progression, is an endpoint for the overall effects on a patient. In addition to a binary treatment variable (1 for RT and 0 for CMT), we adjust for patient’s baseline age, stage, and hemoglobin level (hgb) by including them as covariates in our models. The Cox–Snell residual plots for the proportional all-cause hazards model (Figure A.6(a)) and the proportional cause-specific hazards model (Figure A.6(b)), which presented in Appendix A.3, indicate reasonable overall fit of both models. We conducted the chi-square joint test and the maximum joint test for the treatment variable and summarized the results along with Bonferroni adjustment method and the individual tests in Table 2. The maximum joint test (p-value = 0.047) is significant, whereas the chi-square joint test (p-value = 0.182) and the Bonferroni method (p-value = 0.07) are not significant at 5% significance level. The one-sided individual test statistics for CSH and ACH are 1.81 and 1.78, respectively, both exceeding 1.77, the cutoff value of the maximum test. Therefore, we conclude that at 5% overall significance level, CMT group has a lower risk of TTP (cause-specific hazard) and a lower risk of PFS (ACH) as compared to the RT group adjusting for patient’s baseline age, stage, and hemoglobin level (hgb). Finally, the chi-square joint test has a relatively large p-value because it is actually a two-sided test that is not powered for a one-sided hypothesis, especially when the effect sizes for CSH and ACH are similar, which is consistent with our simulation results (Figure A.3(d)).

Table 2.

Separate and joint test results for follicular cell lymphoma study.

Separate test Joint test
Test CSH ACH Bonferroni χ2 Max
p-value 0.035 0.037 0.070 0.182 0.047

NOTE: χ2 and Max are abbreviations for the chi-square joint test and the maximum joint test.

6. Discussion

We emphasize the importance of joint inference for the cause-specific hazard and the cumulative incidence because one quantity alone does not fully characterize the time to a particular type of failure in the presence of competing risks. As illustrated in our simulations and real data examples, the proposed chi-square joint test and maximum joint test can be much more powerful than the Bonferroni method. The increased power implies substantial saving in the number of patients required in a clinical trial. In a sequel, we will develop power analysis methods to determine the required sample size to test a group difference based on the developed joint tests. We also note that the chi-square joint test tends to be more powerful than the maximum joint test when the effects on the two quantities are very different and that the maximum joint test dominates the chi-square joint test when the effects on the two quantities are similar. In practice, we recommend that both joint tests be performed together with the separate tests for the individual quantities as illustrated in our real data example. The joint regression methods in Section 3 can be extended to beyond Cox’s models. For example, the accelerated failure time models can be used to model the cause-specific hazard. Scheike and Zhang (2008) considered other regression models for the subdistribution hazard. Joint inference procedures for these models can be developed similarly. Finally, joint modeling of the cause-specific hazard and the cumulative incidence is nontrivial since the proportional cause-specific hazards model and the proportional subdistributional hazards model are unlikely to hold simultaneously, especially for a time-independent covariate. However, this issue can be resolved by including time by covariate interactions in the regression models. In particular, we presented a joint model with piecewise proportional cause-specific hazards and piecewise proportional subdistributional hazards and discussed how to check if the two models hold simultaneously in Section 4.

Supplementary Material

1

Acknowledgments

The authors thank the co-editor, the associate editor, and the two anonymous referees for their valuable comments that helped improve this article significantly.

Funding

Gang Li’s work was partially supported by NIH grant 5P30CA-16042 and NIH grant 8UL1TR000124.

Footnotes

Supplementary Materials

Appendix: Proofs for the theorems and additional simulation results.

Supplementary materials for this article are available online. Please go to www.tandfonline.com/r/JASA.

References

  1. Aly E, Kochar S, and McKeague I (1994), “Some Tests for Comparing Cumulative Incidence Functions and Cause-Specific Hazard Rates,” Journal of the American Statistical Association, 89, 994–999. [1289] [Google Scholar]
  2. Andersen P, Borgan Ø, Gill R, and Keiding N (1982), “Linear Nonparametric Tests for Comparison of Counting Processes, With Applications to Censored Survival Data, Correspondent Paper,” International Statistical Review/Revue Internationale de Statistique, 50, 219–244. [1290,1293] [Google Scholar]
  3. Andersen P, Geskus R, de Witte T, and Putter H (2012), “Competing Risks in Epidemiology: Possibilities and Pitfalls,” International Journal of Epidemiology, 41, 861–870. [1295] [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Andersen PK (1982), “Testing Goodness of Fit of Cox’s Regression and Life Model,” Biometrics, 38, 67–77. [1295] [Google Scholar]
  5. Arjas E (1988), “A Graphical Method for Assessing Goodness of Fit in Cox’s Proportional Hazards Model,” Journal of the American Statistical Association, 83, 204–212. [1295] [Google Scholar]
  6. Bajorunaite R, and Klein J (2007), “Two-Sample Tests of the Equality of Two Cumulative Incidence Functions,” Computational Statistics & Data Analysis, 51, 4269–4281. [1289] [Google Scholar]
  7. Beyersmann J, Dettenkofer M, Bertz H, and Schumacher M (2007), “A Competing Risks Analysis of Bloodstream Infection After Stem-Cell Transplantation Using Subdistribution Hazards and Cause-Specific Hazards,” Statistics in Medicine, 26, 5360–5369. [1289,1295] [DOI] [PubMed] [Google Scholar]
  8. Beyersmann J, Latouche A, Buchholz A, and Schumacher M (2009), “Simulating Competing Risks Data in Survival Analysis,” Statistics in Medicine, 28, 956–971. [1295,1297] [DOI] [PubMed] [Google Scholar]
  9. Breslow N (1970), “A Generalized Kruskal-Wallis Test for Comparing k Samples Subject to Unequal Patterns of Censorship,” Biometrika, 57, 579–594. [1291] [Google Scholar]
  10. Cox D (1972), “Regression Models and Life-Tables,” Journal of the Royal Statistical Society, Series B, 34, 187–220. [1289,1293,1295] [Google Scholar]
  11. Cox D (1975), “Partial Likelihood,” Biometrika, 62, 269–276. [1289,1293] [Google Scholar]
  12. Cox D, and Oakes D (1984), Analysis of Survival Data (vol. 21), Boca Raton, FL: Chapman & Hall/CRC; [1289] [Google Scholar]
  13. Fine J (1999), “Analysing Competing Risks Data With Transformation Models,” Journal of the Royal Statistical Society, Series B, 61, 817–830. [1289] [Google Scholar]
  14. Fine J (2001), “Regression Modeling of Competing Crude Failure Probabilities,” Biostatistics, 2, 85–97. [1289] [DOI] [PubMed] [Google Scholar]
  15. Fine JP, and Gray RJ (1999), “A Proportional Hazards Model for the Subdistribution of a Competing Risk,” Journal of the American Statistical Association, 94, 496–509. [1289,1290,1293] [Google Scholar]
  16. Gehan EA (1965), “A Generalized Wilcoxon Test for Comparing Arbitrarily Singly-Censored Samples,” Biometrika, 52, 203–223. [1291] [PubMed] [Google Scholar]
  17. Gerds T, Scheike T, and Andersen P (2012), “Absolute Risk Regression for Competing Risks: Interpretation, Link Functions, and Prediction,” Statistics in Medicine, 31, 3921–3930. [1289] [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Gichangi A, and Vach W (2005), “The Analysis of Competing Risks Data: A Guided Tour,” Statistics in Medicine. [1289] [Google Scholar]
  19. Grambauer N, Schumacher M, and Beyersmann J (2010), “Proportional Subdistribution Hazards Modeling Offers a Summary Analysis, Even if Misspecified,” Statistics in Medicine, 29, 875–884. [1295] [DOI] [PubMed] [Google Scholar]
  20. Gray R (1988), “A Class of K-Sample Tests for Comparing the Cumulative Incidence of a Competing Risk,” The Annals of Statistics, 16, 1141–1154. [1289,1290,1291,1292,1298] [Google Scholar]
  21. Haller B, Schmidt G, and Ulm K (2012), “Applying Competing Risks Regression Models: An Overview,” Lifetime Data Analysis, 19, 1–26. [1289,1295] [DOI] [PubMed] [Google Scholar]
  22. Holt J (1978), “Competing Risk Analyses With Special Reference to Matched Pair Experiments,” Biometrika, 65, 159–165. [1289] [Google Scholar]
  23. Kalbfleisch J, and Prentice RL (1980), The Statistical Analysis of Failure Time Data (Vol. 360), New York: Wiley; [1291] [Google Scholar]
  24. Kaplan E, and Meier P (1958), “Nonparametric Estimation From Incomplete Observations,” Journal of the American Statistical Association, 53, 457–481. [1291,1293] [Google Scholar]
  25. Klein J (2006), “Modelling Competing Risks in Cancer Studies,” Statistics in Medicine, 25, 1015–1034. [1289] [DOI] [PubMed] [Google Scholar]
  26. Klein J, and Andersen P (2005), “Regression Modeling of Competing Risks Data Based on Pseudovalues of the Cumulative Incidence Function,” Biometrics, 61, 223–229. [1289] [DOI] [PubMed] [Google Scholar]
  27. Klein JP, and Moeschberger ML (2003), Survival Analysis: Techniques for Censored and Truncated Data, New York: Springer Science & Business Media; [1291] [Google Scholar]
  28. Kulathinal S, and Gasbarra D (2002), “Testing Equality of Cause-Specific Hazard Rates Corresponding to m Competing Risks Among k Groups,” Lifetime Data Analysis, 8, 147–161. [1289,1292] [DOI] [PubMed] [Google Scholar]
  29. Lagakos S (1978), “A Covariate Model for Partially Censored Data Subject to Competing Causes of Failure,” Applied Statistics, 27, 235–241. [1289] [Google Scholar]
  30. Lagakos S (1981), “The Graphical Evaluation of Explanatory Variables in Proportional Hazard Regression Models,” Biometrika, 68, 93–98. [1295] [Google Scholar]
  31. Lam K (1998), “A Class of Tests for the Equality of k Cause-Specific Hazard Rates in a Competing Risks Model,” Biometrika, 85, 179–188. [1289] [Google Scholar]
  32. Larson M (1984), “Covariate Analysis of Competing-Risks Data With Log-Linear Models,” Biometrics, 40, 459–469. [1289] [PubMed] [Google Scholar]
  33. Latouche A, Boisson V, Chevret S, and Porcher R (2007), “Misspecified Regression Model for the Subdistribution Hazard of a Competing Risk,” Statistics in Medicine, 26, 965–974. [1289,1295] [DOI] [PubMed] [Google Scholar]
  34. Lindkvist H, and Belyaev Y (1998), “A Class of Non-Parametric Tests in the Competing Risks Model for Comparing Two Samples,” Scandinavian Journal of Statistics, 25, 143–150. [1289,1290,1292] [Google Scholar]
  35. Lunn M, and McNeil D (1995), “Applying Cox Regression to Competing Risks,” Biometrics, 51, 524–532. [1289] [PubMed] [Google Scholar]
  36. Luo X, and Turnbull B (1999), “Comparing Two Treatments With Multiple Competing Risks Endpoints,” Statistica Sinica, 9, 985–998. [1289] [Google Scholar]
  37. Moreau T, O’Quigley J, and Mesbah M (1985), “A Global Goodness-ofFit Statistic for the Proportional Hazards Model,” Applied Statistics, 34, 212–218. [1295] [Google Scholar]
  38. Nagelkerke N, Oosting J, and Hart A (1984), “A Simple Test for Goodness of Fit of Cox’S Proportional Hazards Model,” Biometrics, 40, 483–486. [1295] [Google Scholar]
  39. Pepe M, and Mori M (1993), “Kaplan–Meier, Marginal or Conditional Probability Curves in Summarizing Competing Risks Failure Time Data?” Statistics in Medicine, 12, 737–751. [1289] [DOI] [PubMed] [Google Scholar]
  40. Peto R, and Peto J (1972), “Asymptotically Efficient Rank Invariant Test Procedures,” Journal of the Royal Statistical Society, Series A, 135, 185–207. [1290,1291] [Google Scholar]
  41. Pintilie M (2006), Competing Risks: A Practical Perspective, New York: Wiley; [1289,1298,1299] [Google Scholar]
  42. Prentice R, Kalbfleisch J, Peterson A Jr, Flournoy N, Farewell V, and Breslow N (1978), “The Analysis of Failure Times in the Presence of Competing Risks,” Biometrics, 34, 541–554. [1289,1290,1292,1293] [PubMed] [Google Scholar]
  43. Putter H, Fiocco M, and Geskus R (2007), “Tutorial in Biostatistics: Competing Risks and Multi-State Models,” Statistics in Medicine, 26, 2389–2430. [1289] [DOI] [PubMed] [Google Scholar]
  44. Sancho A, Ávila A, Gavela E, Beltrán S, Fernández-Nájera J, Molina P, Crespo J, and Pallardó L (2007), “Effect of Overweight on Kidney Transplantation Outcome,” in Transplantation Proceedings (Vol. 39), Orlando, FL: Grune & Stratton, pp. 2202–2204. [1289] [DOI] [PubMed] [Google Scholar]
  45. Scheike T, and Zhang M (2008), “Flexible Competing Risks Regression Modeling and Goodness-of-Fit,” Lifetime Data Analysis, 14, 464–483. [1295,1299] [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Scheike T, and Zhang M (2011), “Analyzing Competing Risk Data Using the r Timereg Package,” Journal of Statistical Software, 38 [1299] [PMC free article] [PubMed] [Google Scholar]
  47. Schoenfeld D (1980), “Chi-Squared Goodness-of-Fit Tests for the Proportional Hazards Regression Model,” Biometrika, 67, 145–153. [1295] [Google Scholar]
  48. Schoenfeld D (1982), “Partial Residuals for the Proportional Hazards Regression Model,” Biometrika, 69, 239–241. [1295] [Google Scholar]
  49. Sun Y, and Tiwari R (1995), “Comparing Cause-Specific Hazard Rates of a Competing Risks Model With Censored Data,” Lecture NotesMonograph Series, 27, 255–270. [1289] [Google Scholar]
  50. Tiwari R, Kulasekera K, and Park C (2006), “Nonparametric Tests for Cause Specific Hazard Rates With Censored Data for Competing Risks Among Several Groups,” Journal of Statistical Planning and Inference, 136, 1718–1745. [1289] [Google Scholar]
  51. Tsiatis A (1975), “A Nonidentifiability Aspect of the Problem of Competing Risks,” Proceedings of the National Academy of Sciences, 72, 20–22. [1290] [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES