Bayesian Inference of the Fully Specified Subdistribution Model for Survival Data with Competing Risks

Miaomiao Ge; Ming-Hui Chen

doi:10.1007/s10985-012-9221-9

. Author manuscript; available in PMC: 2013 Jul 1.

Published in final edited form as: Lifetime Data Anal. 2012 Apr 8;18(3):339–363. doi: 10.1007/s10985-012-9221-9

Bayesian Inference of the Fully Specified Subdistribution Model for Survival Data with Competing Risks

Miaomiao Ge ¹, Ming-Hui Chen ²

PMCID: PMC3374158 NIHMSID: NIHMS376754 PMID: 22484596

Abstract

Competing risks data are routinely encountered in various medical applications due to the fact that patients may die from different causes. Recently, several models have been proposed for fitting such survival data. In this paper, we develop a fully specified subdistribution model for survival data in the presence of competing risks via a subdistribution model for the primary cause of death and conditional distributions for other causes of death. Various properties of this fully specified subdistribution model have been examined. An efficient Gibbs sampling algorithm via latent variables is developed to carry out posterior computations. Deviance Information Criterion (DIC) and Logarithm of the Pseudomarginal Likelihood (LPML) are used for model comparison. An extensive simulation study is carried out to examine the performance of DIC and LPML in comparing the cause-specific hazards model, the mixture model, and the fully specified subdistribution model. The proposed methodology is applied to analyze a real dataset from a prostate cancer study in detail.

Keywords: Latent variables, Markov chain Monte Carlo, Partial likelihood, Proportional hazards

1 Introduction

Competing risks data are frequently encountered in various medical applications due to the fact that patients may die from different causes. Studies on this topic have been active and productive. Gail (1975) proposed a multivariate model of failure times due to different causes. Tsiatis (1975) showed that for any joint distribution of n failure times there exists a joint distribution of n independent failure times such that the marginal cause-specific cumulative incident functions from the two joint distributions coincide, which implies that the correlations between the failure times are not identifiable in the multivariate failure time model. Prentice et al. (1978) introduced a cause-specific hazards model. Larson and Dinse (1985) established a mixture model with hazards function conditional on failure from a specific cause. Fine and Gray (1999) discussed the subdistribution model with proportional hazards assumption to assess the covariates effect on the cumulative incidence function of the cause of interest. Recently, Fan (2008) introduced a Bayesian nonparametric methodology based on the full likelihood for the proportional subdistribution hazards model. Elashoff et al. (2007, 2008) jointly modeled the longitudinal measurements and survival data with competing risks, where they extended respectively the cause-specific hazards model and the mixture model for survival data, and used latent random variables to link together the sub-models for longitudinal measurements and survival data.

The Bayesian literature on competing risks analysis is still sparse. Fan (2008) developed Bayesian methods by extending the subdistribution model of Fine and Gray (1999) for each cause-specific risk. More recently, Hu et al. (2009) and Huang et al. (2011) developed the Bayesian methods for a joint analysis of longitudinal measurements and survival data with competing risks, in which cause-specific hazards sub-models were considered for modeling survival times. As pointed out in Fine and Gray (1999), one of the nice properties of the subdistribution model is that the effect of a covariate on the marginal probability function can be directly assessed. However, the subdistribution model proposed by Fine and Gray (1999) cannot be compared to two other established models as the competing risks for other causes are not specified in their model. Due to this reason, we develop a fully specified subdistribution model with subdistribution hazard for the primary cause of death and conditional hazards for other causes of death. Under this fully specified subdistribution model, we are able to establish a theoretical connection between the partial likelihood of Fine and Gray (1999) and the one under the fully specified subdistribution model for the cause of primary interest when all failure times are observed. We notice that this connection may not be established under the models discussed in Fan (2008). With this new development, formal model comparisons between the fully specified subdistribution model and two other established models, namely, the cause-specific hazards model (Prentice et al., 1978) and the mixture model (Larson and Dinse, 1985), can be carried out via Bayesian Deviance Information Criterion (DIC) and logarithm of the Pseudomarginal likelihood (LPML). Furthermore, the fully specified subdistribution model also facilitates an efficient implementation of the Gibbs sampling algorithm.

The rest of the article is organized as follows. In Section 2, we present a detailed development of the fully specified subdistribution model and examine various properties of it. The prior and posterior are discussed and an efficient Gibbs sampling algorithm via a set of latent variables is developed in Section 3. In Section 4, we briefly review the cause-specific hazards model (Prentice et al., 1978) and the mixture model (Larson and Dinse, 1985), and provide necessary mathematical formulations for DIC and LPML under these two models and the fully specified subdistribution model. In Section 5, we present the design of a simulation study and the simulation algorithms for generating the data under the three competing risk models. We notice that these three competing risk models have never been formally compared based on our best knowledge. In Section 6, we analyze a real data from a prostate cancer study in detail. We conclude the paper with brief discussion and some extensions of the proposed model in Section 7.

2 Subdistribution Based Models for Competing Risks

2.1 Preliminary

We consider two competing risks throughout the paper and the extension to more than two competing risks will be discussed in Section 7. Let T_j be the time to failure due to cause j for j = 1, 2 and δ be the index of cause of death. Also let T = min{T₁, T₂}. Assume cause 1 is the cause of primary interest. The subdistribution hazard for cause 1 defined in Gray (1988) is given as follows:

h_{1} (t) = lim_{Δ t \to 0} {\frac{Pr (t \leq T \leq t + Δ t, δ = 1 | T \geq t \cup (T \leq t \cap δ \neq 1))}{Δ t}} = \frac{\partial F_{1} (t) / \partial t}{1 - F_{1} (t)},

(2.1)

where F₁(t) = Pr(T ≤ t, δ = 1). As discussed in Fine and Gray (1999), to develop the regression model of (2.1) with the proportional hazards assumption, h₁(t|x) = h₁₀(t) exp(x′β₁) and $F_{1} (t | x) = 1 - exp {- \int_{0}^{t} h_{10} (u) exp (x' β_{1}) du}$ , where x is a vector of covariates and β₁ is a vector of the corresponding regression coefficients. As pointed out in Fine and Gray (1999), the covariate effects can be directly assessed on the cumulative incidence function for primary cause under the subdistribution model. However, the distributions for failure times due to other causes are never specified in Fine and Gray (1999).

2.2 A Fully Specified Subdistribution Model for Two Competing Risks

Let $T_{j}^{*} = T_{j} \times I (δ = j) + \infty \times I (δ \neq j)$ , j = 1, 2, where we define ∞ × 0 = 0. Write $T^{*} = min {T_{1}^{*}, T_{2}^{*}}$ as the observed time to failure. We propose the cause-specific cumulative incidence functions for both causes as follows:

\begin{matrix} F_{1} (t) = Pr (T^{*} \leq t, δ = 1) = Pr (T_{1} \leq t, δ = 1), \\ F_{2} (t) = Pr (T^{*} \leq t, δ = 2) = M_{2} (t) Pr (δ = 2), \end{matrix}

(2.2)

where M₂(t) is the cumulative incidence function conditional on cause 2. The fact that $lim_{t \to \infty} F_{j} (t) = Pr (δ = j) < 1$ implies that F_j(t) is improper. Yet M₂(t) is proper due to $lim_{t \to \infty} M_{2} (t) = lim_{t \to \infty} [F_{2} (t) / Pr (δ = 2)] = 1$ . Note that in (2.2), we do not directly model the correlation between T₁ and T₂, which is not identifiable as shown in Tsiatis (1975). Instead, F₁(t) and F₂(t) are related to each other via Pr(δ = 2) = 1 − Pr(δ = 1) = 1 − F₁(∞).

We apply the definition of subdistribution hazard in Fine and Gray (1999) for cause 1 by $h_{1} (t) = \frac{\partial F_{1} (t)}{\partial t} \times \frac{1}{1 - F_{1} (t)}$ . Then H₁(t) is improper because $lim_{t \to \infty} H_{1} (t) = - log [1 - lim_{t \to \infty} F_{1} (t)] < \infty$ . We specify a proportional hazards model with an improper baseline hazard function for F₁(t|x) as

F_{1} (t | x) = 1 - exp {- H_{10} (t) exp (x' β_{1})} .

(2.3)

For cause 2, we propose a proportional hazards model for M₂(t|x) as

M_{2} (t | x) = 1 - exp {- H_{20} (t) exp (x' β_{2})} .

(2.4)

The model defined by (2.3) and (2.4) is thus called the fully specified subdistribution (FS) model. Under the FS model, Pr(δ = 2|x) = 1 − Pr(δ = 1|x) = exp{−H₁₀(∞) exp(x′β₁)}.

Assume there are n observations with the vector of observed time t = (t₁, t₂, … , t_n)′, the matrix of covariates X = (x₁, x₂, … , x_n)′, and the vector of cause indicator δ = (δ₁, δ₂, … , δ_n)′, where δ_i takes possible values of 0, 1, and 2, corresponding to “censored”, “died due to cause 1”, and “died due to cause 2” for the i^th subject, respectively. Under the model defined in (2.3) and (2.4), the likelihood function is given by

L (β_{1}, β_{2}, h_{10}, h_{20} | t, X, δ) = \prod_{i = 1}^{n} {[h_{10} (t_{i}) exp (x_{i}^{'} β_{1}) exp {- H_{10} (t_{i}) exp (x_{i}^{'} β_{1})}]}^{I (δ_{i} = 1)} \times {[h_{20} (t_{i}) exp (x_{i}^{'} β_{2}) exp {- H_{20} (t_{i}) exp (x_{i}^{'} β_{2}) - H_{10} (\infty) exp (x_{i}^{'} β_{1})}]}^{I (δ_{i} = 2)} \times {[exp {- H_{10} (t_{i}) exp (x_{i}^{'} β_{1})} - (1 - exp {- H_{20} (t_{i}) exp (x_{i}^{'} β_{2})}) \times exp {- H_{10} (\infty) exp (x_{i}^{'} β_{1})}]}^{I (δ_{i} = 0)} .

(2.5)

2.3 Justification of Fine and Gray’s Partial Likelihood

The FS model is not only a natural expansion of the subdistribution model of Fine and Gray (1999) but also provides novel justifications of Fine and Gray’s partial likelihood under certain conditions. Assume there are n complete observations with the vector of observed time t = (t₁, t₂, … , t_n)′, the matrix of covariates X = (x₁, x₂, … , x_n)′, and the vector of cause indicator δ = (δ₁, δ₂, … , δ_n)′. The partial likelihood of β₁ for cause 1 given in Fine and Gray (1999) is of the form

L_{p} (β_{1} | t, X, δ) = \prod_{i = 1}^{n} {[\frac{exp (x_{i}^{'} β_{1})}{\sum_{j \in R_{i}^{*}} exp {x_{j}^{'} β_{1}}}]}^{I (δ_{i} = 1)},

(2.6)

where $R_{i}^{*}$ is defined as a special risk set at failure time t_i given by

R_{i}^{*} = {j : (t_{j} \geq t_{i}) \cup (t_{j} \leq t_{i} \cap δ_{j} \neq 1)} .

(2.7)

Note that the risk set $R_{i}^{*}$ is quite different than the risk set in Cox’s partial likelihood (Cox, 1972, 1975) as the patients who died from cause 2 before t_i are also included in $R_{i}^{*}$ .

Three theorems are established below to show that the partial likelihood (2.6) can be obtained under the FS regression model via three different approaches with detailed proofs given in Appendix B. Denote D₁ as the number of deaths due to cause 1. Let y_i = t_i when δ_i = 1, and y_i = ∞ when δ_i ≠ 1. Write y = (y₍₁₎, y₍₂₎, … , y_(n))′, where 0 = y₍₀₎ < y₍₁₎ < ⋯ < y_(D₁) < y_(D₁+1) = ⋯ = y_(n) = ∞. Since all observations are failure times, the likelihood function of β₁ for cause 1 given the n complete observations is the part of the likelihood function in (2.5) involving β₁:

L (β_{1}, h_{10} | y, X, δ) = \prod_{i = 1}^{n} [h_{10} (y_{(i)}) exp (x_{i}^{'} β_{1})] \times exp {- H_{10} (y_{(i)}) exp (x_{i}^{'} β_{1})} .

(2.8)

Theorem 1 With n complete observations, assume that in the FS model the baseline hazard rate h₁₀ is zero after the last failure time due to cause 1. The partial likelihood function (2.6) can be attained by the profile likelihood approach, which is to plug in the profile maximum likelihood estimator of h₁₀ in the likelihood function L(β₁, h₁₀|y, X, δ).

Theorem 2 With n complete observations, assume that in the FS model the baseline hazard rate h₁₀(t) is zero after the last failure time due to cause 1 and the prior of h₁₀(t) is degenerate at 0 everywhere except at y_i’s when δ_i = 1. Let h₁₀(y_i) = λ_i when δ_i = 1 and λ = (λ₁, … , λ_D₁)′. We further assume independent Jeffreys-type priors for the λ_i’s, i.e., $π (λ) \propto \prod_{i = 1}^{D_{1}} 1 / λ_{i}$ . Then, the partial likelihood function (2.6) is obtained by

L_{p} (β_{1} | y, X, δ) \propto \int L (β_{1}, h_{10} | y, X, δ) π (λ) d λ_{1} \dots d λ_{D_{1}},

where L(β₁, h₁₀|y, X, δ) is defined in (2.8).

Theorem 3 With n complete observations, assume that in the FS model the baseline hazard rate h₁₀(t) is zero after the last failure time due to cause 1 and that H₁₀(t) has a Gamma process prior, i.e., h_1i ~ Gamma(c₀h_0i, c₀), where c₀ > 0, h_1i = H₁₀(y_(i)) − H₁₀(y_(i−1)) for i = 1, 2, … , D₁, h_{1, D₁+1} = ⋯ = h_1n = 0, h_0i = H₀(y_(i)) − H₀(y_(i−1)), H₀(y) is increasing and differentiable at y₁, … , y_D₁ with H₀(0) = 0, and the h_1i are independent of each other. Then the partial likelihood function (2.6) can be approximated by

L_{p} (β_{1} | y, X, δ) \approx lim_{c_{0} ↓ 0} g (c_{0}) E^{h_{10}} [L (β_{1}, h_{10} | y, X, δ)],

where g(c₀) is a function of c₀, which is free from β₁.

Fine and Gray (1999) also showed that the partial likelihood arises from complete data using a certain reduced data structure, without any assumptions on the models for the subdistribution for other causes. The results established in the above theorems give insight into Fine and Gray’s partial likelihood.

3 Prior, Posterior, and Computational Development

3.1 Prior and Posterior

For the sake of simpler calculation, a special case of the gamma process prior for the cumulative baseline hazard function assumed in Theorem 3 is considered here for the FS model. Assume the baseline hazard functions respectively have piecewise constant forms, which are, with K_j + 1 partitions of the time axis, 0 = s_j0 < s_j1 < s_j2 < ⋯ < s_{jK_j} < ∞,

{\begin{matrix} h_{10} (t) = & λ_{1 k}, s_{1, k - 1} < t \leq s_{1 k}, k = 1, 2, \dots, K_{1}; \\ h_{10} (t) = & λ_{1, K_{1} + 1} exp {- (t - s_{1 K_{1}})}, t > s_{1 K_{1}}; \\ h_{20} (t) = & λ_{2 k}, s_{2, k - 1} < t \leq s_{2 k}, k = 1, 2, \dots, K_{2}; \\ h_{20} (t) = & λ_{2 K_{2},} t > s_{1 K_{2}} . \end{matrix}

(3.1)

To construct posterior distributions for the unknown parameters, we assume β_j follows an improper uniform prior, λ_jk follows a Jeffreys-type prior, and λ_1,K₁+1 follows a gamma prior. We further assume that β_j, λ_jk, and λ_1,K₁+1 are independent for k = 1, … , K_j and j = 1, 2. Let λ₁ = (λ₁₁, λ₁₂, … , λ_1,K₁+1)′ and λ₂ = (λ₂₁, λ₂₂, … , λ_2K₂)′, then the joint prior of (β₁, β₂, λ₁, λ₂) is specified as follows

π (β_{1}, β_{2}, λ_{1}, λ_{2}) = π (β_{1}) π (β_{2}) π (λ_{1}) π (λ_{2}) \propto [\prod_{j = 1}^{2} \prod_{k = 1}^{K_{1}} \frac{1}{λ_{jk}}] π (λ_{1, K_{1} + 1}),

(3.2)

where $π (λ_{1, K_{1} + 1}) \propto λ_{1, K_{1} + 1}^{a - 1} exp (- b λ_{1, K_{1} + 1})$ with a > 0 and b > 0, which are prespecified hyperparameters. The joint posterior distribution is given by

π (β_{1}, β_{2}, λ_{1}, λ_{2} | t, X, δ) \propto π^{*} (β_{1}, β_{2}, λ_{1}, λ_{2} | t, X, δ) \equiv L (β_{1}, β_{2}, λ_{1}, λ_{2} | t, X, δ) π (β_{1}, β_{2}, λ_{1}, λ_{2}),

(3.3)

where L(β₁, β₂, λ₁, λ₂|t, X, δ) is given by (2.5) with h₁₀(t) and h₂₀(t) defined in (3.1). Recently, Wang et al. (2012) established a theoretical connection between the gamma process prior specified for the cumulative baseline function and the independent gamma priors for the λ_jk for the interval-censored survival data. Also, the independent gamma priors assumed for the baseline hazard function approximate the gamma process prior when c₀ → 0+. Thus, the priors in (3.2) for the λ_jk can be considered as a special case of the gamma process priors specified for the cumulative baseline function in this sense.

Let ν_jik = 1 if the i^th subject failed or was censored in the k^th interval (s_j,k−1, s_jk], and 0 otherwise for k = 1, 2, … , K_j +1, and i = 1, 2, … , n, where s_{j,K_j+1} = ∞, for j = 1, 2. Also let X_j be a matrix with its i^th row equal to $I (δ_{i} = j) (ν_{ji 1}, \dots, ν_{{jiK}_{j}}, x_{i}^{'})$ for j = 1, 2. Then, we are led to the following theorem regarding the the propriety of the posterior distribution of (β₁, β₂, λ₁, λ₂) with an improper prior given by (3.2).

Theorem 4 Assume that (i) when δ_i > 0, t_i > 0 and for k = 1, 2, … , K_j for j = 1, 2, and (ii) X₁ and X₂ are of full rank. Then, the posterior distribution π(β₁, β₂, λ₁, λ₂|t, X, δ) in (3.3) with the prior specified in (3.2) is proper.

A proof of Theorem 4 is given in Appendix B. Theorem 4 gives very mild conditions for ensuring propriety of the joint posterior distribution of (β₁, β₂, λ₁, λ₂) under the fully specified subdistribution model. The conditions (i) and (ii) essentially require that all event times are strictly positive, at least one event occurs in each chosen interval (s_j,k−1, s_jk], and the corresponding covariate matrix is of full rank. Notice that we do not require any events for the last interval (s_1K₁, ∞) for the primary cause as we specify a proper prior for λ_1,K₁+1. These conditions are easily satisfied in most applications and are quite easy-to-check. Under certain additional conditions, we can also show that when π(λ_1,K₁+1) ∝ 1 in (3.2), the resulting posterior of (β₁, β₂, λ₁, λ₂) is still proper. We also notice that following Chen et al (2006), the posterior propriety by assuming full gamma process priors on H₁₀ and H₂₀ for c₀ > 0 can be established and, however, stronger propriety conditions are required in this case.

3.2 Computational Development

Due to the complexity of the likelihood structure of the FS model, an analytical evaluation of the posterior distribution does not appear to be possible. In order to carry out posterior inference, we adopt Markov chain Monte Carlo (MCMC) methods and develop a computationally efficient Gibbs sampling algorithm to sample from the posterior distribution in (3.3).

In order to avoid the complicated form in the censored part of the likelihood function, we introduce a latent variable η_i to indicate whether or not subject i would eventually fail from cause 1 or 2 and another latent variable u_i to be the failure time such that u_i ≥ t_i when subject i was censored at t_i and η_i = 1. Then the complete likelihood is constructed as

L (β_{1}, β_{2}, h_{10}, h_{20} | t, X, δ, η, u) = \prod_{i = 1}^{n} {[h_{10} (t_{i}) exp (x_{i}^{'} β_{1}) exp {- H_{10} (t_{i}) exp (x_{i}^{'} β_{1})}]}^{I (δ_{i} = 1)} \times {[h_{20} (t_{i}) exp (x_{i}^{'} β_{2}) exp {- H_{20} (t_{i}) exp (x_{i}^{'} β_{2}) - H_{10} (\infty) exp (x_{i}^{'} β_{1})}]}^{I (δ_{i} = 2)} \times {[h_{10} (u_{i}) exp (x_{i}^{'} β_{1}) exp {- H_{10} (u_{i}) exp (x_{i}^{'} β_{1})}]}^{I (δ_{i} = 0, η_{i} = 1)} \times {[exp {- H_{20} (t_{i}) exp (x_{i}^{'} β_{2}) - H_{10} (\infty) exp (x_{i}^{'} β_{1})}]}^{I (δ_{i} = 0, η_{i} = 2)},

(3.4)

where η = (η_i : δ_i = 0, 1 ≤ i ≤ n) and u = (u_i : δ_i = 0, η_i = 1, 1 ≤ i ≤ n). Based on the complete data likelihood, the augmented posterior of (β₁, β₂, λ₁, λ₂, η, u) is given by

π (β_{1}, β_{2}, λ_{1}, λ_{2}, η, u | t, X, δ) \propto L (β_{1}, β_{2}, h_{10}, h_{20} | t, X, δ, η, u) π (β_{1}, β_{2}, λ_{1}, λ_{2}),

(3.5)

where h₁₀(t) and h₂₀(t) are defined in (3.1). It is easy to show that

\sum_{η} \int π (β_{1}, β_{2}, λ_{1}, λ_{2}, η, u | t, X, δ) d u = π (β_{1}, β_{2}, λ_{1}, λ_{2} | t, X, δ),

where π(β₁, β₂, λ₁, λ₂|t, X, δ) is the posterior given in (3.3). This result ensures that whenever (β₁, β₂, λ₁, λ₂, η, u) ~ π(β₁, β₂, λ₁, λ₂, η, u|t, X, δ), then (β₁, β₂, λ₁, λ₂) ~ π(β₁, β₂, λ₁, λ₂|t, X, δ).

The introduction of the latent variables η and u greatly facilitates a convenient implementation of the Gibbs sampling algorithm. To develop an efficient Gibbs sampling algorithm, we use the collapsed Gibbs method of Liu (1994). First, we group (β₂, λ₂, η, u) together. Then, the Gibbs sampling algorithm requires to sample from the following conditional posterior distributions in turn: (i) [β₁|λ₁, β₂, λ₂, η, u, t, X, δ]; (ii) [λ₁|β₁, β₂, λ₂, η, u, t, X, δ]; and (iii) [β₂, λ₂, η, u|β₁, λ₁, t, X, δ]. For (i), the conditional posterior density of β₁ given (λ₁, β₂, λ₂, η, u, t, X, δ) is log-concave in each component of β₁. Thus, we can use the adaptive rejection algorithm of Gilks and Wild (1992) to sample β₁. For (ii), it can be shown that given (β₁, β₂, λ₂, η, u, t, X, δ), the λ_1k’s are conditionally independent and each of them follows a gamma distribution. Thus, sampling λ₁ is straightforward. For (iii), it is easy to see that

[β_{2}, λ_{2}, η, u | β_{1}, λ_{1}, t, X, δ] = [β_{2}, λ_{2}, η | β_{1}, λ_{1}, t, X, δ] [u | β_{1}, λ_{1}, λ_{2}, η, t, X, δ] .

(3.6)

In (3.6), we collapse out u in the conditional distribution [β₂, λ₂, η|β₁, λ₁, t, X, δ]. However, jointly sampling (β₂, λ₂, η) from their conditional distribution is not possible. Thus, we need to run a sub-Gibbs sampling algorithm to sample (β₂, λ₂, η) from this conditional posterior distribution. The approach is called the modified collapsed Gibbs sampling algorithm. As shown in Chen et al. (2000), the modified collapsed Gibbs sampling algorithm yields the target posterior as its stationary distribution. The sub-Gibbs sampling algorithm requires to sample from the following three additional conditional posterior distributions in turns: (iiia) [β₂|β₁, λ₁, λ₂, η, t, X, δ]; (iiib) [λ₂|β₁, λ₁, β₂, η, t, X, δ]; and (iiic) [η|β₁, λ₁, β₂, λ₂, t, X, δ]. For (iiia) the conditional posterior density of β₂ is log-concave in each component of β₂ and we again use adaptive rejection algorithm of Gilks and Wild (1992) to sample β₂. For (iiib), the λ_1k’s are conditionally independent and each of them follows a gamma distribution. Finally, for (iiic), η_i’s are conditionally independent and each η_i follows a Bernoulli distribution. The technical detail of sampling η from [η|β₁, λ₁, β₂, λ₂, t, X, δ] and sampling u from [u|β₁, λ₁, λ₂, η, t, X, δ] is given in Appendix C.

4 Model Comparison

The cause-specific hazards model and the mixture model are two well-established models for competing risks survival data. We discuss details of these two models and compare them with the FS model both theoretically, in simulation, and in an analysis of a real dataset.

4.1 Other Models for Comparison

Cause-specific Hazards Model

As discussed in Gaynor et al. (1993), the cause-specific hazard function is denoted by

h_{Cj} (t) = lim_{Δ t \to 0} {\frac{Pr (t \leq T < t + Δ t, δ = j | T \geq t)}{Δ t}}, j = 1, 2 .

Under the proportional hazards assumption, h_Cj(t|x) = h_Cj0(t) exp(x′β_j) and the cumulative incidence function of cause j is given by

F_{j} (t | x) = Pr (T \leq t, δ = j) = \int_{0}^{t} h_{Cj 0} (u) exp (x' β_{j}) exp {- \sum_{j = 1}^{2} H_{Cj 0} (u) exp (x' β_{j})} du .

The covariate effects can be directly assessed on the cause-specific hazard functions. But they cannot be directly estimated by β_j alone on the cause-specific cumulative incidence function of cause j, as the cause-specific cumulative incidence function of cause 1 also depends on regression coefficients β₂ for cause 2.

Mixture Model

Larson and Dinse (1985) discussed the mixture model. Assume the types of cause-specific failures follow a multinomial distribution. Define the probability of failing from cause j as p_j = Pr(δ = j) for j = 1, 2, where p₁ + p₂ = 1. Define h_Mj(t) as the hazard function conditional on failure from cause j,

h_{Mj} (t) = lim_{Δ t \to 0} {\frac{Pr (t \leq T < t + Δ t | δ = j, T \geq t)}{Δ t}} .

Under the proportional hazards assumption, h_Mj(t|x) = h_Mj0(t) exp(x′β_j) and the cause-specific cumulative incidence function is given by

F_{j} (t | x) = Pr (T \leq t, δ = j | x) = p_{j} (1 - exp {- H_{Mj 0} (t) exp (x' β_{j})}) .

It is observed that both cause-specific hazard functions and cause-specific cumulative incidence functions depend on regression coefficients of the corresponding cause as well as the probability of failing from that cause.

Notice that the definition of the subdistribution hazard is different than the definition of the cause-specific hazard function or the definition of the conditional hazard function in the mixture model. Thus, if the proportional structure on the hazard function of one model is true, the hazard functions of other two models could never achieve the Cox proportional hazards assumption.

4.2 Model Comparison Measures

Deviance Information Criterion (DIC) (Spiegelhalter et al., 2002) and Logarithm of the Pseudomarginal Likelihood (LPML) (Ibrahim et al., 2001) are used here to compare the cause-specific hazards model, the mixture model, and the FS model. Let θ denote a collection of model parameters. DIC is defined as DIC = D(θ̂) + 2p_D, where D(θ) is a deviance function, p_D = D̄ − D(θ̂), D̄ and θ̂ are the posterior means of D(θ) and θ. The formula of LPML is given by $LPML = \sum_{i = 1}^{n} log ({CPO}_{i})$ , where the Conditional Predictive Ordinate (CPO), CPO_i = f(t_i|x_i,D⁽ⁱ⁾) = ∫ f(t_i|θ, x_i)π(θ|D⁽ⁱ⁾), D⁽ⁱ⁾ is the data with the i^th observation deleted, and π(θ|D⁽ⁱ⁾) is the posterior distribution based on the data D⁽ⁱ⁾. According to Gelfand and Dey (1994), LPML implicitly includes a similar dimensional penalty as AIC asymptotically.

For the proposed FS model, θ = (β₁, λ₁, β₂, λ₂), and the deviance function D(θ) is given by D(θ) = −2 log L(β₁, β₂, h₁₀, h₂₀|t, X, δ), where L(β₁, β₂, h₁₀, h₂₀|t, X, δ) is given by (2.5) and h₁₀(t) and h₂₀(t) are defined in (3.1).

For the cause-specific hazards model, suppose the piecewise exponential models for h_Cj0(t), j = 1, 2. The deviance function is defined by

D (θ) = - 2 log L_{C} (β_{1}, β_{2}, h_{C 10}, h_{C 20} | t, X, δ),

where

L_{C} (β_{1}, β_{2}, h_{C 10}, h_{C 20} | t, X, δ) = \prod_{i = 1}^{n} {h_{C 10} (t_{i}) exp (x_{i}^{'} β_{1})}^{I (δ_{i} = 1)} \times {h_{C 20} (t_{i}) exp (x_{i}^{'} β_{2})}^{I (δ_{i} = 2)} \times exp {- H_{C 10} (t_{i}) exp (x_{i}^{'} β_{1}) - H_{C 20} (t_{i}) exp (x_{i}^{'} β_{2})} .

From the above likelihood function, it is easy to see that under the cause-specific hazards model, DIC=DIC₁+DIC₂, where DIC_j is the DIC of the survival model with single cause j by treating other causes of death as censored.

For the mixture model, let p_1i denote the probability of death due to cause 1 for the i^th subject, p₁ = (p₁₁, p₁₂, … , p_1n)′. The likelihood function is

L_{M} (β_{1}, β_{2}, h_{M 10}, h_{M 20}, p_{1} | t, X, δ) = \prod_{i = 1}^{n} {[p_{1 i} h_{M 10} (t_{i}) exp (x_{i}^{'} β_{1}) exp {- H_{M 10} (t_{i}) exp (x_{i}^{'} β_{1})}]}^{I (δ_{i} = 1)} \times {[(1 - p_{1 i}) h_{M 20} (t_{i}) exp (x_{i}^{'} β_{2}} exp {- H_{M 20} (t_{i}) exp (x_{i}^{'} β_{2})}]}^{I (δ_{i} = 2)} \times {[p_{1 i} exp {- H_{M 10} (t_{i}) exp (x_{i}^{'} β_{1})} + (1 - p_{i 1}) exp {- H_{M 20} (t_{i}) exp (x_{i}^{'} β_{2})}]}^{I (δ_{i} = 0)} .

Assume $p_{1 i} = \frac{exp {z_{i}^{'} ϕ}}{1 + exp {z_{i}^{'} ϕ}}$ , where z_i is a vector of covariates, which may be a subset of x_i.

For the cause-specific hazards model and mixture model, the form of the baseline hazard functions h_Cj0 and h_Mj0 and the prior of (β₁, β₂, λ₁, λ₂) are assumed in the same way as those in the FS model excluding the last piece h₁₀(t) when t ≥ s_1,K₁. Similar to the FS model, the propriety of the joint posterior distribution of (β₁, β₂, λ₁, λ₂) can also be established under an improper joint prior π(β₁, β₂, λ₁, λ₂), which is similar to (3.2).

5 A Simulation Study

In this section, we carry out a simulation study to compare the cause-specific hazards model, mixture model, and fully specified subdistribution model via DIC and LPML. To generate data from the FS model, we assume there are two causes with cause 1 to be the cause of interest, and there are two covariates with true parameters β₁ = (β₁₁, β₁₂)′ and β₂ = (β₂₁, β₂₂)′, which are chosen such that Pr(δ = 1) is around 1/3. Covariates x_i1 are generated from N(0, 1) and x_i2 given x_i1 are generated from Bernoulli(p(x_i1)), where $p (x_{i 1}) = \frac{exp {0.5 + 0.3 x_{i 1}}}{1 + exp {0.5 + 0.3 x_{i 1}}}$ . Assume the failure times of two causes follow distinct piecewise exponential distributions, where for cause 1 the time is partitioned as s₁₀ = 0, s₁₁ = 8, s₁₂ = 12, s₁₃ = 15, s₁₄ = 16, and s₁₅ = 17 with corresponding λ₁ = (0.001, 0.01, 0.03, 0.02, 0.3)′, and for cause 2 the time is partitioned as s₂₀ = 0, s₂₁ = 3, s₂₂ = 5, s₂₃ = 8, s₂₄ = 10, s₂₅ = 11, s₂₆ = 12, s₂₇ = 13, s₂₈ = 15, s₂₉ = 17, and s₂₁₀ = 18 with corresponding λ₂ = (0.001, 0.005, 0.01, 0.02, 0.04, 0.07, 0.1, 0.15, 0.2, 1.0)′. Generate r_i from U(0, 1). If r_i < Pr(δ_i = 1|x_i), then generate $t_{i}^{*}$ of cause 1 from a piecewise exponential distribution

f_{1} (t_{i}^{*} | x_{i}) = \frac{h_{10} (t_{i}^{*}) exp (x_{i}^{'} β_{1}) exp {- H_{10} (t_{i}^{*}) exp (x_{i}^{'} β_{1})}}{1 - exp {- H_{10} (\infty) exp (x_{i}^{'} β_{1})}} .

If r_i ≥ Pr(δ_i = 1|x_i), then generate $t_{i}^{*}$ of cause 2 from a piecewise exponential distribution

f_{2} (t_{i}^{*} | x_{i}) = \frac{h_{20} (t_{i}^{*}) exp (x_{i}^{'} β_{2}) exp {- H_{20} (t_{i}^{*}) exp (x_{i}^{'} β_{2}) - H_{10} (\infty) exp (x_{i}^{'} β_{1})}}{exp {- H_{10} (\infty) exp (x_{i}^{'} β_{1})} [1 - exp {- H_{20} (\infty) exp (x_{i}^{'} β_{2})}]} .

The censoring time c_i is generated from a uniform distribution, U(a_c, b_c), where 0 < a_c < b_c are chosen so that the proportion of death is around 2/5, and then t_i is taken to be $t_{i} = min {t_{i}^{*}, c_{i}}$ . Note that under the FS model, $Pr (δ_{i} = 1 | x_{i}) = F_{1} (\infty | x_{i}) = 1 - {- H_{10} (\infty) exp (x_{i}^{'} β_{1})}$ .

For the mixture model, the settings of model parameters and covariates are similar to those for the FS model, while Pr(δ = 1) is calculated by $p_{1} = \frac{exp (z' ϕ)}{1 + exp (z' ϕ)}$ , where ϕ = (ϕ₁, ϕ₂)′ is chosen such that p₁ is around 1/3. If r_i from U(0, 1) falls in (0, Pr(δ = 1)), then $t_{i}^{*}$ of cause 1 is generated from a piecewise exponential distribution

f_{1} (t_{i}^{*} | x_{i}) = h_{10} (t_{i}^{*}) exp (x_{i}^{'} β_{1}) exp {- H_{10} (t_{i}^{*}) exp (x_{i}^{'} β_{1})} .

Otherwise, $t_{i}^{*}$ of cause 2 is generated from a piecewise exponential distribution

f_{2} (t_{i}^{*} | x_{i}) = h_{20} (t_{i}^{*}) exp (x_{i}^{'} β_{2}) exp {- H_{20} (t_{i}^{*}) exp (x_{i}^{'} β_{2})} .

For the cause-specific hazards model, the parameter and covariates settings are also similar as above. According to Lu and Tsiatis (2001), t_i is generated by t_i = min{t_1i, t_2i, c_i}, where t_1i, t_2i, and c_i are generated independently, respectively, from a piecewise exponential distribution

f_{1} (t_{1 i} | x_{i}) = h_{10} (t_{1 i}) exp (x_{i}^{'} β_{1}) exp {- H_{10} (t_{1 i}) exp (x_{i}^{'} β_{1})},

a piecewise exponential distribution

f_{2} (t_{2 i} | x_{i}) = h_{20} (t_{2 i}) exp (x_{i}^{'} β_{2}) exp {- H_{20} (t_{2 i}) exp (x_{i}^{'} β_{2})},

and a uniform distribution, U(a_c, b_c), such that the proportion of death is around 2/5.

500 data sets with n = 500 observations in each dataset were generated from each of the three models, respectively, as described above. Each simulated dataset was fitted by all three models and the corresponding DICs and LPMLs were calculated. From Table 1, we see that the best model chosen by DIC and LPML is always consistent with the true model where the data were simulated. The mean DIC and mean LPML under each model and scenario are shown in Table 4 of Appendix A. The true value, estimate, standard deviation, mean square error, and coverage probability of each covariate coefficient when estimating from the true model for all of the three models are given in Table 5 of Appendix A. It is observed that the standard deviations and mean square errors are moderate and stable, and that the coverage probabilities are always around 0.95 under all three scenarios of (K₁, K₂).

Table 1.

Frequency of Ranking Each Model as Best under Different Scenarios

	DIC			LPML
(K₁, K₂)	FS Best	C Best	M Best	FS Best	C Best	M best
Data Simulated from FS Model

(5, 10)	0.626	0.002	0.372	0.608	0.000	0.392
(10, 20)	0.764	0.004	0.232	0.778	0.008	0.214
(15, 30)	0.892	0.006	0.102	0.918	0.006	0.076

Data Simulated from C Model

(5, 10)	0.194	0.524	0.282	0.190	0.482	0.328
(10, 20)	0.130	0.722	0.148	0.132	0.716	0.152
(15, 30)	0.156	0.732	0.112	0.158	0.726	0.116

Data Simulated from M Model

(5, 10)	0.096	0.044	0.860	0.100	0.072	0.828
(10, 20)	0.104	0.094	0.802	0.110	0.142	0.748
(15, 30)	0.098	0.154	0.748	0.120	0.192	0.688

Open in a new tab

To further compare the models, the median and interquartile range (IQR) of the pairwise differences of DIC and LPML for two models under each scenario are calculated, and the corresponding boxplots are shown in Figure 1. When the data were simulated from another model, the mixture model and fully specified subdistribution model fit better than the cause-specific hazards model.

Fig. 1 — Box Plots of DIC and LPML Differences between Models

6 Analysis of the Prostate Cancer Data

A subset of data from the prostate cancer studies published in Choueiri et al. (2010) is analyzed using the three models. The response variable was the time from prostate-specific antigen (PSA) failure to death or to the last follow-up, whichever came first. The median follow-up time after PSA failure was 11.2 years with IQR=(5.8, 16.0). The sample size was 546, with 54 prostate cancer deaths and 151 other causes of deaths. Seven covariates were considered in the analysis, including patient’s age at the date of PSA failure, the natural logarithm of PSA (logpsa), prostatectomy Gleason score (7 (GS7= 1) or otherwise (GS7= 0)), prostatectomy Gleason score (8 to 10 (GS8H= 1) or otherwise (GS8H= 0)), prostatectomy T classification (T3 and higher (T3= 1) or otherwise (T3= 0)), surgical margin status (positive (margin= 1) or otherwise (margin= 0)), and PSA doubling time (DT) (less than 6 months (DT6= 1) or otherwise (DT6= 0)). In this analysis, the prostate cancer cause of death is the primary cause and the other cause of death is any causes of death other than prostate cancer. We let β₁ = (β₁₁, β₁₂, … , β₁₇) denote the vector of the corresponding regression coefficients for the prostate cancer cause of death.

Different values of K₁ and K₂ were tried out for optimizing the model fitting. In the mixture model, z is the vector of all covariates without preselecting for a fair comparison between the models. The values of DIC, p_D, and LPML for the 3 × 3 combinations of K₁ and K₂ under all three models are reported in Table 2. We see that (K₁, K₂) = (15, 20) is the optimum combination of (K₁, K₂) for almost all models, and that the fully specified subdistribution model always outperforms the other two models by achieving the smallest DIC and the largest LPML.

Table 2.

DIC, Dimension Penalty, and LPML of Three Models

		C Model			M Model			FS Model
K₁	K₂	DIC	p_D	LPML	DIC	p_D	LPML	DIC	p_D	LPML
10	10	1586.5	34.3	−795.3	1580.0	42.5	−791.9	1565.3	34.4	−784.2
	20	1578.6	44.6	−792.5	1574.9	52.7	−791.2	1560.1	44.7	−782.9
	30	1600.8	55.0	−805.6	1596.0	63.0	−803.1	1581.5	55.1	−795.0

15	10	1584.2	39.9	−795.7	1576.4	48.3	−791.9	1564.4	39.8	−785.0
	20	1575.8	49.9	−792.7	1571.1	58.4	−790.3	1558.9	49.9	−783.5
	30	1598.0	60.3	−805.2	1592.1	68.8	−802.6	1580.1	60.5	−795.4

20	10	1599.8	45.1	−805.4	1592.6	53.6	−801.4	1579.4	45.2	−794.6
	20	1592.0	55.5	−802.6	1587.9	64.1	−800.3	1574.7	55.8	−793.4
	30	1614.1	65.9	−815.2	1609.6	74.7	−813.0	1595.4	65.9	−805.2

Open in a new tab

The subdistribution model of Fine and Gray (1999) was also fit the data. For the prostate cancer death, the estimates, standard errors (SEs) and 95% confidence intervals (CIs) of β₁ under the subdistribution model of Fine and Gray (1999) and the posterior means (estimates), posterior standard deviations (SDs), and 95% highest posterior density (HPD) intervals of β₁ under the FS model for the scenario of (K₁, K₂) = (15, 20) are shown in Table 3. We see, from Table 3, that all estimates were quite close and the two models gave consistent conclusions in terms of significance of covariates at a significance level of 0.05. Note that the estimates of β₁ under the subdistribution model of Fine and Gray (1999) were computed using the R-package cmprsk.

Table 3.

Estimates of β₁ under the Subdistribution Model and the Fully Specified Subdistribution Model

Variable	Subdistribution Model			Fully Specified Subdistribution Model
Variable	Estimate	SE	95% CI	Estimate	SD	95% HPD Interval
age	0.017	0.022	(−0.026, 0.059)	0.020	0.020	(−0.017, 0.061)
logpsa	−0.057	0.143	(−0.337, 0.223)	0.036	0.139	(−0.225, 0.316)
GS7	−0.173	0.431	(−1.018, 0.672)	−0.123	0.411	(−0.935, 0.676)
GS8H	0.298	0.400	(−0.486, 1.082)	0.153	0.402	(−0.615, 0.951)
T3	0.453	0.409	(−0.348, 1.255)	0.598	0.417	(−0.195, 1.414)
margin	0.483	0.305	(−0.115, 1.080)	0.417	0.295	(−0.124, 1.035)
DT6	0.990	0.278	( 0.445, 1.535)	0.919	0.271	( 0.407, 1.467)

Open in a new tab

Let the prostate cancer specific mortality (PCSM) be the cumulative incident function corresponding to the primary cause of death due to prostate cancer. The covariate effect was further investigated by comparing the posterior means of the PCSM at different times stratified by PSA doubling time (DT6= 1 versus DT6= 0) under each of the three models, where the other covariates were fixed at age = mean age (66.5), logpsa = mean logpsa (2.5), GS7 = 1, GS8H = 0, T3 = 1, and margin = 1. The PCSM plots are shown in Figure 2. The shapes of the PCSM curves under the three models were similar except at the tail part and the difference between the two curves was slightly smaller under the cause-specific hazards model. For example, at the 10^th and 15^th year after PSA failure, the posterior means of PCSM under the cause-specific hazards model were 0.055 and 0.166 for patients with PSA doubling time less than 6 months and 0.021 and 0.071 for patients with PSA doubling time greater than or equal to 6 months; under the mixture model the posterior means of PCSM were 0.052 and 0.17 for patients with PSA doubling time less than 6 months and 0.022 and 0.073 for patients with PSA doubling time greater than or equal to 6 months; under the FS model the posterior means of PCSM were 0.061 and 0.192 for patients with PSA doubling time less than 6 months and 0.025 and 0.082 for patients with PSA doubling time greater than or equal to 6 months. Those PCSM plots indicate that the patients with PSA doubling time less than 6 months had worse PCSMs than those with PSA doubling time greater than or equal to 6 months. This covariate effect can directly be seen from Table 3 under the FS model as DT6 was significant at a significance level of 0.05. In addition, the proportional hazards structure of the FS model also allows us to compute the adjusted hazard ratio (AHR), which is defined as exp(β₁₇), of DT6 for the PCSM. Specifically, the posterior mean and 95% HPD interval of the AHR of DT6 were 2.601 and (1.369, 4.061), respectively. However, this covariate effect could not be directly assessed under the other two models. For example, under the mixture model with K₁ = 15,K₂ = 20, for the hazard regression sub-model corresponding to the prostate cancer death, the posterior mean, SD, and 95% HPD interval of β₁₇ for DT6 were 0.122, 0.384, and (−0.627, 0.865) while the posterior mean, SD, and 95% HPD interval of ϕ₇ for DT6 were 0.519, 0.168, and (0.187, 0.839) in the logistic regression sub-model for p₁, indicating that DT6 was significant.

Fig. 2 — Plots of the PCSMs under the Three Models

7 Discussion

In this paper, we have developed a fully specified subdistribution model of Fine and Gray (1999) and provided a justification of Fine and Gray’s partial likelihood via the profile likelihood approach and the Bayesian approach. Our Bayesian justification is the first such development in the context of competing risk models after the Bayesian justification of Cox’s partial likelihood (Kalbfleisch, 1978; Sinha et al., 2003) as the risk set at time t in Fine and Gray’s partial likelihood includes all patients who are still alive prior to t as well as the patients who were died from other causes of death up to t, which is quite different than the usual risk set in Cox’s partial likelihood (Cox, 1972, 1975). To fit the proposed FS model, a piecewise exponential model with Jeffreys-type priors, which is a special case of the gamma process prior when c₀ → 0+, is assumed for the baseline hazard function. Compared to the full gamma process priors, the gamma priors based on the piecewise exponential model relax the conditions for the posterior propriety and facilitate the development of an efficient Gibbs sampling algorithm for carrying out the posterior computation.

In Section 5, we conducted an extensive simulation study in examining the performance of DIC and LPML in identifying the model from which the data were generated. Our simulation results empirically showed that when the data are from one (the true model) of the three models (cause-specific hazards model, mixture model, and fully specified subdistribution model), it is unlikely that the other two models would have smaller DICs and larger LPMLs than the true model. This may be due to different proportional hazard functions assumed under these three models. For the prostate cancer data, the FS model had much smaller DIC and larger LPML than those under the cause-specific model and mixture model, implying that the FS model was much more appropriate for fitting this dataset than the other two models.

The fully specified subdistribution model can be further extended to the cases with more than 2 competing risks. Assume there are J competing risks with cause 1 as the cause of interest. Denote $T_{j}^{*} = T_{j} \times I (δ = j) + \infty \times I (δ \neq j)$ , j = 1, 2, … , J, and $T^{*} = min {T_{1}^{*}, T_{2}^{*}, \dots, T_{J}^{*}}$ . The cause-specific cumulative incidence functions can be constructed as follows: F₁(t) = Pr(T* ≤ t, δ = 1) = Pr(T₁ ≤ t, δ = 1), and F_j(t) = Pr(T* ≤ t, δ = j) = M_j(t)Pr(δ ≠ j − 1|δ ≠ 1, … , δ ≠ j − 2) … Pr(δ ≠ 2|δ ≠ 1)Pr(δ ≠ 1), j = 2, … , J, where M_j(t) is the probability of failure from cause j by time t conditional on not failing from causes 1, 2, … , j − 1.

In this paper, we only considered fixed covariates. Including time-dependent covariates in the FS model, as well as jointly modeling longitudinal measurements (e.g., a series of PSA measures over-time) and survival endpoints of cause-specific death times are important future research topics, which are under investigation currently. Models with frailty terms (Clayton, 1978; Vaupel et al., 1979) are commonly used for correlated survival data with multivariate risk factors. Dixon et al. (2011) introduced a multivariate subdistribution hazard model including frailty to induce correlations among clustered survival times. The FS model can be extended for correlated survival data in the presence of competing risks in the manner to include frailties. This is another interesting topic for future research.

In all the Bayesian computations, we used 10,000 Gibbs samples after a burn-in of 1000 iterations for each model to compute all the posterior estimates, including posterior means, posterior standard deviations, 95% HPD intervals, DICs and LPMLs. We also generated 50,000 Gibbs samples after a burn-in of 1000 to re-compute those posterior quantities and the results were very similar. The HPD intervals were computed via the Monte Carlo method developed by Chen and Shao (1999). Codes were written for FORTRAN 95 compiler, and we used IMSL subroutines with double precision accuracy. The fortran codes for the FS model are available upon request.

Acknowledgements

The authors wish to thank the Editor-in-Chief, the Associate Editor, and the two referees for their helpful comments and suggestions, which have led to an improved version of this article. This research was partially supported by NIH grants #GM 70335 and #CA 74015.

Appendix A: Additional Tables

Table 4.

Mean of DIC and LPML of the Three Models

	DIC			LPML
(K₁, K₂)	FS Model	C Model	M Model	FS Model	C Model	M Model
Data Simulated from FS Model

(5, 10)	2142.0	2167.1	2144.2	−1071.1	−1083.7	−1072.0
(10, 20)	2119.2	2145.6	2123.2	−1060.6	−1073.9	−1062.6
(15, 30)	2126.5	2153.9	2131.8	−1065.8	−1079.6	−1068.4

Data Simulated from C Model

(5, 10)	2064.8	2055.8	2056.6	−1032.5	−1028.0	−1028.2
(10, 20)	2039.0	2030.7	2033.4	−1020.5	−1016.4	−1017.8
(15, 30)	2044.6	2037.5	2040.8	−1024.9	−1021.4	−1023.0

Data Simulated from M Model

(5, 10)	2286.8	2283.7	2274.7	−1144.0	−1142.3	−1137.7
(10, 20)	2285.9	2282.3	2276.1	−1144.4	−1142.4	−1139.5
(15, 30)	2295.9	2291.8	2287.0	−1150.9	−1148.7	−1146.5

Open in a new tab

Table 5.

Posterior Estimates of β under the Three Models in Simulation Studies

		FS Model			C Model			M Model
(K₁, K₂) =		(5, 10)	(10, 20)	(15, 30)	(5, 10)	(10, 20)	(15, 30)	(5, 10)	(10, 20)	(15, 30)
β₁₁	True	0.2			0.2			0.2
	Est	0.2	0.2	0.2	0.18	0.19	0.19	0.19	0.20	0.20
	SD	0.09	0.09	0.09	0.11	0.11	0.11	0.13	0.13	0.13
	MSE	0.01	0.01	0.01	0.01	0.01	0.01	0.02	0.02	0.02
	CP	0.94	0.94	0.94	0.95	0.95	0.95	0.94	0.94	0.94

β₁₂	True	0.8			1.0			1.5
	Est	0.78	0.78	0.78	1.00	1.02	1.02	1.47	1.48	1.48
	SD	0.23	0.23	0.23	0.24	0.24	0.24	0.35	0.36	0.37
	MSE	0.05	0.05	0.06	0.06	0.06	0.06	0.14	0.15	0.16
	CP	0.95	0.95	0.95	0.95	0.95	0.95	0.95	0.94	0.94

β₂₁	True	0.3			0.3			0.3
	Est	0.29	0.30	0.30	0.30	0.30	0.30	0.30	0.30	0.30
	SD	0.08	0.08	0.08	0.08	0.08	0.08	0.08	0.08	0.08
	MSE	0.01	0.01	0.01	0.01	0.01	0.01	0.01	0.01	0.01
	CP	0.97	0.96	0.96	0.95	0.95	0.95	0.95	0.95	0.95

β₂₂	True	1.0			0.2			0.5
	Est	0.98	1.00	1.00	0.19	0.20	0.20	0.53	0.53	0.54
	SD	0.17	0.17	0.17	0.15	0.15	0.15	0.18	0.18	0.18
	MSE	0.03	0.03	0.03	0.02	0.02	0.02	0.03	0.03	0.03
	CP	0.94	0.95	0.95	0.96	0.95	0.96	0.96	0.96	0.96

Open in a new tab

Note that Est, SD, MSE, and CP denote the average of the posterior means, the average of the posterior standard deviations, and the mean square error, and the coverage probability of the 95% HPD intervals over 500 simulations.

Appendix B: Proofs of Theorems

Proof of Theorem 1 With the assumption that h₁₀ is zero after the last observation of failure due to cause 1, the likelihood function is now

L (β_{1}, h_{10} | y, X, δ) = \prod_{i = 1}^{D_{1}} [h_{10} (y_{(i)}) exp (x_{i}^{'} β_{1})] \times exp {- \sum_{i = 1}^{n} H_{10} (y_{(i)}) exp (x_{i}^{'} β_{1})} .

The profile likelihood approach assumes that h₁₀ is zero except for the failure times due to cause 1. Then

L (β_{1}, h_{10} | y, X, δ) = [\prod_{i = 1}^{D_{1}} h_{10} (y_{(i)}) exp (x_{i}^{'} β_{1})] \times exp {- \sum_{i = 1}^{D_{1}} \sum_{k = 1}^{i} h_{10} (y_{(k)}) exp (x_{i}^{'} β_{1}) - \sum_{i = D_{1} + 1}^{n} \sum_{l = 1}^{D_{1}} h_{10} (y_{(l)}) exp (x_{i}^{'} β_{1})} = [\prod_{i = 1}^{D_{1}} h_{10} (y_{(i)}) exp (x_{i}^{'} β_{1})] \times exp {- \sum_{i - 1}^{D_{1}} h_{10} (y_{(i)}) \sum_{j \in R_{i}^{*}} exp (x_{j}^{'} β_{1})} .

Therefore, the profile maximum likelihood estimator of h₁₀ is given by

ĥ_{10} (y_{(i)}) = \frac{1}{\sum_{j \in R_{i}^{*}} exp (x_{j}^{'} β_{1})} .

Plugging ĥ₁₀(y_(i)) in L(β₁, h₁₀|y, X, δ) results in the profile likelihood function given by

L_{p}^{*} (β_{1} | y, X, δ) \propto \prod_{i = 1}^{D_{1}} \frac{exp (x_{i}^{'} β_{1})}{\sum_{j \in R_{i}^{*}} exp {x_{j}^{'} β_{1}}},

which is (2.6).

Proof of Theorem 2 Assume the prior of h₁₀ only has value λ_i at times y_(i) such that δ_i = 1. Then λ_i = H₁₀(y_(i)) − H₁₀(y_(i−1)), and λ_D₁+1 = ⋯ = λ_n = 0, i = 1, … , D₁. The survival function at time t is

Pr (T_{1}^{*} > t) = \prod_{i = 1}^{n} [1 - F_{1} (y_{(i)})] = exp {- \sum_{i = 1}^{D_{1}} H_{10} (y_{(i)}) exp (x_{i}^{'} β_{1}) - \sum_{i = D_{1} + 1}^{n} H_{10} (\infty) exp (x_{i}^{'} β_{1})} = exp {- \sum_{i = 1}^{D_{1}} \sum_{k = 1}^{i} λ_{k} exp (x_{i}^{'} β_{1}) - \sum_{i = D_{1} + 1}^{n} \sum_{l = 1}^{D_{1}} λ_{l} exp (x_{i}^{'} β_{1})} = exp {- \sum_{i = 1}^{D_{1}} λ_{i} \sum_{j \in R_{i}^{*}} exp (x_{i}^{'} β_{1})} .

(B.1)

Then the likelihood function in (2.8) reduces to

L (β_{1}, λ | y, X, δ) = \prod_{i = 1}^{D_{1}} [λ_{i} exp (x_{i}^{'} β_{1})] \times exp {- λ_{i} \sum_{j \in R_{i}^{*}} exp (x_{j}^{'} β_{1})} = \prod_{i = 1}^{D_{1}} [\frac{exp (x_{i}^{'} β_{1})}{\sum_{j \in R_{i}^{*}} exp (x_{j}^{'} β_{1})}] \times [λ_{i} \sum_{j \in R_{i}^{*}} exp (x_{j}^{'} β_{1})] \times exp {- λ_{i} \sum_{j \in R_{i}^{*}} exp (x_{j}^{'} β_{1})} .

Since $π (λ) \propto \prod_{i = 1}^{D_{1}} 1 / λ_{i}$ , we have

\int L (β_{1}, λ | y, X, δ) π (λ) d λ = \int \prod_{i = 1}^{D_{1}} \frac{1}{λ_{i}} \times [\frac{exp (x_{i}^{'} β_{1})}{\sum_{j \in R_{i}^{*}} exp (x_{j}^{'} β_{1})}] \times [λ_{i} \sum_{j \in R_{i}^{*}} exp (x_{j}^{'} β_{1})] \times exp {- λ_{i} \sum_{j \in R_{i}^{*}} exp (x_{j}^{'} β_{1})} d λ = \prod_{i = 1}^{D_{1}} \frac{exp (x_{i}^{'} β_{1})}{\sum_{j \in R_{i}^{*}} exp (x_{j}^{'} β_{1})} .

Proof of Theorem 3 Assume H₁₀ follows a gamma process prior. Let h_1i = H₁₀(y_(i)) − H₁₀(y_(i−1)), h_1i ~ G(c₀h₀i, c₀), i = 1, … , D₁. h_1i’s are independent of each other, and h_1,D₁+1 = ⋯ = h_1n = 0. Similar to (B.1), we can show that the survival function at time t is given by

Pr (T_{1}^{*} > t) = \prod_{i = 1}^{n} [1 - F_{1} (y_{(i)})] = exp {- \sum_{i = 1}^{D_{1}} h_{1 i} \sum_{j \in R_{i}^{*}} exp (x_{i}^{'} β_{1})} .

Taking expectation with respect to the gamma process prior gives

E^{h_{10}} [Pr (T_{1}^{*} > t)] = \prod_{i = 1}^{D_{1}} {[\frac{c_{0}}{c_{0} + \sum_{j \in R_{i}^{*}} exp (x_{i}^{'} β_{1})}]}^{c_{0} h_{0 i}} = \prod_{i = 1}^{D_{1}} exp {c_{0} \sum_{k = 1}^{i} h_{0 k} log (1 - \frac{exp (x_{i}^{'} β_{1})}{c_{0} + \sum_{j \in R_{i}^{*}} exp (x_{i}^{'} β_{1})})} .

Now the expectation of the likelihood function in (2.8) with respect to the gamma process prior reduces to

E^{h_{10}} [L (β_{1}, h_{10} | y, X, δ)] = \prod_{i = 1}^{D_{1}} exp {c_{0} \sum_{k = 1}^{i} h_{0 k} log (1 - \frac{exp (x_{i}^{'} β_{1})}{c_{0} + \sum_{j \in R_{i}^{*}} exp (x_{i}^{'} β_{1})})} \times [- c_{0} \sum_{k = 1}^{i} \frac{{dh}_{0 k}}{{dy}_{(i)}} log (1 - \frac{exp (x_{i}^{'} β_{1})}{c_{0} + \sum_{j \in R_{i}^{*}} exp (x_{i}^{'} β_{1})})] .

Since ${lim}_{c_{0} ↓ 0} exp {c_{0} \sum_{k = 1}^{i} h_{0 k} log (1 - \frac{exp (x_{i}^{'} β_{1})}{c_{0} + \sum_{j \in R_{i}^{*}} exp (x_{i}^{'} β_{1})})} = 1$ and

lim_{c_{0} ↓ 0} log (1 - \frac{exp (x_{i}^{'} β_{1})}{c_{0} + \sum_{j \in R_{i}^{*}} exp (x_{i}^{'} β_{1})}) \approx - \frac{exp (x_{i}^{'} β_{1})}{\sum_{j \in R_{i}^{*}} exp (x_{i}^{'} β_{1})},

we have

lim_{c_{0} ↓ 0} \frac{E^{h_{10}} [L (β_{1}, h_{10} | y, X, δ)]}{\prod_{i = 1}^{D_{1}} [c_{0} + \sum_{k = 0}^{i} \frac{{dh}_{0 k}}{{dy}_{(i)}}]} \approx \prod_{i = 1}^{D_{1}} \frac{exp (x_{i}^{'} β_{1})}{\sum_{j \in R_{i}^{*}} exp (x_{i}^{'} β_{1})} .

Proof of Theorem 4 To show that (3.3) is proper, it is needed to show that

\int π^{*} (β_{1}, β_{2}, λ_{1}, λ_{2} | t, X, δ) d β_{1} d β_{2} d λ_{1} d λ_{2} < \infty,

where π*(β₁, β₂, λ₁, λ₂|t, X, δ) is the unnormalized joint posterior density defined in (3.3). After some algebra, we can show that

π^{*} (β_{1}, β_{2}, λ_{1}, λ_{2} | t, X, δ) \leq \prod_{i = 1}^{n} \prod_{k = 1}^{K_{1}} {[λ_{1 k} exp (x_{i}^{'} β_{1}) exp {- exp (x_{i}^{'} β_{1}) λ_{1 k} (t_{i} - s_{1, k - 1})}]}^{ν_{1 ik} I (δ_{i} = 1)} \times λ_{1 k}^{- 1} λ_{1, K_{1} + 1}^{a - 1} exp (- b λ_{1, K_{1} + 1}) \times \prod_{i = 1}^{n} \prod_{k = 1}^{K_{2}} [λ_{2 k} exp (x_{i}^{'} β_{2}) exp {{- exp (x_{i}^{'} β_{2}) λ_{2 k} (t_{i} - s_{2, k - 1})]}^{ν_{2 ik} I (δ_{i} = 2)} \times λ_{2 k}^{- 1} \equiv π^{* *} (β_{1}, λ_{1} | t, X, δ) \times π^{* *} (β_{2}, λ_{2} | t, X, δ) .

It suffices to show that

\int π^{* *} (β_{1}, λ_{1} | t, X, δ) d β_{1} d λ_{1} < \infty,

since ∫ π**(β₂, λ₂|t, X, δ)dβ₂dλ₂ < ∞ can be proved in a similar way.

Consider the transformation u_j = log(λ_1j), j = 1, 2, … , K₁, and let u = (u₁, u₂, … , u_K₁)′. Then, we have

\int π^{* *} (β_{1}, λ_{1} | t, X, δ) d β_{1} d λ_{1} = \int π^{* *} (β_{1}, u | t, X, δ) d β_{1} d u \times \int λ_{1, K_{1} + 1}^{a - 1} exp (- b λ_{1, K_{1} + 1}) d λ_{1, K_{1} + 1} \propto \int \prod_{i = 1}^{n} \prod_{k = 1}^{K_{1}} {[exp (u_{k} + x_{i}^{'} β_{1}) \times exp {- exp (u_{k} + x_{i}^{'} β_{1}) (t_{i} - s_{1, k - 1})}]}^{ν_{1 ik} I (δ_{i} = 1)} d β_{1} d u .

(B.2)

It is easy to show that $exp (u_{k} + x_{i}^{'} β_{1}) \times exp {- exp (u_{k} + x_{i}^{'} β_{1}) (t_{i} - s_{1, k - 1})} \leq M_{1}$ , where M₁ > 0 is a constant. Under condition (ii), X₁ is of full rank. Thus, there exist distinct i₁, i₂, … , i_K₁+p₁, where p₁ = dim(β₁), such that the (K₁+p₁) × (K₁+p₁) matrix $X_{1}^{*}$ , which has rows $(x_{1 ℓ}^{*})' = (ν_{{ji}_{ℓ} 1}, \dots, ν_{{ji}_{ℓ} K_{j}}, x_{i ℓ}^{'})$ for ℓ = 1, … , K₁+p₁, is of full rank. Let k_{i_ℓ} be an integer such that t_{i_ℓ} ∈ (s_{1,k_iℓ−1}, s_{1k_iℓ}] for ℓ = 1, … , K₁+p₁. We take a one-to-one transformation $ξ_{1} = X_{1}^{*} (u', β_{1}^{'})'$ . Using (B.2), we have

\int π^{* *} (β_{1}, λ_{1} | t, X, δ) d β_{1} d λ_{1} \leq M_{2} \int \prod_{ℓ = 1}^{K_{1} + p_{1}} (exp {(x_{1 ℓ}^{*})' (u', β')'} exp [- exp {(x_{1 ℓ}^{*})' (u', β')' (t_{i} - s_{1, k_{i_{ℓ}} - 1})}]) d u d β = M_{3} \prod_{ℓ = 1}^{K_{1} + p_{1}} \int_{- \infty}^{\infty} exp (ξ_{1 ℓ}) exp {- exp (ξ_{1 ℓ}) (t_{i} - s_{1, k_{i_{ℓ}} - 1})} d ξ_{1 ℓ} = M_{3} \prod_{ℓ = 1}^{K_{1} + p_{1}} ({(t_{i} - s_{1, k_{i_{ℓ}} - 1})}^{- 1} < \infty,

where M₂ and M₃ are two positive constants. This completes the proof.

Appendix C: Generating η from [η|β₁, λ₁, β₂, λ₂, t,X, δ] and u from [u|β₁, λ₁, λ₂, η, t,X, δ]

Generating η from [η|β₁, λ₁, β₂, λ₂, t, X, δ] When δ_i = 0, we generate η_i by I(η_i = 1) ~ Bin(1, p_i1) and I(η_i = 2) = 1−I(η_i = 1), where $p_{i 1} = \frac{a_{i}}{a_{i} + b_{i}}, a_{i} = exp {- H_{10} (t_{i}) exp (x_{i}^{'} β_{1})} - exp {- H_{10} (\infty) exp (x_{i}^{'} β_{1})}, and b_{i} = exp {- H_{20} (t_{i}) exp (x_{i}^{'} β_{2}) - H_{10} (\infty) exp (x_{i}^{'} β_{1})}$ .

Generating u from [u|β₁, λ₁, λ₂, η, t, X, δ] When δ_i = 0 and η_i = 1, we generate u_i from a truncated piecewise exponential distribution f(u_i), where

f (u_{i}) \propto h_{10} (u_{i}) exp (x_{i}^{'} β_{1}) exp {- H_{10} (u_{i}) exp (x_{i}^{'} β_{1})}, u_{i} \geq t_{i} .

Denote δ_ji as the index such that s_{j, δ_ji−1} ≤ t_i < s_{jδ_ji}. Let

\begin{matrix} a_{1 i} & = exp {- H_{10} (t_{i}) exp (x_{i}^{'} β_{1})} - exp {- H_{10} (s_{1 δ_{1 i}}) exp (x_{i}^{'} β_{1})}, \\ a_{ki} & = exp {- H_{10} (s_{1, δ_{1 i} + k - 2}) exp (x_{i}^{'} β_{1})} - exp {- H_{10} (s_{1, δ_{1 i} + k - 1}) exp (x_{i}^{'} β_{1})}, k = 2, \dots, K_{1} - δ_{1 i} + 1 . \\ a_{K_{1} - δ_{1 i} + 2, i} & = exp {- H_{10} (s_{1 K_{1}}) exp (x_{i}^{'} β_{1})} - exp {- H_{10} (\infty) exp (x_{i}^{'} β_{1})} . \end{matrix}

Generate v_i from a U(0, 1) distribution. If v_i falls into the $k_{υ_{i}}^{th}$ interval such that

\frac{a_{1, i} + \dots + a_{k_{υ} - 1, i}}{a_{1, i} + \dots + a_{K_{1} - δ_{1 i} + 2, i}} < υ_{i} \leq \frac{a_{1, i} + \dots + a_{k_{υ}, i}}{a_{1, i} + \dots + a_{K_{1} - δ_{1 i} + 2, i}},

the inverse distribution function method is used to calculate u_i as

υ_{i} = \frac{a_{1, i} + \dots + a_{k_{υ} - 1, i} + exp {- H_{10} (s_{1, k_{υ} - 1}) exp (x_{i}^{'} β_{1})} - exp {- H_{10} (u_{i}) exp (x_{i}^{'} β_{1})}}{a_{1, i} + \dots + a_{K_{1} - δ_{1 i} + 2, i}} .

Contributor Information

Miaomiao Ge, Clinical Bio Statistics, Boehringer Ingelheim Pharmaceuticals, Inc., 900 Ridgefield Road, Ridgefield, CT, 06877.

Ming-Hui Chen, Department of Statistics, University of Connecticut, 215 Glenbrook Road, U-4120, Storrs, CT 06269, ming-hui.chen@uconn.edu.

References

Chen MH, Ibrahim JG, Shao QM. Posterior propriety and computation for the Cox regression model with applications to missing covariates. Biometrika. 2006;93:791–807. [Google Scholar]
Chen MH, Shao QM. Monte Carlo estimation of Bayesian credible and HPD intervals. J Comput Graph Stat. 1999;8:69–92. [Google Scholar]
Chen MH, Shao QM, Ibrahim JG. Monte Carlo methods in Bayesian computation. New York: Springer-Verlag; 2000. [Google Scholar]
Choueiri TK, Chen MH, D’Amico AV, Sun L, Nguyen PL, Hayes JH, Robertson CN, Walther PJ, Polascik TJ, Albala DM, Moul JW. Impact of postoperative prostate-specific antigen disease recurrence and the use of salvage therapy on the risk of death. Cancer. 2010;116:1887–1892. doi: 10.1002/cncr.25013. [DOI] [PubMed] [Google Scholar]
Clayton DG. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika. 1978;65:141–151. [Google Scholar]
Cox DR. Regression models and life tables (with discussion) J Roy Stat Society B. 1972;34:187–220. [Google Scholar]
Cox DR. Partial likelihood. Biometrika. 1975;62:269–276. [Google Scholar]
Dixon SN, Darlington GA, Desmond AF. A competing risks model for correlated data based on the subdistribution hazard. Lifetime Data Anal. 2011;17:473–495. doi: 10.1007/s10985-011-9198-9. [DOI] [PubMed] [Google Scholar]
Elashoff RM, Li G, Li N. An approach to joint analysis of longitudinal measurements and competing risks failure time data. Stat Med. 2007;26:2813–2835. doi: 10.1002/sim.2749. [DOI] [PMC free article] [PubMed] [Google Scholar]
Elashoff RM, Li G, Li N. A joint model for longitudinal measurements and survival data in the presence of multiple failure types. Biometrics. 2008;64:762–771. doi: 10.1111/j.1541-0420.2007.00952.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fan X. Unpublished Ph.D. Dissertation, Division of Biostatistics. Medical College of Wisconsin; 2008. Bayesian nonparametric inference for competing risks data. [Google Scholar]
Fine JP, Gray RJ. A proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc. 1999;94:496–509. [Google Scholar]
Gail M. A review and critique on some models used in competing risk analysis. Biometrics. 1975;31:209–222. [PubMed] [Google Scholar]
Gaynor JJ, Feuer EJ, Tan CC, Wu DH, Little CR, Strauss DJ, Clarkson BD, Brennan MF. On the use of cause-specific failure and conditional failure probabilities: examples from clinical oncology data. J Am Stat Assoc. 1993;88:400–409. [Google Scholar]
Gelfand AE, Dey DK. Bayesian model choice: asymptotics and exact calculations. J Roy Stat Society B. 1994;56:501–514. [Google Scholar]
Gilks WR, Wild P. Adaptive rejection sampling for Gibbs sampling. Appl Stat. 1992;41:337–348. [Google Scholar]
Gray RJ. A class of K-sample tests for comparing the cumulative incidence of a competing risk. Ann Stat. 1988;16:1141–1154. [Google Scholar]
Hu W, Li G, Li N. A Bayesian approach to joint analysis of longitudinal measurements and competing risks failure time data. Stat Med. 2009;28:1601–1619. doi: 10.1002/sim.3562. [DOI] [PMC free article] [PubMed] [Google Scholar]
Huang X, Li G, Elashoff RM, Pan J. A general joint model for longitudinal measurements and competing risks survival data with heterogeneous random effects. Lifetime Data Anal. 2011;17:80–100. doi: 10.1007/s10985-010-9169-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ibrahim JG, Chen MH, Sinha D. Bayesian survival analysis. New York: Springer-Verlag; 2001. [Google Scholar]
Kalbfleisch JD. Non-parametric Bayesian analysis of survival time data. J Roy Stat Society B. 1978;40:214–221. [Google Scholar]
Larson MG, Dinse GE. A mixture model for the regression analysis of competing risks data. Appl Stat. 1985;34:201–211. [Google Scholar]
Liu JS. The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problem. J Am Stat Assoc. 1994;89:958–966. [Google Scholar]
Lu K, Tsiatis AA. Multiple imputation methods for estimating regression coefficients in the competing risks model with missing cause of failure. Biometrics. 2001;57:1191–1197. doi: 10.1111/j.0006-341x.2001.01191.x. [DOI] [PubMed] [Google Scholar]
Prentice RL, Kalbfleisch JD, Peterson A, Flournoy N, Farewell V, Breslow N. The analysis of failure times in the presence of competing risks. Biometrics. 1978;34:541–554. [PubMed] [Google Scholar]
Sinha D, Ibrahim JG, Chen MH. A Bayesian justification of Cox’s partial likelihood. Biometrika. 2003;90:629–641. [Google Scholar]
Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A. Bayesian measures of model complexity and fit (with Discussion) J Roy Stat Society B. 2002;64:583–639. [Google Scholar]
Tsiatis A. A nonidentifiability aspect of the problem of competing risks. Proc Nat Acad Sciof USA. 1975;72:20–22. doi: 10.1073/pnas.72.1.20. [DOI] [PMC free article] [PubMed] [Google Scholar]
Vaupel JW, Manton KG, Stallard E. The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography. 1979;16:439–454. [PubMed] [Google Scholar]
Wang X, Sinha A, Yang J, Chen MH. Bayesian inference of interval-censored survival data. In: Chen DG, Sun J, Peace KE, editors. Interval-censored time-toevent data: methods and applications. Boca Raton, FL: Chapman & Hall; 2012. in press. [Google Scholar]

[R1] Chen MH, Ibrahim JG, Shao QM. Posterior propriety and computation for the Cox regression model with applications to missing covariates. Biometrika. 2006;93:791–807. [Google Scholar]

[R2] Chen MH, Shao QM. Monte Carlo estimation of Bayesian credible and HPD intervals. J Comput Graph Stat. 1999;8:69–92. [Google Scholar]

[R3] Chen MH, Shao QM, Ibrahim JG. Monte Carlo methods in Bayesian computation. New York: Springer-Verlag; 2000. [Google Scholar]

[R4] Choueiri TK, Chen MH, D’Amico AV, Sun L, Nguyen PL, Hayes JH, Robertson CN, Walther PJ, Polascik TJ, Albala DM, Moul JW. Impact of postoperative prostate-specific antigen disease recurrence and the use of salvage therapy on the risk of death. Cancer. 2010;116:1887–1892. doi: 10.1002/cncr.25013. [DOI] [PubMed] [Google Scholar]

[R5] Clayton DG. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika. 1978;65:141–151. [Google Scholar]

[R6] Cox DR. Regression models and life tables (with discussion) J Roy Stat Society B. 1972;34:187–220. [Google Scholar]

[R7] Cox DR. Partial likelihood. Biometrika. 1975;62:269–276. [Google Scholar]

[R8] Dixon SN, Darlington GA, Desmond AF. A competing risks model for correlated data based on the subdistribution hazard. Lifetime Data Anal. 2011;17:473–495. doi: 10.1007/s10985-011-9198-9. [DOI] [PubMed] [Google Scholar]

[R9] Elashoff RM, Li G, Li N. An approach to joint analysis of longitudinal measurements and competing risks failure time data. Stat Med. 2007;26:2813–2835. doi: 10.1002/sim.2749. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] Elashoff RM, Li G, Li N. A joint model for longitudinal measurements and survival data in the presence of multiple failure types. Biometrics. 2008;64:762–771. doi: 10.1111/j.1541-0420.2007.00952.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] Fan X. Unpublished Ph.D. Dissertation, Division of Biostatistics. Medical College of Wisconsin; 2008. Bayesian nonparametric inference for competing risks data. [Google Scholar]

[R12] Fine JP, Gray RJ. A proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc. 1999;94:496–509. [Google Scholar]

[R13] Gail M. A review and critique on some models used in competing risk analysis. Biometrics. 1975;31:209–222. [PubMed] [Google Scholar]

[R14] Gaynor JJ, Feuer EJ, Tan CC, Wu DH, Little CR, Strauss DJ, Clarkson BD, Brennan MF. On the use of cause-specific failure and conditional failure probabilities: examples from clinical oncology data. J Am Stat Assoc. 1993;88:400–409. [Google Scholar]

[R15] Gelfand AE, Dey DK. Bayesian model choice: asymptotics and exact calculations. J Roy Stat Society B. 1994;56:501–514. [Google Scholar]

[R16] Gilks WR, Wild P. Adaptive rejection sampling for Gibbs sampling. Appl Stat. 1992;41:337–348. [Google Scholar]

[R17] Gray RJ. A class of K-sample tests for comparing the cumulative incidence of a competing risk. Ann Stat. 1988;16:1141–1154. [Google Scholar]

[R18] Hu W, Li G, Li N. A Bayesian approach to joint analysis of longitudinal measurements and competing risks failure time data. Stat Med. 2009;28:1601–1619. doi: 10.1002/sim.3562. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Huang X, Li G, Elashoff RM, Pan J. A general joint model for longitudinal measurements and competing risks survival data with heterogeneous random effects. Lifetime Data Anal. 2011;17:80–100. doi: 10.1007/s10985-010-9169-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Ibrahim JG, Chen MH, Sinha D. Bayesian survival analysis. New York: Springer-Verlag; 2001. [Google Scholar]

[R21] Kalbfleisch JD. Non-parametric Bayesian analysis of survival time data. J Roy Stat Society B. 1978;40:214–221. [Google Scholar]

[R22] Larson MG, Dinse GE. A mixture model for the regression analysis of competing risks data. Appl Stat. 1985;34:201–211. [Google Scholar]

[R23] Liu JS. The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problem. J Am Stat Assoc. 1994;89:958–966. [Google Scholar]

[R24] Lu K, Tsiatis AA. Multiple imputation methods for estimating regression coefficients in the competing risks model with missing cause of failure. Biometrics. 2001;57:1191–1197. doi: 10.1111/j.0006-341x.2001.01191.x. [DOI] [PubMed] [Google Scholar]

[R25] Prentice RL, Kalbfleisch JD, Peterson A, Flournoy N, Farewell V, Breslow N. The analysis of failure times in the presence of competing risks. Biometrics. 1978;34:541–554. [PubMed] [Google Scholar]

[R26] Sinha D, Ibrahim JG, Chen MH. A Bayesian justification of Cox’s partial likelihood. Biometrika. 2003;90:629–641. [Google Scholar]

[R27] Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A. Bayesian measures of model complexity and fit (with Discussion) J Roy Stat Society B. 2002;64:583–639. [Google Scholar]

[R28] Tsiatis A. A nonidentifiability aspect of the problem of competing risks. Proc Nat Acad Sciof USA. 1975;72:20–22. doi: 10.1073/pnas.72.1.20. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] Vaupel JW, Manton KG, Stallard E. The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography. 1979;16:439–454. [PubMed] [Google Scholar]

[R30] Wang X, Sinha A, Yang J, Chen MH. Bayesian inference of interval-censored survival data. In: Chen DG, Sun J, Peace KE, editors. Interval-censored time-toevent data: methods and applications. Boca Raton, FL: Chapman & Hall; 2012. in press. [Google Scholar]

PERMALINK

Bayesian Inference of the Fully Specified Subdistribution Model for Survival Data with Competing Risks

Miaomiao Ge

Ming-Hui Chen

Abstract

1 Introduction

2 Subdistribution Based Models for Competing Risks

2.1 Preliminary

2.2 A Fully Specified Subdistribution Model for Two Competing Risks

2.3 Justification of Fine and Gray’s Partial Likelihood

3 Prior, Posterior, and Computational Development

3.1 Prior and Posterior

3.2 Computational Development

4 Model Comparison

4.1 Other Models for Comparison

Cause-specific Hazards Model

Mixture Model

4.2 Model Comparison Measures

5 A Simulation Study

Table 1.

Fig. 1.

6 Analysis of the Prostate Cancer Data

Table 2.

Table 3.

Fig. 2.

7 Discussion

Acknowledgements

Appendix A: Additional Tables

Table 4.

Table 5.

Appendix B: Proofs of Theorems

Appendix C: Generating η from [η|β₁, λ₁, β₂, λ₂, t,X, δ] and u from [u|β₁, λ₁, λ₂, η, t,X, δ]

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Bayesian Inference of the Fully Specified Subdistribution Model for Survival Data with Competing Risks

Miaomiao Ge

Ming-Hui Chen

Abstract

1 Introduction

2 Subdistribution Based Models for Competing Risks

2.1 Preliminary

2.2 A Fully Specified Subdistribution Model for Two Competing Risks

2.3 Justification of Fine and Gray’s Partial Likelihood

3 Prior, Posterior, and Computational Development

3.1 Prior and Posterior

3.2 Computational Development

4 Model Comparison

4.1 Other Models for Comparison

Cause-specific Hazards Model

Mixture Model

4.2 Model Comparison Measures

5 A Simulation Study

Table 1.

Fig. 1.

6 Analysis of the Prostate Cancer Data

Table 2.

Table 3.

Fig. 2.

7 Discussion

Acknowledgements

Appendix A: Additional Tables

Table 4.

Table 5.

Appendix B: Proofs of Theorems

Appendix C: Generating η from [η|β1, λ1, β2, λ2, t,X, δ] and u from [u|β1, λ1, λ2, η, t,X, δ]

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Appendix C: Generating η from [η|β₁, λ₁, β₂, λ₂, t,X, δ] and u from [u|β₁, λ₁, λ₂, η, t,X, δ]