Proportional hazards model for competing risks data with missing cause of failure

Seunggeun Hyun; Jimin Lee; Yanqing Sun

doi:10.1016/j.jspi.2012.02.037

. Author manuscript; available in PMC: 2013 Jul 1.

Published in final edited form as: J Stat Plan Inference. 2012 Feb 21;142(7):1767–1779. doi: 10.1016/j.jspi.2012.02.037

Proportional hazards model for competing risks data with missing cause of failure

Seunggeun Hyun ^a, Jimin Lee ^b,^✉, Yanqing Sun ^c

PMCID: PMC3314432 NIHMSID: NIHMS360186 PMID: 22468017

Abstract

We consider the semiparametric proportional hazards model for the cause-specific hazard function in analysis of competing risks data with missing cause of failure. The inverse probability weighted equation and augmented inverse probability weighted equation are proposed for estimating the regression parameters in the model, and their theoretical properties are established for inference. Simulation studies demonstrate that the augmented inverse probability weighted estimator is doubly robust and the proposed method is appropriate for practical use. The simulations also compare the proposed estimators with the multiple imputation estimator of Lu and Tsiatis (2001). The application of the proposed method is illustrated using data from a bone marrow transplant study.

Keywords: Asymptotic property, Augmented inverse probability weighted estimator, Cause-specific hazard function, Double robust property, Inverse probability weighted estimator, Missing cause of failure, Multiple imputation

1. Introduction

Competing risks data are commonly encountered in medical studies. Typically the responses to a treatment can be classified in terms of failure from disease of interest or from non-disease-related causes. Hence, in the competing risks framework, each individual is exposed to K distinct types of risks and the eventual failure can be attributed to precisely one of the risks. Let T^* denote the time to failure, Δ^* the cause of failure, and Z a p-dimensional vector of possibly time-dependent covariates. Then a principal estimable quantity in competing risks data is the cause-specific hazard function of cause k, defined, in the absence of censoring, by

λ_{k}^{*} (t ∣ Z) = lim_{h \to 0} h^{- 1} P (t \leq T^{*} < t + h, Δ^{*} = k ∣ T^{*} \geq t, Z),

which is the instantaneous rate of experiencing the event of type k at time t, having not experienced any of the K competing events until time t. Without loss of generality in our study, we consider only two causes of failure, the cause of interest as cause 1 and the other as cause 2 (i.e. Δ^* = 1 or 2). In many applications involving follow-up studies, however, individuals may be subject to censoring. Let C be a censoring time and $T^{*} = min (T_{1}^{*}, T_{2}^{*})$ , where $T_{1}^{*}$ and $T_{2}^{*}$ denote the latent failure times from causes 1 and 2, respectively. Then the observed data consist of observations of (T, Δ, Z), where T = min(T^*, C) and Δ = Δ^*I(T^* ≤ C). If the failure time T^* is observed, Δ is the cause of failure and Δ = 0 otherwise. The observable cause-specific hazard function of cause k in the presence of censoring is given by

λ_{k} (t ∣ Z) = lim_{h \to 0} h^{- 1} P (t \leq T < t + h, Δ = k ∣ T \geq t, Z), k = 1, 2.

Throughout the paper, we assume that Z is an external covariate process (Kalbfleish and Prentice, 2002) and the censoring time C is conditionally independent of (T^*, Δ^*) given Z (Lu and Tsiatis, 2001; Gao and Tsiatis, 2005; Lu and Liang, 2008). Under this assumption, it can be shown that $λ_{k}^{*} (t ∣ Z) = λ_{k} (t ∣ Z)$ if the distribution of C is continuous at t. A number of statistical models for the relationship between the cause-specific hazard function of interest and regression covariates have been studied, among others, by Benichou and Gail (1990); Prentice et al. (1978); Cheng et al. (1998); Shen and Cheng (1999); Scheike and Zhang (2003). In this article we study the proportional hazards model for describing the relationship,

λ_{1}^{*} (t ∣ Z) = λ_{0} (t) e^{β_{0}^{⊤} Z (t)},

(1)

where λ₀(·) is a nonnegative, but otherwise unspecified baseline hazard function and β₀ is a p-dimensional vector of regression parameters. The parameter β₀ can be consistently estimated by treating all the failure times with Δ ≠ 1 as censored observations and using the partial likelihood score equation proposed by Cox (1972, 1975). The estimator will be called the full-case estimator, denoted by β̂_F in the paper.

In practice, however, the information needed for the cause of failure may be lost, or it may be difficult to determine the cause of disease or death for some individuals (Andersen et al., 1996). When we have missing causes in data, a naive method for estimating the regression parameter β₀ is to simply ignore the missing data and use the partial likelihood score equation to the complete data only. The so-called complete-case estimator, denoted by β̂_C, is clearly inefficient and can lead to serious bias. Thus, analysis of competing risks data with missing cause of failure has received considerable attention and a number of models have been proposed. Dinse (1982, 1986) considered nonparametric estimation for incomplete cause of death data with no covariate. Goetghebeur and Ryan (1990) proposed a modified log-rank test to compare survival in two groups, Dewanji (1992) suggested a modification of that approach, and Goetghebeur and Ryan (1995) extended the results of Goetghebeur and Ryan (1990) to proportional hazards regression model. More recently, Lu and Tsiatis (2001) proposed a parametric model to model the probability that the missing cause is the cause of interest while allowing the inclusion of additional auxiliary covariates and then estimated the regression parameters by using a multiple imputation method (Rubin, 1987, 1996). Gao and Tsiatis (2005) considered linear transformation models and Lu and Liang (2008) considered the additive hazards model for analysis of competing risks data with missing cause of failure.

For right-censored survival data in which the censoring indicator is missing, Lo (1991) considered the problem of nonparametric maximum likelihood estimation of a survival function in the absence of covariates, McKeague and Subramanian (1998) developed a survival function estimator assuming that the censoring indicators are missing completely at random. Subramanian (2000) considered further development of efficient estimation of the regression parameters under proportionality assumptions of the conditional hazards, and Gijbels et al. (2007) proposed a class of estimating functions for the regression parameters of the Cox proportional hazard model, among others.

In this study of analysis of competing risks data with missing cause of failure, we derive two different estimators for the regression parameters in model (1), namely the inverse probability weighted estimator and augmented inverse probability weighted estimator, and establish their theoretical properties. The first approach, following the idea of Horvitz and Thompson (1952), uses the inverse probability weighted complete-case technique to estimate the regression parameter. This approach uses only the complete cases and relies on correct modeling for the probability of missing causes. It has been shown that the inverse probability weighted estimator is inconsistent when the respective parametric model is misspecified, and is inefficient (Gao and Tsiatis, 2005; Lu and Liang, 2008; Scharfstein et al., 1999). It would be desirable, therefore, to obtain improved efficiency over the inverse probability weighted estimator. The second approach, adapting the idea of Robins et al. (1994), augments the inverse probability weighted complete-case estimating equation with a consistent estimator of the conditional distribution of the cause of interest that incorporates information available for individuals whose cause of failure is missing. See Subramanian and Bandyopadhyay (2010) for homogeneous right censored data with missing censoring indicators.

The paper is structured as follows. In Section 2, the inverse probability weighted estimator and augmented inverse probability weighted estimator are developed. The asymptotic properties of the corresponding estimators are established in Section 3. In Section 4, we investigate the finite sample properties of the proposed estimators through simulations, including comparisons with the multiple imputations estimator proposed by Lu and Tsiatis (2001). A bone marrow transplant data set is analyzed in Section 4. Some conclusions and discussions are given in Section 5. Technical derivations are detailed in Appendix.

2. Estimating equations

Since the cause of failure may not be observed for some individuals, we define the missingness indicator R as follows. If an individual’s death is observed, then R = 1 when the cause of failure information Δ^* is observed and R = 0 otherwise. If an individual is censored, we always define R = 1. We also introduce auxiliary covariates A which are not of interest for modelling the cause-specific hazard function but may be used to describe the missingness mechanism. The utilization of auxiliary information has been considered by Lu and Tsiatis (2001), Gao and Tsiatis (2005), Lu and Liang (2008), Gilbert, McKeague, and Sun (2008), among others. Then the observed data will consist of

O_{i} = {T_{i}, Z_{i}, A_{i}, R_{i}, I (Δ_{i} = 0), R_{i} I (Δ_{i} = 1), R_{i} I (Δ_{i} = 2)}

for i = 1, …, n. We assume that {O_i, i = 1, …, n} are independent identically distributed. The possible choices are {T_i, Z_i, A_i, 1, 0, 1, 0} for the individual who died from the cause 1, {T_i, Z_i, A_i, 1, 0, 0, 1} for the individual who died from the cause 2, {T_i, Z_i, A_i, 0, 0, 0, 0} for the individual who died with missing cause, and {T_i, Z_i, A_i, 1, 1, 0, 0} for the censored individual.

We also assume that the cause of failure is missing at random (MAR) (Rubin, 1976); that is, the probability that the cause of failure is missing given Δ > 0 and W = (T, Z, A) depends only on the observed W, but not on the unobserved Δ,

P (R = 1 ∣ Δ, Δ > 0, W) = P (R = 1 ∣ Δ > 0, W) .

(2)

The assumption implies that

\begin{array}{l} P (R = 1 ∣ Δ > 0, W) = P (R = 1, Δ = 1 ∣ Δ > 0, W) + P (R = 1, Δ = 2 ∣ Δ > 0, W) \\ = P (R = 1 ∣ Δ = 1, Δ > 0, W) P (Δ = 1 ∣ Δ > 0, W) + P (R = 1 ∣ Δ = 2, Δ > 0, W) P (Δ = 2 ∣ Δ > 0, W) \\ = P (R = 1 ∣ Δ = 1, Δ > 0, W) [P (Δ = 1 ∣ Δ > 0, W) + P (Δ = 2 ∣ Δ > 0, W)] \\ = P (R = 1 ∣ Δ = 1, Δ > 0, W), \end{array}

and likewise for the case involving Δ = 2. See also Lu and Tsiatis (2001), Gao and Tsiatis (2005), and Lu and Liang (2008).

2.1. Inverse probability weighted estimator

Following the inverse selection probability idea of Horvitz and Thompson (1952), the method of inversely weighting the probability of complete-case has been commonly used in missing data problems. To do that, we need to estimate the probability of a complete case, π(Q) ≡ P (R = 1|Q), where Q = (W, Δ). By the MAR assumption and R = 1 when Δ = 0, we have

\begin{array}{l} π (Q) = P (R = 1 ∣ W, Δ) \\ = P (R = 1 ∣ W, Δ, Δ > 0) \cdot I (Δ > 0) + P (R = 1 ∣ W, Δ, Δ = 0) \cdot I (Δ = 0) \\ = P (R = 1 ∣ W, Δ, Δ > 0) \cdot I (Δ > 0) + 1 \cdot I (Δ = 0) \\ = P (R = 1 ∣ W, Δ > 0) \cdot I (Δ > 0) + I (Δ = 0) \\ = r (W) \cdot I (Δ > 0) + I (Δ = 0), \end{array}

(3)

where r(W) = P (R = 1|W, Δ > 0). We consider that the probability of complete-case r(W_i) may be specified as a parametric model r(W_i, ψ₀), in terms of a few unknown parameters ψ₀. Accordingly, let π(Q_i, ψ₀) = r(W_i, ψ₀)I(Δ > 0) + I(Δ = 0). Since R is binary, one can posit the logistic model $logit {r (W_{i}, ψ)} = W_{i}^{⊤} ψ$ , though other parametric models can also be used. By (2) and (3), the likelihood L regarding to π(Q, ψ₀) is

\begin{array}{l} L (π) = \prod π {(Q_{i})}^{I (R_{i} = 1)} {(1 - π (Q_{i}))}^{I (R_{i} = 0)} \\ = \prod π {(Q_{i})}^{I (R_{i} = 1) I (Δ_{i} > 0)} \cdot π {(Q_{i})}^{I (R_{i} = 1) I (Δ_{i} = 0)} \cdot {(1 - π (Q_{i}))}^{I (R_{i} = 0) I (Δ_{i} > 0)} \\ = \prod r {(W_{i})}^{I (R_{i} = 1) I (Δ_{i} > 0)} \cdot 1 \cdot {(1 - r (W_{i}))}^{I (R_{i} = 0) I (Δ_{i} > 0)}, \\ because π (Q_{i}) = r (W_{i}) when Δ_{i} > 0, and π (Q_{i}) = 1 when Δ_{i} = 0 \\ = \prod r {(W_{i})}^{I (R_{i} = 1) I (Δ_{i} > 0)} \cdot {(1 - r (W_{i}))}^{(1 - R_{i}) I (Δ_{i} > 0)} . \end{array}

This implies that the maximum likelihood estimator ψ̂ of ψ can be estimated by maximizing the likelihood based on uncensored data

\prod_{i = 1}^{n} {r (W_{i}, ψ)}^{R_{i} I (Δ_{i} > 0)} {1 - r (W_{i}, ψ)}^{(1 - R_{i}) I (Δ_{i} > 0)} .

It is known that for a correctly specified model r(W_i, ψ), ψ̂ consistently estimates ψ₀, the true value of the parametric component of r(W_i, ψ) (Haberman, 1974, 1977; Gourieroux and Monfort, 1981).

We define the counting process N_i(t) = I(Δ_i = 1)I(T_i ≤ t) and at-risk process Y_i(t) = I(T_i ≥ t). Let a^⊗0 = 1, a^⊗1 = a, and a^⊗2 = aa^⊤ for a vector a. Let

\begin{array}{c} {\tilde{S}}^{(m)} (t, β, ψ) = \frac{1}{n} \sum_{i = 1}^{n} \frac{R_{i}}{π (Q_{i}, ψ)} Y_{i} (t) e^{β^{⊤} Z_{i} (t)} Z_{i} {(t)}^{\otimes m}, \\ \tilde{Z} (t, β, ψ) = {\tilde{S}}^{(1)} (t, β, ψ) / {\tilde{S}}^{(0)} (t, β, ψ), \\ \tilde{V} (t, β, ψ) = \frac{{\tilde{S}}^{(2)} (t, β, ψ)}{{\tilde{S}}^{(0)} (t, β, ψ)} - \tilde{Z} {(t, β, ψ)}^{\otimes 2} \end{array}

for m = 0, 1, 2. Then we consider the following inverse probability weighted estimating equation for β₀:

U_{I} (β, \hat{ψ}) = \sum_{i = 1}^{n} \int_{0}^{τ} \frac{R_{i}}{π (Q_{i}, \hat{ψ})} (Z_{i} (t) - \tilde{Z} (t, β, \hat{ψ})) {d N}_{i} (t),

(4)

where τ > 0 is the end of follow-up time. The inverse probability weighted estimator (IPW) of β solves the above equation and is denoted by β̂_I. When there is no missing cause, the equation (4) consequently becomes the partial likelihood score equation proposed by Cox (1972, 1975). The cumulative baseline hazard function $Λ_{0} (t) = \int_{0}^{t} λ_{0} (u) d u$ can be estimated by

{\hat{Λ}}_{I 0} (t) = \sum_{i = 1}^{n} \int_{0}^{t} \frac{R_{i}}{π (Q_{i}, \hat{ψ})} \frac{{d N}_{i} (u)}{n {\tilde{S}}^{(0)} (u, {\hat{β}}_{I}, \hat{ψ})} .

2.2. Augmented inverse probability weighted estimator

The inverse probability weighted estimator β̂_I uses only complete cases. Thus, it is inefficient. In addition, its consistency relies on correct modelling of the probability r(W_i, ψ₀). To improve the robustness and efficiency over β̂_I, we adapt the idea of Robins et al. (1994) and propose to augment the inverse probability weighted estimating equation with a consistent estimator of the conditional distribution of the cause of interest that utilizes available information for individuals with missing cause of failure.

Consequently, we estimate the probability that the cause of failure is the cause of interest ρ(W) ≡ P (Δ = 1| Δ > 0, W). Let f(t, Δ = 1| Δ > 0, z, a) be the conditional joint density of T and Δ = 1 given (Δ > 0, Z = z, A = a). Then for w = (t, z, a),

\begin{array}{l} ρ (w) = \frac{f (t, Δ = 1 ∣ Δ > 0, z, a)}{f (t ∣ Δ > 0, z, a)} \\ = \frac{f (t, Δ = 1 ∣ z, a)}{f (t, Δ = 1 ∣ z, a) + f (t, Δ = 2 ∣ z, a)} \\ = \frac{{\tilde{λ}}_{1} (t ∣ z, a)}{{\tilde{λ}}_{1} (t ∣ z, a) + {\tilde{λ}}_{2} (t ∣ z, a)}, \end{array}

where λ̃_k(t|z, a) is the conditional cause-specific hazard function of T at t due to cause Δ = k given (Z, A) = (z, a) for k = 1, 2. Here, instead of directly estimating ρ(W_i) which requires the estimation of two unknown cause-specific hazard functions, we posit a parametric model ρ(W_i, γ₀) for ρ(W_i) in terms of a few unknown parameters γ₀. It is natural to use a logistic regression model $logit {ρ (W_{i}, γ)} = W_{i}^{⊤} γ$ , but other parametric models can also be accommodated.

There is, however, an issue with obtaining estimates for γ in the presence of missingness. The MAR assumption implies that given Δ > 0 and W, R is independent of Δ; that is,

P (Δ = 1 ∣ Δ > 0, W) = P (Δ = 1 ∣ R, Δ > 0, W) .

(5)

By (5), ρ(W_i) can be deduced from the complete cases with R_i = 1 and Δ_i > 0. This suggests that the maximum likelihood estimator γ̂ of γ can be obtained by maximizing the likelihood based on complete-case data

\prod_{i = 1}^{n} {ρ (W_{i}, γ)}^{R_{i} I (Δ_{i} = 1)} {1 - ρ (W_{i}, γ)}^{R_{i} I (Δ_{i} = 2)} .

Since γ̂ is the maximum likelihood estimator, then for a correctly specified model ρ(W_i, γ), γ̂ consistently estimates γ₀, the true value of the parametric component model ρ(W_i, γ) (Haberman, 1974, 1977; Gourieroux and Monfort, 1981).

Now, we define the counting process $N_{i}^{*} (t) = I (Δ_{i} > 0) I (T_{i} \leq t)$ . Let

\begin{matrix} S^{(m)} (t, β) = \frac{1}{n} \sum_{i = 1}^{n} Y_{i} (t) e^{β^{⊤} Z_{i} (t)} Z_{i} {(t)}^{\otimes m}, \\ \bar{Z} (t, β) = S^{(1)} (t, β) / S^{(0)} (t, β), \\ V (t, β) = \frac{S^{(2)} (t, β)}{S^{(0)} (t, β)} - \bar{Z} {(t, β)}^{\otimes 2} . \end{matrix}

Then we propose the following augmented inverse probability weighted estimating equation

U_{A} (β, \hat{ψ}, \hat{γ}) = \sum_{i = 1}^{n} \int_{0}^{τ} (Z_{i} (t) - \bar{Z} (t, β)) [\frac{R_{i}}{r (W_{i}, \hat{ψ})} {d N}_{i} (t) - \frac{R_{i} - r (W_{i}, \hat{ψ})}{r (W_{i}, \hat{ψ})} ρ (W_{i}, \hat{γ}) {d N}_{i}^{*} (t)],

(6)

where ψ̂ and γ̂ are the maximum likelihood estimators defined earlier. The augmented inverse probability weighted estimator (AIPW) of β solves the above equation (6) and is denoted by β̂_A. The cumulative baseline hazard function can be estimated by

{\hat{Λ}}_{A 0} (t) = \sum_{i = 1}^{n} \int_{0}^{t} \frac{1}{{n S}^{(0)} (u, {\hat{β}}_{A})} [\frac{R_{i}}{r (W_{i}, \hat{ψ})} {d N}_{i} (u) - \frac{R_{i} - r (W_{i}, \hat{ψ})}{r (W_{i}, \hat{ψ})} ρ (W_{i}, \hat{γ}) {d N}_{i}^{*} (u)] .

3. Asymptotic results

When the model for r(W_i) is correctly specified, we let ψ₀ be the true value of ψ such that r(W_i) = r(W_i, ψ₀). Under Condition (A.4) stated in the Appendix, $\hat{ψ} \overset{P}{\to} ψ_{0}$ . When the model for ρ(W_i) is correctly specified, we let γ₀ be the true value of γ such that ρ(W_i) = ρ(W_i, γ₀). In this case, $\hat{γ} \overset{P}{\to} γ_{0}$ . In general, under Condition (A.4), there exist ψ^* and γ^* such that $\hat{ψ} \overset{P}{\to} ψ^{*}$ and $\hat{γ} \overset{P}{\to} γ^{*}$ (White, 1982). We have ψ^* = ψ₀ if r(W_i) is correctly specified, and γ^* = γ₀ if ρ(W_i) is correctly specified.

Let s⁽^m⁾(t, β) = E[Y₁(t)e^{β^⊤Z₁(t)} Z₁(t)^⊗^m], z̄(t, β) = s⁽¹⁾(t, β)/s⁽⁰⁾(t, β), and v(t, β) = s⁽²⁾(t, β)/s⁽⁰⁾(t, β) − z̄(t, β)^⊗2.

Theorem 1

Assume Condition A given in the Appendix. If r(W_i, ψ₀) is correctly specified for r(W_i), then ${\hat{β}}_{I} \overset{P}{\to} β_{0}$ and $\sqrt{n} ({\hat{β}}_{I} - β_{0})$ converges in distribution to a zero-mean Gaussian random vector with covariance matrix $\sum^{- 1} E [ω_{i} ω_{i}^{T}] \sum^{- 1}$ , where

\begin{matrix} \sum = \int_{0}^{τ} v (t, β_{0}) λ_{0} (t) s^{(0)} (t, β_{0}) d t, \\ ω_{i} = \int_{0}^{τ} {Z_{i} (t) - \bar{z} (t, β_{0})} \frac{R_{i}}{π (Q_{i}, ψ_{0})} {d M}_{i} (t) - V_{ψ} I_{ψ}^{- 1} S_{ψ i}, \end{matrix}

$M_{i} (t) = N_{i} (t) - \int_{0}^{t} Y_{i} (u) e^{β_{0}^{⊤} Z_{i} (u)} d Λ_{0} (u)$ , V_ψ is given in (8), I_ψ and S_ψi are given in (11) in the Appendix.

The asymptotic covariance matrix $\sum^{- 1} E (ω_{i} ω_{i}^{T}) \sum^{- 1}$ can be consistently estimated by

{\sum^{^}}_{I}^{- 1} (n^{- 1} \sum_{i = 1}^{n} {\hat{ω}}_{i} {\hat{ω}}_{i}^{⊤}) {\sum^{^}}_{I}^{- 1},

where

\begin{matrix} {\sum^{^}}_{i} = \frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{τ} \frac{R_{i}}{π (Q_{i}, \hat{ψ})} \tilde{V} (t, {\hat{β}}_{I}, \hat{ψ}) {d N}_{i} (t), \\ {\hat{ω}}_{i} = \int_{0}^{τ} \frac{R_{i}}{π (Q_{i}, \hat{ψ})} {Z_{i} (t) - \tilde{Z} (t, {\hat{β}}_{I}, \hat{ψ})} d {\hat{M}}_{i} (t) - {\hat{V}}_{ψ} {\hat{I}}_{ψ}^{- 1} {\hat{S}}_{ψ i}, \end{matrix}

and ${\hat{M}}_{i} (t) = N_{i} (t) - \int_{0}^{t} Y_{i} (u) e^{{\hat{β}}_{I}^{⊤} Z_{i} (u)} d {\hat{Λ}}_{I 0} (u)$ . Here V̂_ψ, Î_ψ and Ŝ_ψi are obtained by replacing with their respective sample estimators and substituting (β̂_I, ψ̂) for (β₀, ψ₀) in V_ψ, I_ψ, and S_ψi.

The following establishes the asymptotic properties of β̂_A.

Theorem 2

Assume Condition A given in the Appendix. If at least one of r(W_i, ψ₀) and ρ(W_i, γ₀) is correctly specified for r(W_i) and ρ(W_i), then ${\hat{β}}_{A} \overset{P}{\to} β_{0}$ and $\sqrt{n} ({\hat{β}}_{A} - β_{0})$ converges in distribution to a zero-mean Gaussian random vector with covariance matrix $\sum^{- 1} E [φ_{i} φ_{i}^{T}] \sum^{- 1}$ , where

\sum = \int_{0}^{τ} v (t, β_{0}) λ_{0} (t) s^{(0)} (t, β_{0}) d t,

and

φ_{i} = \int_{0}^{τ} {Z_{i} (t) - \bar{z} (t, β_{0})} {d M}_{i}^{*} (t) - P_{ψ} {(I_{ψ}^{*})}^{- 1} S_{ψ i}^{*} - P_{γ} {(I_{γ}^{*})}^{- 1} S_{γ i}^{*} .

Here M^*(t) is defined in (17), and P_ψ, $I_{ψ}^{*}, S_{ψ i}^{*}$ , P_γ, $I_{γ}^{*}$ and $S_{γ i}^{*}$ are given in (15) and (19) in the Appendix.

It is interesting to notice that if r(W_i, ψ₀) = r(W_i), then P_γ = 0, and if ρ(W_i, γ₀) = ρ(W_i), then P_ψ = 0. When both the models for r(W_i) and ρ(W_i) are correctly specified, we have P_ψ = 0 and P_γ = 0 and hence, $φ_{i} = \int_{0}^{τ} {Z_{i} (t) - \bar{z} (t, β_{0})} {d M}_{i}^{*} (t)$ .

The asymptotic covariance matrix $\sum^{- 1} E (φ_{i} φ_{i}^{T}) \sum^{- 1}$ can be consistently estimated by

{\sum^{^}}_{A}^{- 1} (n^{- 1} \sum_{i = 1}^{n} {\hat{φ}}_{i} {\hat{φ}}_{i}^{⊤}) {\sum^{^}}_{A}^{- 1},

where

\begin{matrix} {\sum^{^}}_{A} = \frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{τ} V (t, {\hat{β}}_{A}) [\frac{R_{i}}{r (W_{i}, \hat{ψ})} {d N}_{i} (t) - \frac{R_{i} - r (W_{i}, \hat{ψ})}{r (W_{i}, \hat{ψ})} ρ (W_{i}, \hat{γ}) {d N}_{i}^{*} (t)], \\ {\hat{φ}}_{i} = \int_{0}^{τ} {Z_{i} (t) - \bar{Z} (t, {\hat{β}}_{A})} d {\hat{M}}_{i}^{*} (t) - {\hat{P}}_{ψ} {({\hat{I}}_{ψ}^{*})}^{- 1} {\hat{S}}_{ψ i}^{*} - {\hat{P}}_{γ} {({\hat{I}}_{γ}^{*})}^{- 1} {\hat{S}}_{γ i}^{*}, \end{matrix}

and

{\hat{M}}_{i}^{*} (t) = \frac{R_{i}}{r (W_{i}, \hat{ψ})} N_{i} (t) - \frac{R_{i} - r (w_{i}, \hat{ψ})}{r (W_{i}, \hat{ψ})} ρ (W_{i}, \hat{γ}) N_{i}^{*} (t) - \int_{0}^{t} Y_{i} (u) e^{{\hat{β}}_{A}^{⊤} Z_{i} (u)} d {\hat{Λ}}_{A 0} (u) .

Here P̂_ψ, ${\hat{I}}_{ψ}^{*}, {\hat{S}}_{ψ i}^{*}$ , P̂_γ, ${\hat{I}}_{γ}^{*}$ and ${\hat{S}}_{γ i}^{*}$ are the empirical counterparts of P_ψ, $I_{ψ}^{*}, S_{ψ i}^{*}$ , P_γ, $I_{γ}^{*}$ and $S_{γ i}^{*}$ given in (15) and (19) in the Appendix, obtained by replacing with their respective sample estimators and substituting (β̂_A, ρ̂, γ̂) for (β₀, ψ^*, γ^*).

4. Numerical results

4.1. Simulation studies

We present simulation studies conducted to evaluate the performance of our proposed methods. We set τ = 2.0 and consider a univariate covariate Z, where Z follows a uniform distribution on [0, 1]. Given Z, the latent failure time $T_{1}^{*}$ of interest is generated from the proportional hazards model $λ_{1}^{*} (t ∣ Z) = λ e^{β Z}$ , where λ = 1 and β = −0.5. The other latent failure time $T_{2}^{*}$ is generated from a Gompertz distribution with a hazard function $λ_{2}^{*} (t ∣ Z) = e^{θ + ν t}$ , where θ = −0.5 and ν = 0.2. The censoring time C is generated from an exponential distribution which yields about 20% censoring level. We consider a single auxiliary covariate A which follows a Bernoulli distribution with success probability of 0.5. We also consider a logistic regression model logit{r(W, ψ)} = ψ₁ + ψ₂T + ψ₃Z + ψ₄A for missing cause of failure. We have about 20% missingness with ψ = (0.7, 1, −1, 1) and about 45% missingness with ψ = (−0.8, 1, −1, 1). In the settings we consider here, the true model ρ(W) is given by a logistic regression model logit{ρ(W)} = −θ − νT +βZ. To study the performance of the estimators when r(W) is misspecified, we posit two different parametric models of r(W, ψ), where one is a correctly specified logistic model and the other is a misspecified constant model r₀ ∈ (0, 1) independent of W. To study the behavior when ρ(W) is misspecified, we consider various model specifications. We posit a correctly specified logistic model logit{ρ(W, γ)} = γ₁ + γ₂T + γ₃Z (Model 1), a misspecified logistic model logit{ρ(W, γ)} = γ₁ + γ₂Z (Model 2), a misspecified logistic model logit{ρ(W, γ)} = γ₁ + γ₂T (Model 3) and a misspecified constant model ρ₀ ∈ (0, 1) independent of W (Model 4). The simulation studies consist of 1000 runs with the sample size n = 200 and n = 400. We also conduct comparison with the multiple imputation estimators, studied by Lu and Tsiatis (2001), with the number of imputation m = 1 and m = 5.

The results from Table 1 and Table 2 show that the complete-case estimator β̂_C shows large biases in all the settings. When the parametric model for r(W) is correctly specified, both the IPW estimator β̂_I and AIPW estimator β̂_A show small biases, but the AIPW estimator has smaller standard errors than the corresponding IPW estimator. When the parametric model for ρ(W) is correctly specified, the multiple imputation estimator has small biases, but the multiple imputation estimator tends to have larger biases when ρ(W) is misspecified. As expected, the AIPW estimator is clearly not sensitive to the misspecification if one of the parametric models for r(W) and ρ(W) is misspecified. In fact, the AIPW estimator performs quite well even when both the parametric models r(W) and ρ(W) are misspecified. The standard errors of the multiple imputation estimator decrease as the number of imputation increases. The standard errors of the multiple imputation estimator and the AIPW estimator are comparable when ρ(W) is correctly specified. However, the multiple imputation estimator seems to have small standard errors than the AIPW estimator under misspecified ρ(W). The estimated standard errors of the AIPW estimator are close to the sample standard errors, and the 95% confidence intervals have reasonable coverage probabilities.

Table 1.

Summary statistics of simulation results with 20% missingness under various model specifications of ρ(·).

Estimator	n = 200				n = 400
Estimator	Bias	SSE	MSE	CP	Bias	SSE	MSE	CP
β̂_C	−0.136	0.442	0.447	96.0	−0.129	0.322	0.313	92.6
β̂_Ic	−0.007	0.422	0.426	96.4	−0.002	0.313	0.299	93.9
β̂_Im	−0.107	0.430	0.430	96.1	−0.182	0.392	0.385	92.2
	(Model 1)				(Model 1)
β̂_Ac	−0.005	0.406	0.416	96.2	0.002	0.301	0.292	94.4
β̂_Am	−0.003	0.403	0.411	96.5	0.003	0.351	0.345	95.1
β̂_M¹	−0.003	0.427	0.437	96.3	0.003	0.313	0.306	94.3
β̂_M⁵	−0.005	0.406	0.419	96.6	0.005	0.302	0.294	94.4
	(Model 2)				(Model 2)
β̂_Ac	−0.005	0.406	0.416	96.1	0.002	0.301	0.292	94.4
β̂_Am	−0.007	0.403	0.412	96.4	0.001	0.297	0.290	94.3
β̂_M¹	−0.012	0.428	0.438	96.2	−0.003	0.314	0.307	94.3
β̂_M⁵	−0.011	0.408	0.420	96.6	−0.001	0.303	0.295	94.0
	(Model 3)				(Model 3)
β̂_Ac	−0.004	0.406	0.416	96.2	0.002	0.300	0.292	94.5
β̂_Am	0.005	0.395	0.405	96.7	0.010	0.292	0.286	94.5
β̂_M¹	0.061	0.370	0.383	96.3	0.066	0.273	0.269	93.8
β̂_M⁵	0.057	0.347	0.364	96.2	0.067	0.260	0.255	93.8
	(Model 4)				(Model 4)
β̂_Ac	−0.005	0.406	0.416	96.2	0.002	0.301	0.292	94.5
β̂_Am	−0.001	0.397	0.406	96.7	0.009	0.341	0.336	95.2
β̂_M¹	0.056	0.367	0.383	96.5	0.062	0.273	0.269	94.1
β̂_M⁵	0.054	0.348	0.364	96.5	0.064	0.261	0.256	93.5

Open in a new tab

Bias, the mean of the estimates of β; SSE, the sample standard error of the estimates of β; MSE, the mean of the standard error estimates; CP, the empirical coverage probability of the corresponding 95% confidence intervals; β̂_C, the complete-case estimator; β̂_Ic and β̂_Im, the IPW estimators; β̂_Ac and β̂_Am, the AIPW estimators; β̂_M¹ and β̂_M⁵, the multiple imputation estimators with m = 1 and m = 5 imputations. Here c denotes the correctly specified model and m the misspecified model for r(·), respectively.

Table 2.

Summary statistics of simulation results with 45% missingness under various model specifications of ρ(·).

Estimator	n = 200				n = 400
Estimator	Bias	SSE	MSE	CP	Bias	SSE	MSE	CP
β̂_C	−0.322	0.595	0.599	94.0	−0.305	0.420	0.416	88.1
β̂_Ic	−0.014	0.583	0.554	93.7	0.002	0.411	0.394	94.7
β̂_Im	−0.194	0.561	0.552	94.6	−0.182	0.392	0.385	92.2
	(Model 1)				(Model 1)
β̂_Ac	−0.022	0.540	0.523	94.3	0.001	0.380	0.365	93.4
β̂_Am	−0.011	0.493	0.494	95.8	0.003	0.351	0.345	95.1
β̂_M¹	−0.007	0.534	0.539	96.2	−0.006	0.377	0.376	95.2
β̂_M⁵	−0.012	0.502	0.509	96.6	0.001	0.356	0.354	95.1
	(Model 2)				(Model 2)
β̂_Ac	−0.021	0.539	0.523	94.0	0.001	0.379	0.365	93.6
β̂_Am	−0.017	0.495	0.496	95.9	−0.003	0.352	0.347	94.8
β̂_M¹	−0.022	0.538	0.541	96.5	−0.017	0.377	0.377	95.2
β̂_M⁵	−0.025	0.506	0.510	96.7	−0.010	0.356	0.355	94.8
	(Model 3)				(Model 3)
β̂_Ac	−0.020	0.538	0.524	94.9	0.002	0.379	0.365	94.0
β̂_Am	0.011	0.463	0.471	96.0	0.017	0.338	0.333	95.1
β̂_M¹	0.137	0.368	0.383	94.6	0.133	0.266	0.268	92.3
β̂_M⁵	0.130	0.325	0.340	94.6	0.139	0.237	0.238	91.1
	(Model 4)				(Model 4)
β̂_Ac	−0.020	0.536	0.525	94.6	0.002	0.379	0.365	94.2
β̂_Am	0.004	0.469	0.476	96.2	0.009	0.341	0.336	95.2
β̂_M¹	0.131	0.371	0.384	95.2	0.127	0.268	0.269	92.5
β̂_M⁵	0.126	0.327	0.340	94.7	0.134	0.239	0.239	91.5

Open in a new tab

To further study robustness of parameter estimates against misspecification of the parametric models for r(W) and ρ(W), we consider the same model for $T_{1}^{*}$ as described above, but here instead we generate $T_{2}^{*}$ from a Weibull distribution, log logistic distribution, exponential power distribution and gamma distribution. None of these distributions will induce a simple linear logistic regression model for ρ(W). For example, when $T_{2}^{*}$ is generated from a log logistic distribution with a hazard function $λ_{2}^{*} (t ∣ Z) = α λ t^{α - 1} / (1 + λ t^{α})$ , the true logistic model for ρ(W) is logit{ρ(W)} = −log(α) − log(λ) + βZ + log ((1 + λT^α)/T^α⁻¹). In all cases we misspecify ρ(W) by Model 1 to Model 4 described above. Although not presented here, the findings from these simulations are similar to those from Table 1 and Table 2.

In conclusion, the multiple imputation estimator of Lu and Tsiatis (2001) and the AIPW estimator have similar performance when the parametric models for r(W) and ρ(W) are correctly specified. The AIPW estimator has the advantage of double robustness such that the biases of the AIPW estimator remain small when the parametric model for r(W) or ρ(W) is misspecified. The bias of the multiple imputation estimator can be larger under misspecified ρ(W).

4.2. Bone marrow transplant data

Sierra et al. (2002) described the characteristics and outcomes of 452 patients with primary myelodysplasia (MDS) who received transplants from HLA-identical siblings and were registered with the International Bone Marrow Transplant Registry (IBMTR). The study has two competing risks; treatment related death defined as death in complete remission and relapse defined as recurrence of myelodysplasia. In this example, we consider 408 patients with complete covariate information obtained from the timereg package for R. Among these 408 patients, 161 patients died in complete remission, 87 patients relapsed, and 160 patients were censored. The covariates considered in our study are age of patient standardized at mean of 35 years old and platelet before transplantation (1 for more than 100 × 10⁹ per L, or 0 for less). In the data set, the causes of failure are all known. For illustration purposes, we delete some failure causes by the three following missing mechanisms; missing completely at random (MCAR), missing at random (MAR), and not missing at random (NMAR).

For the MCAR, the causes of failure are randomly selected for missing with probability 23%. For the MAR, the logistic model is chosen as logit{r(W)} = 0.5 + 1.0 * T − 1.0 * age which yields about 23% missing causes, where T is the failure time. For the NMAR, the logistic model is chosen as logit{r(W)} = 0.5+1.0*T −1.0*age−0.5*I(Δ = 1) which yields about 26% missing causes, where Δ = 1 corresponds to the death in complete remission and Δ = 2 does to relapse. We posit the logistic models for both r(W) and ρ(W) with logit{r(W, ψ)} = ψ₁ + ψ₂ * T + ψ₃ * age + ψ₄ * platelet and logit{ρ(W, γ)} = γ₁ + γ₂ * log T + γ₃ * age + γ₄ * platelet.

The results of the estimation of β based on the AIPW estimator, the complete-case estimator and the multiple imputation estimator with the number of imputation m = 5 are summarized in Table 3. For comparison, Table 3 also includes the estimation of β based on the original data without artificial missing, namely, the full-case estimator. The results from the AIPW estimator and the multiple imputation estimator are very close under all the missingness mechanisms and they are closer to the full-case estimator than the complete-case estimator. The analyses using the AIPW estimator and the multiple imputation estimator are consistent with the findings from the earlier study; that is, patients with high platelet counts have a lower risk of treatment related mortality than those with low platelet counts, and a higher risk rate is seen among the older patients.

Table 3.

Estimation of the effects of platelet and age for the bone marrow transplant data.

Missing	Estimator	Platelet			Age
Missing	Estimator	Est.	SEE	p-value	Est.	SEE	p-value
None	β̂_F	−0.586	0.186	0.002	0.367	0.087	<0.001
MCAR	β̂_C	−0.491	0.236	0.037	0.397	0.112	<0.001
	β̂_A	−0.520	0.203	0.010	0.387	0.091	<0.001
	β̂_M⁵	−0.530	0.205	0.010	0.380	0.096	<0.001
MAR	β̂_C	−0.504	0.238	0.034	0.241	0.107	0.024
	β̂_A	−0.566	0.204	0.006	0.380	0.090	<0.001
	β̂_M⁵	−0.614	0.205	0.003	0.380	0.093	<0.001
NMAR	β̂_C	−0.390	0.253	0.123	0.156	0.114	0.171
	β̂_A	−0.529	0.220	0.016	0.350	0.098	<0.001
	β̂_M⁵	−0.567	0.219	0.010	0.344	0.099	<0.001

Open in a new tab

Est., the estimate; SEE, the standard error estimate; p-value pertaining to testing no covariate effect; β̂_F, the full-case estimator with no missing causes; β̂_C, the complete-case estimator; β̂_A, the AIPW estimator; β̂_M⁵, the multiple imputation estimator with m = 5 imputations.

5. Conclusion

We propose the inverse probability weighted estimator and augmented inverse probability weighted estimator for analysis of competing risks data with missing cause of failure, where the Cox proportional hazard model is utilized to examine the covariate effects on the cause-specific hazard function for the failure type of interest. The augmented inverse probability weighted estimator posses the double robust property such that it is unbiased as long as one of the parametric models for r(W) and ρ(W) is correctly specified. The inverse probability weighted estimator is unbiased only when the parametric model for r(W) is correctly specified. Under the correctly specified models for r(W) and ρ(W), the augmented inverse probability weighted estimator is more efficient than the inverse probability weighted estimator.

The proposed estimators are compared with the multiple imputation estimator of Lu and Tsiatis (2001) through simulations. The multiple imputation estimator and augmented inverse probability weighted estimator have similar performance when the parametric models for r(W) and ρ(W) are correctly specified. The augmented inverse probability weighted estimator has the advantage of the double robustness over the multiple imputation estimator.

In the competing risks problem, another useful quantity is the cumulative incidence function which is the probability of occurrence by time t for a particular type of failure in the presence of other risks. It is known that the covariate effect on the cause-specific hazard for a particular type of failure can be quite different from its effect on the cumulative incidence function of that type of failure (Gray, 1988; Gaynor et al., 1993). Fine and Gray (1999) developed a direct Cox regression approach for the cumulative incidence curve based on earlier work by Gray (1988) and Pepe (1991). Recently, Andersen et al. (2003) and Klein and Andersen (2005) suggested pseudo-observation approach for direct modeling for cumulative incidence probabilities. It would be interesting to study models that relate the covariates directly to the cumulative incidence function for the failure type of interest in competing risks data with missing cause of failure.

Acknowledgments

The research of Yanqing Sun was partially supported by NSF DMS-0905777 and NIH grant R37 AI054165-09. The authors thank the reviewers for their constructive comments that have improved the paper.

Appendix

Condition A

(A.1)
λ₀(t) is continuous on [0, τ]. The distribution of C is continuous on [0, τ] and P (C > τ) > 0. The covariate processes Z_i(t) have paths that are left continuous and of bounded variation, and satisfy the moment condition E[||Z_i(t)||⁴ exp(2M||Z_i(t)||)] < ∞, where M is a constant such that β ∈ [−M, M]^p and ||A|| = max_k_,_l |a_kl| for a matrix A = (a_kl).
(A.2)
Each component of s⁽^j⁾(t, β) is continuous on [0, τ] × [−M, M]^p for M > 0, j = 0, 1, 2 and s⁽⁰⁾(t, β) > 0 on [0, τ] × [−M, M]^p. sup_{t∈[0,τ],β∈[−M,M]^p} ||S⁽^j⁾(t, β) − s⁽^j⁾(t, β)|| = O_p(n^−1/2), and sup_{ψ∈[−L,L]^q} sup_{t∈[0, τ],β∈[−M,M]^p} ||S̃⁽^j⁾(t, β, ψ) − s⁽^j⁾(t, β)|| = O_p(n^−1/2) for j = 0, 1, 2.
(A.3)
The matrix $\sum = \int_{0}^{τ} v (t, β_{0}) λ_{0} (t) s^{(0)} (t, β_{0}) d t$ is positive definite.
(A.4)
There is a σ > 0 such that r(W_i) ≥ σ for all i with Δ_i > 0. Both r(W_i, ψ) and ρ(W_i, γ) are twice continuously differentiable with respect to ψ and γ, respectively. There exist ψ^* and γ^* satisfying the equations ${E S}_{ψ i}^{*} = 0$ and ${E S}_{γ i}^{*} = 0$ , respectively, where $S_{ψ i}^{*}$ and $S_{γ i}^{*}$ are the corresponding score functions for r(W_i, ψ) and ρ(W_i, γ) given in (19). The information matrices $I_{ψ}^{*}$ and $I_{γ}^{*}$ also given in (19) are positive definite.

Proof of Theorem 1

Consistency of β̂_I

Let ψ₀ be the true value of ψ such that r(W_i) = r(W_i, ψ₀) under the correctly specified model for r(W_i). Then $\hat{ψ} \overset{P}{\to} ψ_{0}$ . Let

ξ_{n} (β, ψ_{0}) = n^{- 1} \sum_{i = 1}^{n} \int_{0}^{τ} (Z_{i} (t) - \bar{z} (t, β)) \frac{R_{i}}{π (Q_{i}, ψ_{0})} N_{i} (d t) .

When r(W_i, ψ₀) is the correct model for r(W_i), z̃(t, β, ψ₀) = z̄(t, β), where z̃(t, β, ψ₀) is the limit of Z̃(t, β, ψ₀). Under the conditions of Theorem 1, n⁻¹ U_I (β, ψ̂) = ξ_n(β, ψ₀) + O_p(n^−1/2) uniformly in β ∈ [−M, M]^p for M > 0.

By application of the Glivenko–Cantelli and Donsker theorems, ξ_n(β, ψ₀) = ξ(β, ψ₀) + O_p(n^−1/2) uniformly in β ∈ [−M, M]^p, where

ξ (β, ψ_{0}) = E [\int_{0}^{τ} (Z_{i} (t) - \bar{z} (t, β)) \frac{R_{i}}{π (Q_{i}, ψ_{0})} N_{i} (d t)] .

When r(W_i, ψ₀) is the correct model for $r (W_{i}), ξ (β, ψ_{0}) = E [\int_{0}^{τ} (Z_{i} (t) - \bar{z} (t, β)) N_{i} (d t)]$ by the double expectation formula E[·] = E{E[·|W_i, Δ_i, Δ_i > 0]} and the missing at random assumption (2). Hence

n^{- 1} U_{I} (β, \hat{ψ}) \overset{P}{\to} E [\int_{0}^{τ} (Z_{i} (t) - \bar{z} (t, β)) N_{i} (d t)]

uniformly in β ∈ [−M, M]^p for M > 0.

Let Inline graphic = σ{N_i(s), Y_i(s), Z_i(s), 0 ≤ s ≤ t, i = 1, …, n} be the σ-field generated by the observed information on the failure/censoring times and covariates up to time t. Then the intensity of the counting process N_i(t) is given by E{N_i(dt)| } = λ₁(t|Z_i(t))Y_i(t)dt. Under the independent censoring assumption stated in the introduction, $λ_{1} (t ∣ Z_{i} (t)) = λ_{1}^{*} (t ∣ Z_{i} (t))$ . Hence $M_{i} (t) = N_{i} (t) - \int_{0}^{t} Y_{i} (u) λ_{1} (u ∣ Z_{i} (u)) d u = N_{i} (t) - \int_{0}^{t} Y_{i} (u) e^{β_{0}^{⊤} Z_{i} (u)} λ_{0} (u) d u$ is a martin-gale with respect to Inline graphic . Since $E [\int_{0}^{τ} (Z_{i} (t) - \bar{z} (t, β_{0})) Y_{i} (t) e^{β_{0}^{⊤} Z_{i} (t)} λ_{0} (t) d t] = 0$ , it follows that $ξ (β_{0}, ψ_{0}) = E [\int_{0}^{τ} (Z_{i} (t) - \bar{z} (t, β_{0})) M_{i} (d t)] = 0$ . By Condition (A.3), β₀ is the unique solution to ξ(β, ψ₀) = 0. We have ${\hat{β}}_{I} \overset{P}{\to} β_{0}$ by Theorem 5.9 of van der Vaart (1998).

Asymptotic Normality of β̂_I

Since U_I (β̂_I, ψ̂) = 0, $U_{I} ({\hat{β}}_{I}, \hat{ψ}) = 0, U_{I} (β_{0}, \hat{ψ}) = {- \frac{1}{n} \frac{\partial U_{I} (β, \hat{ψ})}{\partial β} |}_{β = \tilde{β}} ({\hat{β}}_{I} - β_{0})$ , where β̃ is on the line segment between β̂_I and β₀. We have

\sqrt{n} ({\hat{β}}_{I} - β_{0}) = {({- \frac{1}{n} \frac{\partial U_{I} (β, \hat{ψ})}{\partial β} |}_{β = \tilde{β}})}^{- 1} \frac{1}{\sqrt{n}} U_{I} (β_{0}, \hat{ψ}) .

By (A.4), $\hat{ψ} \overset{P}{\to} ψ_{0}$ under correctly specified model for r(W_i). By Condition (A.2) and the consistency of β̂_I, $\tilde{V} (t, \tilde{β}, \hat{ψ}) \overset{P}{\to} v (t, β_{0})$ uniformly in t ∈ (0, τ]. We have

\begin{array}{l} {- \frac{1}{n} \frac{\partial U_{I} (β, \hat{ψ})}{\partial β} |}_{β = \tilde{β}} = \frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{τ} \tilde{V} (t, \tilde{β}, \hat{ψ}) \frac{R_{i}}{r (W_{i}, \hat{ψ})} {d N}_{i} (t) \overset{P}{\to} E [\int_{0}^{τ} v (t, β_{0}) \frac{R_{1}}{r (W_{1}, ψ_{0})} {d N}_{1} (t)] \\ = \sum . \end{array}

Thus

\sqrt{n} ({\hat{β}}_{I} - β_{0}) = {(\sum + o_{p} (1))}^{- 1} n^{- 1 / 2} U_{I} (β_{0}, \hat{ψ}) .

(7)

Let π̇(·, ψ) = ∂π(·, ψ)/∂ψ, ṙ(·, ψ) = ∂r(·,ψ)/∂ψ, r̈(·,ψ) = ∂ṙ(·, ψ)/∂ψ^⊤, and

\begin{matrix} H^{(1)} (t, β, ψ) = \frac{1}{n} \sum_{i = 1}^{n} Y_{i} (t) e^{β^{⊤} Z_{i} (t)} \frac{R_{i} {\dot{π}}^{⊤} (Q_{i}, ψ)}{π^{2} (Q_{i}, ψ)}, \\ H^{(2)} (t, β, ψ) = \frac{1}{n} \sum_{i = 1}^{n} Y_{i} (t) e^{β^{⊤} Z_{i} (t)} Z_{i} (t) \frac{R_{i} {\dot{π}}^{⊤} (Q_{i}, ψ)}{π^{2} (Q_{i}, ψ)}, \\ h^{(1)} (t, β, ψ) = E [Y_{1} (t) e^{β^{⊤} Z_{1} (t)} \frac{R_{1} {\dot{π}}^{⊤} (Q_{1}, ψ)}{π^{2} (Q_{1}, ψ)}], \\ h^{(2)} (t, β, ψ) = E [Y_{1} (t) e^{β^{⊤} Z_{1} (t)} Z_{1} (t) \frac{R_{1} {\dot{π}}^{⊤} (Q_{1}, ψ)}{π^{2} (Q_{1}, ψ)}] . \end{matrix}

By the Taylor expansion of U_I (β₀, ψ̂) around ψ₀,

n^{- 1 / 2} \sqrt{n} U_{I} (β_{0}, \hat{ψ}) = n^{- 1 / 2} U_{I} (β_{0}, ψ_{0}) + {n^{- 1} \frac{\partial U_{I} (β_{0}, ψ)}{\partial ψ} |}_{ψ = ψ_{0}} \sqrt{n} (\hat{ψ} - ψ_{0}) + o_{p} (1) .

Note that

{n^{- 1} \frac{\partial U_{I} (β_{0}, ψ)}{\partial ψ} |}_{ψ = ψ_{0}} = V_{ψ} + o_{p} (1),

where

V_{ψ} = E [\begin{array}{l} \int_{0}^{τ} (Z_{1} (t) - \bar{z} (t, β_{0})) \frac{R_{1} {\dot{π}}^{⊤} (Q_{1}, ψ_{0})}{π^{2} (Q_{1}, ψ_{0})} {d N}_{1} (t) - (\frac{h^{(2)} (t, β_{0}, ψ_{0})}{{\tilde{S}}^{(0)} (t, β_{0}, ψ_{0})} - \frac{{\tilde{S}}^{(1)} (t, β_{0}, ψ_{0}) h^{(1)} (t, β_{0}, ψ_{0})}{{({\tilde{S}}^{(0)} (t, β_{0}, ψ_{0}))}^{2}}) \frac{R_{1}}{π (Q_{1}, ψ_{0})} {d N}_{1} (t) \end{array}] .

(8)

It follows that

n^{- 1 / 2} U_{I} (β_{0}, \hat{ψ}) = n^{- 1 / 2} U_{I} (β_{0}, ψ_{0}) + V_{ψ} \sqrt{n} (\hat{ψ} - ψ_{0}) + o_{p} (1) .

(9)

Since

\sum_{i = 1}^{n} \int_{0}^{τ} \frac{R_{i}}{π (Q_{i}, ψ_{0})} (Z_{i} (t) - \tilde{Z} (t, β_{0}, ψ_{0})) Y_{i} (t) e^{β_{0}^{T} Z_{i} (t)} d Λ_{0} (t) = 0,

we have

n^{- 1 / 2} U_{I} (β_{0}, ψ_{0}) = n^{- 1 / 2} \sum_{i = 1}^{n} \int_{0}^{τ} \frac{R_{i}}{π (Q_{i}, ψ_{0})} (Z_{i} (t) - \tilde{Z} (t, β_{0}, ψ_{0})) {d M}_{i} (t) .

Under correctly specified r(W_i), E[R_i{π(Q_i, ψ₀)}⁻¹M_i(t)] = 0. By Glivenko–Cantelli and Donsker theorems, $n^{- 1 / 2} \sum_{i = 1}^{n} R_{i} {π (Q_{i}, ψ_{0})}^{- 1} M_{i} (t)$ converges weakly to a mean zero Gaussian process and $\tilde{Z} (t, β_{0}, ψ_{0}) \overset{P}{\to} \bar{z} (t, β_{0})$ uniformly in t under the correctly specified model for r(W_i). Applying Lemma 2 of Gilbert et al. (2008), we have

\int_{0}^{τ} {\tilde{Z} (t, β_{0}, ψ_{0}) - \bar{z} (t, β_{0})} d {n^{- 1 / 2} \sum_{i = 1}^{n} \frac{R_{i}}{π (Q_{i}, ψ_{0})} M_{i} (t)} \overset{P}{\to} 0.

Hence

n^{- 1 / 2} U_{I} (β_{0}, ψ_{0}) = \sum_{i = 1}^{n} \int_{0}^{τ} \frac{R_{i}}{π (Q_{i}, ψ_{0})} (Z_{i} (t) - \bar{z} (t, β_{0})) {d M}_{i} (t) + o_{p} (1) .

(10)

Let S_ψi and I_ψ be the score vector and the information matrix under the parametric model r(W_i, ψ), respectively. Specifically,

\begin{array}{l} S_{ψ i} = \frac{I (Δ_{i} > 0) {R_{i} - r (W_{i}, ψ_{0})} \dot{r} (W_{i}, ψ_{0})}{r (W_{i}, ψ_{0}) {1 - r (W_{i}, ψ_{0})}}, \\ I_{ψ} = E [S_{ψ 1} S_{ψ 1}^{⊤} - \frac{I (Δ_{1} > 0) {R_{1} - r (W_{1}, ψ_{0})} \ddot{r} (W_{1}, ψ_{0})}{r (W_{1}, ψ_{0}) {1 - r (W_{1}, ψ_{0})}}], \end{array}

(11)

Under (A.4),

\hat{ψ} - ψ_{0} = n^{- 1} \sum_{i = 1}^{n} I_{ψ}^{- 1} S_{ψ i} + o_{p} (n^{- 1 / 2}) .

(12)

By (9), (10) and (12), we have

n^{- 1 / 2} U_{I} (β_{0}, \hat{ψ}) = n^{- 1 / 2} \sum_{i = 1}^{n} ω_{i} + o_{p} (1),

(13)

where

ω_{i} = \int_{0}^{τ} {Z_{i} (t) - \bar{z} (t, β_{0})} \frac{R_{i}}{π (Q_{i}, ψ_{0})} {d M}_{i} (t) - V_{ψ} I_{ψ}^{- 1} S_{ψ i} .

By the central limit theorem, n^−1/2U_I (β₀, ψ̂) converges in distribution to a normal random vector with zero-mean and covariance matrix $E [ω_{1} ω_{1}^{T}]$ .

It follows by (7) and (13) that $\sqrt{n} ({\hat{β}}_{I} - β_{0})$ converges in distribution to a zero-mean Gaussian random vector with covariance matrix $\sum^{- 1} E [ω_{1} ω_{1}^{T}] \sum^{- 1}$ .

Proof of Theorem 2

Consistency of β̂_A

Under (A.4), there exist ψ^* and γ^* such that $\hat{ψ} \overset{P}{\to} ψ^{*}$ and $\hat{γ} \overset{P}{\to} γ^{*}$ . Let

η_{n} (β, ψ^{*}, γ^{*}) = n^{- 1} \sum_{i = 1}^{n} \int_{0}^{τ} (Z_{i} (t) - \bar{z} (t, β)) [\frac{R_{i}}{r (W_{i}, ψ^{*})} {d N}_{i} (t) - \frac{R_{i} - r (W_{i}, ψ^{*})}{r (W_{i}, ψ^{*})} ρ (W_{i}, γ^{*}) {d N}_{i}^{*} (t)] .

Under Condition A, n⁻¹U_A(β, ψ̂, γ̂) = η_n(β, ψ^*, γ^*) + O_p(n^−1/2) uniformly in β ∈ [−M, M]^p for M > 0.

By application of the Glivenko–Cantelli and Donsker theorems, η_n(β, ψ^*, γ^*) = η(β, ψ^*, γ^*) + O_p(n^−1/2) uniformly in β ∈ [−M, M]^p, where

η_{n} (β, ψ^{*}, γ^{*}) = E {\int_{0}^{τ} (Z_{i} (t) - \bar{z} (t, β)) [\frac{R_{i}}{r (W_{i}, ψ^{*})} {d N}_{i} (t) - \frac{R_{i} - r (W_{i}, ψ^{*})}{r (W_{i}, ψ^{*})} ρ (W_{i}, γ^{*}) {d N}_{i}^{*} (t)]} .

If at least one of r(W_i, ψ₀) and ρ(W_i, γ₀) is correctly specified for r(W_i) and ρ(W_i), then $η (β, ψ^{*}, γ^{*}) = E [\int_{0}^{τ} (Z_{i} (t) - \bar{z} (t, β)) N_{i} (d t)]$ by the missing at random assumption (2) and by using the double expectation formula E[·] = E{E[·|W_i, Δ_i, Δ_i > 0]} if r(W_i) is correctly specified and using E[·] = E{E[·|W_i, R_i, Δ_i > 0]} if ρ(W_i) is correctly specified. Hence

n^{- 1} U_{A} (β, \hat{ψ}, \hat{γ}) \overset{P}{\to} E [\int_{0}^{τ} (Z_{i} (t) - \bar{z} (t, β)) N_{i} (d t)]

uniformly in β ∈ [−M, M] if at least one of r(W_i, ψ₀) and ρ(W_i, γ₀) is correctly specified for r(W_i) and ρ(W_i). Since η(β₀, ψ^*, γ^*) = 0 and β₀ is the unique solution to η(β₀, ψ^*, γ^*) = 0 by Condition (A.3), we have ${\hat{β}}_{A} \overset{P}{\to} β_{0}$ by Theorem 5.9 of van der Vaart (1998).

Asymptotic Normality of β̂_A

Since U_A(β, ψ̂, γ̂) = 0, $U_{A} (β_{0}, \hat{ψ}, \hat{γ}) = {- \frac{1}{n} \frac{\partial U_{A} (β, \hat{ψ}, \hat{γ})}{\partial β} |}_{β = \tilde{β}} ({\hat{β}}_{A} - β_{0})$ , where β̃ is on the line segment between β̂_A and β₀. We have

\sqrt{n} ({\hat{β}}_{A} - β_{0}) = {({- \frac{1}{n} \frac{\partial U_{A} (β, \hat{ψ}, \hat{γ})}{\partial β} |}_{β = \tilde{β}})}^{- 1} \frac{1}{\sqrt{n}} U_{A} (β_{0}, \hat{ψ}, \hat{γ}) .

By Condition (A.2) and the consistency of β̂_A, $V (t, \tilde{β}) \overset{P}{\to} v (t, β_{0})$ uniformly in t. Furthermore, by (A.4), $\hat{ψ} \overset{P}{\to} ψ^{*}$ and $\hat{γ} \overset{P}{\to} γ^{*}$ , we have

{- \frac{1}{n} \frac{\partial U_{A} (β, \hat{ψ}, \hat{γ})}{\partial β} |}_{β = \tilde{β}} = \frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{τ} V (t, \tilde{β}) [\frac{R_{i}}{r (W_{i}, \hat{ψ})} {d N}_{i} (t) - \frac{R_{i} - r (W_{i}, \hat{ψ})}{r (W_{i}, \hat{ψ})} ρ (W_{i}, \hat{γ}) {d N}_{i}^{*} (t)]

converges in probability to

\sum_{A} = E [\int_{0}^{τ} V (t, β_{0}) (\frac{R_{1}}{r (W_{1}, ψ^{*})} {d N}_{1} (t) - \frac{R_{1} - r (W_{1}, ψ^{*})}{r (W_{1}, ψ^{*})} ρ (W_{1}, γ^{*}) {d N}_{1}^{*} (t))],

which equals to $\sum = \int_{0}^{τ} v (t, β_{0}) λ_{0} (t) s^{(0)} (t, β_{0}) d t$ if at least one of r(W) and ρ(W) is correctly specified. Thus

\sqrt{n} ({\hat{β}}_{A} - β_{0}) = {(\sum + o_{p} (1))}^{- 1} n^{- 1 / 2} U_{A} (β_{0}, \hat{ψ}, \hat{γ}) .

(14)

Let ṙ(·, ψ) = ∂r(·,ψ)/∂ψ, r̈(·, ψ) = ∂ṙ(·, ψ)/∂ψ^⊤, ρ̇(·, γ) = ∂ρ(·, γ)/∂γ, and r̈(·, γ) = ∂ρ̇(·, γ)/∂γ^⊤. By the Taylor expansion of U_A(β₀, ψ̂, γ̂) around ψ^* and γ^*,

\begin{array}{l} n^{- 1 / 2} U_{A} (β_{0}, \hat{ψ}, \hat{γ}) \\ = n^{- 1 / 2} U_{A} (β_{0}, ψ^{*}, γ^{*}) + {n^{- 1} \frac{\partial U_{A} (β_{0}, ψ, γ)}{\partial ψ} |}_{(ψ, γ) = (ψ^{*}, γ^{*})} \sqrt{n} (\hat{ψ} - ψ^{*}) \\ + {n^{- 1} \frac{\partial U_{A} (β_{0}, ψ, γ)}{\partial γ} |}_{(ψ, γ) = (ψ^{*}, γ^{*})} \sqrt{n} (\hat{γ} - γ^{*}) + o_{p} (1) . \end{array}

Note that

\begin{array}{l} {n^{- 1} \frac{\partial U_{A} (β_{0}, ψ, γ)}{\partial γ} |}_{(ψ, γ) = (ψ^{*}, γ^{*})} = P_{ψ} + o_{p} (1), \\ {n^{- 1} \frac{\partial U_{A} (β_{0}, ψ, γ)}{\partial γ} |}_{(ψ, γ) = (ψ^{*}, γ^{*})} = P_{γ} + o_{p} (1), \end{array}

where

\begin{array}{l} P_{ψ} = E [\int_{0}^{τ} {Z_{1} (t) - \bar{z} (t, β_{0})} \frac{R_{1} {\dot{r}}^{⊤} (W_{1}, ψ^{*})}{r^{2} (W_{1}, ψ^{*})} {{d N}_{1} (t) - ρ (W_{1}, γ^{*}) {d N}_{1}^{*} (t)}], \\ P_{γ} = E [\int_{0}^{τ} {Z_{1} (t) - \bar{z} (t, β_{0})} \frac{R_{1} - r (W_{1}, ψ^{*})}{r (W_{1}, ψ^{*})} {\dot{ρ}}^{⊤} (W_{1}, γ^{*}) {d N}_{1}^{*} (t)] . \end{array}

(15)

It follows that

n^{- 1 / 2} U_{A} (β_{0}, \hat{ψ}, \hat{γ}) = n^{- 1 / 2} U_{A} (β_{0}, ψ^{*}, γ^{*}) + P_{ψ} \sqrt{n} (\hat{ψ} - ψ^{*}) + P_{γ} \sqrt{n} (\hat{γ} - γ^{*}) + o_{p} (1) .

(16)

Let $M_{i} (t) = N_{i} (t) - \int_{0}^{t} Y_{i} (u) e^{β_{0}^{⊤} Z_{i} (u)} d Λ_{0} (u)$ , and

M_{i}^{*} (t) = \frac{R_{i}}{r (W_{i}, ψ^{*})} N_{i} (t) - \frac{R_{i} - r (W_{i}, ψ^{*})}{r (W_{i}, ψ^{*})} ρ (W_{i}, γ^{*}) N_{i}^{*} (t) - \int_{0}^{t} Y_{i} (u) e^{β_{0}^{⊤} Z_{i} (u)} d Λ_{0} (u) .

(17)

Since $\sum_{i = 1}^{n} \int_{0}^{τ} (Z_{i} (t) - \bar{Z} (t, β_{0})) Y_{i} (t) e^{β_{0}^{T} Z_{i} (t)} d Λ_{0} (t) = 0$ , it follows that

n^{- 1 / 2} U_{A} (β_{0}, ψ^{*}, γ^{*}) = n^{- 1 / 2} \sum_{i = 1}^{n} \int_{0}^{τ} (Z_{i} (t) - \bar{Z} (t, β_{0})) {d M}_{i}^{*} (t) .

Similar to the arguments given above, $E [M_{i}^{*} (t)] = 0$ if at least one of r(W_i) and ρ(W_i) is correctly specified. By application of the Glivenko-Cantelli and Donsker theorems, $n^{- 1 / 2} M_{i}^{*} (t)$ converges weakly to a zero-mean Gaussian process and $\bar{Z} (t, β_{0}) \overset{P}{\to} \bar{z} (t, β_{0})$ uniformly in t. Applying Lemma 2 of Gilbert et al. (2008), we have

\int_{0}^{τ} {\bar{Z} (t, β_{0}) - \bar{z} (t, β_{0})} d {n^{- 1 / 2} \sum_{i = 1}^{n} M_{i}^{*} (t)} \overset{P}{\to} 0 .

Hence

U_{A} (β_{0}, ψ^{*}, γ^{*}) = \sum_{i - 1}^{n} \int_{0}^{τ} (Z_{i} (t) - \bar{z} (t, β_{0})) {d M}_{i}^{*} (t) + o_{p} (1) .

(18)

Let $S_{ψ i}^{*}$ and $I_{ψ}^{*}$ be the score vector and the information matrix under the parametric model r(W_i, ψ), respectively, and $S_{γ i}^{*}$ and $I_{γ}^{*}$ the score vector and the information matrix under the parametric model ρ(W_i, γ). Specifically,

\begin{array}{l} S_{ψ i}^{*} = \frac{I (Δ_{i} > 0) {R_{i} - r (W_{i}, ψ^{*})} \dot{r} (W_{i}, ψ^{*})}{r (W_{i}, ψ^{*}) {1 - r (W_{i}, ψ^{*})}}, \\ S_{γ i}^{*} = \frac{R_{i} I (Δ_{i} > 0) {I (Δ_{i} = 1) - ρ (W_{i}, γ^{*})} \dot{ρ} (W_{i}, γ^{*})}{ρ (W_{i}, γ^{*}) {1 - ρ (W_{i}, γ^{*})}}, \\ I_{ψ}^{*} = E [S_{ψ 1} S_{ψ 1}^{⊤} - \frac{I (Δ_{1} > 0) {R_{1} - r (W_{1}, ψ^{*})} \ddot{r} (W_{1}, ψ^{*})}{r (W_{1}, ψ^{*}) {1 - r (W_{1}, ψ^{*})}}], \\ I_{γ}^{*} = E [S_{γ 1} S_{γ 1}^{⊤} - \frac{R_{1} I (Δ_{1} > 0) {I (Δ_{1} = 1) - ρ (W_{1}, γ^{*})} \ddot{ρ} (W_{1}, γ^{*})}{ρ (W_{1}, γ^{*}) {1 - ρ (W_{1}, γ^{*})}}] . \end{array}

(19)

Under (A.4),

\hat{ψ} - ψ^{*} = n^{- 1} \sum_{i = 1}^{n} {(I_{ψ}^{*})}^{- 1} S_{ψ i}^{*} + o_{p} (n^{- 1 / 2})

(20)

and

\hat{γ} - γ^{*} = n^{- 1} \sum_{i = 1}^{n} {(I_{γ}^{*})}^{- 1} S_{γ i}^{*} + o_{p} (n^{- 1 / 2}) .

(21)

By (16), (18), (20), and (21), we have

n^{- 1 / 2} U_{A} (β_{0}, \hat{ψ}, \hat{γ}) = n^{- 1 / 2} \sum_{i = 1}^{n} φ_{i} + o_{p} (1),

(22)

where

φ_{i} = \int_{0}^{τ} {Z_{i} (t) - \bar{z} (t, β_{0})} {d M}_{i}^{*} (t) - P_{ψ} {(I_{ψ}^{*})}^{- 1} S_{ψ i}^{*} - P_{γ} {(I_{γ}^{*})}^{- 1} S_{γ i}^{*} .

By the central limit theorem, n^−1/2U_A(β₀, ψ̂, γ̂) converges in distribution to a normal random vector with zero-mean and covariance matrix $E [φ_{1} φ_{1}^{T}]$ .

It follows by (14) and (22) that $\sqrt{n} ({\hat{β}}_{A} - β_{0})$ converges in distribution to a zero-mean Gaussian random vector with covariance matrix $\sum^{- 1} E [φ_{1} φ_{1}^{T}] \sum^{- 1}$ .

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Contributor Information

Seunggeun Hyun, Email: shyun@uscupstate.edu.

Jimin Lee, Email: jlee@unca.edu.

Yanqing Sun, Email: yasun@uncc.edu.

References

Andersen J, Goetghebeur E, Ryan L. Missing cause of death information in the analysis of survival data. Statistics in Medicine. 1996;15:2191–2201. doi: 10.1002/(SICI)1097-0258(19961030)15:20<2191::AID-SIM358>3.0.CO;2-D. [DOI] [PubMed] [Google Scholar]
Andersen P, Klein J, Rosthøj S. Generalised linear models for correlated pseudo-observations, with applications to multi-state models. Biometrika. 2003;90:15–27. [Google Scholar]
Benichou J, Gail M. Estimates of absolute cause-specific risk in cohort studies. Biometrics. 1990;46:813–826. [PubMed] [Google Scholar]
Cheng S, Fine J, Wei L. Prediction of cumulative incidence function under the proportional hazards model. Biometrics. 1998;54:219–228. [PubMed] [Google Scholar]
Cox D. Regression models and life tables (with discussion) Journal of the Royal Statistical Society Series B. 1972;34:187–220. [Google Scholar]
Cox D. Partial likelihood. Biometrika. 1975;62:269–276. [Google Scholar]
Dewanji A. A note on a test for competing risks with missing failure type. Biometrika. 1992;79:855–857. [Google Scholar]
Dinse G. Nonparametric estimation for partially-complete time and of failure data. Biometrics. 1982;38:417–431. [PubMed] [Google Scholar]
Dinse G. Nonparametric prevalence and mortality estimators for animal experiments with incomplete cause-of-death data. Journal of the American Statistical Association. 1986;81:328–336. [Google Scholar]
Fine J, Gray R. A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association. 1999;94:496–509. [Google Scholar]
Gao G, Tsiatis A. Semiparametric estimators for the regression coefficients in the linear transformation competing risks model with missing cause of failure. Biometrika. 2005;92:875–891. [Google Scholar]
Gaynor J, Feuer E, Tan C, Wu D, Little C, Straus D, Clarkson B, Brennan M. On the use of cause-specific failure and conditional failure probabilities: Examples from clinical oncology data. Journal of the American Statistical Association. 1993;88:400–409. [Google Scholar]
Gijbels I, Lin D, Ying Z. Non- and semi-parametric analysis of failure time data with missing failure indicators. IMS Lecture Notes Monograph Series. 2007;54:203–223. [Google Scholar]
Gilbert P, McKeague I, Sun Y. The two-sample problem for failure rates depending on a continuous mark: An application to vaccine efficacy. Biostatistics. 2008;9:263–276. doi: 10.1093/biostatistics/kxm028. [DOI] [PubMed] [Google Scholar]
Goetghebeur E, Ryan L. A modified logrank test for competing risks with missing failure type. Biometrika. 1990;77:207–211. [Google Scholar]
Goetghebeur E, Ryan L. Analysis of competing risks survival data when some failure types are missing. Biometrika. 1995;82:821–834. [Google Scholar]
Gourieroux C, Monfort A. Asymptotic properties of the maximum likelihood estimator in dichotomous logit models. Journal of Econometrics. 1981;17:83–97. [Google Scholar]
Gray R. A class of k-sample tests for comparing the cumulative incidence of a competing risk. Annals of Statistics. 1988;16:1040–1154. [Google Scholar]
Haberman S. The Analysis of Frequency Data. University of Chicago Press; Chicago: 1974. [Google Scholar]
Haberman S. Maximum likelihood estimates in exponential response models. Annals of Statistics. 1977;5:815–841. [Google Scholar]
Horvitz D, Thompson D. A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association. 1952;47:663–685. [Google Scholar]
Kalbfleish J, Prentice R. The Statistical Analysis of Failure Time Data. 2. Wiley; New York: 2002. [Google Scholar]
Klein J, Andersen P. Regression modeling of competing risks data based on pseudovalues of the cumulative incidence function. Biometrics. 2005;61:223–229. doi: 10.1111/j.0006-341X.2005.031209.x. [DOI] [PubMed] [Google Scholar]
Lo SH. Estimating a survival function with incomplete cause-of-death data. Journal of Multivariate Analysis. 1991;39:217–235. [Google Scholar]
Lu K, Tsiatis A. Multiple imputation methods for estimating regression coefficients in the competing risks model with missing cause of failure. Biometrics. 2001;57:1191–1197. doi: 10.1111/j.0006-341x.2001.01191.x. [DOI] [PubMed] [Google Scholar]
Lu W, Liang Y. Analysis of competing risks data with missing cause of failure under additive hazards model. Statistica Sinica. 2008;18:219–234. [Google Scholar]
McKeague I, Subramanian S. Product-limit estimators and cox regression with missing censoring information. Scandinavian Journal of Statistics. 1998;25:589–601. [Google Scholar]
Pepe M. Inference for events with dependent risks in multiple endpoint studies. Journal of the American Statistical Association. 1991;86:770–778. [Google Scholar]
Prentice R, Kalbfleisch J, Peterson A, Flournoy N, Farewell V, Breslow N. The analysis of failure times in the presence of competing risks. Biometrics. 1978;34:541–554. [PubMed] [Google Scholar]
Robins J, Rotnitzky A, Zhao L. Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association. 1994;89:846–866. [Google Scholar]
Rubin D. Inference and missing data. Biometrika. 1976;63:581–592. [Google Scholar]
Rubin D. Multiple Imputation for Nonresponse in Surveys. Wiley; New York: 1987. [Google Scholar]
Rubin D. Multiple imputation after 18+ years. Journal of the American Statistical Association. 1996;91:473–489. [Google Scholar]
Scharfstein D, Rotnitzky A, Robins J. Adjusting for nonignorable drop-out using semiparametric nonresponse models: rejoinder. Journal of the American Statistical Association. 1999;94:1135–1146. [Google Scholar]
Scheike T, Zhang M. Extensions and applications of the cox-aalen survival model. Biometrics. 2003;59:1036–1045. doi: 10.1111/j.0006-341x.2003.00119.x. [DOI] [PubMed] [Google Scholar]
Shen Y, Cheng S. Confidence bands for cumulative incidence curves under the additive risk model. Biometrics. 1999;55:1093–1100. doi: 10.1111/j.0006-341x.1999.01093.x. [DOI] [PubMed] [Google Scholar]
Sierra J, Perez W, Rozman W, Carreras C, Klein J, Rizzo J, Davies J, Lazarus S, Bredeson C, Marks D, Canals C, Boogaerts M, Goldman J, Champlin R, Keating A, Weisdorf D, deWitte T, Horowitz M. Bone marrow transplantation from hla-identical siblings as treatment for myelodysplasia. Blood. 2002;100:1997–2004. [PubMed] [Google Scholar]
Subramanian S. Efficient estimation of regression coefficients and baseline hazard under proportionality of conditional hazards. Journal of Statistical Planning and Inference. 2000;84:81–94. [Google Scholar]
Subramanian S, Bandyopadhyay D. Doubly robust semiparametric estimation for the missing censoring indicator model. Statistics and Probability Letters. 2010;80:621–630. [Google Scholar]
van der Vaart A. Asymptotic Statistics. Cambridge University Press; Cambridge: 1998. [Google Scholar]
White H. Maximum likelihood estimation of misspecified models. Econometrica. 1982;50:1–25. [Google Scholar]

[R1] Andersen J, Goetghebeur E, Ryan L. Missing cause of death information in the analysis of survival data. Statistics in Medicine. 1996;15:2191–2201. doi: 10.1002/(SICI)1097-0258(19961030)15:20<2191::AID-SIM358>3.0.CO;2-D. [DOI] [PubMed] [Google Scholar]

[R2] Andersen P, Klein J, Rosthøj S. Generalised linear models for correlated pseudo-observations, with applications to multi-state models. Biometrika. 2003;90:15–27. [Google Scholar]

[R3] Benichou J, Gail M. Estimates of absolute cause-specific risk in cohort studies. Biometrics. 1990;46:813–826. [PubMed] [Google Scholar]

[R4] Cheng S, Fine J, Wei L. Prediction of cumulative incidence function under the proportional hazards model. Biometrics. 1998;54:219–228. [PubMed] [Google Scholar]

[R5] Cox D. Regression models and life tables (with discussion) Journal of the Royal Statistical Society Series B. 1972;34:187–220. [Google Scholar]

[R6] Cox D. Partial likelihood. Biometrika. 1975;62:269–276. [Google Scholar]

[R7] Dewanji A. A note on a test for competing risks with missing failure type. Biometrika. 1992;79:855–857. [Google Scholar]

[R8] Dinse G. Nonparametric estimation for partially-complete time and of failure data. Biometrics. 1982;38:417–431. [PubMed] [Google Scholar]

[R9] Dinse G. Nonparametric prevalence and mortality estimators for animal experiments with incomplete cause-of-death data. Journal of the American Statistical Association. 1986;81:328–336. [Google Scholar]

[R10] Fine J, Gray R. A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association. 1999;94:496–509. [Google Scholar]

[R11] Gao G, Tsiatis A. Semiparametric estimators for the regression coefficients in the linear transformation competing risks model with missing cause of failure. Biometrika. 2005;92:875–891. [Google Scholar]

[R12] Gaynor J, Feuer E, Tan C, Wu D, Little C, Straus D, Clarkson B, Brennan M. On the use of cause-specific failure and conditional failure probabilities: Examples from clinical oncology data. Journal of the American Statistical Association. 1993;88:400–409. [Google Scholar]

[R13] Gijbels I, Lin D, Ying Z. Non- and semi-parametric analysis of failure time data with missing failure indicators. IMS Lecture Notes Monograph Series. 2007;54:203–223. [Google Scholar]

[R14] Gilbert P, McKeague I, Sun Y. The two-sample problem for failure rates depending on a continuous mark: An application to vaccine efficacy. Biostatistics. 2008;9:263–276. doi: 10.1093/biostatistics/kxm028. [DOI] [PubMed] [Google Scholar]

[R15] Goetghebeur E, Ryan L. A modified logrank test for competing risks with missing failure type. Biometrika. 1990;77:207–211. [Google Scholar]

[R16] Goetghebeur E, Ryan L. Analysis of competing risks survival data when some failure types are missing. Biometrika. 1995;82:821–834. [Google Scholar]

[R17] Gourieroux C, Monfort A. Asymptotic properties of the maximum likelihood estimator in dichotomous logit models. Journal of Econometrics. 1981;17:83–97. [Google Scholar]

[R18] Gray R. A class of k-sample tests for comparing the cumulative incidence of a competing risk. Annals of Statistics. 1988;16:1040–1154. [Google Scholar]

[R19] Haberman S. The Analysis of Frequency Data. University of Chicago Press; Chicago: 1974. [Google Scholar]

[R20] Haberman S. Maximum likelihood estimates in exponential response models. Annals of Statistics. 1977;5:815–841. [Google Scholar]

[R21] Horvitz D, Thompson D. A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association. 1952;47:663–685. [Google Scholar]

[R22] Kalbfleish J, Prentice R. The Statistical Analysis of Failure Time Data. 2. Wiley; New York: 2002. [Google Scholar]

[R23] Klein J, Andersen P. Regression modeling of competing risks data based on pseudovalues of the cumulative incidence function. Biometrics. 2005;61:223–229. doi: 10.1111/j.0006-341X.2005.031209.x. [DOI] [PubMed] [Google Scholar]

[R24] Lo SH. Estimating a survival function with incomplete cause-of-death data. Journal of Multivariate Analysis. 1991;39:217–235. [Google Scholar]

[R25] Lu K, Tsiatis A. Multiple imputation methods for estimating regression coefficients in the competing risks model with missing cause of failure. Biometrics. 2001;57:1191–1197. doi: 10.1111/j.0006-341x.2001.01191.x. [DOI] [PubMed] [Google Scholar]

[R26] Lu W, Liang Y. Analysis of competing risks data with missing cause of failure under additive hazards model. Statistica Sinica. 2008;18:219–234. [Google Scholar]

[R27] McKeague I, Subramanian S. Product-limit estimators and cox regression with missing censoring information. Scandinavian Journal of Statistics. 1998;25:589–601. [Google Scholar]

[R28] Pepe M. Inference for events with dependent risks in multiple endpoint studies. Journal of the American Statistical Association. 1991;86:770–778. [Google Scholar]

[R29] Prentice R, Kalbfleisch J, Peterson A, Flournoy N, Farewell V, Breslow N. The analysis of failure times in the presence of competing risks. Biometrics. 1978;34:541–554. [PubMed] [Google Scholar]

[R30] Robins J, Rotnitzky A, Zhao L. Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association. 1994;89:846–866. [Google Scholar]

[R31] Rubin D. Inference and missing data. Biometrika. 1976;63:581–592. [Google Scholar]

[R32] Rubin D. Multiple Imputation for Nonresponse in Surveys. Wiley; New York: 1987. [Google Scholar]

[R33] Rubin D. Multiple imputation after 18+ years. Journal of the American Statistical Association. 1996;91:473–489. [Google Scholar]

[R34] Scharfstein D, Rotnitzky A, Robins J. Adjusting for nonignorable drop-out using semiparametric nonresponse models: rejoinder. Journal of the American Statistical Association. 1999;94:1135–1146. [Google Scholar]

[R35] Scheike T, Zhang M. Extensions and applications of the cox-aalen survival model. Biometrics. 2003;59:1036–1045. doi: 10.1111/j.0006-341x.2003.00119.x. [DOI] [PubMed] [Google Scholar]

[R36] Shen Y, Cheng S. Confidence bands for cumulative incidence curves under the additive risk model. Biometrics. 1999;55:1093–1100. doi: 10.1111/j.0006-341x.1999.01093.x. [DOI] [PubMed] [Google Scholar]

[R37] Sierra J, Perez W, Rozman W, Carreras C, Klein J, Rizzo J, Davies J, Lazarus S, Bredeson C, Marks D, Canals C, Boogaerts M, Goldman J, Champlin R, Keating A, Weisdorf D, deWitte T, Horowitz M. Bone marrow transplantation from hla-identical siblings as treatment for myelodysplasia. Blood. 2002;100:1997–2004. [PubMed] [Google Scholar]

[R38] Subramanian S. Efficient estimation of regression coefficients and baseline hazard under proportionality of conditional hazards. Journal of Statistical Planning and Inference. 2000;84:81–94. [Google Scholar]

[R39] Subramanian S, Bandyopadhyay D. Doubly robust semiparametric estimation for the missing censoring indicator model. Statistics and Probability Letters. 2010;80:621–630. [Google Scholar]

[R40] van der Vaart A. Asymptotic Statistics. Cambridge University Press; Cambridge: 1998. [Google Scholar]

[R41] White H. Maximum likelihood estimation of misspecified models. Econometrica. 1982;50:1–25. [Google Scholar]

PERMALINK

Proportional hazards model for competing risks data with missing cause of failure

Seunggeun Hyun

Jimin Lee

Yanqing Sun

Abstract

1. Introduction

2. Estimating equations

2.1. Inverse probability weighted estimator

2.2. Augmented inverse probability weighted estimator

3. Asymptotic results

Theorem 1

Theorem 2

4. Numerical results

4.1. Simulation studies

Table 1.

Table 2.

4.2. Bone marrow transplant data

Table 3.

5. Conclusion

Acknowledgments

Appendix

Condition A

Proof of Theorem 1

Consistency of β̂_I

Asymptotic Normality of β̂_I

Proof of Theorem 2

Consistency of β̂_A

Asymptotic Normality of β̂_A

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Proportional hazards model for competing risks data with missing cause of failure

Seunggeun Hyun

Jimin Lee

Yanqing Sun

Abstract

1. Introduction

2. Estimating equations

2.1. Inverse probability weighted estimator

2.2. Augmented inverse probability weighted estimator

3. Asymptotic results

Theorem 1

Theorem 2

4. Numerical results

4.1. Simulation studies

Table 1.

Table 2.

4.2. Bone marrow transplant data

Table 3.

5. Conclusion

Acknowledgments

Appendix

Condition A

Proof of Theorem 1

Consistency of β̂I

Asymptotic Normality of β̂I

Proof of Theorem 2

Consistency of β̂A

Asymptotic Normality of β̂A

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Consistency of β̂_I

Asymptotic Normality of β̂_I

Consistency of β̂_A

Asymptotic Normality of β̂_A