Semiparametric regression based on quadratic inference function for multivariate failure time data with auxiliary information

Feifei Yan; Lin Zhu; Yanyan Liu; Jianwen Cai; Haibo Zhou

doi:10.1007/s10985-020-09513-1

. 2021 Jan 8;27(2):269–299. doi: 10.1007/s10985-020-09513-1

Semiparametric regression based on quadratic inference function for multivariate failure time data with auxiliary information

Feifei Yan ^1,⁴, Lin Zhu ⁴, Yanyan Liu ^2,^✉, Jianwen Cai ³, Haibo Zhou ³

PMCID: PMC7943434 PMID: 33420545

Abstract

This paper deals with statistical inference procedure of multivariate failure time data when the primary covariate can be measured only on a subset of the full cohort but the auxiliary information is available. To improve efficiency of statistical inference, we use quadratic inference function approach to incorporate the intra-cluster correlation and use kernel smoothing technique to further utilize the auxiliary information. The proposed method is shown to be more efficient than those ignoring the intra-cluster correlation and auxiliary information and is easy to implement. In addition, we develop a chi-squared test for hypothesis testing of hazard ratio parameters. We evaluate the finite-sample performance of the proposed procedure via extensive simulation studies. The proposed approach is illustrated by analysis of a real data set from the study of left ventricular dysfunction.

Keywords: Multivariate failure time data, Validation sample, Quadratic inference function, Chi-squared test

Introduction

This paper is aimed at developing improved inference procedure for multivariate failure time data with auxiliary information. Large cohort studies often involve thousands or more subjects and the studies, especially when involving failure time outcomes, could last for many years. It is often that the measurement of the primary covariate can only be obtained for a random subset of the study cohort due to technical difficulties or financial limitations. On the other hand, some auxiliary information that is less precise but highly correlated to the primary exposure can be cheaply collected for all cohort members. The auxiliary information could be a mismeasured surrogate to the true covariate, or any covariate that is informative about the true covariate. An example is from the left ventricular dysfunction (SOLVD 1991) prevention study, which aims to assess the effects of risk factors on the time (possibly censored) to heart failure and the first myocardial infarction. One of the most important risk factors is patient’s ejection fraction (EF), which can be precisely measured by using a standardized radionucleotide technique, but the cost is very high. Therefore, EF is only measured on a randomly chosen subset of all cohort, while a less precise but cheaper measurement of EF was ascertained for all the patients using a nonstandardized technique. Because each patient could experience both heart failure and the first myocardial infarction, statistical methods for handling multivariate failure time data with covariate measurement error would be required.

Proper use of auxiliary information has been proved to lead to improved efficiency of survival estimates in multivariate failure time data with auxiliary information. For example, Hu and Lin (2004) proposed a corrected estimation function under the assumption that the error is symmetrically distributed. Liu et al. (2009) and Liu et al. (2010) developed estimated pseudo-partial likelihood method for multivariate failure time data with discrete and continuous auxiliary variable, respectively. Liu et al. (2012) and Fan and Wang (2009) studied this problem under the assumption that the intra-cluster subjects have common baseline hazard. The above studies are based on the marginal hazards model, the intra-cluster correlation, however, is ignored in the estimation procedures and only adjusted in the inference step by applying a robust sandwich variance estimate. The practice of ignoring the intra-cluster correlation would result in some loss of efficiency.

Some authors have proposed to incorporate correlation explicitly into the estimating equations to improve the efficiency of estimate in dealing with multivariate failure time data with covariates being fully observed. For example, Cai and Prentice (1995, 1997) added a weight matrix based on the inverse of correlation matrix of marginal martingales into the partial likelihood score equation. Simulation studies have shown that their approach is more efficient than that using independent structure when cluster size is small. However, their method is computation intensive when the cluster size is large because the computation involves an estimation of very high dimensional weighting correlation matrix. To overcome this shortcoming, Xue et al. (2010) developed a different approach by applying the method of quadratic inference function (QIF). Their method avoids to explicitly estimate the correlation parameters and is easy to implement especially when cluster size is large. We note that both these two methods assume that the covariates could be observed completely and therefore cannot be applied directly to SOLVD data.

Motivated by the advantages of the QIF method provided, we extend this method to the analysis of multivariate failure time data with auxiliary information. Here, we assume that the auxiliary covariate is continuous. We propose an estimated QIF method and study the asymptotic properties of the proposed estimator. The proposed method inherits the merit of QIF method which avoids the estimation of nuisance correlation parameters and is computationally easy to implement. Under certain regularity conditions, we establish the asymptotic normality of resulting estimator. Simulation studies show that our proposed method can improve the estimation efficiency compared with that ignoring dependent structure, such as the method by Liu et al. (2010). In addition, we study the problem of hypothesis testing, propose a proper test statistic which have a chi-squared limiting distribution under the null hypothesis.

The rest of the article is organized as follows. In Sect. 2, we introduce the model and describe the proposed estimation procedure. In Sect. 3, the large-sample properties of the proposed estimator are presented. In Sect. 4, a chi-squared test is developed for hypothesis testing. In Sect. 5, the finite-sample performance of the proposed procedures is assessed through extensive simulation studies. We illustrate the proposed method through analysis of a real data set from SOLVD study in Sect. 6. Some concluding remarks are given in Sect. 7 and the technical proofs are provided in “Appendix”.

Model and estimation

Preliminaries

Suppose that the whole cohort consists of n independent clusters, and each cluster contains K correlated failure types. Let (i, k) denote the kth $(k = 1, \dots, K)$ subject in the ith $(i = 1, \dots, n)$ cluster. Let ${\tilde{T}}_{ik}$ and $C_{ik}$ be potential failure time and censoring time for subject (i, k). With censoring, one observes $T_{ik} = \min ({\tilde{T}}_{ik}, C_{ik})$ and $Δ_{ik} = I ({\tilde{T}}_{ik} \leq C_{ik})$ , where $I (\cdot)$ is the indicator function. Let ${\tilde{Z}}_{ik} (t)$ be a p-vector of possibly time-dependent covariates.

For subject (i, k), the hazard function $λ_{ik} (t ; {\tilde{Z}}_{ik} (t))$ takes the following form:

\begin{matrix} λ_{ik} (t ; {\tilde{Z}}_{ik} (t)) = λ_{0 k} (t) exp {β^{T} {\tilde{Z}}_{ik} (t)}, \end{matrix}

2.1

where $β$ is a p-vector of unknown regression parameters and $λ_{0 k} (t)$ is an unspecified marginal baseline hazard function pertaining to the kth failure type.

Note that model (2.1) includes as a special case the failure-type-specific model (Wei et al. 1989; Greene and Cai 2004) $λ_{ik} (t ; {\tilde{Z}}_{ik} (t)) = λ_{0 k} (t) exp {β_{k}^{T} {\tilde{Z}}_{ik} (t)}$ , which allows for different covariate effect for different k. This can be seen by defining $β = {(β_{1}^{T}, \dots, β_{k}^{T}, \dots, β_{K}^{T})}^{T}$ and ${\tilde{Z}}_{ik}^{*} (t) = {(0^{T}, \dots, {\tilde{Z}}_{ik}^{T} (t), \dots, 0^{T})}^{T}$ in the model $λ_{ik} (t ; {\tilde{Z}}_{ik}^{*} (t)) = λ_{0 k} (t) exp {β^{T} {\tilde{Z}}_{ik}^{*} (t)}$ . For simplicity, we write $λ_{ik} (t ; {\tilde{Z}}_{ik} (t))$ as $λ_{ik} (t)$ in the following.

Let $Λ_{0 k} (t) = \int_{0}^{t} λ_{0 k} (u) d u$ be the marginal cumulative baseline hazard function for the kth failure type. Let $N_{ik} (t) = Δ_{ik} I (T_{ik} \leq t)$ and $Y_{ik} (t) = I (T_{ik} \geq t)$ be the observed counting process and the at-risk indicator process. For convenience, write the relative risk function as $r_{ik} (t ; β) = exp {β^{T} {\tilde{Z}}_{ik} (t)}$ . Let $M_{ik} (t) = N_{ik} (t) - \int_{0}^{t} Y_{ik} (u) r_{ik} (u ; β_{0}) λ_{0 k} (u) d u$ be the marginal martingale process, where $β_{0}$ is the true parameter. Given $β$ , $Λ_{0 k} (t)$ can be estimated consistently by the following Breslow type estimator (Breslow 1972):

\begin{matrix} {\tilde{Λ}}_{0 k} (t ; β) = \int_{0}^{t} \frac{\sum_{i = 1}^{n} d N_{ik} (u)}{\sum_{i = 1}^{n} Y_{ik} (u) r_{ik} (u ; β)} . \end{matrix}

2.2

Given $β_{0}$ , it follows that $M_{ik} (t)$ could be estimated as follows:

\begin{matrix} {\tilde{M}}_{ik} (t ; β_{0}) = N_{ik} (t) - \int_{0}^{t} Y_{ik} (u) r_{ik} (u ; β_{0}) {\tilde{Λ}}_{0 k} (d t ; β_{0}) . \end{matrix}

Let $τ$ be the end time of study. Write ${\tilde{M}}_{i} (t ; β) = {({\tilde{M}}_{i 1} (t ; β), \dots, {\tilde{M}}_{iK} (t ; β))}^{T}$ . To improve the estimation efficiency, Cai and Prentice (1995) added proper weight matrix to the pseudo-partial likelihood equation, and proposed to obtain estimate of $β$ through solving the following equation:

\begin{matrix} \sum_{i = 1}^{n} \int_{0}^{τ} R_{i}^{T} (t ; β) W_{i} (t ; β) {\tilde{M}}_{i} (d t ; β) = 0, \end{matrix}

2.3

where $W_{i} (t ; β) = {\hat{corr}}^{- 1} (M_{i} (T_{i} ; β))$ is the weight matrix with

\begin{matrix} M_{i} (T_{i} ; β) = {(M_{i 1} (T_{i 1} ; β), \dots, M_{iK} (T_{iK} ; β))}^{T}, \end{matrix}

and

\begin{matrix} R_{i}^{T} (t ; β) = (\frac{r_{i 1}^{(1)} (t ; β)}{r_{i 1} (t ; β)}, \dots, \frac{r_{iK}^{(1)} (t ; β)}{r_{iK} (t ; β)}), \end{matrix}

with $r^{(j)} (t ; β)$ denotes the jth derivative of $r (t ; β)$ with respect to $β$ . The weight matrix measures the intra-cluster correlation and is important to improve estimation efficiency. However, when the cluster size K is large, the estimation of weight matrix is computationally expensive. To overcome this shortcoming, Xue et al. (2010) proposed a QIF method which is based on the following generalized estimating equation:

\begin{matrix} \sum_{i = 1}^{n} \int_{0}^{τ} R_{i}^{T} (t ; β) Ξ_{i}^{1 / 2} (t ; β) Σ_{i}^{- 1} (α) Ξ_{i}^{- 1 / 2} (t ; β) {\tilde{M}}_{i} (d t ; β) = 0, \end{matrix}

2.4

where $Ξ_{i} (t ; β) = diag {λ_{i 1} (t), \dots, λ_{iK} (t)}$ and $Σ_{i} (α)$ is the working correlation matrix whose common structure is specified by a vector of nuisance correlation parameters $α$ . The inverse of the working correlation is approximated by a linear combination of several pre-specified symmetric basis matrices, namely,

\begin{matrix} Σ^{- 1} \approx a_{1} B_{1} + \dots + a_{L} B_{L}, \end{matrix}

2.5

where $B_{1}, \dots, B_{L}$ are known basis matrices and $a_{1}, \dots, a_{L}$ are unknown coefficients.

Substituting (2.5) in (2.4) leads to a linear combination of the elements of the following vector

\begin{matrix} G_{n} (β) & = \frac{1}{n} \sum_{i = 1}^{n} g_{i} (β) \\ = \frac{1}{n} \sum_{i = 1}^{n} (\begin{matrix} \int_{0}^{τ} R_{i}^{T} (t ; β) Ξ_{i}^{1 / 2} (t ; β) B_{1} Ξ_{i}^{- 1 / 2} (t ; β) {\tilde{M}}_{i} (d t ; β) \\ ⋮ \\ \int_{0}^{τ} R_{i}^{T} (t ; β) Ξ_{i}^{1 / 2} (t ; β) B_{L} Ξ_{i}^{- 1 / 2} (t ; β) {\tilde{M}}_{i} (d t ; β) \end{matrix}) . \end{matrix}

As there are more equations than unknown parameters, Xue et al. (2010) proposed to estimate $β$ by minimizing the following QIF:

\begin{matrix} Q_{n} (β) = G_{n}^{T} (β) W_{n}^{- 1} (β) G_{n} (β), \end{matrix}

2.6

where $W_{n} (β) = \frac{1}{n^{2}} \sum_{i = 1}^{n} g_{i} (β) g_{i}^{T} (β)$ . We denote the solution as ${\tilde{β}}_{Q}$ in the following.

In the implementation of QIF, there being an additional issue that the diagonal matrix $Ξ_{i} (t ; β)$ involves the unknown baseline hazard function $λ_{0 k} (t ; β)$ . Xue et al. (2010) suggested a kernel smoothed estimator ${\tilde{λ}}_{0 k} (t ; β)$ as follows,

\begin{matrix} {\tilde{λ}}_{0 k} (t ; β) = \frac{1}{ν_{k}} \sum_{i = 1}^{n} κ (\frac{t - T_{ik}}{ν_{k}}) Δ {\tilde{Λ}}_{0 k} (T_{ik} ; β), \end{matrix}

2.7

where $κ (\cdot)$ is the Epanechnikov kernel function with $ν_{k}$ being the rule-of-thumb bandwidth, and $Δ {\tilde{Λ}}_{0 k} (t ; β) = {\tilde{Λ}}_{0 k} (t ; β) - {\tilde{Λ}}_{0 k} (t - ; β)$ with ${\tilde{Λ}}_{0 k} (t ; β)$ being the Breslow estimator given in (2.2).

Estimated QIF for marginal hazards model

Consider the situation that the primary covariate can only be ascertained in validation set. Let ${\tilde{Z}}_{ik} (t)$ consist of two parts, $X_{ik} (t)$ and $Z_{ik} (t)$ , where $X_{ik} (t)$ is the primary variable which can only be observed in the validation set and $Z_{ik} (t)$ is the vector of the remaining covariates that are measured precisely for the full cohort. Accordingly, write the true parameter as $β_{0} = (β_{10}, β_{20})$ with $β_{10}$ and $β_{20}$ pertaining to $X_{ik} (t)$ and $Z_{ik} (t)$ , respectively. Denote A(t) as a time-dependent auxiliary variable for the primary covariate X(t). $A (\cdot)$ can be measured for all cohort members. Suppose that A provides no additional information to model given X, i.e.,

\begin{matrix} λ (t ; X (t), Z (t), A (t)) \equiv λ (t ; X (t), Z (t)) . \end{matrix}

Use $η_{ik} = 1$ or 0 to indicate whether the subject (i, k) is in the validation set or not. Let $V_{k} = {i : η_{ik} = 1}$ and ${\bar{V}}_{k} = {i : η_{ik} = 0}$ denote the kth marginal validation set and non-validation set, respectively. Then the observed data are:

\begin{matrix} \begin{matrix} {T_{ik}, Δ_{ik}, η_{ik}, Y_{ik} (t), X_{ik} (t), Z_{ik} (t), A_{ik} (t)} & if i \in V_{k} \\ {T_{ik}, Δ_{ik}, η_{ik}, Y_{ik} (t), Z_{ik} (t), A_{ik} (t)} & if i \in {\bar{V}}_{k} . \end{matrix} \end{matrix}

According to Liu et al. (2009), when subject (i, k) is in non-validation set, the hazard function given observed data can be written as:

\begin{matrix} λ_{ik} (t ; Z_{ik} (t), A_{ik} (t)) & = λ_{0 k} (t) e^{β_{2}^{T} Z_{ik} (t)} E {e^{β_{1}^{T} X_{ik} (t)} | Y_{ik} (t) = 1, A_{ik} (t), Z_{ik} (t)} \\ = λ_{0 k} (t) e^{β_{2}^{T} Z_{ik} (t)} E {e^{β_{1}^{T} X_{ik} (t)} | Y_{ik} (t) = 1, A_{ik}^{*} (t)}, \end{matrix}

where $A_{ik}^{*} (t)$ denotes all the possible auxiliary information, which may include the auxiliary covariate $A_{ik} (t)$ and the part from $Z_{ik} (t)$ . Therefore, the induced relative risk function is

\begin{matrix} r_{ik} (t ; β) = φ_{ik} (t ; β) η_{ik} + ψ_{ik} (t ; β) (1 - η_{ik}), \end{matrix}

where $φ_{ik} (t ; β) = exp {β_{1}^{T} X_{ik} (t) + β_{2}^{T} Z_{ik} (t)}$ , and

\begin{matrix} ψ_{ik} (t ; β) = exp {β_{2}^{T} Z_{ik} (t)} E {e^{β_{1}^{T} X_{ik} (t)} | Y_{ik} (t) = 1, A_{ik}^{*} (t)} . \end{matrix}

If the conditional density of $X_{ik}$ , written as $f (x | T_{ik} \geq t, A_{ik}^{*})$ , is a known function up to a parameter $θ$ , then $(β, θ)$ can be estimated by using the induced risk function to replace risk function in equations (2.3) or (2.4). However, misspecification of such parameterization may lead to biased estimates. We use empirical method to estimate $ψ_{ik} (t ; β)$ and then replace it with the corresponding estimate.

In this paper, we consider the often encountered case that both the primary covariate $X_{ik} (t)$ and the auxiliary variable $A_{ik}^{*} (t)$ are one-dimensional. The unknown part of induced relative risk function in non-validation set is estimated by kernel smoothing method

\begin{matrix} {\hat{ψ}}_{ik} (t ; β) = \frac{\sum_{j \in V_{k}} Y_{jk} (t) Ψ {μ_{k}^{- 1} (A_{jk}^{*} (t) - A_{ik}^{*} (t))} e^{β_{1} X_{jk} (t)}}{\sum_{j \in V_{k}} Y_{jk} (t) Ψ {μ_{k}^{- 1} (A_{jk}^{*} (t) - A_{ik}^{*} (t))}} e^{β_{2}^{T} Z_{ik} (t)}, \end{matrix}

2.8

where $Ψ (\cdot)$ is a kernel function, $μ_{k}$ is the bandwidth. Imputation of the relative risk by interpolation would be used when the denominator is 0. Therefore, the estimate of the relative risk is

\begin{matrix} {\hat{r}}_{ik} (t ; β) = φ_{ik} (t ; β) η_{ik} + {\hat{ψ}}_{ik} (t ; β) (1 - η_{ik}) . \end{matrix}

Replacing $r_{ik} (t ; β)$ by ${\hat{r}}_{ik} (t ; β)$ in the notations in Sect. 2.1, we obtain an estimated version of $R_{i}, {\tilde{Λ}}_{0 k}, λ_{0 k}, Ξ_{i}, {\tilde{M}}_{i}, g_{i}$ and $G_{n}$ . To differentiate, write as ${\hat{R}}_{i}, {\hat{Λ}}_{0 k}, {\hat{λ}}_{0 k}, {\hat{Ξ}}_{i}$ , ${\hat{M}}_{i}, {\hat{g}}_{i}$ and ${\hat{G}}_{n}$ . It yields an estimated QIF as

\begin{matrix} {\hat{Q}}_{n} (β) = {\hat{G}}_{n}^{T} (β) {\hat{W}}_{n}^{- 1} (β) {\hat{G}}_{n} (β), \end{matrix}

where ${\hat{W}}_{n} (β) = \frac{1}{n^{2}} \sum_{i = 1}^{n} {\hat{g}}_{i} (β) {\hat{g}}_{i}^{T} (β)$ . $β_{0}$ can be estimated by minimizing ${\hat{Q}}_{n} (β)$ , i.e.,

\begin{matrix} {\hat{β}}_{Q} = \underset{β}{argmin} {\hat{Q}}_{n} (β) . \end{matrix}

2.9

To reduce the computation burden, we approximate the first and the second order derivatives of ${\hat{Q}}_{n} (β)$ as in Qu et al. (2000) as follows.

\begin{matrix} {\hat{Q}}_{n}^{(1)} (β) = & 2 {{\hat{G}}_{n}^{(1)} (β)}^{T} {\hat{W}}_{n}^{- 1} (β) {\hat{G}}_{n} (β) + o_{p} (1), \\ {\hat{Q}}_{n}^{(2)} (β) = & 2 {{\hat{G}}_{n}^{(1)} (β)}^{T} {\hat{W}}_{n}^{- 1} (β) {\hat{G}}_{n}^{(1)} (β) + o_{p} (1) . \end{matrix}

Then, Newton-Raphson algorithm can be applied by using the approximation.

Asymptotic properties

In this section, we present the asymptotic properties of the proposed estimated QIF estimator ${\hat{β}}_{Q}$ , and provide standard error formula for it.

Let $n_{k}$ denote the number of subjects in $V_{k}$ and assume $n_{k} / n \to ρ_{k}$ as $n \to \infty$ , where $ρ_{k}$ represents the probability of subject (i, k) being sampled into the kth marginal validation set. Under the conditions listed in “Appendix”, we demonstrate the asymptotic behavior of ${\hat{β}}_{Q}$ in the following theorems.

Theorem 1

Under conditions (C1)–(C9) in “Appendix”, the following results hold:

(I)
The proposed estimator ${\hat{β}}_{Q}$ is a consistent estimator of $β_{0}$ .
(II)
$\sqrt{n} ({\hat{β}}_{Q} - β_{0})$ is asymptotically normally distributed with mean zero and variance matrix $Σ_{Q} (β_{0}) = {(J_{0}^{T} W_{0}^{- 1} J_{0})}^{- 1}$ , where $J_{0} = J (β_{0})$ , with
$\begin{matrix} J (β_{0}) = (\begin{matrix} Γ (β_{0}, B_{1}) \\ ⋮ \\ Γ (β_{0}, B_{L}) \end{matrix}), \end{matrix}$
and $Γ (β_{0}, B)$ and $W_{0}$ are defined in (A.1) and (A.2) in “Appendix”, respectively.

The asymptotic covariance $Σ_{Q} (β_{0})$ can be consistently estimated by

\begin{matrix} {\hat{Σ}}_{Q} ({\hat{β}}_{Q}) = {{\hat{J}}^{T} ({\hat{β}}_{Q}) {\hat{W}}^{- 1} \hat{J} ({\hat{β}}_{Q})}^{- 1}, \end{matrix}

where

\begin{matrix} \hat{J} (β) = (\begin{matrix} \hat{Γ} (β, B_{1}) \\ ⋮ \\ \hat{Γ} (β, B_{L}) \end{matrix}), \end{matrix}

with

\begin{matrix} \hat{Γ} (β, B) & = \frac{1}{n} \sum_{i = 1}^{n} \sum_{k = 1}^{K} Δ_{ik} (\sum_{m = 1}^{K} \frac{\partial {\hat{R}}_{im} (T_{ik} ; β) {\hat{ϕ}}_{imk} (T_{ik} ; β, B)}{\partial β^{T}} - \frac{{\hat{S}}_{k}^{3} (T_{ik} ; β, B)}{{\hat{S}}_{k}^{0} (T_{ik} ; β)}) \\ (- \frac{{\hat{S}}_{k}^{4} (T_{ik} ; β, B) + {\hat{S}}_{k}^{5} (T_{ik} ; β, B)}{{\hat{S}}_{k}^{0} (T_{ik} ; β)} + \frac{{\hat{S}}_{k}^{2} (T_{ik} ; β, B) {({\hat{S}}_{k}^{1} (T_{ik} ; β))}^{T}}{{({\hat{S}}_{k}^{0} (T_{ik} ; β))}^{2}}), \end{matrix}

and

\begin{matrix} \hat{W} = {\hat{ω} ({\hat{β}}_{Q}, B_{j}, B_{j^{^{'}}})}_{j, j^{^{'}} = 1}^{L}, \end{matrix}

where

\begin{matrix} \hat{ω} (β, B_{j}, B_{j^{^{'}}}) = n^{- 1} \sum_{i = 1}^{n} \sum_{k = 1}^{K} \sum_{m = 1}^{K} {\hat{U}}_{ik} (β, B_{j}) {\hat{U}}_{im}^{T} (β, B_{j^{^{'}}}), \end{matrix}

with

\begin{matrix} {\hat{U}}_{ik} (β, B) & = Δ_{ik} {\hat{ζ}}_{ik} (T_{ik} ; β, B) - \sum_{l = 1}^{n} Δ_{lk} {\hat{ζ}}_{ik} (T_{lk} ; β, B) \frac{Y_{ik} (T_{lk}) {\hat{r}}_{ik} (T_{lk} ; β)}{n {\hat{S}}_{k}^{0} (T_{lk} ; β)} \\ - \frac{n - n_{k}}{n_{k}} \sum_{l = 1}^{n} Δ_{lk} {\hat{ζ}}_{ik} (T_{lk} ; β, B) \frac{Y_{ik} (T_{lk}) ({\hat{r}}_{ik} (T_{lk} ; β) - {\hat{ψ}}_{ik} (T_{lk} ; β))}{n {\hat{S}}_{k}^{0} (T_{lk} ; β)}, \end{matrix}

where ${\hat{S}}_{k}^{d} (d = 0, \dots, 5)$ and ${\hat{ζ}}_{ik} (t ; β, B)$ are defined in “Appendix”.

Remark 1

As a special case of the estimated QIF estimator, the estimator using the independent working correlation is denoted as ${\hat{β}}_{I}$ , which is the same as the EPPL estimator of Liu et al. (2010), but different expressions of the variance matrix $Σ_{I} (β_{0})$ of the asymptotic distribution of $\sqrt{n} ({\hat{β}}_{I} - β_{0})$ and its estimator ${\hat{Σ}}_{I} ({\hat{β}}_{I})$ are provided under the conditions (C1)–(C8) in “Appendix”. From the corresponding expressions of $Σ_{Q} (β_{0})$ and ${\hat{Σ}}_{Q} ({\hat{β}}_{Q})$ , we can obtain that

\begin{matrix} Σ_{I} (β_{0}) = Γ^{- 1} (β_{0}, I) ω (β_{0}, I, I) Γ^{- 1} (β_{0}, I), \end{matrix}

and

\begin{matrix} {\hat{Σ}}_{I} ({\hat{β}}_{I}) = {\hat{Γ}}^{- 1} ({\hat{β}}_{I}, I) \hat{ω} ({\hat{β}}_{I}, I, I) {\hat{Γ}}^{- 1} ({\hat{β}}_{I}, I) . \end{matrix}

Inference on hazard ratio parameters

The QIF is built on an objective function, which provides a natural way to make inference about the hazard ratio parameter $β$ . Suppose that $β$ is partitioned into $γ$ and $δ$ , where $γ$ is vector of hazard ratio parameters of interest with dimension $p_{0}$ , and $δ$ is a vector of nuisance parameters with dimension $p - p_{0}$ . As a special case, we also allow $p_{0} = p$ , with $β = γ$ and $δ$ being absent.

To test

\begin{matrix} H_{0} : γ = γ_{0} versus H_{1} : γ \neq γ_{0}, \end{matrix}

we propose a test statistic

\begin{matrix} T = {\hat{Q}}_{n} (γ_{0}, \tilde{δ}) - {\hat{Q}}_{n} (\hat{γ}, \hat{δ}), \end{matrix}

4.1

where

\begin{matrix} \tilde{δ} = \underset{δ}{argmin} {\hat{Q}}_{n} (γ_{0}, δ), (\hat{γ}, \hat{δ}) = \underset{(γ, δ)}{argmin} {\hat{Q}}_{n} (γ, δ) . \end{matrix}

The values of ${\hat{Q}}_{n} (γ_{0}, \tilde{δ})$ and ${\hat{Q}}_{n} (\hat{γ}, \hat{δ})$ measure how well the model fits the data under $H_{0}$ and $H_{1}$ , respectively. Under $H_{0}$ , the difference between ${\hat{Q}}_{n} (γ_{0}, \tilde{δ})$ and ${\hat{Q}}_{n} (\hat{γ}, \hat{δ})$ should be very small. However, under $H_{1}$ , ${\hat{Q}}_{n} (γ_{0}, \tilde{δ})$ should be systematically larger than ${\hat{Q}}_{n} (\hat{γ}, \hat{δ})$ .

Theorem 2

Suppose conditions (C1)–(C9) in “Appendix” are satisfied, under $H_{0}$ , the test statistic T asymptotically follows chi-squared distribution with $p_{0}$ degrees of freedom.

Comment

To prove Theorem 2, we rewrite that

\begin{matrix} {\hat{Q}}_{n} (γ_{0}, \tilde{δ}) - {\hat{Q}}_{n} (\hat{γ}, \hat{δ}) & = ({\hat{Q}}_{n} (γ_{0}, \tilde{δ}) - Q_{n} (γ_{0}, \tilde{δ})) - ({\hat{Q}}_{n} (\hat{γ}, \hat{δ}) - Q_{n} (\hat{γ}, \hat{δ})) \\ + (Q_{n} (γ_{0}, \tilde{δ}) - Q_{n} (γ_{0}, {\tilde{δ}}^{*})) - (Q_{n} (\hat{γ}, \hat{δ}) - Q_{n} ({\hat{γ}}^{*}, {\hat{δ}}^{*})) \\ + (Q_{n} (γ_{0}, {\tilde{δ}}^{*}) - Q_{n} ({\hat{γ}}^{*}, {\hat{δ}}^{*})), \end{matrix}

where $Q_{n}$ is defined as in (2.6), ${\tilde{δ}}^{*} = \underset{δ}{argmin} Q_{n} (γ_{0}, δ)$ , and $({\hat{γ}}^{*}, {\hat{δ}}^{*}) = \underset{(γ, δ)}{argmin} Q_{n} (γ, δ)$ . From the proof of Theorem 1, we can obtain that the first two brackets equal to $o_{p} (1)$ . In addition, from the conclusions of the previous Theorem 1 and the Theorem 1 in Xue et al. (2010), both ${\hat{β}}_{Q}$ and ${\tilde{β}}_{Q}$ are consistent estimators of $β_{0}$ , then we can have that the third and the fourth brackets also equal to $o_{p} (1)$ . Furthermore, the last bracket asymptotically approaches to a random variable which follows chi-squared distribution with $p_{0}$ degrees of freedom, the proof of the last one is similar to the proof of Theorem 1 in Qu et al. (2000).

Simulation studies

In this section, we conduct simulation studies to evaluate the finite-sample behavior of the proposed method. We first evaluate the performance of proposed estimator in Sect. 5.1 and then the performance of inference method in Sect. 5.2.

Performance of estimated QIF estimator

We compare the proposed estimator with the QIF estimator proposed by Xue et al. (2010) based only on the validation set and the EPPL estimator ( ${\hat{β}}_{I}$ ) of Liu et al. (2010), which utilizes the auxiliary information but does not consider the intra-cluster correlation in the estimate of $β$ . The proposed estimator takes both the intra-cluster correlation and the auxiliary information into account.

The covariates $(X_{i 1}, X_{i 2}, \dots, X_{iK})$ , which are only observed in the validation sets in the real studies, are generated independently from uniform distribution U(0, 1). The covariates $(Z_{i 1}, Z_{i 2}, \dots, Z_{iK})$ are independent binary covariates taking value one with probability 0.5. The multivariate failure times ${\tilde{T}}_{i 1}, {\tilde{T}}_{i 2}, \dots, {\tilde{T}}_{iK}$ are generated from multivariate (Clayton and Cuzick 1985) model with the joint survival function

\begin{matrix} S (t_{1}, \dots, t_{K} ; D_{1}, \dots, D_{K}) = {(\sum_{k = 1}^{K} exp (θ^{- 1} λ_{0 k} t_{k} e^{β_{k}^{T} D_{k}}) - (K - 1))}^{- θ}, \end{matrix}

where $D_{k} = {(X_{k}^{T}, Z_{k}^{T})}^{T}$ , and $β_{k}$ , which may vary with the failure type, is the corresponding parameter of $D_{k}$ , and $θ$ is the dependence parameter, a larger value of which represents a weaker dependence between the failure times. We set $θ = 0.25$ , 0.5 or 2, which presents a varying degree of correlation between the generated failure times, and the baseline hazard function $λ_{0 k} = 1$ . The simulated failure times $(t_{1}, \dots, t_{K})$ are generated by using the algorithm described in Cai and Shen (2000) through

\begin{matrix} t_{1} & = - e^{- β_{1}^{T} D_{1}} ln (1 - u_{1}), \\ t_{k} & = θ e^{- β_{k}^{T} D_{k}} ln [(k - 1) - \sum_{i = 1}^{k - 1} a_{i} + (\sum_{i = 1}^{k - 1} a_{i} - (k - 2)) {(1 - u_{k})}^{- {(θ + k - 1)}^{- 1}}], \end{matrix}

for $k = 2, \dots, K$ , where $a_{i} = exp (θ^{- 1} t_{i} e^{β_{i}^{T} D_{i}})$ for $i = 1, \dots, k - 1$ , and $(u_{1}, \dots, u_{K})$ are generated from uniform distribution over interval (0, 1). Censoring times are generated from U(0, c), where c is a selected constant to achieve a specified censoring rate.

Notice that the true correlation structure of the Clayton model is exchangeable, the working correlation in (2.9) is taken to be exchangeable, and the corresponding estimated QIF estimator is denoted as ${\hat{β}}_{QE}$ . We calculated another estimated QIF estimator ${\hat{β}}_{QA}$ using the misspecified AR(1) working correlation. The corresponding resulting estimators of the QIF method based only on the validation set are denoted as ${\tilde{β}}_{QEV}$ and ${\tilde{β}}_{QAV}$ , respectively.

To estimate the induced relative risk function and the baseline hazard function, we apply the Epanechnikov kernel function in (2.8) and (2.7) with bandwidths $μ_{k} = 2 \hat{σ} (A_{V_{k}}) n_{k}^{- 1 / 3}$ and $ν_{k} = 1.06 \hat{σ} (T_{k}) n^{- 1 / 5}$ , respectively, where $\hat{σ} (\cdot)$ is the sample standard deviation function, $A_{V_{k}}$ is the part of auxiliary covariate $A_{k}$ in the kth marginal validation set. We choose the nearest neighbor interpolation to estimate the induced relative function when the denominator in (2.8) is 0. In addition, it is worth noting that ${\hat{λ}}_{0 k} (t)$ may be 0 at some locations because the Epanechnikov kernel function is of bounded support, which could make the diagonal matrix ${\hat{Ξ}}_{i} (t ; β)$ not invertible. If this happens, we replace ${\hat{λ}}_{0 k} (t)$ with the average of values at the non-zero locations.

We consider two types of simulations:

$β_{k}$ s are the same for different failure type, i.e. $β_{1} = \dots = β_{K} = β$ .
$β_{k}$ varies across failure type.

The auxiliary covariate $A_{k}$ is generated from $X_{k}$ via

\begin{matrix} A_{k} = X_{k} + ϵ_{k}, \end{matrix}

where $ϵ_{k}$ follows a normal distribution $N (0, σ^{2})$ , the positive parameter $σ$ controls the strength of association between $A_{k}$ and $X_{k}$ . Each simulation is repeated 1000 times.

Simulation study (1)

In the first simulation, we set the true parameter $β = {(β^{(1)}, β^{(2)})}^{T} = {(0.693, - 0.2)}^{T}$ , validation proportion $ρ_{k} = 0.5$ , association parameter $σ = 0.1$ . The number of independent clusters is $n = 200$ , with $K = 4$ or 8 failure types in each cluster.

Tables 1, 2 and 3 demonstrate the simulation results for estimates of parameter $β$ for each method under different censoring rates 10%, 40% and 80%. The sample mean and sample standard deviation of the 1000 estimates, the average of estimated standard errors and the coverage rate of the 95% confidence intervals for the true parameter are listed in the Est, SD, SE and CR columns, respectively. RE, the ratio of the empirical variance of ${\hat{β}}_{I}$ to that of ${\hat{β}}_{QE}$ or ${\hat{β}}_{QA}$ , is the estimated relative efficiency of estimated QIF estimators relative to ${\hat{β}}_{I}$ . We summarize the results as follows: (i) The estimates of all the methods are all approximately unbiased. Moreover, the estimators of the asymptotic standard errors are approximately equal to the empirical standard deviations. The corresponding 95% confidence intervals calculated by the estimated standard errors provide reasonable coverage rates. This suggests that the estimates of asymptotic standard errors for all methods work well. (ii) For each considered scenario, the estimator ${\hat{β}}_{I}$ using auxiliary information is more efficient than the estimators ${\tilde{β}}_{QEV}$ and ${\tilde{β}}_{QAV}$ using validation set only. However, ${\hat{β}}_{I}$ loses efficiency when the degree of correlation within a cluster becomes stronger. (iii) As K increases, the empirical standard deviations (SD) of all the estimators decrease. That is naturally because of the increase in the total amount of data. (iv) As $θ$ decreases, the efficiency gain of estimated QIF estimators relative to ${\hat{β}}_{I}$ increases. From Table 1, estimated QIF estimators are more efficient than the other estimators for all combinations of $θ$ and K. We also observe the same trend from Table 2. From Table 3, however, the estimated QIF estimators are less efficient than ${\hat{β}}_{I}$ , although REs are very close to 1, in several cases due to the reduction of correlation when censoring rate is 80%. Furthermore, as expected, ${\hat{β}}_{QE}$ with correct working correlation is always more efficient than ${\hat{β}}_{QA}$ with misspecified working correlation. (v) The validation proportion of the incomplete covariate has effect on the values of RE, especially for the first parameter. For example, when $K = 4$ and 10% censoring, the REs of ${\hat{β}}_{QE}$ relative to ${\hat{β}}_{I}$ for ${\hat{β}}^{(1)}$ decrease from (3.34, 2.26, 1.21) to (2.85, 2.03, 1.18) , when validation proportion decreases from 0.5 to 0.3 (results not shown). However, when we increased n, not only the CRs but also the REs increased.

Table 1.

Simulation results for common effect size across failure type: $β = {(β^{(1)}, β^{(2)})}^{T} = {(0.693, - 0.2)}^{T}$ under the censoring rate 10%

K	$θ$	Method	${\hat{β}}^{(1)}$					${\hat{β}}^{(2)}$
K	$θ$	Method	Est	SD	SE	CR	RE	Est	SD	SE	CR	RE
4	0.25	${\tilde{β}}_{QEV}$	0.686	0.164	0.163	0.952	–	- 0.202	0.098	0.091	0.926	–
		${\tilde{β}}_{QAV}$	0.682	0.171	0.169	0.948	–	- 0.205	0.104	0.095	0.917	–
		${\hat{β}}_{I}$	0.675	0.143	0.141	0.945	–	- 0.203	0.077	0.075	0.938	–
		${\hat{β}}_{QE}$	0.684	0.078	0.078	0.942	3.34	- 0.200	0.039	0.038	0.936	3.93
		${\hat{β}}_{QA}$	0.680	0.089	0.088	0.933	2.62	- 0.201	0.047	0.045	0.941	2.65
	0.5	${\tilde{β}}_{QEV}$	0.683	0.166	0.167	0.955	–	- 0.202	0.103	0.094	0.925	–
		${\tilde{β}}_{QAV}$	0.679	0.174	0.173	0.946	–	- 0.205	0.107	0.097	0.926	–
		${\hat{β}}_{I}$	0.673	0.142	0.139	0.943	–	- 0.202	0.077	0.075	0.941	–
		${\hat{β}}_{QE}$	0.681	0.094	0.091	0.935	2.26	- 0.199	0.048	0.047	0.940	2.55
		${\hat{β}}_{QA}$	0.676	0.103	0.101	0.930	1.91	- 0.201	0.055	0.053	0.945	1.97
	2	${\tilde{β}}_{QEV}$	0.678	0.184	0.179	0.939	–	- 0.203	0.111	0.102	0.932	–
		${\tilde{β}}_{QAV}$	0.676	0.188	0.181	0.924	–	- 0.204	0.112	0.103	0.927	–
		${\hat{β}}_{I}$	0.673	0.139	0.137	0.946	–	- 0.201	0.078	0.075	0.938	–
		${\hat{β}}_{QE}$	0.675	0.127	0.123	0.938	1.21	- 0.200	0.069	0.067	0.941	1.24
		${\hat{β}}_{QA}$	0.672	0.133	0.127	0.928	1.11	- 0.201	0.072	0.069	0.935	1.15
8	0.25	${\tilde{β}}_{QEV}$	0.699	0.122	0.113	0.939	–	-0.202	0.066	0.062	0.933	–
		${\tilde{β}}_{QAV}$	0.697	0.130	0.121	0.929	–	-0.201	0.070	0.067	0.936	–
		${\hat{β}}_{I}$	0.683	0.106	0.103	0.941	–	-0.201	0.054	0.054	0.949	–
		${\hat{β}}_{QE}$	0.686	0.061	0.059	0.938	3.05	-0.200	0.027	0.026	0.942	4.12
		${\hat{β}}_{QA}$	0.687	0.068	0.066	0.940	2.38	-0.200	0.031	0.031	0.949	2.96
	0.5	${\tilde{β}}_{QEV}$	0.698	0.126	0.116	0.927	–	-0.200	0.069	0.064	0.923	–
		${\tilde{β}}_{QAV}$	0.695	0.133	0.123	0.935	–	-0.200	0.072	0.068	0.931	–
		${\hat{β}}_{I}$	0.684	0.103	0.101	0.945	–	-0.201	0.054	0.054	0.950	–
		${\hat{β}}_{QE}$	0.688	0.068	0.065	0.931	2.31	-0.200	0.032	0.032	0.946	2.89
		${\hat{β}}_{QA}$	0.687	0.077	0.073	0.929	1.80	-0.200	0.037	0.037	0.939	2.17
	2	${\tilde{β}}_{QEV}$	0.698	0.135	0.125	0.922	–	-0.199	0.074	0.071	0.941	–
		${\tilde{β}}_{QAV}$	0.694	0.139	0.128	0.918	–	-0.200	0.075	0.073	0.945	–
		${\hat{β}}_{I}$	0.687	0.099	0.098	0.935	–	-0.200	0.051	0.053	0.949	–
		${\hat{β}}_{QE}$	0.690	0.087	0.084	0.938	1.29	-0.199	0.044	0.045	0.948	1.37
		${\hat{β}}_{QA}$	0.687	0.094	0.090	0.931	1.13	-0.200	0.048	0.048	0.944	1.16

Open in a new tab

${\tilde{β}}_{QEV}$ is the estimator of the QIF method with exchangeable working correlation based only on the validation set, while ${\tilde{β}}_{QAV}$ is the one with AR(1) working correlation. ${\hat{β}}_{I}$ is the EPPL estimator using the independent structure. ${\hat{β}}_{QE}$ and ${\hat{β}}_{QA}$ are the estimators of the proposed estimated QIF method with exchangeable and AR(1) working correlation, respectively

Table 2.

Simulation results for common effect size across failure type: $β = {(β^{(1)}, β^{(2)})}^{T} = {(0.693, - 0.2)}^{T}$ under the censoring rate 40%

K	$θ$	Method	${\hat{β}}^{(1)}$					${\hat{β}}^{(2)}$
K	$θ$	Method	Est	SD	SE	CR	RE	Est	SD	SE	CR	RE
4	0.25	${\tilde{β}}_{QEV}$	0.679	0.213	0.206	0.936	–	- 0.205	0.121	0.117	0.946	–
		${\tilde{β}}_{QAV}$	0.676	0.222	0.213	0.932	–	- 0.206	0.124	0.121	0.950	–
		${\hat{β}}_{I}$	0.681	0.177	0.169	0.936	–	- 0.205	0.093	0.092	0.950	–
		${\hat{β}}_{QE}$	0.687	0.120	0.116	0.941	2.18	- 0.201	0.062	0.061	0.946	2.20
		${\hat{β}}_{QA}$	0.682	0.131	0.127	0.934	1.82	- 0.202	0.069	0.068	0.949	1.79
	0.5	${\tilde{β}}_{QEV}$	0.673	0.218	0.212	0.935	–	- 0.206	0.127	0.120	0.941	–
		${\tilde{β}}_{QAV}$	0.670	0.224	0.217	0.934	–	- 0.207	0.129	0.123	0.940	–
		${\hat{β}}_{I}$	0.680	0.174	0.168	0.939	–	- 0.204	0.091	0.092	0.954	–
		${\hat{β}}_{QE}$	0.685	0.132	0.131	0.944	1.73	- 0.202	0.070	0.070	0.952	1.66
		${\hat{β}}_{QA}$	0.679	0.143	0.140	0.935	1.48	- 0.203	0.076	0.076	0.954	1.45
	2	${\tilde{β}}_{QEV}$	0.668	0.225	0.221	0.939	–	- 0.204	0.134	0.127	0.944	–
		${\tilde{β}}_{QAV}$	0.666	0.226	0.222	0.934	–	- 0.205	0.134	0.127	0.937	–
		${\hat{β}}_{I}$	0.679	0.166	0.166	0.942	–	- 0.202	0.092	0.092	0.954	–
		${\hat{β}}_{QE}$	0.678	0.156	0.157	0.940	1.14	- 0.202	0.088	0.086	0.950	1.11
		${\hat{β}}_{QA}$	0.675	0.161	0.160	0.940	1.07	- 0.203	0.090	0.088	0.947	1.05
8	0.25	${\tilde{β}}_{QEV}$	0.702	0.152	0.143	0.929	–	- 0.201	0.084	0.080	0.944	–
		${\tilde{β}}_{QAV}$	0.701	0.159	0.152	0.940	–	- 0.201	0.088	0.085	0.944	–
		${\hat{β}}_{I}$	0.695	0.122	0.123	0.948	–	- 0.203	0.066	0.066	0.947	–
		${\hat{β}}_{QE}$	0.696	0.085	0.082	0.941	2.08	- 0.201	0.042	0.041	0.944	2.39
		${\hat{β}}_{QA}$	0.697	0.094	0.092	0.942	1.68	- 0.201	0.048	0.048	0.944	1.85
	0.5	${\tilde{β}}_{QEV}$	0.706	0.157	0.147	0.934	–	- 0.200	0.087	0.083	0.945	–
		${\tilde{β}}_{QAV}$	0.702	0.163	0.154	0.948	–	- 0.200	0.090	0.087	0.950	–
		${\hat{β}}_{I}$	0.697	0.120	0.121	0.946	–	- 0.202	0.066	0.065	0.951	–
		${\hat{β}}_{QE}$	0.700	0.092	0.090	0.945	1.69	- 0.200	0.048	0.047	0.955	1.85
		${\hat{β}}_{QA}$	0.699	0.102	0.099	0.938	1.39	- 0.200	0.054	0.053	0.950	1.48
	2	${\tilde{β}}_{QEV}$	0.700	0.165	0.155	0.939	–	- 0.199	0.089	0.088	0.947	–
		${\tilde{β}}_{QAV}$	0.697	0.168	0.158	0.934	–	- 0.200	0.091	0.090	0.945	–
		${\hat{β}}_{I}$	0.695	0.120	0.118	0.949	–	- 0.200	0.063	0.065	0.950	–
		${\hat{β}}_{QE}$	0.698	0.111	0.108	0.943	1.16	- 0.199	0.058	0.059	0.946	1.19
		${\hat{β}}_{QA}$	0.695	0.116	0.113	0.950	1.07	- 0.201	0.062	0.062	0.950	1.04

Open in a new tab

See Table 1

Table 3.

Simulation results for common effect size across failure type: $β = {(β^{(1)}, β^{(2)})}^{T} = {(0.693, - 0.2)}^{T}$ under the censoring rate 80%

K	$θ$	Method	${\hat{β}}^{(1)}$					${\hat{β}}^{(2)}$
K	$θ$	Method	Est	SD	SE	CR	RE	Est	SD	SE	CR	RE
4	0.25	${\tilde{β}}_{QEV}$	0.655	0.412	0.374	0.924	–	- 0.216	0.250	0.215	0.921	–
		${\tilde{β}}_{QAV}$	0.649	0.414	0.379	0.928	–	- 0.216	0.244	0.219	0.923	–
		${\hat{β}}_{I}$	0.684	0.308	0.288	0.939	–	- 0.198	0.164	0.159	0.941	–
		${\hat{β}}_{QE}$	0.695	0.264	0.248	0.936	1.36	- 0.199	0.147	0.136	0.935	1.24
		${\hat{β}}_{QA}$	0.685	0.277	0.259	0.943	1.23	- 0.200	0.153	0.143	0.936	1.15
	0.5	${\tilde{β}}_{QEV}$	0.652	0.435	0.381	0.929	–	- 0.211	0.241	0.221	0.938	–
		${\tilde{β}}_{QAV}$	0.651	0.412	0.384	0.933	–	- 0.209	0.238	0.222	0.939	–
		${\hat{β}}_{I}$	0.689	0.304	0.287	0.939	–	- 0.196	0.160	0.159	0.946	–
		${\hat{β}}_{QE}$	0.691	0.282	0.266	0.938	1.17	- 0.199	0.155	0.147	0.940	1.07
		${\hat{β}}_{QA}$	0.682	0.290	0.273	0.938	1.10	- 0.200	0.156	0.151	0.945	1.06
	2	${\tilde{β}}_{QEV}$	0.658	0.432	0.385	0.918	–	- 0.210	0.248	0.224	0.932	–
		${\tilde{β}}_{QAV}$	0.660	0.417	0.386	0.915	–	- 0.210	0.250	0.224	0.936	–
		${\hat{β}}_{I}$	0.698	0.296	0.287	0.942	–	- 0.195	0.165	0.159	0.945	–
		${\hat{β}}_{QE}$	0.685	0.298	0.282	0.938	0.99	- 0.202	0.167	0.157	0.936	0.98
		${\hat{β}}_{QA}$	0.683	0.299	0.283	0.934	0.98	- 0.202	0.167	0.157	0.931	0.97
8	0.25	${\tilde{β}}_{QEV}$	0.705	0.269	0.260	0.937	–	- 0.203	0.147	0.148	0.953	–
		${\tilde{β}}_{QAV}$	0.696	0.284	0.269	0.949	–	- 0.205	0.149	0.154	0.961	–
		${\hat{β}}_{I}$	0.706	0.214	0.206	0.937	–	- 0.205	0.111	0.113	0.953	–
		${\hat{β}}_{QE}$	0.710	0.178	0.169	0.939	1.44	- 0.204	0.093	0.092	0.942	1.43
		${\hat{β}}_{QA}$	0.706	0.195	0.183	0.928	1.20	- 0.206	0.100	0.100	0.956	1.23
	0.5	${\tilde{β}}_{QEV}$	0.699	0.281	0.266	0.934	–	- 0.202	0.151	0.153	0.949	–
		${\tilde{β}}_{QAV}$	0.692	0.288	0.271	0.930	–	- 0.204	0.153	0.156	0.947	–
		${\hat{β}}_{I}$	0.707	0.204	0.204	0.937	–	- 0.202	0.111	0.112	0.954	–
		${\hat{β}}_{QE}$	0.709	0.188	0.183	0.938	1.19	- 0.202	0.100	0.101	0.954	1.22
		${\hat{β}}_{QA}$	0.705	0.199	0.193	0.929	1.06	- 0.204	0.106	0.106	0.949	1.10
	2	${\tilde{β}}_{QEV}$	0.684	0.291	0.273	0.932	–	- 0.205	0.164	0.157	0.937	–
		${\tilde{β}}_{QAV}$	0.681	0.290	0.273	0.929	–	- 0.204	0.161	0.157	0.942	–
		${\hat{β}}_{I}$	0.705	0.207	0.203	0.943	–	- 0.203	0.112	0.112	0.954	–
		${\hat{β}}_{QE}$	0.702	0.207	0.198	0.942	1.00	- 0.205	0.111	0.109	0.947	1.02
		${\hat{β}}_{QA}$	0.700	0.209	0.200	0.944	0.98	- 0.206	0.112	0.110	0.947	0.99

Open in a new tab

See Table 1

Simulation study (2)

In practical studies, one may be interested in the failure-type-specific model

\begin{matrix} λ_{ik} (t ; {\tilde{Z}}_{ik} (t)) = λ_{0 k} (t) exp {β_{k}^{T} {\tilde{Z}}_{ik} (t)}, \end{matrix}

which allows the regression parameters varying with the failure type. We simulate $K = 2$ failure types in each cluster, the true parameter $β_{1} = {(β_{1}^{(1)}, β_{1}^{(2)})}^{T} = {(0.693, - 0.2)}^{T}$ and $β_{2} = {(β_{2}^{(1)}, β_{2}^{(2)})}^{T} = {(0.5, - 0.262)}^{T}$ . Since the cluster size is 2, we only need to consider the exchangeable working correlation structure. Consider three settings of n and censoring rate (CE), (n, CE) $= (300, 10 %), (300, 40 %), (700, 80 %)$ . The simulation results are shown in Table 4. From this table, we can observe similar results as in Simulation Study (1).

Table 4.

Simulation results for varying effect size across failure type: $β_{1} = {(β_{1}^{(1)}, β_{1}^{(2)})}^{T} = {(0.693, - 0.2)}^{T}, β_{2} = {(β_{2}^{(1)}, β_{2}^{(2)})}^{T} = {(0.5, - 0.262)}^{T}$

$θ$	Method	${\hat{β}}_{1}^{(1)}$					${\hat{β}}_{1}^{(2)}$					${\hat{β}}_{2}^{(1)}$					${\hat{β}}_{2}^{(2)}$
$θ$	Method	Est	SD	SE	CR	RE	Est	SD	SE	CR	RE	Est	SD	SE	CR	RE	Est	SD	SE	CR	RE
		$n = 300$ , censoring rate 10%
0.25	${\tilde{β}}_{QEV}$	0.696	0.294	0.272	0.941	–	- 0.191	0.170	0.155	0.931	–	0.477	0.296	0.270	0.929	–	- 0.263	0.172	0.156	0.928	–
	${\hat{β}}_{I}$	0.693	0.229	0.222	0.940	–	- 0.194	0.121	0.123	0.948	–	0.495	0.226	0.222	0.943	–	- 0.257	0.127	0.124	0.946	–
	${\hat{β}}_{QE}$	0.681	0.139	0.130	0.934	2.69	- 0.200	0.073	0.069	0.930	2.75	0.493	0.139	0.128	0.915	2.62	- 0.263	0.070	0.069	0.949	3.31
0.5	${\tilde{β}}_{QEV}$	0.694	0.301	0.278	0.934	–	- 0.190	0.172	0.158	0.926	–	0.484	0.302	0.277	0.929	–	- 0.263	0.176	0.160	0.925	–
	${\hat{β}}_{I}$	0.693	0.229	0.222	0.941	–	- 0.194	0.121	0.123	0.948	–	0.498	0.226	0.222	0.936	–	- 0.258	0.127	0.124	0.946	–
	${\hat{β}}_{QE}$	0.682	0.170	0.158	0.937	1.81	- 0.198	0.090	0.085	0.932	1.82	0.495	0.170	0.157	0.930	1.76	- 0.263	0.087	0.085	0.947	2.13
2	${\tilde{β}}_{QEV}$	0.688	0.315	0.291	0.938	–	- 0.189	0.179	0.166	0.932	–	0.482	0.315	0.291	0.921	–	- 0.267	0.184	0.168	0.926	–
	${\hat{β}}_{I}$	0.693	0.229	0.222	0.941	–	- 0.194	0.121	0.123	0.948	–	0.502	0.225	0.222	0.950	–	- 0.262	0.123	0.124	0.961	–
	${\hat{β}}_{QE}$	0.677	0.223	0.206	0.932	1.05	- 0.195	0.117	0.113	0.937	1.08	0.490	0.220	0.206	0.938	1.05	- 0.266	0.117	0.114	0.943	1.10
		$n = 300$ , censoring rate 40%
0.25	${\tilde{β}}_{QEV}$	0.687	0.368	0.342	0.933	–	- 0.202	0.203	0.195	0.946	–	0.449	0.375	0.349	0.919	–	- 0.270	0.220	0.202	0.934	–
	${\hat{β}}_{I}$	0.702	0.273	0.269	0.953	–	- 0.195	0.142	0.148	0.956	–	0.490	0.265	0.274	0.953	–	- 0.260	0.155	0.153	0.950	–
	${\hat{β}}_{QE}$	0.684	0.213	0.203	0.945	1.64	- 0.204	0.113	0.110	0.937	1.59	0.477	0.210	0.206	0.936	1.58	- 0.266	0.119	0.114	0.931	1.71
0.5	${\tilde{β}}_{QEV}$	0.684	0.378	0.348	0.931	–	- 0.199	0.209	0.199	0.940	–	0.461	0.383	0.356	0.927	–	- 0.272	0.219	0.206	0.939	–
	${\hat{β}}_{I}$	0.702	0.273	0.269	0.953	–	- 0.195	0.142	0.148	0.956	–	0.498	0.271	0.274	0.948	–	- 0.260	0.155	0.153	0.948	–
	${\hat{β}}_{QE}$	0.684	0.237	0.226	0.941	1.32	- 0.199	0.126	0.124	0.940	1.28	0.483	0.238	0.231	0.947	1.30	- 0.266	0.130	0.128	0.947	1.44
2	${\tilde{β}}_{QEV}$	0.680	0.387	0.357	0.933	–	- 0.200	0.214	0.205	0.935	–	0.481	0.411	0.366	0.913	–	- 0.276	0.225	0.212	0.932	–
	${\hat{β}}_{I}$	0.702	0.273	0.269	0.953	–	- 0.195	0.142	0.148	0.956	–	0.508	0.282	0.275	0.939	–	- 0.260	0.152	0.153	0.950	–
	${\hat{β}}_{QE}$	0.683	0.271	0.259	0.942	1.01	- 0.200	0.142	0.142	0.950	1.01	0.491	0.282	0.264	0.932	1.00	- 0.268	0.148	0.147	0.946	1.05
		$n = 700$ , censoring rate 80%
0.25	${\tilde{β}}_{QEV}$	0.672	0.404	0.397	0.948	–	- 0.213	0.241	0.229	0.947	–	0.477	0.471	0.418	0.912	–	- 0.283	0.259	0.243	0.938	–
	${\hat{β}}_{I}$	0.701	0.311	0.299	0.940	–	- 0.200	0.164	0.166	0.955	–	0.506	0.323	0.314	0.942	–	- 0.262	0.169	0.176	0.958	–
	${\hat{β}}_{QE}$	0.686	0.283	0.272	0.936	1.20	- 0.206	0.150	0.151	0.948	1.18	0.494	0.305	0.286	0.938	1.12	- 0.270	0.162	0.160	0.948	1.09
0.5	${\tilde{β}}_{QEV}$	0.670	0.412	0.401	0.938	–	- 0.213	0.245	0.232	0.952	–	0.473	0.476	0.422	0.922	–	- 0.283	0.267	0.246	0.934	–
	${\hat{β}}_{I}$	0.701	0.311	0.299	0.940	–	- 0.200	0.164	0.166	0.955	–	0.510	0.328	0.314	0.943	–	- 0.267	0.172	0.176	0.957	–
	${\hat{β}}_{QE}$	0.683	0.299	0.286	0.942	1.08	- 0.207	0.160	0.159	0.950	1.04	0.494	0.327	0.301	0.931	1.01	- 0.276	0.172	0.168	0.947	1.00
2	${\tilde{β}}_{QEV}$	0.672	0.420	0.405	0.933	–	- 0.215	0.247	0.234	0.947	–	0.474	0.458	0.425	0.935	–	- 0.278	0.274	0.248	0.934	–
	${\hat{β}}_{I}$	0.701	0.311	0.299	0.940	–	- 0.200	0.164	0.166	0.955	–	0.521	0.316	0.314	0.948	–	- 0.263	0.183	0.176	0.948	–
	${\hat{β}}_{QE}$	0.687	0.313	0.295	0.930	0.99	- 0.207	0.165	0.165	0.947	0.99	0.499	0.324	0.310	0.941	0.96	- 0.271	0.187	0.174	0.940	0.96

Open in a new tab

The cluster size $K = 2$ , validation proportion $ρ_{k} = 0.5$ , association parameter $σ = 0.1$

Performance of inference method

We also conduct simulation studies to assess the performance of the proposed chi-squared test method. The data are generated from the same model as in Simulation Study (1) with $β = {(β^{(1)}, β^{(2)})}^{T} = {(0.693, - 0.2)}^{T}, n = 200, K = 4, θ = 0.25, ρ_{k} = 0.5, σ = 0.1$ , and censoring rate is 10%. First, we are interested in testing $H_{0} : β^{(1)} = 0.693$ versus $H_{1} : β^{(1)} \neq 0.693$ . Since the dimension of $β^{(1)}$ is 1, the test statistic T in (4.1) asymptotically follows $χ_{1}^{2}$ , where ${\tilde{β}}^{(2)}$ is calculated by minimizing ${\hat{Q}}_{n} (0.693, β^{(2)})$ with exchangeable or AR(1) working correlation. Figure 1 shows Q–Q plot based on 1000 replications. It is clear that the plots indicate proximity to the $χ_{1}^{2}$ distribution for both exchangeable and AR(1) working correlation. We also examine the power of the proposed test under $H_{1} : β^{(1)} = β_{*}^{(1)}$ . The powers with significance level $α = 0.05$ are calculated when $β_{*}^{(1)}$ takes different values in [0.3, 0.693]. According to the simulation results, when $β_{*}^{(1)} = 0.693$ , i.e., the alternative hypothesis collapses into the null hypothesis, powers are 0.051 and 0.061 for exchangeable and AR(1) working correlation, respectively. It shows that the proposed chi-squared test gives the right level for testing. Figure 2 plots the power functions of the chi-squared test for the estimated QIF method and the QIF method with two different working correlations, and the EPPL method. We can observe that the power functions decrease rapidly as $β_{*}^{(1)}$ gets closer to the true value (0.693), but the power function for exchangeable working correlation is always larger than that for AR(1) working correlation, thus the test with correct working correlation is more powerful than the one with misspecified working correlation. Nonetheless, powers of the chi-squared test for either of the estimated QIF methods are larger than those of the other two methods. In addition, the power for the EPPL method that utilizes the auxiliary information is larger than those for QIF method based on the validation set only. Similarly, we also consider the hypothesis test that $H_{0} : β^{(2)} = - 0.2$ , and $H_{1} : β^{(2)} \neq - 0.2$ , and compute the powers under $H_{1} : β^{(2)} = β_{*}^{(2)}$ when $β_{*}^{(2)}$ varies in $[- 0.2, 0]$ . Similar results are obtained but not presented in this paper due to space limitation.

Fig. 1 — Q–Q plot for the test statistic versus $χ_{1}^{2}$ under $H_{0}$ for 1000 replications

Fig. 2 — Power functions of the chi-squared test for the proposed method, the EPPL method and the QIF method under $H_{0} : β^{(1)} = 0.693, H_{1} : β^{(1)} = β_{*}^{(1)}$

Analysis of SOLVD data

We apply the proposed method to Left Ventricular Dysfunction (SOLVD 1991) study in this section. The SOLVD study was a randomized, double-masked, placebo-controlled trial between 1986 and 1991. The trial had a three-year recruitment and a two-year follow-up. The basic inclusion criteria for the prevention trial were: age between 21 and 80 years, inclusive, no overt symptoms of congestive heart failure, and left ventricular EF less than 35%. EF is a number between 0 and 100 that measures the efficiency of the heart in ejecting blood. A total of 4228 patients with asymptomatic left ventricular dysfunction were randomly assigned to receive either enalapril or placebo at one of the 83 hospitals linked to 23 centers in the United States, Canada, and Belgium. Liu et al. (2009) and Liu et al. (2010) have analyzed this data without considering the intra-cluster correlation.

The primary clinical issues of interest are the effects of covariates on the risk of heart failure and on the first nonfatal myocardial infarction (MI) after adjusting for the confounding variables. The covariates of interest are ejection fraction, patient’s gender (SEX), which is coded 1 for male and 0 for female, treatment (TRT), which is coded 1 for enalapril and 0 for placebo, and patient’s age (AGE), which is measured in years. In the SOLVD study, the covariates SEX, TRT and AGE were recorded for almost all of the patients, but only 108 among the total of 4228 patients have their ejection fraction accurately measured using a standardized radionucleotide technique (LVEF). A related nonstandardized measure (EF) was ascertained for all the patients. Therefore, the nonstandardized measure (EF) is a surrogate measure for the standardized measure (LVEF) in this case.

In terms of the notation in the previous sections, we set $X_{ik} = L V E F_{ik}, A_{ik} = E F_{ik}, Z_{ik} = {(T R T_{ik}, S E X_{ik}, A G E_{ik})}^{T}$ , where k denotes failure type with $k = 1$ for heart failure and $k = 2$ for nonfatal MI and i denotes the patient with $i = 1, \dots, 4228$ . Let $β_{k} = {(β_{1 k}, β_{2 k}^{T})}^{T}$ be the unknown regression coefficients, we fit the following marginal hazards model to the SOLVD data:

\begin{matrix} λ_{ik} (t ; X_{ik}, Z_{ik}) = λ_{0 k} (t) exp (β_{1 k} X_{ik} + β_{2 k}^{T} Z_{ik}) . \end{matrix}

Since the primary covariate LVEF is continuous and severely incomplete, we need to estimate the induced relative risk function using the validation set. Furthermore, Liu et al. (2009) found that, given EF, the LVEF is conditionally independent of the rest of the covariates, thus $ψ_{ik} (t ; β)$ can be estimated through (2.8) with $A_{ik}^{*} = A_{ik}$ .

Table 5 presents the data analysis results for two methods, the proposed estimated QIF method, and the EPPL method which ignores the intra-cluster correlation between the failure times. It can be seen that the parameter estimates from two methods are close but the proposed estimated QIF method have smaller standard errors. To test whether or not the covariates have significant effects on the times of heart failure and nonfatal MI, we calculate p-values from both the two-sided Z-test and the chi-squared test for the two methods. The results indicate that, at 0.05 significance level, all of the covariates are statistically significant for heart failure by the proposed method, while SEX is not significant from the EPPL method. From both methods, only TRT is significant for the risk of nonfatal MI. By the proposed method, the risk of heart failure decreases by 3.92% (95% CI [1.83%, 5.97%]) with 1% increase in LVEF, the risk increases by 2.63% (95% CI [1.43%, 3.85%]) per year increase in age, males have about 29.46% (95% CI [7.19%, 46.39%]) lower risk for heart failure than females, and enalapril reduces the risk by 33.77% (95% CI [20.52%, 44.80%]).

Table 5.

SOLVD data analysis results

Covariate	Proposed method				EPPL method
	Coef	SE	P value		Coef	SE	P value
	Coef	SE	Z-test	$χ^{2}$ -test	Coef	SE	Z-test	$χ^{2}$ -test
For heart failure
LVEF	- 0.040	0.011	< 0.001	< 0.001	- 0.045	0.012	< 0.001	< 0.001
TRT	- 0.412	0.093	< 0.001	< 0.001	- 0.454	0.106	< 0.001	< 0.001
SEX	- 0.349	0.140	0.013	< 0.001	- 0.318	0.165	0.054	0.072
AGE	0.026	0.006	< 0.001	< 0.001	0.023	0.012	0.045	0.009
For nonfatal MI
LVEF	0.006	0.011	0.566	0.534	0.023	0.015	0.111	0.102
TRT	- 0.433	0.116	< 0.001	< 0.001	- 0.391	0.131	0.003	0.002
SEX	0.084	0.196	0.669	0.294	0.048	0.214	0.822	0.815
AGE	0.006	0.006	0.275	0.715	0.004	0.008	0.651	0.597

Open in a new tab

Proposed method is referred to as the estimated QIF method with exchangeable working correlation. EPPL method is the method which uses the independent structure

To illustrate the prediction of the survival probability for a subject, Fig. 3 shows the estimated survival curves of heart failure and nonfatal MI for a 69-year-old male patient with LVEF of 28% (the median of LVEF), receiving enalapril. The survival curves by the two methods are very close, but the pointwise confidence intervals from the proposed method are narrower than those from the EPPL method.

Concluding remarks

We proposed an estimated QIF approach for multivariate failure time data when the primary covariate is ascertained on a subset of full cohort but auxiliary information is available on the full cohort. For our proposed approach, we allowed the censoring times for different failures for a subject to be different. It is worth noting that in practice the censoring times for different failures for a subject are usually the same. This can be treated as a special situation of our general set up and the proposed method is applicable.

In this article, we consider the situation that auxiliary variable is continuous. The method is based on the kernel smoothing technique and therefore is nonparametric with respect to the association between the missing covariate and corresponding auxiliary. QIF method has advantage of incorporating intra-cluster correlation in the estimation procedure. Compared with other existing methods (e.g., Liu et al. 2010) where intra-cluster correlation is not considered, the proposed procedure can improve the estimation efficiency without requiring the specification of the correlation formula. Another advantage of proposed method is that it is easy to implement. In this work, we consider the situation when the dimension of continuous auxiliary covariates is low, further research is needed when the dimension is high.

Acknowledgements

We are grateful to the associate editor and the reviewers for their detailed and constructive comments which led to the improvements in the paper. This research is supported in part by the National Key Research and Development Project of China No. 2018YFC1314603 (Liu), the National Science Foundation of China (NSFC) Grants 11771366 (Yan, Liu), the U.S. National Institute of Health Grants P01 CA142538 (Cai, Zhou), P42ES031007 (Zhou) and P30 ES010126 (Zhou).

Appendix

For $i = 1, \dots, n$ , let $Φ_{i} (t ; β, B) = Ξ_{i}^{1 / 2} (t ; β) B Ξ_{i}^{- 1 / 2} (t ; β)$ , and $Φ_{i} (t ; β, B) = {ϕ_{imk}}_{m, k = 1}^{K}$ . For $k = 1, \dots, K$ , some quantities are defined as follows:

\begin{matrix} S_{k}^{d} (t ; β) = n^{- 1} \sum_{l = 1}^{n} Y_{lk} (t) r_{lk}^{(d)} (t ; β), (d = 0, 1), \\ S_{k}^{d} (t ; β, B) = n^{- 1} \sum_{l = 1}^{n} \sum_{m = 1}^{K} Y_{lk} (t) R_{lm} (t ; β) ϕ_{lmk} (t ; β, B) {(r_{lk}^{(d - 2)} (t ; β))}^{T}, (d = 2, 3), \\ S_{k}^{4} (t ; β, B) = n^{- 1} \sum_{l = 1}^{n} \sum_{m = 1}^{K} Y_{lk} (t) R_{lm} (t ; β) {\partial ϕ_{lmk} (t ; β, B) / \partial β^{T}} r_{lk} (t ; β), \\ S_{k}^{5} (t ; β, B) = n^{- 1} \sum_{l = 1}^{n} \sum_{m = 1}^{K} Y_{lk} (t) {\partial R_{lm} (t ; β) / \partial β^{T}} ϕ_{lmk} (t ; β, B) r_{lk} (t ; β) . \end{matrix}

For $d = 0, \dots, 5$ , define ${\hat{S}}_{k}^{d}$ by substituting ${\hat{r}}_{lk} (t ; β), {\hat{R}}_{lm} (t ; β)$ and ${\hat{ϕ}}_{lmk} (t ; β, B) = {\hat{λ}}_{lm}^{1 / 2} (t ; β) B_{mk} {\hat{λ}}_{lk}^{- 1 / 2} (t ; β)$ for $r_{lk} (t ; β)$ , $R_{lm} (t ; β)$ and $ϕ_{lmk} (t ; β, B)$ in $S_{k}^{d}$ , respectively. Furthermore, we also define

\begin{matrix} E_{k} (t ; β, B) = \frac{S_{k}^{2} (t ; β, B)}{S_{k}^{0} (t ; β)}, V_{k} (t ; β, B) = \frac{S_{k}^{3} (t ; β, B)}{S_{k}^{0} (t ; β)} - \frac{S_{k}^{2} (t ; β, B) {(S_{k}^{1} (t ; β))}^{T}}{(S_{k}^{0} {(t ; β)}^{2})} . \end{matrix}

We introduce some conditions to ensure the consistency and asymptotic normality of ${\hat{β}}_{Q}$ and ${\hat{β}}_{I}$ as follows:

$Λ_{0 k} (τ) = \int_{0}^{τ} λ_{0 k} (t) d t < \infty$ , for $k = 1, \dots, K$ .
$P {Y_{ik} (t) = 1} > 0$ , for all $t \in [0, τ], i = 1, \dots, n$ .
There exists a compact set $B$ , containing $β_{0}$ as its interior point.
Multivariate kernel function $Ψ (\cdot)$ is non-negative and uniformly bounded with finite support satisfying that $\int Ψ (u) d u = 1$ and $\int Ψ^{2} (u) d u < \infty$ . Furthermore, $Ψ (\cdot)$ has order $α_{0}$ in the sense that $α_{0} \equiv inf {| α | > b ; \int_{R^{b}} u^{α} Ψ (u) d u \neq 0}$ , where b is the dimension of $u, u = (u_{1}, \dots, u_{b})$ , $α = (α_{1}, \dots, α_{b})$ with $α_{i}$ s being non-negative integers, $u^{α} = u_{1}^{α_{1}} \dots u_{b}^{α_{b}}$ .

For $k = 1, \dots, K$ , the bandwidth matrix $μ_{k}$ satisfies that $\sqrt{n} {‖ μ_{k} ‖}^{α_{0}} \to 0$ and $log n / (\sqrt{n} ‖ μ_{k} ‖^{b}) \to 0$ .
Let $α_{0}$ be as in (C4), $h_{k} (v, s) = \frac{\partial^{2} H_{k} (1, v, s)}{\partial v \partial s}$ has the $α_{0}$ th continuous derivative with respect to $v$ , where $H_{k} (u, v, s)$ is the joint distribution function of $(η_{k} Y_{k} (t), A_{k}^{*} (t), X_{k} (t))$ for given t.
There exist scalar, vector and matrix functions $s_{k}^{d} (t ; β) (d = 0, 1), s_{k}^{d} (t ; β, B)$ $(d = 2, \dots, 5)$ , such that for all $k = 1, \dots, K$ and all constant matrix B,
$\begin{matrix} \sup_{t \in [0, τ], β \in B} ‖ S_{k}^{d} - s_{k}^{d} ‖ \to 0, \end{matrix}$
in probability.
Let $s_{k}^{d} (t ; β) (d = 0, 1), s_{k}^{d} (t ; β, B) (d = 2, \dots, 5)$ be as in Condition (C6), and set $e_{k} (t ; β, B) = s_{k}^{2} (t ; β, B) / s_{k}^{0} (t ; β), v_{k} (t ; β, B) = s_{k}^{3} (t ; β, B) / s_{k}^{0} (t ; β) - s_{k}^{2} (t ; β, B) {(s_{k}^{1} (t ; β))}^{T} / (s_{k}^{0} {(t ; β)}^{2})$ . Then for all $β \in B$ , $t \in [0, τ]$ , and $k = 1, \dots, K$ ,
$\begin{matrix} s_{k}^{1} (t ; β) = & \partial s_{k}^{0} (t ; β) / \partial β, \\ s_{k}^{3} (t ; β, B) = & \partial s_{k}^{2} (t ; β, B) / \partial β - s_{k}^{4} (t ; β, B) - s_{k}^{5} (t ; β, B), \end{matrix}$
and $s_{k}^{0} (t ; β)$ is bounded away from 0 on $[0, τ] \times B$ . For all basis matrix B, the matrix
$\begin{matrix} Γ (β_{0}, B) = - \sum_{k = 1}^{K} \int_{0}^{τ} v_{k} (t ; β_{0}, B) s_{k}^{0} (t ; β_{0}) λ_{0 k} (t) d t \end{matrix}$ A.1
is negative definite.
There exists a matrix function $ω (\cdot, \cdot, \cdot)$ , such that for any $K \times K$ constant matrices $B_{1}, B_{2}$ ,
$\begin{matrix} n^{- 1} \sum_{i = 1}^{n} \sum_{k = 1}^{K} \int_{0}^{τ} ζ_{ik} (t ; β, B_{1}) ζ_{ik}^{T} (t ; β, B_{2}) s_{k}^{0} (t ; β) λ_{0 k} (t) d t \to ω (β, B_{1}, B_{2}), \end{matrix}$
in probability uniformly for $β \in B$ , where
$\begin{matrix} ζ_{ik} (t ; β, B) = \sum_{m = 1}^{K} R_{im} (t ; β) ϕ_{imk} (t ; β, B) - e_{k} (t ; β, B) . \end{matrix}$
Furthermore, for any set of basis matrices ${B_{j}, j = 1, \dots, L}$ , the matrix
$\begin{matrix} W_{0} = W (β_{0}) = {ω (β_{0}, B_{j}, B_{j^{^{'}}})}_{j, j^{^{'}} = 1}^{L} \end{matrix}$ A.2
is positive definite.
The baseline hazard rates ${λ_{0 k} (\cdot), k = 1, \dots, K}$ are twice continuously differentiable on $[0, τ]$ .

Proof of Theorem 1

Consistency

Following the arguments of Xue et al. (2010), one can show that ${\hat{β}}_{Q}$ is consistent for $β_{0}$ provided that:

(i)
$\partial {\hat{G}}_{n} (β) / \partial β^{T}$ exists and is continuous, and it converges in probability to a fixed function, say $J (β)$ , uniformly for $β \in B$ ;
(ii)
${\hat{G}}_{n} (β_{0}) \to 0$ in probability;
(iii)
$n {\hat{W}}_{n} (β) = \frac{1}{n} \sum_{i = 1}^{n} {\hat{g}}_{i} (β) {\hat{g}}_{i}^{T} (β)$ converges in probability to a constant matrix $W (β)$ uniformly for $β \in B$ ;
(iv)
$n {\hat{W}}_{n} (β_{0})$ is positive definite with probability going to 1 as $n \to \infty$ .

According to the first conclusion of Lemma 1 in Liu et al. (2010), under conditions (C2), (C4) and (C5), for $a = 0, 1, 2$ , we can prove that in probability

\begin{matrix} sup_{t \in [0, τ], β \in B} ‖ {\hat{ψ}}_{ik}^{(a)} (t ; β) - ψ_{ik}^{(a)} (t ; β) ‖ \to 0 . \end{matrix}

Since the kernel smoothed estimator ${\hat{λ}}_{0 k} (t)$ in (2.7) is a consistent estimator for $λ_{0 k} (t)$ , then, by the definitions of ${\hat{S}}_{k}^{d}$ and $S_{k}^{d}$ and the remaining conditions, it follows that

\begin{matrix} sup_{t \in [0, τ], β \in B} ‖ {\hat{S}}_{k}^{d} - S_{k}^{d} ‖ \to 0, \end{matrix}

in probability for $d = 0, 1, \dots, 5$ .

Denote

\begin{matrix} \hat{U} (β, B) = \frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{τ} {\hat{R}}_{i}^{T} (t ; β) {\hat{Ξ}}_{i}^{1 / 2} (t ; β) B {\hat{Ξ}}_{i}^{- 1 / 2} (t ; β) {\hat{M}}_{i} (d t ; β), \end{matrix}

then ${\hat{G}}_{n} (β) = {({\hat{U}}^{T} (β, B_{1}), \dots, {\hat{U}}^{T} (β, B_{L}))}^{T}$ .

After simple algebraic manipulations, we obtain that

\begin{matrix} \hat{U} (β, B) & = \frac{1}{n} \sum_{i = 1}^{n} \sum_{k = 1}^{K} \int_{0}^{τ} {\hat{ζ}}_{ik} (t ; β, B) d N_{ik} (t) \\ = \frac{1}{n} \sum_{i = 1}^{n} \sum_{k = 1}^{K} \int_{0}^{τ} {\hat{ζ}}_{ik} (t ; β, B) d M_{ik} (t) \\ + \frac{1}{n} \sum_{i = 1}^{n} \sum_{k = 1}^{K} \int_{0}^{τ} {\hat{ζ}}_{ik} (t ; β, B) Y_{ik} (t) r_{ik} (t ; β) λ_{0 k} (t) d t, \end{matrix}

where ${\hat{ζ}}_{ik} (t ; β, B) = \sum_{m = 1}^{K} {\hat{R}}_{im} (t ; β) {\hat{ϕ}}_{imk} (t ; β, B) - {\hat{S}}_{k}^{2} (t ; β, B) / {\hat{S}}_{k}^{0} (t ; β)$ , and

\begin{matrix} \frac{\partial \hat{U} (β, B)}{\partial β^{T}} & = \frac{1}{n} \sum_{i = 1}^{n} \sum_{k = 1}^{K} \int_{0}^{τ} \frac{\partial {\hat{ζ}}_{ik} (t ; β, B)}{\partial β^{T}} d M_{ik} (t) \\ + \frac{1}{n} \sum_{i = 1}^{n} \sum_{k = 1}^{K} \int_{0}^{τ} \frac{\partial {\hat{ζ}}_{ik} (t ; β, B)}{\partial β^{T}} Y_{ik} (t) r_{ik} (t ; β) λ_{0 k} (t) d t, \end{matrix}

7.3

with

\begin{matrix} \frac{\partial {\hat{ζ}}_{ik} (t ; β, B)}{\partial β^{T}} & = \sum_{m = 1}^{K} \frac{\partial {\hat{R}}_{im} (t ; β) {\hat{ϕ}}_{imk} (t ; β, B)}{\partial β^{T}} + \frac{{\hat{S}}_{k}^{2} (t ; β, B) {({\hat{S}}_{k}^{1} (t ; β))}^{T}}{{({\hat{S}}_{k}^{0} (t ; β))}^{2}} \\ - \frac{{\hat{S}}_{k}^{3} (t ; β, B) + {\hat{S}}_{k}^{4} (t ; β, B) + {\hat{S}}_{k}^{5} (t ; β, B)}{{\hat{S}}_{k}^{0} (t ; β)} . \end{matrix}

Clearly, $\partial \hat{U} (β, B) / \partial β^{T}$ is continuous. For any constant matrix B, the first term on the right hand side of (7.3) is a local square integrable martingale, hence by Lenglart inequality, we can show that it converges to zero in probability, uniformly for $β \in B$ . Let $Γ (β, B)$ denote the uniform convergence limit of the second term, we can show that

\begin{matrix} \frac{\partial \hat{U} (β, B)}{\partial β^{T}} \to Γ (β, B) = - \sum_{k = 1}^{K} \int_{0}^{τ} v_{k} (t ; β, B) s_{k}^{0} (t ; β) λ_{0 k} (t) d t \end{matrix}

in probability, uniformly in $β \in B$ . Thus, $\partial G_{n} (β) / \partial β^{T}$ converges to

\begin{matrix} J (β) = (\begin{matrix} Γ (β, B_{1}) \\ ⋮ \\ Γ (β, B_{L}) \end{matrix}) \end{matrix}

in probability, uniformly for $β \in B$ . Therefore, (i) is satisfied.

Denote

\begin{matrix} {\hat{U}}_{1} (β, B) = \frac{1}{n} \sum_{k = 1}^{K} \sum_{i = 1}^{n} \int_{0}^{τ} {\hat{ζ}}_{ik} (t ; β, B) d M_{ik} (t), \end{matrix}

and

\begin{matrix} {\hat{U}}_{2} (β, B) = \frac{1}{n} \sum_{k = 1}^{K} \sum_{i = 1}^{n} \int_{0}^{τ} {\hat{ζ}}_{ik} (t ; β, B) Y_{ik} (t) r_{ik} (t ; β) λ_{0 k} (t) d t . \end{matrix}

When $β = β_{0}$ , by Lenglart inequality, we can show that

\begin{matrix} \sqrt{n} {\hat{U}}_{1} (β_{0}, B) = \frac{1}{\sqrt{n}} \sum_{k = 1}^{K} \sum_{i = 1}^{n} \int_{0}^{τ} ζ_{ik} (t ; β_{0}, B) d M_{ik} (t) + o_{p} (1), \end{matrix}

where $ζ_{ik} (t ; β, B) = \sum_{m = 1}^{K} R_{im} (t ; β) ϕ_{imk} (t ; β, B) - e_{k} (t ; β, B)$ . From the arguments in Zhou and Wang (2000), it can be shown that

\begin{matrix} \sqrt{n} {\hat{U}}_{2} (β_{0}, B) = - \frac{1}{\sqrt{n}} \sum_{k = 1}^{K} \sum_{i = 1}^{n} \frac{n - n_{k}}{n_{k}} F_{ik} (β_{0}) + o_{p} (1), \end{matrix}

where

\begin{matrix} F_{ik} (β) = \int_{0}^{τ} ζ_{ik} (t ; β, B) (r_{ik} (t ; β) - ψ_{ik} (t ; β)) Y_{ik} (t) λ_{0 k} (t) d t . \end{matrix}

Note that $F_{ik} (β) = 0$ if $i \in {\bar{V}}_{k}$ . Then we have

\begin{matrix} \sqrt{n} \hat{U} (β_{0}, B) & = \frac{1}{\sqrt{n}} \sum_{k = 1}^{K} \sum_{i = 1}^{n} \int_{0}^{τ} ζ_{ik} (t ; β_{0}, B) d M_{ik} (t) \\ - \frac{1}{\sqrt{n}} \sum_{k = 1}^{K} \sum_{i = 1}^{n} \frac{n - n_{k}}{n_{k}} F_{ik} (β_{0}) + o_{p} (1), \end{matrix}

where $M_{ik} (t)$ is a mean-zero martingale and $E (F_{ik} (β_{0})) = 0$ , then by strong law of large numbers, we have $\hat{U} (β_{0}, B)$ converges to zero with probability 1, thus (ii) is satisfied.

Denote

\begin{matrix} \hat{U} (β, B) = \sum_{i = 1}^{n} \sum_{k = 1}^{K} {\hat{U}}_{ik} (β, B), \end{matrix}

then we can obtain

\begin{matrix} {\hat{g}}_{i} (β) = {(n \sum_{k = 1}^{K} {\hat{U}}_{ik}^{T} (β, B_{1}), n \sum_{k = 1}^{K} {\hat{U}}_{ik}^{T} (β, B_{2}), \dots, n \sum_{k = 1}^{K} {\hat{U}}_{ik}^{T} (β, B_{L}))}^{T}, \end{matrix}

and

\begin{matrix} n {\hat{W}}_{n} (β) & = \frac{1}{n} \sum_{i = 1}^{n} {\hat{g}}_{i} (β) {\hat{g}}_{i}^{T} (β) \\ = (\begin{matrix} O (β, B_{1}, B_{1}) & \dots & O (β, B_{1}, B_{L}) \\ ⋮ & ⋱ & ⋮ \\ O (β, B_{L}, B_{1}) & \dots & O (β, B_{L}, B_{L}) \end{matrix}), \end{matrix}

where

\begin{matrix} O (β, B_{j}, B_{j^{^{'}}}) \\ = n \sum_{i = 1}^{n} \sum_{k = 1}^{K} \sum_{l = 1}^{K} {\hat{U}}_{ik} (β, B_{j}) {\hat{U}}_{il}^{T} (β, B_{j^{^{'}}}) \\ = \frac{1}{n} \sum_{i = 1}^{n} \sum_{k = 1}^{K} \sum_{l = 1}^{K} \{\int_{0}^{τ}, ζ_{ik}, (t ; β, B_{j}), d, M_{ik}, (t)\} {\{\int_{0}^{τ}, ζ_{il}, (t ; β, B_{j^{^{'}}}), d, M_{il}, (t)\}}^{T} + o_{p} (1) . \end{matrix}

Similiarly as in Xue et al. (2010), by condition (C8), we can show that $O (β, B_{j}, B_{j^{^{'}}})$ converges in probability to $ω (β, B_{j}, B_{j^{^{'}}})$ , and $n {\hat{W}}_{n} (β)$ converges in probability to $W (β) = {[ω (β, B_{j}, B_{j^{^{'}}})]}_{j, j^{^{'}} = 1}^{L}$ uniformly for $β \in B$ , hence (iii) and (iv) are satisfied.

Asymptotic normality

Define $D_{i} (B) = \sum_{k = 1}^{K} \int_{0}^{τ} ζ_{ik} (t ; β_{0}, B) d M_{ik} (t)$ , from the previous statements, for any constant matrix B, we can obtain that $\sqrt{n} \hat{U} (β_{0}, B)$ is asymptotically equivalent to $n^{- 1 / 2} \sum_{i = 1}^{n} D_{i} (B)$ , which is a sum of independent p-vector random variables with mean zero and variance $var (D_{i} (B))$ . By condition (C8) and the multivariate central limit theorem, we can show that $\sqrt{n} \hat{U} (β_{0}, B)$ converges in distribution to a normal random vector with mean zero and covariance matrix $ω (β_{0}, B, B)$ , and $\sqrt{n} {\hat{G}}_{n} (β_{0})$ converges in distribution to a normal random vector, denoted as $X$ , with mean zero and covariance matrix $W_{0}$ as defined in (A.2).

By Taylor expansion of ${\hat{Q}}_{n}^{(1)} (β)$ around the true parameter $β_{0}$ , we have

\begin{matrix} \sqrt{n} ({\hat{β}}_{Q} - β_{0}) = - {[n^{- 1} {\hat{Q}}_{n}^{(2)} (β^{*})]}^{- 1} [n^{- 1 / 2} {\hat{Q}}_{n}^{(1)} (β_{0})], \end{matrix}

where $β^{*}$ is between $β_{0}$ and ${\hat{β}}_{Q}$ . From the argument in the proof of Theorem 1 and the consistency of ${\hat{β}}_{Q}, n^{- 1} {\hat{Q}}_{n}^{(2)} (β^{*})$ and $n^{- 1 / 2} {\hat{Q}}_{n}^{(1)} (β_{0})$ converge to $2 J_{0}^{T} W_{0}^{- 1} J_{0}$ and $2 J_{0}^{T} W_{0}^{- 1} X$ in probability, respectively. Hence, we can obtain that in distribution

\begin{matrix} \sqrt{n} ({\hat{β}}_{Q} - β_{0}) \to N (0, {(J_{0}^{T} W_{0}^{- 1} J_{0})}^{- 1}) . \end{matrix}

$□$

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Breslow NE. Discussion of the paper by D. R. Cox. J R Stat Soc Ser B. 1972;34:216–217. [Google Scholar]
Cai JW, Prentice RL. Estimating equations for hazard ratio parameters based on correlated failure time data. Biometrika. 1995;82:151–164. doi: 10.1093/biomet/82.1.151. [DOI] [Google Scholar]
Cai JW, Prentice RL. Regression estimation using multivariate failure time data and a common baseline hazard function model. Lifetime Data Anal. 1997;3:197–213. doi: 10.1023/A:1009613313677. [DOI] [PubMed] [Google Scholar]
Cai JW, Shen Y. Permutation tests for comparing marginal survival functions with clustered failure time data. Stat Med. 2000;19:2963–2973. doi: 10.1002/1097-0258(20001115)19:21<2963::AID-SIM593>3.0.CO;2-H. [DOI] [PubMed] [Google Scholar]
Clayton D, Cuzick J. Multivariate generalizations of the proportional hazard model. J R Stat Soc Ser A. 1985;148:82–117. doi: 10.2307/2981943. [DOI] [Google Scholar]
Fan Z, Wang X. Marginal hazards model for multivariate failure time data with auxiliary covariates. J Nonparametr Stat. 2009;21:771–786. doi: 10.1080/10485250902915903. [DOI] [Google Scholar]
Greene WF, Cai JW. Measurement error in covariates in the marginal hazards model for multivariate failure time data. Biometrics. 2004;60:987–996. doi: 10.1111/j.0006-341X.2004.00254.x. [DOI] [PubMed] [Google Scholar]
Hu C, Lin DY. Semiparametric failure time regression with replicates of mismeasured covariates. J Am Stat Assoc. 2004;99:105–118. doi: 10.1198/016214504000000197. [DOI] [Google Scholar]
Liu Y, Zhou H, Cai JW. Estimated pseudopartial-likelihood method for correlated failure time data with auxiliary covariates. Biometrics. 2009;65:1184–1193. doi: 10.1111/j.1541-0420.2009.01198.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu Y, Wu Y, Zhou H. Multivariate failure times regression with a continuous auxiliary covariate. J Multivar Anal. 2010;101:679–691. doi: 10.1016/j.jmva.2009.09.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu Y, Yuan Z, Cai JW, Zhou H. Marginal hazard regression for correlated failure time data with auxiliary covariates. Lifetime Data Anal. 2012;18:116–138. doi: 10.1007/s10985-011-9209-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Qu A, Lindsay BG, Li B. Improving generalised estimating equations using quadratic inference functions. Biometrika. 2000;87:823–836. doi: 10.1093/biomet/87.4.823. [DOI] [Google Scholar]
SOLVD Investigators (1991) Effect of enalapril on survival in patients with reduced left ventricular ejection fractions and congestive heart failure. N Engl J Med 325:293–302 [DOI] [PubMed]
Wei LJ, Lin DY, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc. 1989;84:1065–1073. doi: 10.1080/01621459.1989.10478873. [DOI] [Google Scholar]
Xue L, Wang L, Qu A. Incorporating correlation for multivariate failure time data when cluster size is large. Biometrics. 2010;66:393–404. doi: 10.1111/j.1541-0420.2009.01307.x. [DOI] [PubMed] [Google Scholar]
Zhou H, Wang C-Y. Failure time regression with continuous covariates measured with error. J R Stat Soc Ser B. 2000;62:657–665. doi: 10.1111/1467-9868.00255. [DOI] [Google Scholar]

[CR1] Breslow NE. Discussion of the paper by D. R. Cox. J R Stat Soc Ser B. 1972;34:216–217. [Google Scholar]

[CR2] Cai JW, Prentice RL. Estimating equations for hazard ratio parameters based on correlated failure time data. Biometrika. 1995;82:151–164. doi: 10.1093/biomet/82.1.151. [DOI] [Google Scholar]

[CR3] Cai JW, Prentice RL. Regression estimation using multivariate failure time data and a common baseline hazard function model. Lifetime Data Anal. 1997;3:197–213. doi: 10.1023/A:1009613313677. [DOI] [PubMed] [Google Scholar]

[CR4] Cai JW, Shen Y. Permutation tests for comparing marginal survival functions with clustered failure time data. Stat Med. 2000;19:2963–2973. doi: 10.1002/1097-0258(20001115)19:21<2963::AID-SIM593>3.0.CO;2-H. [DOI] [PubMed] [Google Scholar]

[CR5] Clayton D, Cuzick J. Multivariate generalizations of the proportional hazard model. J R Stat Soc Ser A. 1985;148:82–117. doi: 10.2307/2981943. [DOI] [Google Scholar]

[CR6] Fan Z, Wang X. Marginal hazards model for multivariate failure time data with auxiliary covariates. J Nonparametr Stat. 2009;21:771–786. doi: 10.1080/10485250902915903. [DOI] [Google Scholar]

[CR7] Greene WF, Cai JW. Measurement error in covariates in the marginal hazards model for multivariate failure time data. Biometrics. 2004;60:987–996. doi: 10.1111/j.0006-341X.2004.00254.x. [DOI] [PubMed] [Google Scholar]

[CR8] Hu C, Lin DY. Semiparametric failure time regression with replicates of mismeasured covariates. J Am Stat Assoc. 2004;99:105–118. doi: 10.1198/016214504000000197. [DOI] [Google Scholar]

[CR9] Liu Y, Zhou H, Cai JW. Estimated pseudopartial-likelihood method for correlated failure time data with auxiliary covariates. Biometrics. 2009;65:1184–1193. doi: 10.1111/j.1541-0420.2009.01198.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] Liu Y, Wu Y, Zhou H. Multivariate failure times regression with a continuous auxiliary covariate. J Multivar Anal. 2010;101:679–691. doi: 10.1016/j.jmva.2009.09.008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] Liu Y, Yuan Z, Cai JW, Zhou H. Marginal hazard regression for correlated failure time data with auxiliary covariates. Lifetime Data Anal. 2012;18:116–138. doi: 10.1007/s10985-011-9209-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] Qu A, Lindsay BG, Li B. Improving generalised estimating equations using quadratic inference functions. Biometrika. 2000;87:823–836. doi: 10.1093/biomet/87.4.823. [DOI] [Google Scholar]

[CR13] SOLVD Investigators (1991) Effect of enalapril on survival in patients with reduced left ventricular ejection fractions and congestive heart failure. N Engl J Med 325:293–302 [DOI] [PubMed]

[CR14] Wei LJ, Lin DY, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc. 1989;84:1065–1073. doi: 10.1080/01621459.1989.10478873. [DOI] [Google Scholar]

[CR15] Xue L, Wang L, Qu A. Incorporating correlation for multivariate failure time data when cluster size is large. Biometrics. 2010;66:393–404. doi: 10.1111/j.1541-0420.2009.01307.x. [DOI] [PubMed] [Google Scholar]

[CR16] Zhou H, Wang C-Y. Failure time regression with continuous covariates measured with error. J R Stat Soc Ser B. 2000;62:657–665. doi: 10.1111/1467-9868.00255. [DOI] [Google Scholar]

PERMALINK

Semiparametric regression based on quadratic inference function for multivariate failure time data with auxiliary information

Feifei Yan

Lin Zhu

Yanyan Liu

Jianwen Cai

Haibo Zhou

Abstract

Introduction

Model and estimation

Preliminaries

Estimated QIF for marginal hazards model

Asymptotic properties

Theorem 1

Remark 1

Inference on hazard ratio parameters

Theorem 2

Comment

Simulation studies

Performance of estimated QIF estimator

Simulation study (1)

Table 1.

Table 2.

Table 3.

Simulation study (2)

Table 4.

Performance of inference method

Fig. 1.

Fig. 2.

Analysis of SOLVD data

Table 5.

Fig. 3.

Concluding remarks

Acknowledgements

Appendix

Proof of Theorem 1

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases