Abstract
This paper deals with statistical inference procedure of multivariate failure time data when the primary covariate can be measured only on a subset of the full cohort but the auxiliary information is available. To improve efficiency of statistical inference, we use quadratic inference function approach to incorporate the intra-cluster correlation and use kernel smoothing technique to further utilize the auxiliary information. The proposed method is shown to be more efficient than those ignoring the intra-cluster correlation and auxiliary information and is easy to implement. In addition, we develop a chi-squared test for hypothesis testing of hazard ratio parameters. We evaluate the finite-sample performance of the proposed procedure via extensive simulation studies. The proposed approach is illustrated by analysis of a real data set from the study of left ventricular dysfunction.
Keywords: Multivariate failure time data, Validation sample, Quadratic inference function, Chi-squared test
Introduction
This paper is aimed at developing improved inference procedure for multivariate failure time data with auxiliary information. Large cohort studies often involve thousands or more subjects and the studies, especially when involving failure time outcomes, could last for many years. It is often that the measurement of the primary covariate can only be obtained for a random subset of the study cohort due to technical difficulties or financial limitations. On the other hand, some auxiliary information that is less precise but highly correlated to the primary exposure can be cheaply collected for all cohort members. The auxiliary information could be a mismeasured surrogate to the true covariate, or any covariate that is informative about the true covariate. An example is from the left ventricular dysfunction (SOLVD 1991) prevention study, which aims to assess the effects of risk factors on the time (possibly censored) to heart failure and the first myocardial infarction. One of the most important risk factors is patient’s ejection fraction (EF), which can be precisely measured by using a standardized radionucleotide technique, but the cost is very high. Therefore, EF is only measured on a randomly chosen subset of all cohort, while a less precise but cheaper measurement of EF was ascertained for all the patients using a nonstandardized technique. Because each patient could experience both heart failure and the first myocardial infarction, statistical methods for handling multivariate failure time data with covariate measurement error would be required.
Proper use of auxiliary information has been proved to lead to improved efficiency of survival estimates in multivariate failure time data with auxiliary information. For example, Hu and Lin (2004) proposed a corrected estimation function under the assumption that the error is symmetrically distributed. Liu et al. (2009) and Liu et al. (2010) developed estimated pseudo-partial likelihood method for multivariate failure time data with discrete and continuous auxiliary variable, respectively. Liu et al. (2012) and Fan and Wang (2009) studied this problem under the assumption that the intra-cluster subjects have common baseline hazard. The above studies are based on the marginal hazards model, the intra-cluster correlation, however, is ignored in the estimation procedures and only adjusted in the inference step by applying a robust sandwich variance estimate. The practice of ignoring the intra-cluster correlation would result in some loss of efficiency.
Some authors have proposed to incorporate correlation explicitly into the estimating equations to improve the efficiency of estimate in dealing with multivariate failure time data with covariates being fully observed. For example, Cai and Prentice (1995, 1997) added a weight matrix based on the inverse of correlation matrix of marginal martingales into the partial likelihood score equation. Simulation studies have shown that their approach is more efficient than that using independent structure when cluster size is small. However, their method is computation intensive when the cluster size is large because the computation involves an estimation of very high dimensional weighting correlation matrix. To overcome this shortcoming, Xue et al. (2010) developed a different approach by applying the method of quadratic inference function (QIF). Their method avoids to explicitly estimate the correlation parameters and is easy to implement especially when cluster size is large. We note that both these two methods assume that the covariates could be observed completely and therefore cannot be applied directly to SOLVD data.
Motivated by the advantages of the QIF method provided, we extend this method to the analysis of multivariate failure time data with auxiliary information. Here, we assume that the auxiliary covariate is continuous. We propose an estimated QIF method and study the asymptotic properties of the proposed estimator. The proposed method inherits the merit of QIF method which avoids the estimation of nuisance correlation parameters and is computationally easy to implement. Under certain regularity conditions, we establish the asymptotic normality of resulting estimator. Simulation studies show that our proposed method can improve the estimation efficiency compared with that ignoring dependent structure, such as the method by Liu et al. (2010). In addition, we study the problem of hypothesis testing, propose a proper test statistic which have a chi-squared limiting distribution under the null hypothesis.
The rest of the article is organized as follows. In Sect. 2, we introduce the model and describe the proposed estimation procedure. In Sect. 3, the large-sample properties of the proposed estimator are presented. In Sect. 4, a chi-squared test is developed for hypothesis testing. In Sect. 5, the finite-sample performance of the proposed procedures is assessed through extensive simulation studies. We illustrate the proposed method through analysis of a real data set from SOLVD study in Sect. 6. Some concluding remarks are given in Sect. 7 and the technical proofs are provided in “Appendix”.
Model and estimation
Preliminaries
Suppose that the whole cohort consists of n independent clusters, and each cluster contains K correlated failure types. Let (i, k) denote the kth subject in the ith cluster. Let and be potential failure time and censoring time for subject (i, k). With censoring, one observes and , where is the indicator function. Let be a p-vector of possibly time-dependent covariates.
For subject (i, k), the hazard function takes the following form:
| 2.1 |
where is a p-vector of unknown regression parameters and is an unspecified marginal baseline hazard function pertaining to the kth failure type.
Note that model (2.1) includes as a special case the failure-type-specific model (Wei et al. 1989; Greene and Cai 2004) , which allows for different covariate effect for different k. This can be seen by defining and in the model . For simplicity, we write as in the following.
Let be the marginal cumulative baseline hazard function for the kth failure type. Let and be the observed counting process and the at-risk indicator process. For convenience, write the relative risk function as . Let be the marginal martingale process, where is the true parameter. Given , can be estimated consistently by the following Breslow type estimator (Breslow 1972):
| 2.2 |
Given , it follows that could be estimated as follows:
Let be the end time of study. Write . To improve the estimation efficiency, Cai and Prentice (1995) added proper weight matrix to the pseudo-partial likelihood equation, and proposed to obtain estimate of through solving the following equation:
| 2.3 |
where is the weight matrix with
and
with denotes the jth derivative of with respect to . The weight matrix measures the intra-cluster correlation and is important to improve estimation efficiency. However, when the cluster size K is large, the estimation of weight matrix is computationally expensive. To overcome this shortcoming, Xue et al. (2010) proposed a QIF method which is based on the following generalized estimating equation:
| 2.4 |
where and is the working correlation matrix whose common structure is specified by a vector of nuisance correlation parameters . The inverse of the working correlation is approximated by a linear combination of several pre-specified symmetric basis matrices, namely,
| 2.5 |
where are known basis matrices and are unknown coefficients.
Substituting (2.5) in (2.4) leads to a linear combination of the elements of the following vector
As there are more equations than unknown parameters, Xue et al. (2010) proposed to estimate by minimizing the following QIF:
| 2.6 |
where . We denote the solution as in the following.
In the implementation of QIF, there being an additional issue that the diagonal matrix involves the unknown baseline hazard function . Xue et al. (2010) suggested a kernel smoothed estimator as follows,
| 2.7 |
where is the Epanechnikov kernel function with being the rule-of-thumb bandwidth, and with being the Breslow estimator given in (2.2).
Estimated QIF for marginal hazards model
Consider the situation that the primary covariate can only be ascertained in validation set. Let consist of two parts, and , where is the primary variable which can only be observed in the validation set and is the vector of the remaining covariates that are measured precisely for the full cohort. Accordingly, write the true parameter as with and pertaining to and , respectively. Denote A(t) as a time-dependent auxiliary variable for the primary covariate X(t). can be measured for all cohort members. Suppose that A provides no additional information to model given X, i.e.,
Use or 0 to indicate whether the subject (i, k) is in the validation set or not. Let and denote the kth marginal validation set and non-validation set, respectively. Then the observed data are:
According to Liu et al. (2009), when subject (i, k) is in non-validation set, the hazard function given observed data can be written as:
where denotes all the possible auxiliary information, which may include the auxiliary covariate and the part from . Therefore, the induced relative risk function is
where , and
If the conditional density of , written as , is a known function up to a parameter , then can be estimated by using the induced risk function to replace risk function in equations (2.3) or (2.4). However, misspecification of such parameterization may lead to biased estimates. We use empirical method to estimate and then replace it with the corresponding estimate.
In this paper, we consider the often encountered case that both the primary covariate and the auxiliary variable are one-dimensional. The unknown part of induced relative risk function in non-validation set is estimated by kernel smoothing method
| 2.8 |
where is a kernel function, is the bandwidth. Imputation of the relative risk by interpolation would be used when the denominator is 0. Therefore, the estimate of the relative risk is
Replacing by in the notations in Sect. 2.1, we obtain an estimated version of and . To differentiate, write as , and . It yields an estimated QIF as
where . can be estimated by minimizing , i.e.,
| 2.9 |
To reduce the computation burden, we approximate the first and the second order derivatives of as in Qu et al. (2000) as follows.
Then, Newton-Raphson algorithm can be applied by using the approximation.
Asymptotic properties
In this section, we present the asymptotic properties of the proposed estimated QIF estimator , and provide standard error formula for it.
Let denote the number of subjects in and assume as , where represents the probability of subject (i, k) being sampled into the kth marginal validation set. Under the conditions listed in “Appendix”, we demonstrate the asymptotic behavior of in the following theorems.
Theorem 1
Under conditions (C1)–(C9) in “Appendix”, the following results hold:
-
(I)
The proposed estimator is a consistent estimator of .
- (II)
The asymptotic covariance can be consistently estimated by
where
with
and
where
with
where and are defined in “Appendix”.
Remark 1
As a special case of the estimated QIF estimator, the estimator using the independent working correlation is denoted as , which is the same as the EPPL estimator of Liu et al. (2010), but different expressions of the variance matrix of the asymptotic distribution of and its estimator are provided under the conditions (C1)–(C8) in “Appendix”. From the corresponding expressions of and , we can obtain that
and
Inference on hazard ratio parameters
The QIF is built on an objective function, which provides a natural way to make inference about the hazard ratio parameter . Suppose that is partitioned into and , where is vector of hazard ratio parameters of interest with dimension , and is a vector of nuisance parameters with dimension . As a special case, we also allow , with and being absent.
To test
we propose a test statistic
| 4.1 |
where
The values of and measure how well the model fits the data under and , respectively. Under , the difference between and should be very small. However, under , should be systematically larger than .
Theorem 2
Suppose conditions (C1)–(C9) in “Appendix” are satisfied, under , the test statistic T asymptotically follows chi-squared distribution with degrees of freedom.
Comment
To prove Theorem 2, we rewrite that
where is defined as in (2.6), , and . From the proof of Theorem 1, we can obtain that the first two brackets equal to . In addition, from the conclusions of the previous Theorem 1 and the Theorem 1 in Xue et al. (2010), both and are consistent estimators of , then we can have that the third and the fourth brackets also equal to . Furthermore, the last bracket asymptotically approaches to a random variable which follows chi-squared distribution with degrees of freedom, the proof of the last one is similar to the proof of Theorem 1 in Qu et al. (2000).
Simulation studies
In this section, we conduct simulation studies to evaluate the finite-sample behavior of the proposed method. We first evaluate the performance of proposed estimator in Sect. 5.1 and then the performance of inference method in Sect. 5.2.
Performance of estimated QIF estimator
We compare the proposed estimator with the QIF estimator proposed by Xue et al. (2010) based only on the validation set and the EPPL estimator () of Liu et al. (2010), which utilizes the auxiliary information but does not consider the intra-cluster correlation in the estimate of . The proposed estimator takes both the intra-cluster correlation and the auxiliary information into account.
The covariates , which are only observed in the validation sets in the real studies, are generated independently from uniform distribution U(0, 1). The covariates are independent binary covariates taking value one with probability 0.5. The multivariate failure times are generated from multivariate (Clayton and Cuzick 1985) model with the joint survival function
where , and , which may vary with the failure type, is the corresponding parameter of , and is the dependence parameter, a larger value of which represents a weaker dependence between the failure times. We set , 0.5 or 2, which presents a varying degree of correlation between the generated failure times, and the baseline hazard function . The simulated failure times are generated by using the algorithm described in Cai and Shen (2000) through
for , where for , and are generated from uniform distribution over interval (0, 1). Censoring times are generated from U(0, c), where c is a selected constant to achieve a specified censoring rate.
Notice that the true correlation structure of the Clayton model is exchangeable, the working correlation in (2.9) is taken to be exchangeable, and the corresponding estimated QIF estimator is denoted as . We calculated another estimated QIF estimator using the misspecified AR(1) working correlation. The corresponding resulting estimators of the QIF method based only on the validation set are denoted as and , respectively.
To estimate the induced relative risk function and the baseline hazard function, we apply the Epanechnikov kernel function in (2.8) and (2.7) with bandwidths and , respectively, where is the sample standard deviation function, is the part of auxiliary covariate in the kth marginal validation set. We choose the nearest neighbor interpolation to estimate the induced relative function when the denominator in (2.8) is 0. In addition, it is worth noting that may be 0 at some locations because the Epanechnikov kernel function is of bounded support, which could make the diagonal matrix not invertible. If this happens, we replace with the average of values at the non-zero locations.
We consider two types of simulations:
s are the same for different failure type, i.e. .
varies across failure type.
The auxiliary covariate is generated from via
where follows a normal distribution , the positive parameter controls the strength of association between and . Each simulation is repeated 1000 times.
Simulation study (1)
In the first simulation, we set the true parameter , validation proportion , association parameter . The number of independent clusters is , with or 8 failure types in each cluster.
Tables 1, 2 and 3 demonstrate the simulation results for estimates of parameter for each method under different censoring rates 10%, 40% and 80%. The sample mean and sample standard deviation of the 1000 estimates, the average of estimated standard errors and the coverage rate of the 95% confidence intervals for the true parameter are listed in the Est, SD, SE and CR columns, respectively. RE, the ratio of the empirical variance of to that of or , is the estimated relative efficiency of estimated QIF estimators relative to . We summarize the results as follows: (i) The estimates of all the methods are all approximately unbiased. Moreover, the estimators of the asymptotic standard errors are approximately equal to the empirical standard deviations. The corresponding 95% confidence intervals calculated by the estimated standard errors provide reasonable coverage rates. This suggests that the estimates of asymptotic standard errors for all methods work well. (ii) For each considered scenario, the estimator using auxiliary information is more efficient than the estimators and using validation set only. However, loses efficiency when the degree of correlation within a cluster becomes stronger. (iii) As K increases, the empirical standard deviations (SD) of all the estimators decrease. That is naturally because of the increase in the total amount of data. (iv) As decreases, the efficiency gain of estimated QIF estimators relative to increases. From Table 1, estimated QIF estimators are more efficient than the other estimators for all combinations of and K. We also observe the same trend from Table 2. From Table 3, however, the estimated QIF estimators are less efficient than , although REs are very close to 1, in several cases due to the reduction of correlation when censoring rate is 80%. Furthermore, as expected, with correct working correlation is always more efficient than with misspecified working correlation. (v) The validation proportion of the incomplete covariate has effect on the values of RE, especially for the first parameter. For example, when and 10% censoring, the REs of relative to for decrease from (3.34, 2.26, 1.21) to (2.85, 2.03, 1.18) , when validation proportion decreases from 0.5 to 0.3 (results not shown). However, when we increased n, not only the CRs but also the REs increased.
Table 1.
Simulation results for common effect size across failure type: under the censoring rate 10%
| K | Method | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Est | SD | SE | CR | RE | Est | SD | SE | CR | RE | |||
| 4 | 0.25 | 0.686 | 0.164 | 0.163 | 0.952 | – | - 0.202 | 0.098 | 0.091 | 0.926 | – | |
| 0.682 | 0.171 | 0.169 | 0.948 | – | - 0.205 | 0.104 | 0.095 | 0.917 | – | |||
| 0.675 | 0.143 | 0.141 | 0.945 | – | - 0.203 | 0.077 | 0.075 | 0.938 | – | |||
| 0.684 | 0.078 | 0.078 | 0.942 | 3.34 | - 0.200 | 0.039 | 0.038 | 0.936 | 3.93 | |||
| 0.680 | 0.089 | 0.088 | 0.933 | 2.62 | - 0.201 | 0.047 | 0.045 | 0.941 | 2.65 | |||
| 0.5 | 0.683 | 0.166 | 0.167 | 0.955 | – | - 0.202 | 0.103 | 0.094 | 0.925 | – | ||
| 0.679 | 0.174 | 0.173 | 0.946 | – | - 0.205 | 0.107 | 0.097 | 0.926 | – | |||
| 0.673 | 0.142 | 0.139 | 0.943 | – | - 0.202 | 0.077 | 0.075 | 0.941 | – | |||
| 0.681 | 0.094 | 0.091 | 0.935 | 2.26 | - 0.199 | 0.048 | 0.047 | 0.940 | 2.55 | |||
| 0.676 | 0.103 | 0.101 | 0.930 | 1.91 | - 0.201 | 0.055 | 0.053 | 0.945 | 1.97 | |||
| 2 | 0.678 | 0.184 | 0.179 | 0.939 | – | - 0.203 | 0.111 | 0.102 | 0.932 | – | ||
| 0.676 | 0.188 | 0.181 | 0.924 | – | - 0.204 | 0.112 | 0.103 | 0.927 | – | |||
| 0.673 | 0.139 | 0.137 | 0.946 | – | - 0.201 | 0.078 | 0.075 | 0.938 | – | |||
| 0.675 | 0.127 | 0.123 | 0.938 | 1.21 | - 0.200 | 0.069 | 0.067 | 0.941 | 1.24 | |||
| 0.672 | 0.133 | 0.127 | 0.928 | 1.11 | - 0.201 | 0.072 | 0.069 | 0.935 | 1.15 | |||
| 8 | 0.25 | 0.699 | 0.122 | 0.113 | 0.939 | – | -0.202 | 0.066 | 0.062 | 0.933 | – | |
| 0.697 | 0.130 | 0.121 | 0.929 | – | -0.201 | 0.070 | 0.067 | 0.936 | – | |||
| 0.683 | 0.106 | 0.103 | 0.941 | – | -0.201 | 0.054 | 0.054 | 0.949 | – | |||
| 0.686 | 0.061 | 0.059 | 0.938 | 3.05 | -0.200 | 0.027 | 0.026 | 0.942 | 4.12 | |||
| 0.687 | 0.068 | 0.066 | 0.940 | 2.38 | -0.200 | 0.031 | 0.031 | 0.949 | 2.96 | |||
| 0.5 | 0.698 | 0.126 | 0.116 | 0.927 | – | -0.200 | 0.069 | 0.064 | 0.923 | – | ||
| 0.695 | 0.133 | 0.123 | 0.935 | – | -0.200 | 0.072 | 0.068 | 0.931 | – | |||
| 0.684 | 0.103 | 0.101 | 0.945 | – | -0.201 | 0.054 | 0.054 | 0.950 | – | |||
| 0.688 | 0.068 | 0.065 | 0.931 | 2.31 | -0.200 | 0.032 | 0.032 | 0.946 | 2.89 | |||
| 0.687 | 0.077 | 0.073 | 0.929 | 1.80 | -0.200 | 0.037 | 0.037 | 0.939 | 2.17 | |||
| 2 | 0.698 | 0.135 | 0.125 | 0.922 | – | -0.199 | 0.074 | 0.071 | 0.941 | – | ||
| 0.694 | 0.139 | 0.128 | 0.918 | – | -0.200 | 0.075 | 0.073 | 0.945 | – | |||
| 0.687 | 0.099 | 0.098 | 0.935 | – | -0.200 | 0.051 | 0.053 | 0.949 | – | |||
| 0.690 | 0.087 | 0.084 | 0.938 | 1.29 | -0.199 | 0.044 | 0.045 | 0.948 | 1.37 | |||
| 0.687 | 0.094 | 0.090 | 0.931 | 1.13 | -0.200 | 0.048 | 0.048 | 0.944 | 1.16 | |||
is the estimator of the QIF method with exchangeable working correlation based only on the validation set, while is the one with AR(1) working correlation. is the EPPL estimator using the independent structure. and are the estimators of the proposed estimated QIF method with exchangeable and AR(1) working correlation, respectively
Table 2.
Simulation results for common effect size across failure type: under the censoring rate 40%
| K | Method | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Est | SD | SE | CR | RE | Est | SD | SE | CR | RE | |||
| 4 | 0.25 | 0.679 | 0.213 | 0.206 | 0.936 | – | - 0.205 | 0.121 | 0.117 | 0.946 | – | |
| 0.676 | 0.222 | 0.213 | 0.932 | – | - 0.206 | 0.124 | 0.121 | 0.950 | – | |||
| 0.681 | 0.177 | 0.169 | 0.936 | – | - 0.205 | 0.093 | 0.092 | 0.950 | – | |||
| 0.687 | 0.120 | 0.116 | 0.941 | 2.18 | - 0.201 | 0.062 | 0.061 | 0.946 | 2.20 | |||
| 0.682 | 0.131 | 0.127 | 0.934 | 1.82 | - 0.202 | 0.069 | 0.068 | 0.949 | 1.79 | |||
| 0.5 | 0.673 | 0.218 | 0.212 | 0.935 | – | - 0.206 | 0.127 | 0.120 | 0.941 | – | ||
| 0.670 | 0.224 | 0.217 | 0.934 | – | - 0.207 | 0.129 | 0.123 | 0.940 | – | |||
| 0.680 | 0.174 | 0.168 | 0.939 | – | - 0.204 | 0.091 | 0.092 | 0.954 | – | |||
| 0.685 | 0.132 | 0.131 | 0.944 | 1.73 | - 0.202 | 0.070 | 0.070 | 0.952 | 1.66 | |||
| 0.679 | 0.143 | 0.140 | 0.935 | 1.48 | - 0.203 | 0.076 | 0.076 | 0.954 | 1.45 | |||
| 2 | 0.668 | 0.225 | 0.221 | 0.939 | – | - 0.204 | 0.134 | 0.127 | 0.944 | – | ||
| 0.666 | 0.226 | 0.222 | 0.934 | – | - 0.205 | 0.134 | 0.127 | 0.937 | – | |||
| 0.679 | 0.166 | 0.166 | 0.942 | – | - 0.202 | 0.092 | 0.092 | 0.954 | – | |||
| 0.678 | 0.156 | 0.157 | 0.940 | 1.14 | - 0.202 | 0.088 | 0.086 | 0.950 | 1.11 | |||
| 0.675 | 0.161 | 0.160 | 0.940 | 1.07 | - 0.203 | 0.090 | 0.088 | 0.947 | 1.05 | |||
| 8 | 0.25 | 0.702 | 0.152 | 0.143 | 0.929 | – | - 0.201 | 0.084 | 0.080 | 0.944 | – | |
| 0.701 | 0.159 | 0.152 | 0.940 | – | - 0.201 | 0.088 | 0.085 | 0.944 | – | |||
| 0.695 | 0.122 | 0.123 | 0.948 | – | - 0.203 | 0.066 | 0.066 | 0.947 | – | |||
| 0.696 | 0.085 | 0.082 | 0.941 | 2.08 | - 0.201 | 0.042 | 0.041 | 0.944 | 2.39 | |||
| 0.697 | 0.094 | 0.092 | 0.942 | 1.68 | - 0.201 | 0.048 | 0.048 | 0.944 | 1.85 | |||
| 0.5 | 0.706 | 0.157 | 0.147 | 0.934 | – | - 0.200 | 0.087 | 0.083 | 0.945 | – | ||
| 0.702 | 0.163 | 0.154 | 0.948 | – | - 0.200 | 0.090 | 0.087 | 0.950 | – | |||
| 0.697 | 0.120 | 0.121 | 0.946 | – | - 0.202 | 0.066 | 0.065 | 0.951 | – | |||
| 0.700 | 0.092 | 0.090 | 0.945 | 1.69 | - 0.200 | 0.048 | 0.047 | 0.955 | 1.85 | |||
| 0.699 | 0.102 | 0.099 | 0.938 | 1.39 | - 0.200 | 0.054 | 0.053 | 0.950 | 1.48 | |||
| 2 | 0.700 | 0.165 | 0.155 | 0.939 | – | - 0.199 | 0.089 | 0.088 | 0.947 | – | ||
| 0.697 | 0.168 | 0.158 | 0.934 | – | - 0.200 | 0.091 | 0.090 | 0.945 | – | |||
| 0.695 | 0.120 | 0.118 | 0.949 | – | - 0.200 | 0.063 | 0.065 | 0.950 | – | |||
| 0.698 | 0.111 | 0.108 | 0.943 | 1.16 | - 0.199 | 0.058 | 0.059 | 0.946 | 1.19 | |||
| 0.695 | 0.116 | 0.113 | 0.950 | 1.07 | - 0.201 | 0.062 | 0.062 | 0.950 | 1.04 | |||
See Table 1
Table 3.
Simulation results for common effect size across failure type: under the censoring rate 80%
| K | Method | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Est | SD | SE | CR | RE | Est | SD | SE | CR | RE | |||
| 4 | 0.25 | 0.655 | 0.412 | 0.374 | 0.924 | – | - 0.216 | 0.250 | 0.215 | 0.921 | – | |
| 0.649 | 0.414 | 0.379 | 0.928 | – | - 0.216 | 0.244 | 0.219 | 0.923 | – | |||
| 0.684 | 0.308 | 0.288 | 0.939 | – | - 0.198 | 0.164 | 0.159 | 0.941 | – | |||
| 0.695 | 0.264 | 0.248 | 0.936 | 1.36 | - 0.199 | 0.147 | 0.136 | 0.935 | 1.24 | |||
| 0.685 | 0.277 | 0.259 | 0.943 | 1.23 | - 0.200 | 0.153 | 0.143 | 0.936 | 1.15 | |||
| 0.5 | 0.652 | 0.435 | 0.381 | 0.929 | – | - 0.211 | 0.241 | 0.221 | 0.938 | – | ||
| 0.651 | 0.412 | 0.384 | 0.933 | – | - 0.209 | 0.238 | 0.222 | 0.939 | – | |||
| 0.689 | 0.304 | 0.287 | 0.939 | – | - 0.196 | 0.160 | 0.159 | 0.946 | – | |||
| 0.691 | 0.282 | 0.266 | 0.938 | 1.17 | - 0.199 | 0.155 | 0.147 | 0.940 | 1.07 | |||
| 0.682 | 0.290 | 0.273 | 0.938 | 1.10 | - 0.200 | 0.156 | 0.151 | 0.945 | 1.06 | |||
| 2 | 0.658 | 0.432 | 0.385 | 0.918 | – | - 0.210 | 0.248 | 0.224 | 0.932 | – | ||
| 0.660 | 0.417 | 0.386 | 0.915 | – | - 0.210 | 0.250 | 0.224 | 0.936 | – | |||
| 0.698 | 0.296 | 0.287 | 0.942 | – | - 0.195 | 0.165 | 0.159 | 0.945 | – | |||
| 0.685 | 0.298 | 0.282 | 0.938 | 0.99 | - 0.202 | 0.167 | 0.157 | 0.936 | 0.98 | |||
| 0.683 | 0.299 | 0.283 | 0.934 | 0.98 | - 0.202 | 0.167 | 0.157 | 0.931 | 0.97 | |||
| 8 | 0.25 | 0.705 | 0.269 | 0.260 | 0.937 | – | - 0.203 | 0.147 | 0.148 | 0.953 | – | |
| 0.696 | 0.284 | 0.269 | 0.949 | – | - 0.205 | 0.149 | 0.154 | 0.961 | – | |||
| 0.706 | 0.214 | 0.206 | 0.937 | – | - 0.205 | 0.111 | 0.113 | 0.953 | – | |||
| 0.710 | 0.178 | 0.169 | 0.939 | 1.44 | - 0.204 | 0.093 | 0.092 | 0.942 | 1.43 | |||
| 0.706 | 0.195 | 0.183 | 0.928 | 1.20 | - 0.206 | 0.100 | 0.100 | 0.956 | 1.23 | |||
| 0.5 | 0.699 | 0.281 | 0.266 | 0.934 | – | - 0.202 | 0.151 | 0.153 | 0.949 | – | ||
| 0.692 | 0.288 | 0.271 | 0.930 | – | - 0.204 | 0.153 | 0.156 | 0.947 | – | |||
| 0.707 | 0.204 | 0.204 | 0.937 | – | - 0.202 | 0.111 | 0.112 | 0.954 | – | |||
| 0.709 | 0.188 | 0.183 | 0.938 | 1.19 | - 0.202 | 0.100 | 0.101 | 0.954 | 1.22 | |||
| 0.705 | 0.199 | 0.193 | 0.929 | 1.06 | - 0.204 | 0.106 | 0.106 | 0.949 | 1.10 | |||
| 2 | 0.684 | 0.291 | 0.273 | 0.932 | – | - 0.205 | 0.164 | 0.157 | 0.937 | – | ||
| 0.681 | 0.290 | 0.273 | 0.929 | – | - 0.204 | 0.161 | 0.157 | 0.942 | – | |||
| 0.705 | 0.207 | 0.203 | 0.943 | – | - 0.203 | 0.112 | 0.112 | 0.954 | – | |||
| 0.702 | 0.207 | 0.198 | 0.942 | 1.00 | - 0.205 | 0.111 | 0.109 | 0.947 | 1.02 | |||
| 0.700 | 0.209 | 0.200 | 0.944 | 0.98 | - 0.206 | 0.112 | 0.110 | 0.947 | 0.99 | |||
See Table 1
Simulation study (2)
In practical studies, one may be interested in the failure-type-specific model
which allows the regression parameters varying with the failure type. We simulate failure types in each cluster, the true parameter and . Since the cluster size is 2, we only need to consider the exchangeable working correlation structure. Consider three settings of n and censoring rate (CE), (n, CE). The simulation results are shown in Table 4. From this table, we can observe similar results as in Simulation Study (1).
Table 4.
Simulation results for varying effect size across failure type:
| Method | |||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Est | SD | SE | CR | RE | Est | SD | SE | CR | RE | Est | SD | SE | CR | RE | Est | SD | SE | CR | RE | ||
| , censoring rate 10% | |||||||||||||||||||||
| 0.25 | 0.696 | 0.294 | 0.272 | 0.941 | – | - 0.191 | 0.170 | 0.155 | 0.931 | – | 0.477 | 0.296 | 0.270 | 0.929 | – | - 0.263 | 0.172 | 0.156 | 0.928 | – | |
| 0.693 | 0.229 | 0.222 | 0.940 | – | - 0.194 | 0.121 | 0.123 | 0.948 | – | 0.495 | 0.226 | 0.222 | 0.943 | – | - 0.257 | 0.127 | 0.124 | 0.946 | – | ||
| 0.681 | 0.139 | 0.130 | 0.934 | 2.69 | - 0.200 | 0.073 | 0.069 | 0.930 | 2.75 | 0.493 | 0.139 | 0.128 | 0.915 | 2.62 | - 0.263 | 0.070 | 0.069 | 0.949 | 3.31 | ||
| 0.5 | 0.694 | 0.301 | 0.278 | 0.934 | – | - 0.190 | 0.172 | 0.158 | 0.926 | – | 0.484 | 0.302 | 0.277 | 0.929 | – | - 0.263 | 0.176 | 0.160 | 0.925 | – | |
| 0.693 | 0.229 | 0.222 | 0.941 | – | - 0.194 | 0.121 | 0.123 | 0.948 | – | 0.498 | 0.226 | 0.222 | 0.936 | – | - 0.258 | 0.127 | 0.124 | 0.946 | – | ||
| 0.682 | 0.170 | 0.158 | 0.937 | 1.81 | - 0.198 | 0.090 | 0.085 | 0.932 | 1.82 | 0.495 | 0.170 | 0.157 | 0.930 | 1.76 | - 0.263 | 0.087 | 0.085 | 0.947 | 2.13 | ||
| 2 | 0.688 | 0.315 | 0.291 | 0.938 | – | - 0.189 | 0.179 | 0.166 | 0.932 | – | 0.482 | 0.315 | 0.291 | 0.921 | – | - 0.267 | 0.184 | 0.168 | 0.926 | – | |
| 0.693 | 0.229 | 0.222 | 0.941 | – | - 0.194 | 0.121 | 0.123 | 0.948 | – | 0.502 | 0.225 | 0.222 | 0.950 | – | - 0.262 | 0.123 | 0.124 | 0.961 | – | ||
| 0.677 | 0.223 | 0.206 | 0.932 | 1.05 | - 0.195 | 0.117 | 0.113 | 0.937 | 1.08 | 0.490 | 0.220 | 0.206 | 0.938 | 1.05 | - 0.266 | 0.117 | 0.114 | 0.943 | 1.10 | ||
| , censoring rate 40% | |||||||||||||||||||||
| 0.25 | 0.687 | 0.368 | 0.342 | 0.933 | – | - 0.202 | 0.203 | 0.195 | 0.946 | – | 0.449 | 0.375 | 0.349 | 0.919 | – | - 0.270 | 0.220 | 0.202 | 0.934 | – | |
| 0.702 | 0.273 | 0.269 | 0.953 | – | - 0.195 | 0.142 | 0.148 | 0.956 | – | 0.490 | 0.265 | 0.274 | 0.953 | – | - 0.260 | 0.155 | 0.153 | 0.950 | – | ||
| 0.684 | 0.213 | 0.203 | 0.945 | 1.64 | - 0.204 | 0.113 | 0.110 | 0.937 | 1.59 | 0.477 | 0.210 | 0.206 | 0.936 | 1.58 | - 0.266 | 0.119 | 0.114 | 0.931 | 1.71 | ||
| 0.5 | 0.684 | 0.378 | 0.348 | 0.931 | – | - 0.199 | 0.209 | 0.199 | 0.940 | – | 0.461 | 0.383 | 0.356 | 0.927 | – | - 0.272 | 0.219 | 0.206 | 0.939 | – | |
| 0.702 | 0.273 | 0.269 | 0.953 | – | - 0.195 | 0.142 | 0.148 | 0.956 | – | 0.498 | 0.271 | 0.274 | 0.948 | – | - 0.260 | 0.155 | 0.153 | 0.948 | – | ||
| 0.684 | 0.237 | 0.226 | 0.941 | 1.32 | - 0.199 | 0.126 | 0.124 | 0.940 | 1.28 | 0.483 | 0.238 | 0.231 | 0.947 | 1.30 | - 0.266 | 0.130 | 0.128 | 0.947 | 1.44 | ||
| 2 | 0.680 | 0.387 | 0.357 | 0.933 | – | - 0.200 | 0.214 | 0.205 | 0.935 | – | 0.481 | 0.411 | 0.366 | 0.913 | – | - 0.276 | 0.225 | 0.212 | 0.932 | – | |
| 0.702 | 0.273 | 0.269 | 0.953 | – | - 0.195 | 0.142 | 0.148 | 0.956 | – | 0.508 | 0.282 | 0.275 | 0.939 | – | - 0.260 | 0.152 | 0.153 | 0.950 | – | ||
| 0.683 | 0.271 | 0.259 | 0.942 | 1.01 | - 0.200 | 0.142 | 0.142 | 0.950 | 1.01 | 0.491 | 0.282 | 0.264 | 0.932 | 1.00 | - 0.268 | 0.148 | 0.147 | 0.946 | 1.05 | ||
| , censoring rate 80% | |||||||||||||||||||||
| 0.25 | 0.672 | 0.404 | 0.397 | 0.948 | – | - 0.213 | 0.241 | 0.229 | 0.947 | – | 0.477 | 0.471 | 0.418 | 0.912 | – | - 0.283 | 0.259 | 0.243 | 0.938 | – | |
| 0.701 | 0.311 | 0.299 | 0.940 | – | - 0.200 | 0.164 | 0.166 | 0.955 | – | 0.506 | 0.323 | 0.314 | 0.942 | – | - 0.262 | 0.169 | 0.176 | 0.958 | – | ||
| 0.686 | 0.283 | 0.272 | 0.936 | 1.20 | - 0.206 | 0.150 | 0.151 | 0.948 | 1.18 | 0.494 | 0.305 | 0.286 | 0.938 | 1.12 | - 0.270 | 0.162 | 0.160 | 0.948 | 1.09 | ||
| 0.5 | 0.670 | 0.412 | 0.401 | 0.938 | – | - 0.213 | 0.245 | 0.232 | 0.952 | – | 0.473 | 0.476 | 0.422 | 0.922 | – | - 0.283 | 0.267 | 0.246 | 0.934 | – | |
| 0.701 | 0.311 | 0.299 | 0.940 | – | - 0.200 | 0.164 | 0.166 | 0.955 | – | 0.510 | 0.328 | 0.314 | 0.943 | – | - 0.267 | 0.172 | 0.176 | 0.957 | – | ||
| 0.683 | 0.299 | 0.286 | 0.942 | 1.08 | - 0.207 | 0.160 | 0.159 | 0.950 | 1.04 | 0.494 | 0.327 | 0.301 | 0.931 | 1.01 | - 0.276 | 0.172 | 0.168 | 0.947 | 1.00 | ||
| 2 | 0.672 | 0.420 | 0.405 | 0.933 | – | - 0.215 | 0.247 | 0.234 | 0.947 | – | 0.474 | 0.458 | 0.425 | 0.935 | – | - 0.278 | 0.274 | 0.248 | 0.934 | – | |
| 0.701 | 0.311 | 0.299 | 0.940 | – | - 0.200 | 0.164 | 0.166 | 0.955 | – | 0.521 | 0.316 | 0.314 | 0.948 | – | - 0.263 | 0.183 | 0.176 | 0.948 | – | ||
| 0.687 | 0.313 | 0.295 | 0.930 | 0.99 | - 0.207 | 0.165 | 0.165 | 0.947 | 0.99 | 0.499 | 0.324 | 0.310 | 0.941 | 0.96 | - 0.271 | 0.187 | 0.174 | 0.940 | 0.96 | ||
The cluster size , validation proportion , association parameter
Performance of inference method
We also conduct simulation studies to assess the performance of the proposed chi-squared test method. The data are generated from the same model as in Simulation Study (1) with , and censoring rate is 10%. First, we are interested in testing versus . Since the dimension of is 1, the test statistic T in (4.1) asymptotically follows , where is calculated by minimizing with exchangeable or AR(1) working correlation. Figure 1 shows Q–Q plot based on 1000 replications. It is clear that the plots indicate proximity to the distribution for both exchangeable and AR(1) working correlation. We also examine the power of the proposed test under . The powers with significance level are calculated when takes different values in [0.3, 0.693]. According to the simulation results, when , i.e., the alternative hypothesis collapses into the null hypothesis, powers are 0.051 and 0.061 for exchangeable and AR(1) working correlation, respectively. It shows that the proposed chi-squared test gives the right level for testing. Figure 2 plots the power functions of the chi-squared test for the estimated QIF method and the QIF method with two different working correlations, and the EPPL method. We can observe that the power functions decrease rapidly as gets closer to the true value (0.693), but the power function for exchangeable working correlation is always larger than that for AR(1) working correlation, thus the test with correct working correlation is more powerful than the one with misspecified working correlation. Nonetheless, powers of the chi-squared test for either of the estimated QIF methods are larger than those of the other two methods. In addition, the power for the EPPL method that utilizes the auxiliary information is larger than those for QIF method based on the validation set only. Similarly, we also consider the hypothesis test that , and , and compute the powers under when varies in . Similar results are obtained but not presented in this paper due to space limitation.
Fig. 1.
Q–Q plot for the test statistic versus under for 1000 replications
Fig. 2.
Power functions of the chi-squared test for the proposed method, the EPPL method and the QIF method under
Analysis of SOLVD data
We apply the proposed method to Left Ventricular Dysfunction (SOLVD 1991) study in this section. The SOLVD study was a randomized, double-masked, placebo-controlled trial between 1986 and 1991. The trial had a three-year recruitment and a two-year follow-up. The basic inclusion criteria for the prevention trial were: age between 21 and 80 years, inclusive, no overt symptoms of congestive heart failure, and left ventricular EF less than 35%. EF is a number between 0 and 100 that measures the efficiency of the heart in ejecting blood. A total of 4228 patients with asymptomatic left ventricular dysfunction were randomly assigned to receive either enalapril or placebo at one of the 83 hospitals linked to 23 centers in the United States, Canada, and Belgium. Liu et al. (2009) and Liu et al. (2010) have analyzed this data without considering the intra-cluster correlation.
The primary clinical issues of interest are the effects of covariates on the risk of heart failure and on the first nonfatal myocardial infarction (MI) after adjusting for the confounding variables. The covariates of interest are ejection fraction, patient’s gender (SEX), which is coded 1 for male and 0 for female, treatment (TRT), which is coded 1 for enalapril and 0 for placebo, and patient’s age (AGE), which is measured in years. In the SOLVD study, the covariates SEX, TRT and AGE were recorded for almost all of the patients, but only 108 among the total of 4228 patients have their ejection fraction accurately measured using a standardized radionucleotide technique (LVEF). A related nonstandardized measure (EF) was ascertained for all the patients. Therefore, the nonstandardized measure (EF) is a surrogate measure for the standardized measure (LVEF) in this case.
In terms of the notation in the previous sections, we set , where k denotes failure type with for heart failure and for nonfatal MI and i denotes the patient with . Let be the unknown regression coefficients, we fit the following marginal hazards model to the SOLVD data:
Since the primary covariate LVEF is continuous and severely incomplete, we need to estimate the induced relative risk function using the validation set. Furthermore, Liu et al. (2009) found that, given EF, the LVEF is conditionally independent of the rest of the covariates, thus can be estimated through (2.8) with .
Table 5 presents the data analysis results for two methods, the proposed estimated QIF method, and the EPPL method which ignores the intra-cluster correlation between the failure times. It can be seen that the parameter estimates from two methods are close but the proposed estimated QIF method have smaller standard errors. To test whether or not the covariates have significant effects on the times of heart failure and nonfatal MI, we calculate p-values from both the two-sided Z-test and the chi-squared test for the two methods. The results indicate that, at 0.05 significance level, all of the covariates are statistically significant for heart failure by the proposed method, while SEX is not significant from the EPPL method. From both methods, only TRT is significant for the risk of nonfatal MI. By the proposed method, the risk of heart failure decreases by 3.92% (95% CI [1.83%, 5.97%]) with 1% increase in LVEF, the risk increases by 2.63% (95% CI [1.43%, 3.85%]) per year increase in age, males have about 29.46% (95% CI [7.19%, 46.39%]) lower risk for heart failure than females, and enalapril reduces the risk by 33.77% (95% CI [20.52%, 44.80%]).
Table 5.
SOLVD data analysis results
| Covariate | Proposed method | EPPL method | ||||||
|---|---|---|---|---|---|---|---|---|
| Coef | SE | P value | Coef | SE | P value | |||
| Z-test | -test | Z-test | -test | |||||
| For heart failure | ||||||||
| LVEF | - 0.040 | 0.011 | < 0.001 | < 0.001 | - 0.045 | 0.012 | < 0.001 | < 0.001 |
| TRT | - 0.412 | 0.093 | < 0.001 | < 0.001 | - 0.454 | 0.106 | < 0.001 | < 0.001 |
| SEX | - 0.349 | 0.140 | 0.013 | < 0.001 | - 0.318 | 0.165 | 0.054 | 0.072 |
| AGE | 0.026 | 0.006 | < 0.001 | < 0.001 | 0.023 | 0.012 | 0.045 | 0.009 |
| For nonfatal MI | ||||||||
| LVEF | 0.006 | 0.011 | 0.566 | 0.534 | 0.023 | 0.015 | 0.111 | 0.102 |
| TRT | - 0.433 | 0.116 | < 0.001 | < 0.001 | - 0.391 | 0.131 | 0.003 | 0.002 |
| SEX | 0.084 | 0.196 | 0.669 | 0.294 | 0.048 | 0.214 | 0.822 | 0.815 |
| AGE | 0.006 | 0.006 | 0.275 | 0.715 | 0.004 | 0.008 | 0.651 | 0.597 |
Proposed method is referred to as the estimated QIF method with exchangeable working correlation. EPPL method is the method which uses the independent structure
To illustrate the prediction of the survival probability for a subject, Fig. 3 shows the estimated survival curves of heart failure and nonfatal MI for a 69-year-old male patient with LVEF of 28% (the median of LVEF), receiving enalapril. The survival curves by the two methods are very close, but the pointwise confidence intervals from the proposed method are narrower than those from the EPPL method.
Fig. 3.
Survival curves by the proposed method (bold curve) and the EPPL method (thin curve) for heart failure and nonfatal myocardial infarction of a subject with covariates and , along with the corresponding 95% pointwise confidence intervals (dotted curves)
Concluding remarks
We proposed an estimated QIF approach for multivariate failure time data when the primary covariate is ascertained on a subset of full cohort but auxiliary information is available on the full cohort. For our proposed approach, we allowed the censoring times for different failures for a subject to be different. It is worth noting that in practice the censoring times for different failures for a subject are usually the same. This can be treated as a special situation of our general set up and the proposed method is applicable.
In this article, we consider the situation that auxiliary variable is continuous. The method is based on the kernel smoothing technique and therefore is nonparametric with respect to the association between the missing covariate and corresponding auxiliary. QIF method has advantage of incorporating intra-cluster correlation in the estimation procedure. Compared with other existing methods (e.g., Liu et al. 2010) where intra-cluster correlation is not considered, the proposed procedure can improve the estimation efficiency without requiring the specification of the correlation formula. Another advantage of proposed method is that it is easy to implement. In this work, we consider the situation when the dimension of continuous auxiliary covariates is low, further research is needed when the dimension is high.
Acknowledgements
We are grateful to the associate editor and the reviewers for their detailed and constructive comments which led to the improvements in the paper. This research is supported in part by the National Key Research and Development Project of China No. 2018YFC1314603 (Liu), the National Science Foundation of China (NSFC) Grants 11771366 (Yan, Liu), the U.S. National Institute of Health Grants P01 CA142538 (Cai, Zhou), P42ES031007 (Zhou) and P30 ES010126 (Zhou).
Appendix
For , let , and . For , some quantities are defined as follows:
For , define by substituting and for , and in , respectively. Furthermore, we also define
We introduce some conditions to ensure the consistency and asymptotic normality of and as follows:
, for .
, for all .
There exists a compact set , containing as its interior point.
-
Multivariate kernel function is non-negative and uniformly bounded with finite support satisfying that and . Furthermore, has order in the sense that , where b is the dimension of , with s being non-negative integers, .
For , the bandwidth matrix satisfies that and .
Let be as in (C4), has the th continuous derivative with respect to , where is the joint distribution function of for given t.
- There exist scalar, vector and matrix functions , such that for all and all constant matrix B,
in probability. - Let be as in Condition (C6), and set . Then for all , , and ,
and is bounded away from 0 on . For all basis matrix B, the matrix
is negative definite.A.1 - There exists a matrix function , such that for any constant matrices ,
in probability uniformly for , where
Furthermore, for any set of basis matrices , the matrix
is positive definite.A.2 The baseline hazard rates are twice continuously differentiable on .
Proof of Theorem 1
Consistency
Following the arguments of Xue et al. (2010), one can show that is consistent for provided that:
-
(i)
exists and is continuous, and it converges in probability to a fixed function, say , uniformly for ;
-
(ii)
in probability;
-
(iii)
converges in probability to a constant matrix uniformly for ;
-
(iv)
is positive definite with probability going to 1 as .
According to the first conclusion of Lemma 1 in Liu et al. (2010), under conditions (C2), (C4) and (C5), for , we can prove that in probability
Since the kernel smoothed estimator in (2.7) is a consistent estimator for , then, by the definitions of and and the remaining conditions, it follows that
in probability for .
Denote
then .
After simple algebraic manipulations, we obtain that
where , and
| 7.3 |
with
Clearly, is continuous. For any constant matrix B, the first term on the right hand side of (7.3) is a local square integrable martingale, hence by Lenglart inequality, we can show that it converges to zero in probability, uniformly for . Let denote the uniform convergence limit of the second term, we can show that
in probability, uniformly in . Thus, converges to
in probability, uniformly for . Therefore, (i) is satisfied.
Denote
and
When , by Lenglart inequality, we can show that
where . From the arguments in Zhou and Wang (2000), it can be shown that
where
Note that if . Then we have
where is a mean-zero martingale and , then by strong law of large numbers, we have converges to zero with probability 1, thus (ii) is satisfied.
Denote
then we can obtain
and
where
Similiarly as in Xue et al. (2010), by condition (C8), we can show that converges in probability to , and converges in probability to uniformly for , hence (iii) and (iv) are satisfied.
Asymptotic normality
Define , from the previous statements, for any constant matrix B, we can obtain that is asymptotically equivalent to , which is a sum of independent p-vector random variables with mean zero and variance . By condition (C8) and the multivariate central limit theorem, we can show that converges in distribution to a normal random vector with mean zero and covariance matrix , and converges in distribution to a normal random vector, denoted as , with mean zero and covariance matrix as defined in (A.2).
By Taylor expansion of around the true parameter , we have
where is between and . From the argument in the proof of Theorem 1 and the consistency of and converge to and in probability, respectively. Hence, we can obtain that in distribution
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- Breslow NE. Discussion of the paper by D. R. Cox. J R Stat Soc Ser B. 1972;34:216–217. [Google Scholar]
- Cai JW, Prentice RL. Estimating equations for hazard ratio parameters based on correlated failure time data. Biometrika. 1995;82:151–164. doi: 10.1093/biomet/82.1.151. [DOI] [Google Scholar]
- Cai JW, Prentice RL. Regression estimation using multivariate failure time data and a common baseline hazard function model. Lifetime Data Anal. 1997;3:197–213. doi: 10.1023/A:1009613313677. [DOI] [PubMed] [Google Scholar]
- Cai JW, Shen Y. Permutation tests for comparing marginal survival functions with clustered failure time data. Stat Med. 2000;19:2963–2973. doi: 10.1002/1097-0258(20001115)19:21<2963::AID-SIM593>3.0.CO;2-H. [DOI] [PubMed] [Google Scholar]
- Clayton D, Cuzick J. Multivariate generalizations of the proportional hazard model. J R Stat Soc Ser A. 1985;148:82–117. doi: 10.2307/2981943. [DOI] [Google Scholar]
- Fan Z, Wang X. Marginal hazards model for multivariate failure time data with auxiliary covariates. J Nonparametr Stat. 2009;21:771–786. doi: 10.1080/10485250902915903. [DOI] [Google Scholar]
- Greene WF, Cai JW. Measurement error in covariates in the marginal hazards model for multivariate failure time data. Biometrics. 2004;60:987–996. doi: 10.1111/j.0006-341X.2004.00254.x. [DOI] [PubMed] [Google Scholar]
- Hu C, Lin DY. Semiparametric failure time regression with replicates of mismeasured covariates. J Am Stat Assoc. 2004;99:105–118. doi: 10.1198/016214504000000197. [DOI] [Google Scholar]
- Liu Y, Zhou H, Cai JW. Estimated pseudopartial-likelihood method for correlated failure time data with auxiliary covariates. Biometrics. 2009;65:1184–1193. doi: 10.1111/j.1541-0420.2009.01198.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Y, Wu Y, Zhou H. Multivariate failure times regression with a continuous auxiliary covariate. J Multivar Anal. 2010;101:679–691. doi: 10.1016/j.jmva.2009.09.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Y, Yuan Z, Cai JW, Zhou H. Marginal hazard regression for correlated failure time data with auxiliary covariates. Lifetime Data Anal. 2012;18:116–138. doi: 10.1007/s10985-011-9209-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qu A, Lindsay BG, Li B. Improving generalised estimating equations using quadratic inference functions. Biometrika. 2000;87:823–836. doi: 10.1093/biomet/87.4.823. [DOI] [Google Scholar]
- SOLVD Investigators (1991) Effect of enalapril on survival in patients with reduced left ventricular ejection fractions and congestive heart failure. N Engl J Med 325:293–302 [DOI] [PubMed]
- Wei LJ, Lin DY, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc. 1989;84:1065–1073. doi: 10.1080/01621459.1989.10478873. [DOI] [Google Scholar]
- Xue L, Wang L, Qu A. Incorporating correlation for multivariate failure time data when cluster size is large. Biometrics. 2010;66:393–404. doi: 10.1111/j.1541-0420.2009.01307.x. [DOI] [PubMed] [Google Scholar]
- Zhou H, Wang C-Y. Failure time regression with continuous covariates measured with error. J R Stat Soc Ser B. 2000;62:657–665. doi: 10.1111/1467-9868.00255. [DOI] [Google Scholar]



