Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Jan 15.
Published in final edited form as: Biometrika. 2013 May 14;100(3):741–755. doi: 10.1093/biomet/ast012

Kernel Smoothed Profile Likelihood Estimation in the Accelerated Failure Time Frailty Model for Clustered Survival Data

Bo Liu 1, Wenbin Lu 2, Jiajia Zhang 3
PMCID: PMC3893096  NIHMSID: NIHMS480514  PMID: 24443587

Summary

Clustered survival data frequently arise in biomedical applications, where event times of interest are clustered into groups such as families. In this article we consider an accelerated failure time frailty model for clustered survival data and develop nonparametric maximum likelihood estimation for it via a kernel smoother aided EM algorithm. We show that the proposed estimator for the regression coefficients is consistent, asymptotically normal and semiparametric efficient when the kernel bandwidth is properly chosen. An EM-aided numerical differentiation method is derived for estimating its variance. Simulation studies evaluate the finite sample performance of the estimator, and it is applied to the Diabetic Retinopathy data set.

Keywords: Accelerated failure time model, Clustered survival data, EM algorithm, Kernel smoothing, Profile likelihood estimation

1. Introduction

Clustered survival data are a common type of multivariate survival data, often encountered in fields such as medicine, economics and epidemiology. Because multivariate survival models are important tools for analyzing clustered survival data, they have attracted considerable attention. There are two main approaches: marginal modelling and joint modelling via random effects. The first approach models the marginal distribution of correlated failure times without specifying the correlation structure. For example, Wei et al. (1989) proposed marginal regression analysis based on the proportional hazards model (Cox, 1972) for multivariate failure time data. A review of marginal approaches based on the proportional hazards model can be found in Lin (1994). In some applications, such as family studies, the within-cluster association is also important to investigators. Joint modelling uses random effects to describe the association among failure times within clusters. In addition, by appropriately taking into account the correlation structure, joint modelling can have better estimation efficiency than the marginal approach. Clayton & Cuzick (1985) introduced cluster-specific random effects or frailties to the proportional hazards model, which assumes that subjects within the same cluster can be considered independent conditional on the frailty. The multiplicative proportional hazards frailty model has been widely studied (Hougaard, 1987; Oakes, 1989; Nielsen et al., 1992) and various frailty distributions have been used to describe the within cluster correlation. The large sample properties of the associated nonparametric maximum likelihood estimators have been investigated by Murphy (1994, 1995) and Parner (1998).

The accelerated failure time model (Kalbfleisch & Prentice, 2002) is a useful alternative to the proportional hazards model. Many methods have been developed for parameter estimation in the accelerated failure time model for univariate survival data (Buckley & James, 1979; Tsiatis, 1990; Ying, 1993; Jin et al., 2003; Zeng & Lin, 2007). Recently, this model has been extended to clustered survival data. For example, Jin et al. (2006a,b) considered the marginal accelerated failure time model for clustered survival data: the former extended the Buckley-James estimation method; the latter extended the weighted log-rank estimation method. To improve the efficiency of the marginal approach, Li & Yin (2009) proposed a generalized moments estimation method, incorporating a posited correlation matrix into the rank-based estimating equations and minimizing a quadratic inference function. In addition, Johnson & Strawderman (2009) applied the induced smoothing technique to the weighted log-rank estimators for clustered survival data, which facilitates the resulting estimation and inference procedures. To characterize the correlation structure of failure times within clusters, Pan (2001) proposed to use frailties in the accelerated failure time model and developed an EM-like algorithm to estimate the coefficients in the accelerated failure time frailty model. Based on Pan’s method, Zhang & Peng (2007) and Xu & Zhang (2010) developed more stable estimation procedures using M-estimation and rank-based estimation, respectively. More recently, Johnson & Strawderman (2012) introduced smoothing into the EM-like algorithm to facilitate parameter estimation. However, none of the above estimators are semiparametric efficient because the considered EM-like algorithms do not maximize the likelihood function. Moreover, the asymptotic properties of these estimators have not been studied. In this article, we develop a nonparametric maximum likelihood estimation method for the accelerated failure time frailty model.

2. The accelerated failure time frailty model

Let Tij be the failure time, Cij be the censoring time and Xij be the p-dimensional vector of baseline covariates for the jth individual in the ith cluster, for i = 1, …, n and j = 1, …, mi. Here n is the total number of clusters and mi is the size of the ith cluster. The observed data are O = {(ij, δij, Xij) : i = 1, …, n; j = 1, …, mi}, where ij = min(Tij, Cij) and δij = I(TijCij).

The marginal accelerated failure time model is

logTij=βXij+εij, (1)

where β is the p-dimensional vector of regression coefficients, and the error terms, (εi1, …, εimi), are independent across clusters and independent of (Xi1, …, Ximi). It is assumed that all εij have a common unknown marginal distribution, and εij and εik may be correlated for jk. We assume that Tij and Cij are independent conditional on Xij, and mi is small compared to n and is noninformative, i.e., independent of Tij, Cij and Xij.

To describe the dependence between clustered survival times, Pan (2001) proposed to consider the accelerated failure time frailty model. Specifically, given a positive latent variable αi of mean 1 and variance σ2, it is assumed that the hazard function of eεij is

λij(t)=αiλ(t)(i=1,,n;j=1,,mi), (2)

where λ(·) is an unspecified baseline hazard function. In addition, εi1, …, εimi are assumed independent conditional on αi and the magnitude of dependence among the εij is characterized by the value of σ2. There are many choices for the frailty distribution, e.g., the gamma distribution (Clayton, 1978), the positive stable distribution (Hougaard, 1986), the compound Poisson distribution (Aalen, 1992) and the log-normal distribution (McGilchrist & Aisbett, 1991).

3. Nonparametric maximum likelihood estimator

Let fα(·; θ) denote the density of the latent variable αi, where θ is an unknown finite dimensional vector of parameters. The log-likelihood function for the complete data, {(ij, δij, Xij, αi) : i = 1, …, n; j = 1, …, mi}, can be written as

lnc(β,Λ,θ)=ln,1c(θ)+ln,2c(β,Λ),

where Λ(t)=0tλ(s)ds, Rij (β) = log(ij) + β′ Xij,

ln,1c(θ)=1ni=1nj=1miδijlogαi+1ni=1nlogfα(αi;θ), (3)
ln,2c(β,Λ)=1ni=1nj=1mi[δij{βXij+logλ(eRij(β))}αiΛ(eRij(β))]. (4)

We use an EM algorithm to obtain the nonparametric maximum likelihood estimator. Let Ω̂[k] = (β̂[k], Λ̂[k], θ̂[k]) denote the parameter estimates at step k. In the expectation step, we obtain the conditional density of αi given the observed data O and current parameter estimates Ω̂[k],

fα(αiO,Ω^[k])=fα(αi;θ^[k])αij=1miδijexp{αij=1miΛ^[k](eRij(β^[k]))}0+fα(αi;θ^[k])αij=1miδijexp{αij=1miΛ^[k](eRij(β^[k]))}dαi. (5)

The conditional expectations E(αiO, Ω̂[k]), E(log αiO, Ω̂[k]) and E{log fα(αi; θ)∣O, Ω̂[k]} can be calculated as the integrals of the corresponding terms with respect to the conditional density fα(αiO, Ω̂[k]). For example, when the frailty has a gamma density fα(x; θ) = xθ−1 e−θx θθ/Γ(θ), where x > 0, θ > 0 and Γ(θ)=0+tθ1etdt, we have

α^i[k]E(αiO,Ω^[k])=(Di+θ^[k])/{θ^[k]+j=1miΛ^[k](eRij(β^[k]))},E2,i[k]E(logαiO,Ω^[k])=Ψ(Di+θ^[k])log{θ^[k]+j=1miΛ^[k](eRij(β^[k]))},E3,i[k]E{logfα(αi;θ)O,Ω^[k]}=(θ1)E2,i[k]θα^i[k]+θlogθlogΓ(θ),

where ψ (x) = Γ′(x)/Γ(x) is the digamma function. For general frailty distributions, such as the log-normal distribution, these conditional expectations may not have closed analytical forms. In such cases we use gaussian quadrature. Therefore, the conditional expectations of (3) and (4) given O and Ω̂[k] are

E{ln,1c(θ)O,Ω^[k]}=1ni=1nE2,i[k]Di+1ni=1nE3,i[k], (6)
E{ln,2c(β,Λ)O,Ω^[k]}=1ni=1nj=1mi[δij{βXij+logλ(eRij(β))}α^i[k]Λ(eRij(β))], (7)

where Di=j=1miδij is the total number of observed events in cluster i.

In the maximization step, equation (6) can be easily maximized using standard gradient-based optimization algorithms. Let θ̂[k+1] denote the maximizer of (6). The conditional log-likelihood given in (7) cannot be directly maximized over β and Λ. We adopt a kernel smoothing technique similar to that used by Zeng & Lin (2007). Specifically, consider the piecewise constant hazard function λ(t)=l=1JnclI(tl1t<tl) on [0, M], where 0 = t0 < t1 < ⋯ < tJn = M are equally spaced, and M is an upper bound for all eRij (β). Accordingly, the cumulative hazard function is Λ(t)=l=1Jncl(ttl)I(tl1t<tl)+(M/Jn)l=1JnclI(ttl). Using these expression for λ̃(·) and Λ̃(·) in (7) and maximizing (7) with respect to cl (l = 1, …, Jn), for fixed β, the following maximizers are obtained:

c^l[k]=i=1nj=1miδijI(tl1eRij(β)<tl)i=1nα^i[k]j=1mi{(eRij(β)tl1)I(tl1eRij(β)<tl)+(M/Jn)I(eRij(β)tl)}.

The profile likelihood function for β constructed using the above expression for c^l[k] is

ln,2p,k(β)=1ni=1nj=1miδijβXij+l=1Jn[{1ni=1nj=1miδijI(tl1eRij(β)<tl)}×log{JnnMi=1nj=1miδijI(tl1eRij(β)<tl)}]l=1Jn({1ni=1nj=1miδijI(tl1eRij(β)<tl)}×log[JnnMi=1nα^i[k]j=1mi{(eRij(β)tl1)I(tl1eRij(β)<tl)+MJnI(eRij(β)tl)}]).

Following derivations similar to those given in Zeng & Lin (2007), ln,2p,k(β) converges uniformly in β to a limiting function as n → ∞, Jn → ∞ and Jn/n → 0, and the limiting function can be approximated by the smooth function

ln,2s,k(β)=1ni=1nj=1miδij[βXij+log{λ^[k+1](eRij(β);β)}], (8)

where

λ^[k+1](t;β)=1thni=1nj=1miδijK[{Rij(β)log(t)}/hn]i=1nα^i[k]j=1mi{Rij(β)log(t)}/hnK(u)du,t>0. (9)

Let β̂[k+1] denote the maximizer of ln,2s,k(β). Given β̂[k+1], a smooth estimator of λ(t) can be obtained as λ̂[k+1](t; β̂[k+1]) and Λ^[k+1](t)0tλ^[k+1](s;β^[k+1])ds.

From an initial estimator Ω̂[0], the E-step and M-step are repeated until convergence. The estimators of β, Λ(·) and θ at convergence are denoted by β̂n, Λn(·) and θ̂n, respectively. In our implementation, the starting value of β was taken to be the maximum smoothed profile likelihood estimator of Zeng & Lin (2007) assuming working independence among correlated failure times. An initial estimator for λ was obtained by setting k = −1 and α^i[1]1 in (9). Based on our numerical experience, the convergence of the proposed EM algorithm is not sensitive to the choice of the starting value for θ. For convenience, we chose θ̂[0] = 1 for all scenarios considered in our numerical studies. Define γ^n=(β^n,θ^n) and γ0=(β0,θ0), the true value of γ. Let Λ0 and λ0 denote the true values of Λ and λ, respectively. Next, we establish the asymptotic properties for estimators γ̂n and Λ̂n.

Theorem 1

Assume that the regularity conditions (C1)–(C8) given in the Appendix hold. As n → ∞, nhn2 and nhn40: (i) supt∈[0,τ] |Λ̂n(t) − Λ0(t)| → 0 and γ̂nγ0 almost surely; and (ii) n1/2(γ̂n − γ0) converges in distribution to a mean-zero normal random vector with a covariance matrix that achieves the semiparametric efficiency bound I−1.

The proof of Theorem 1 is given in the Appendix. To estimate the variance of β̂n, we adopt the EM-aided numerical differentiation method proposed by Chen & Little (1999), which numerically computes the empirical Fisher information matrix of the profile likelihood. A similar method was used by Lu (2010) for variance estimation in the accelerated failure time model with a cure fraction. Specifically,

E{ln,1c(θ)+ln,2c(β,Λ)O,Ω}=1ni=1nli(β,Λ,θ).

The jth component of β̂n is perturbed by a small value, d. The pair of perturbed estimates is denoted by β̂n,j = (β̂n,1, …, β̂n,jd, …, β̂n,p)′ and β̂n,j+ = (β̂n,1, …, β̂n,j + d, …, β̂n,p)′ for j = 1, …, p. The β is fixed at β̂n,j, and the above EM algorithm is implemented until convergence. The estimates of Λ and θ at convergence are denoted by Λ̂n,j and θ̂n,j, respectively. The estimates Λ̂n,j+ and θ̂n,j+ can be similarly obtained. For i = 1, …, n and j = 1, …, p, define

Sij={li(β^n,j+,Λ^n,j+,θ^n,j+)li(β^n,j,Λ^n,j,θ^n,j)}/(2d).

Let i = (i1, …, ip)′ and In=i=1nSiSi. Then, the covariance matrix of β̂n can be estimated by In1.

4. Numerical Examples

4·1. Simulation studies

We generated clustered failure times from the following model

logTij=Xij1Xij2+εij(i=1,,100;j=1,,5),

where Xij1 follows a Bernoulli distribution with a success probability of 0.5, Xij2 follows a uniform distribution on [-1,1] and εij follows the frailty model (2). We considered two frailty distributions: gamma frailty with mean 1 and variance σ2 = 1/θ; and log-normal frailty with mean 1 and variance σ2 = eθ − 1. Further, we considered three choices for λ0(t): Weibull-type, λ0(t) = atb; log-normal-type, λ0(t) = t−1 ϕ{log(t)}/[1 − Φ{log(t)}], where ϕ(·) and Φ(·) are the density and cumulative distribution functions of the standard normal random variable; and reciprocal-type, λ0(t) = c/(1 + t). Here, a, b and c are positive constants. Censoring times were generated from a uniform distribution on [0, τc], where τc was chosen to yield censoring proportions of 15% and 40%. For each setting, we conducted 2000 simulation runs.

For the bandwidth parameter, hn, of the kernel smoother, we adapted the optimal bandwidths proposed by Jones (1990) and Jones & Sheather (1991) for density estimation. Such bandwidths were also used by Zeng & Lin (2007) for smoothing the profile likelihood in the standard accelerated failure time model. Specifically, we set hn = ζ σ̂e n−1/3, where ζ is a positive constant, and σ̂e is the sample standard deviation of the fitted residuals, logTij+β^LSXij. Here β̂LS is the least squares estimate based on all of the data, including censored data. In our simulations, we tried a range of values for ζ and found that 0.8 ≤ ζ ≤ 1.8 works well in all of the scenarios. For comparison, we also included the Gehan rank estimator (Jin et al., 2006b), the induced smoothing estimator (Johnson & Strawderman, 2009) and the smoothed EM-like estimator (Johnson & Strawderman, 2012). The former two estimators are based on the marginal accelerated failure time model.

The results for the gamma and log-normal frailties are summarized in Tables 1 and 2, respectively. Because the results for the various hazard functions are similar, we present only the results for the Weibull-type and reciprocal-type. In addition, as reported in Johnson & Strawderman (2009), the Gehan rank estimator and induced smoothing estimator have very similar performances. Therefore, we exclude the results for the Gehan rank estimator. All three estimators for the regression parameters are essentially unbiased under all settings and the averages of the estimated standard errors obtained using the proposed EM-aided numerical differentiation method for the nonparametric maximum likelihood estimator are close to their standard deviations with the empirical coverage probabilities close to the nominal level. In most cases, the nonparametric maximum likelihood estimator is more efficient than the Gehan rank estimator and the induced smoothing estimator. The efficiency gain is more substantial when the variance of the frailty is large, but it decreases as the variance decreases. This result agrees with our expectation since when the variance of the frailty is large, the survival times within the same cluster are strongly correlated. Thus, the nonparametric maximum likelihood estimator is expected to be more efficient since it effectively accounts for the within-cluster correlation. The nonparametric maximum likelihood estimator is generally more efficient than the smoothed EM-like estimator in terms of the mean square error for the Weibull-type hazard function, especially when the correlation between clustered survival times is strong and the censoring proportion is low. However, for the reciprocal-type hazard function, the smoothed EM-like estimator may have better efficiency than the nonparametric maximum likelihood estimator when the correlation is weak or the censoring proportion is high. This result is attributed to the smaller biases of the smoothed EM-like estimators. Finally, the proposed nonparametric maximum likelihood estimator for the variance of the frailty is nearly unbiased. The mean estimated survival curves for S0(t) ≡ exp{−Λ0(t)} are given in Figures 1 – 6 of the Supplementary Material. For all the scenarios, the mean estimated survival curves are close to the true survival curves.

Table 1.

Simulation results for gamma frailty.

CR σ2 σ̂2 SD β NPMLE J-S1 J-S2
Bias(%) SD SE CP(%) Bias(%) SD Bias(%) SD
λ0(t): Weibull type
15% 2 1.98 .30 β̂1 −0.5 .068 .08 97 0.3 .078 0.0 .104
β̂2 0.5 .061 .07 97 −0.4 .069 0.1 .090
40% 2 1.96 .31 β̂1 −1.6 .085 .09 95 1.2 .090 −1.0 .109
β̂2 1.5 .076 .08 94 −0.9 .081 1.3 .096
15% 1 0.98 .17 β̂1 −0.5 .069 .07 95 0.3 .075 −0.4 .082
β̂2 0.4 .060 .06 94 −0.3 .064 0.3 .071
40% 1 0.97 .19 β̂1 −2.2 .082 .10 97 0.4 .089 −1.1 .093
β̂2 2.3 .071 .08 96 −0.3 .076 1.1 .081
15% 0.5 0.48 .12 β̂1 −0.9 .062 .08 97 −0.1 .069 −0.5 .069
β̂2 1.1 .055 .07 97 0.1 .061 0.7 .062
40% 0.5 0.47 .14 β̂1 −2.4 .077 .09 96 −0.2 .083 −1.7 .082
β̂2 2.9 .070 .08 96 0.2 .075 1.9 .074
λ0 (t): reciprocal type
15% 2 2.00 .30 β̂1 −0.3 .154 .17 96 0.6 .161 0.3 .234
β̂2 1.4 .138 .14 95 0.5 .143 0.8 .208
40% 2 2.00 .32 β̂1 −2.5 .173 .18 95 1.0 .179 0.3 .219
β̂2 4.0 .154 .16 94 0.8 .157 1.0 .201
15% 1 0.99 .17 β̂1 −1.5 .143 .15 96 −0.5 .144 −0.9 .166
β̂2 1.4 .125 .13 96 0.8 .128 0.5 .148
40% 1 0.98 .22 β̂1 −4.2 .163 .18 96 −0.8 .160 −1.6 .172
β̂2 5.2 .146 .16 96 1.0 .145 1.0 .152
15% 0.5 0.48 .11 β̂1 −1.0 .136 .13 94 0.6 .136 0.2 .139
β̂2 1.1 .124 .12 93 −1.0 .122 0.2 .122
40% 0.5 0.47 .12 β̂1 −5.1 .156 .17 95 0.8 .157 −0.3 .156
β̂2 5.2 .140 .15 94 −0.9 .139 1.0 .138

CR, censoring rate; σ2, true variance of frailty; σ̂2, estimate for σ2; SD, sample standard deviation; SE, mean of estimated standard errors; CP, empirical coverage probability of 95% Wald-type confidence interval; NPMLE, proposed nonparametric maximum likelihood estimator; J-S1, smoothed EM-like estimator; J-S2, induced smoothing estimator.

Table 2.

Simulation results for log-normal frailty.

CR σ2 σ̂2 SD β NPMLE J-S1 J-S2
Bias(%) SD SE CP(%) Bias(%) SD Bias(%) SD
λ0(t): Weibull type
15% 3.48 3.43 1.26 β̂1 −1.0 .066 .08 97 0.3 .077 −0.6 .083
β̂2 1.0 .059 .07 97 −0.2 .066 0.6 .073
40% 3.48 3.10 1.31 β̂1 −2.3 .085 .10 96 0.5 .093 −1.5 .099
β̂2 2.5 .074 .09 97 −0.4 .080 1.7 .085
15% 1.72 1.70 0.55 β̂1 −0.8 .066 .07 96 −0.2 .073 −0.5 .076
β̂2 0.9 .058 .06 96 0.1 .063 0.5 .066
40% 1.72 1.59 0.59 β̂1 −2.8 .083 .09 95 0.5 .089 −1.6 .092
β̂2 3.0 .073 .08 96 −0.3 .077 1.7 .078
15% 0.65 0.64 0.21 β̂1 −0.7 .063 .07 96 −0.1 .070 −0.5 .067
β̂2 0.8 .056 .06 96 0.2 .060 0.5 .059
40% 0.65 0.63 0.23 β̂1 −3.1 .080 .09 95 −0.2 .086 −1.6 .083
β̂2 3.0 .069 .08 95 0.5 .075 1.6 .072
λ0 (t): reciprocal type
15% 3.48 3.41 1.24 β̂1 −0.8 .143 .17 98 0.8 .151 0.4 .171
β̂2 1.1 .128 .15 97 −0.2 .133 −0.3 .157
40% 3.48 2.94 1.05 β̂1 −3.8 .171 .19 96 1.0 .173 −0.2 .187
β̂2 4.7 .151 .17 96 −0.5 .152 0.5 .169
15% 1.72 1.69 0.53 β̂1 −1.0 .140 .16 97 0.7 .146 0.2 .153
β̂2 1.5 .126 .14 96 −0.3 .132 −0.1 .140
40% 1.72 1.53 0.51 β̂1 −4.5 .168 .18 96 1.2 .167 −0.3 .172
β̂2 5.1 .149 .16 95 0.5 .148 0.7 .156
15% 0.65 0.65 0.19 β̂1 −1.0 .134 .14 95 0.8 .136 0.1 .135
β̂2 1.5 .123 .13 94 −0.3 .124 0.1 .124
40% 0.65 0.62 0.21 β̂1 −5.1 .160 .18 95 0.5 .159 −0.4 .155
β̂2 6.1 .142 .16 95 0.3 .141 0.6 .142

CR, censoring rate; σ2, true variance of frailty; σ̂;2, estimate for σ2; SD, sample standard deviation; SE, mean of estimated standard errors; CP, empirical coverage probability of 95% Wald-type confidence interval; NPMLE, proposed nonparametric maximum likelihood estimator; J-S1, smoothed EM-like estimator; J-S2, induced smoothing estimator.

We also conducted a sensitivity analysis to study the performance of the nonparametric maximum likelihood estimator when the frailty distribution is misspecified. Specifically, clustered survival data were generated from the accelerated failure time frailty model with the log-normal frailty as considered previously. However, the nonparametric maximum likelihood estimator was computed based on the gamma frailty. The simulation results are given in Table 3. The nonparametric maximum likelihood estimator for the regression parameters shows very small biases that are comparable to those reported in Table 2 when the log-normal frailty distribution was correctly specified. The means of the estimated standard errors are close to the standard deviations with proper coverage probabilities. Based on the limited simulations we have conducted, the performance of the nonparametric maximum likelihood estimator for the regression parameters is relatively robust to the misspecification of the frailty distribution. However, the estimate for the variance of the frailty shows large biases when the frailty distribution is misspecified. The mean estimated survival curves for S0(t) are given in Figures 7–9 of the Supplementary Material. When the log-normal frailty is misspecified as the gamma frailty, the mean estimated survival curves slightly overestimate the true survival curves for the cases with large frailty variance, i.e., σ2 = 3.48, while they are nearly unbiased for cases with smaller variances.

Table 3.

Sensitivity analysis results for misspecified frailty distribution

CR σ2 σ̂2 SD β Bias(%) SD SE CP(%)
λ0(t): Weibull type
15% 3.48 1.01 0.17 β̂1 −1.1 .067 .08 97
β̂2 1.0 .060 .07 97
40% 3.48 1.04 0.20 β̂1 −3.4 .086 .10 96
β̂2 3.3 .075 .09 97
15% 1.72 0.70 0.12 β̂1 −0.9 .067 .07 96
β̂2 1.0 .059 .07 97
40% 1.72 0.72 0.15 β̂1 −3.3 .084 .09 95
β̂2 3.2 .072 .08 96
15% 0.65 0.39 0.10 β̂1 −0.8 .063 .07 96
β̂2 0.8 .056 .06 97
40% 0.65 0.39 0.12 β̂1 −3.2 .080 .09 96
β̂2 3.0 .069 .08 96

λ0 (t): reciprocal type
15% 3.48 1.04 0.16 β̂1 −1.2 .145 .17 98
β̂2 1.4 .130 .15 97
40% 3.48 1.14 0.22 β̂1 −5.3 .174 .19 96
β̂2 5.4 .153 .17 96
15% 1.72 0.74 0.13 β̂1 −1.6 .141 .16 97
β̂2 1.7 .127 .14 96
40% 1.72 0.79 0.16 β̂1 −5.4 .169 .18 95
β̂2 5.6 .151 .16 95
15% 0.65 0.38 0.09 β̂1 −1.6 .134 .14 95
β̂2 1.8 .123 .12 94
40% 0.65 0.41 0.11 β̂1 −6.6 .160 .18 95
β̂2 6.8 .146 .16 95

CR, censoring rate; σ2, true variance of frailty; σ̂;2, estimate for σ2; SD, sample standard deviation; SE, mean of estimated standard errors; CP, empirical coverage probability of 95% Wald-type confidence interval.

We conducted additional simulations with cluster size of 2 and n = 200. The simulation results are given in the Supplementary Material. The findings are similar to those reported here.

4·2. Analysis of diabetic retinopathy data

We applied our estimation methods to clustered survival data from the diabetic retinopathy study conducted by the National Eye Institute (Huster et al., 1989). The study enrolled 197 patients with proliferative diabetic retinopathy representing a 50% simple random sample of patients with high-risk. For each patient, the photocoagulation treatment was randomly assigned to one eye, while the other eye was an untreated control. The endpoint of interest is the time to severe visual loss after treatment. In addition to the treatment indicator, 1 for treated with photocoagulation and 0 for untreated, there are three prognostic factors: age at diagnosis of diabetes, type of diabetes, 1 for adult and 0 for juvenile, and risk group ranging from 6–12. A primary goal is to study the effects of treatment and risk factors on the time to severe visual loss. This data set has been previously studied. For example, Lu (2007) studied the data using a marginal bivariate accelerated failure time model based on the weighted log-rank estimation method of Jin et al. (2006b). Lu also developed a statistical test for the association between pairs of failure times after adjusting for covariates. It was found that the null hypothesis of independence was rejected with a small p-value, which implies that there is a strong correlation between the pair of error terms in the bivariate accelerated failure time model. Here, we fit the accelerated failure time frailty model with all four covariates using the proposed nonparametric maximum likelihood estimation method and the Gehan rank estimation method. For our method, both the gamma and log-normal frailties were considered. The bandwidth parameters for the nonparametric maximum likelihood estimators were selected as in the simulation studies. The results are given in Table 4. The nonparametric maximum likelihood estimators have much smaller standard errors than the Gehan rank estimator, which indicates the efficiency gained by taking the correlation between error terms into account in the nonparametric maximum likelihood estimators. In addition, the nonparametric maximum likelihood estimators with the gamma and log-normal frailty distributions show very similar performances, which may imply that the analysis results are not sensitive to the choice of the frailty distribution. The estimated variance of the frailty is 0.88 under the gamma frailty and 1.16 under the log-normal frailty. Finally, all of the methods found that treatment and risk group are significantly associated with time to severe visual loss, whereas age at diagnosis of diabetes and type of diabetes are not.

Table 4.

Analysis results for diabetes data.

NPMLEg NPMLEl GehanR
Est. SE pv Est. SE pv Est. SE pv
treatment −0.929 .104 .000 −0.949 .105 .000 −0.981 .201 .000
age 0.011 .006 .078 0.008 .006 .162 0.010 .016 .551
type −0.029 .102 .779 0.070 .096 .467 −0.301 .453 .506
risk 1.660 .353 .000 1.491 .346 .000 2.568 1.036 .013

NPMLEg, nonparametric maximum likelihood estimator with the gamma frailty; NPMLEl, nonparametric maximum likelihood estimator with the log-normal frailty; GehanR, Gehan rank estimator; Est., estimated coefficients; pv, p-value; age, age at diagnosis of diabetes; type, type of diabetes; risk, risk group.

5. Discussion

The proposed kernel-smoothing based nonparametric maximum likelihood estimation method can be extended to other types of multivariate survival data, such as recurrent event data. Specifically, let Ni(t) denote the number of events observed on subject i by time t. We assume that Ni(t) is a nonhomogeneous Poisson process and model its conditional intensity function given covariates Zi and frailty αi by αi eβ′ Zi λ(eβ′ Zi t). This model is an extension of the accelerated failure time model for counting processes considered by Lin et al. (1998) and was studied by Strawderman (2006) using an EM-like algorithm. The nonparametric maximum likelihood estimation and its associated inference for the above model require further investigation.

Supplementary Material

Supplementary material

Acknowledgments

We thank the editor, an associate editor and two referees for very insightful comments. Lu and Zhang’s research was supported by the National Cancer Institute.

Appendix

Throughout the proofs, we assume the following regularity conditions:

  • (C1)

    The hazard function λ0(·) is positive and thrice-continuously differentiable with λ̇0(0) > 0, where λ̇0(0) is the right derivative of λ(t) at t = 0. In addition, Λ0(τ) < ∞.

  • (C2)

    There exists some positive constant c0, such that pr(Cijeβ0XijτXij)c0.

  • (C3)

    The covariates Xij are bounded. If there exists a constant vector a, such that a′Xij = 0 almost surely, then a = 0.

  • (C4)

    The true regression parameters β0 belong to the interior of a known compact set B, and 0 < θ0 < ∞.

  • (C5)

    The kernel function K(·) is thrice-continuously differentiable. In addition, K(r) (·) (r = 0, 1, 2, 3), have bounded variations in R, where K(r) (·) is the rth derivative of K(·).

  • (C6)

    The cluster size mi is completely random. In addition, there exists a positive integer m0, such that 1 ≤ mim0 and pr(mi ≥ 2) > 0.

  • (C7)

    For any 1 ≤ km0, supθ0uk|fα(m)(u;θ)|du< for m = 0, 1, 2, where fα(m)(u;θ) is the mth derivative of fα(u; θ) with respect to θ. In addition, 0ukfα(1)(u;θ0)du0 for some k.

  • (C8)

    The information matrix I is finite and positive definite.

Conditions (C1)–(C5) are similar to those used in Zeng & Lin (2007). Condition (C7) is assumed to establish the consistency of the proposed estimators, which is satisfied by many commonly used frailty distributions, e.g., the gamma and log-normal distributions. Condition (C8) is assumed to establish the asymptotic normality of the estimators.

Proof of Theorem 1

To establish the consistency of the estimators, we introduce the following quantity

Λn(t)0t(nhns)1i=1nj=1miδijK[{Rij(β0)logs}/hn]n1i=1nαij=1mi{Rij(β0)logs}/hnK(u)duds,

where αi=E(αiδij,Tij,Xij). Following Lemma 2.4 of Schuster (1969), we can show that as n → ∞,

sups[0,τ]|1nhnsi=1nj=1miδijK{Rij(β0)logshn}dE{j=1miI(δij=1,eRij(β0)s)}ds|0,sups[0,τ]|1ni=1nαij=1mi{Rij(β0)logs}/hnK(u)duE{αij=1miI(eRij(β0)s)}|0.

almost surely. Moreover,

dE{j=1miI(δij=1,eRij(β0)s)}ds=λ0(s)E{αieαiΛ0(s)j=1miSc(seβ0XijXij)},E{αij=1miI(eRij(β0)s)}=E{αij=1miI(eRij(β0)s)}=E{αieαiΛ0(s)j=1miSc(seβ0XijXij)},

where Sc(tx) = pr(CijtXij = x). Therefore,

sups[0,τ]|(nhns)1i=1nj=1miδijK[{Rij(β0)logs}/hn]n1i=1nαij=1mi{Rij(β0)logs}/hnK(u)duλ0(s)|0,

almost surely, which implies Λ̃n(t) → Λ0(t) almost surely for any t ∈ [0, τ]. This pointwise consistency can be strengthened to uniform consistency on [0, τ] due to the monotonicity and boundedness of Λ̃n(t) and Λ0(t).

By Helly’s theorem, there exists a convergent subsequence (β̂nk, θ̂nk, Λ̂nk) such that (β̂nk, θ̂nk, Λ̂nk) → (β*, θ*, Λ*) almost surely, where Λ* is a monotonically increasing function. Define the observed log-likelihood function

lno(β,θ,Λ)=i=1nlog0[j=1mi{αiλ(eRij(β))eβXij}δijeαiΛ(eRij(β))]fα(αi;θ)dαi.

We have

0nk1lnko(β^nk,θ^nk,Λ^nk)nk1lnko(β0,θ0,Λnk).

Letting k → ∞ leads to

0E(log0[j=1mi{αiλ(eRij(β))eβXij}δijeαiΛ(eRij(β))]fα(αi;θ)dαi0[j=1mi{αiλ0(eRij(β0))eβ0Xij}δijeαiΛ0(eRij(β0))]fα(αi;θ0)dαi),

where λ* is the derivative of Λ*. Due to the nonnegativity of the Kullback–Leibler information,

0[j=1mi{αiλ(eRij(β))eβXij}δijeαiΛ(eRij(β))]fα(αi;θ)dαi=0[j=1mi{αiλ0(eRij(β0))eβ0Xij}δijeαiΛ0(eRij(β0))]fα(αi;θ0)dαi.

Set δi1 = 1 and i1 = 0. For j = 2, …, mi, if δij = 0, set ij = τ; if δij = 1, integrate ij from 0 to τ. We have

0αiλ(0)eβXijj=2mi{1eαiΛ(τeβXij)}δij{eαiΛ(τeβXij)}1δijfα(αi;θ)dαi=0αiλ0(0)eβ0Xijj=2mi{1eαiΛ0(τeβ0Xij)}δij{eαiΛ0(τeβ0Xij)}1δijfα(αi;θ0)dαi.

The two sides of the above equation are summed over all possible combinations of δij (j = 2, …, mi) to obtain

λ(0)eβXij=λ0(0)eβ0Xij

since E(αi) = 1. Therefore, (β* − β0)′ Xij = log{λ0(0)/λ*(0)}. By (C3), β* = β0. It follows that λ0(0) = λ*(0). In addition, following similar steps, we can obtain

0αikfα(αi;θ0)dαi=0αikfα(αi;θ)dαi,

for any 1 ≤ kmi, which leads to θ0 = θ*. Finally, set δi1 = 1 and integrate i1 from 0 to t. For j = 2, …, mi, if δij = 0, set ij = τ; if δij = 1, integrate ij from 0 to τ. The two sides of the equation are summed over all possible combinations of δij (j = 2, …, mi) to obtain

0{1eαiΛ(teβXi1)}fα(αi;θ)dαi=0{1eαiΛ0(teβ0Xi1)}fα(αi;θ0)dαi.

It follows that Λ*(t) = Λ0(t) for t ∈ [0, τ]. Therefore, (β̂n, θ̂n, Λ̂n(t)) → (β0, θ0, Λ0 (t)) almost surely by Helly’s theorem, which can be strengthened to uniform convergence on [0, τ].

Next, we show that β̂n is asymptotically normal and its variance achieves the semiparametric efficiency bound. Let BV[0, τ] denote the space of bounded variation functions on [0, τ] and define class H = {h = (h11, h12, h2) : h11 ∈ ℝp with ∥h111 < ∞, |h12| < ∞, h2 ∈ BV[0, τ]}. For hH, define the norm ∥h∥ = ∥h111 + |h12| + ∥h2υ, where ∥h111 is the L1 norm of h11, and ∥h2υ is the absolute value of h2(0) plus the total variation of h2 on the interval [0, τ]. Consider submodels βd = β + dh11, θd = θ + dh12 and Λd(t)=0t{1+dh2(u)}dΛ(u). Further, define Un(β,θ,Λ)(h11,h12,h2)=n1{lno(βd,θd,Λd)/d}|d=0, where (h11, h12, h2) ∈ H. For simplicity of notation, we denote

Ai=Ai(β,Λ)=j=1mi{αiλ(eRij(β))eβXij}δijeαiΛ(eRij(β)).

Then we can write Un(β, θ, Λ) (h11, h12, h2) = Un1(h11) + Un2(h2) + Un3(h12), where

Un1(h11)=1nd|d=0lno(βd,Λ,θ)=1ni=1n0Ai[j=1mih11Xij{δijλ˙(eRij(β))eRij(β)/λ(eRij(β))/+δijαiλ(eRij(β))eRij(β)}]fα(αi;θ)dαi0Aifα(αi;θ)dαi, (A1)
Un2(h2)=1nd|d=0lno(β,Λd,θ)=1ni=1n0Ai{j=1miδijh2(eRij(β))αij=1mi0eRij(β)h2(u)dΛ(u)}fα(αi;θ)dαi0Aifα(αi;θ)dαi, (A2)
Un3(h12)=1nd|d=0lno(β,Λ,θd)=1ni=1n0h12Ai{fα(αi;θ)/θ}dαi0Aifα(αi;θ)dαi (A3)

Define u(β, θ, Λ) (h11, h12, h2) = limn→∞ Un(β, θ, Λ) (h11, h12, h2) ≡ u1(h11) + u2(h2) + u3(h12). It can be easily shown that u(β0, θ0, Λ0) (h11, h12, h2) = 0. In addition, it is easy to show that u(β, θ, Λ) is Fréchet differentiable since u(β, θ, Λ) is a smooth function of β, θ and Λ. Let (β0, θ0, Λ0) (ββ0, θθ0, Λ − Λ0) (h) denote the corresponding Fréchet derivative of u(β, θ, Λ) at (β0, θ0, Λ0). After some algebra, we have

u˙(β0,θ0,Λ0)(ββ0,θθ0,ΛΛ0)(h)=(ββ0)Q1(h)+0Q2(t,h)d{Λ(t)Λ0(t)}+(θθ0)Q3(h),

where h = (h11, h12, h2),

Q1(h)=B1(h11h12)+0D1(t)h2(t)dt,Q2(t,h)=B2(t)(h11h12)+c2(t)h2(t)+0D2(t,u)h2(u)du,Q3(h)=B3(h11h12)+0D3(t)h2(t)dt,

where B1 is a p × (p + 1) matrix, B2(t) and B3 are (p + 1)-dimensional vectors, D1 (t) is a p-dimensional function, and c2(t), D2(t, u) and D3(t) are scalar functions. Therefore, Q(h) ≡ (Q1(h), Q2(h), Q3(h)) is a continuous linear operator from the linear span of H to itself.

Consider two classes of functions:

A1(β0,θ0,Λ0)={h11U1(β0,θ0,Λ0)+h12U3(β0,θ0,Λ0):h111<,|h12|<},A2(β0,θ0,Λ0)={U2(β0,θ0,Λ0)(h2):h2BV[0,τ]},

where

U1(β0,θ0,Λ0)=0Ai0[j=1miXij{δijλ˙0(eRij(β0))eRij(β0)/λ0(eRij(β0))+δijαiλ(eRij(β0))eRij(β0)}]fα(αi;θ0)dαi0Ai0fα(αi;θ0)dαi,U2(β0,θ0,Λ0)(h2)=0Ai0{j=1miδijh2(eRij(β0))αij=1mi0eRij(β0)h2(u)dΛ0(u)}fα(αi;θ0)dαi0Ai0fα(αi;θ0)dαi,U3(β0,θ0,Λ0)=0Ai0{fα(αi;θ0)/θ}dαi0Ai0fα(αi;θ0)dαi

and Ai0 = Ai(β0, Λ0). Since U1(β0,θ0,Λ0) and U3(β0,θ0,Λ0) are bounded functions based on assumptions (C1)–(C5), A1 is a Donsker class. In addition, since h2 ∈ BV[0, τ], A2 can be written as the summation of bounded Donsker classes, which is also a Donsker class. Therefore, we have n1/2{Un(β0, θ0, Λ0)(h) − u(β0, θ0, Λ0)(h)} converges weakly to a Gaussian process G* on l(H).

In addition, since ∥ββ01 + |θθ0| = op(1) and supt∈[0, τ] |Λ(t) − Λ0(t)| = op(1), we can show that A1(β, θ, Λ) and A2(β, θ, Λ) are Donsker classes. This implies that

suphH|(Unu)(β^n,θ^n,Λ^n)(h)(Unu)(β0,θ0,Λ0)(h)|=op{max(n1/2,β^nβ01+|θ^nθ0|+supt[0,τ]|Λ^n(t)Λ0(t)|)}. (A4)

Finally, we show that (β0, θ0, Λ0) is continuously invertible. It is equivalent to show that Q(h) is a one to one map, i.e., Q(h) = 0 implies h = 0. If Q(h) = 0, (β0, θ0, Λ0) = 0 for (β, θ, Λ) in a neighbourhood of (β0, θ0, Λ0). We choose β = β0 + dh11, θ = θ0 + dh12 and Λ(t)=Λ0(t)+d0th2(u)dΛ0(u) for a small constant d. By the definition of (β0, Λ0, θ0), we have u˙(β0,Λ0,θ0)=dE[{h11U1(β0,θ0,Λ0)+U2(β0,θ0,Λ0)(h2)+h12U3(β0,θ0,Λ0)}2]=0. This implies that h11U1(β0,θ0,Λ0)+U2(β0,θ0,Λ0)(h2)+h12U3(β0,θ0,Λ0)=0 almost surely. Following the techniques used to derive the consistency of the estimators, we can show that h = 0. The details are given in the Supplementary Material.

Since (β0, θ0, Λ0) is continuously invertible on its range, based on Theorem 3.3.1. of van der Vaart & Wellner (1996), we have that n1/2[{γ̂n, Λ̂n(t)} − {γ0, Λ0(t)}] converges weakly to a tight Gaussian process G = {(β0, θ0, Λ0)}−1G*. In addition, the variance of G is

var{G(h)}=0h2(t)Q21(t,h)dΛ0(t)+(h11,h12)(Q11(h)Q31(h)),

where Q1(h){Q11(h),Q21(h),Q31(h)} is the inverse of Q(h). We derive the semiparametric efficiency bound I−1 (Bickel et al., 1993) and show that the asymptotic variance of n1/2(γ̂nγ0) achieves the semiparametric efficiency bound. The details are given in the Supplementary Material.

Footnotes

Supplementary Material

Supplementary material available at Biometrika online includes additional simulation study results and technical derivations.

Contributor Information

Bo Liu, Email: bliu4@ncsu.edu, Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, North Carolina 27695, U.S.A.

Wenbin Lu, Email: lu@stat.ncsu.edu, Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, North Carolina 27695, U.S.A.

Jiajia Zhang, Email: jzhang@mailbox.sc.edu, Department of Epidemiology and Biostatistics, University of South Carolina, 800 Sumter Street, Columbia, South Carolina 29208, U.S.A.

References

  1. Aalen OO. Modelling heterogeneity in survival analysis by the compound Poisson distribution. The Annals of Applied Probability. 1992;2:951–972. [Google Scholar]
  2. Bickel PJ, Klaassen CAJ, Ritov Y, Wellner JA. Efficient and Adaptive Estimation for Semiparametric Models. Baltimore: Johns Hopkins University Press; 1993. [Google Scholar]
  3. Buckley J, James I. Linear regression with censored data. Biometrika. 1979;66:429–436. [Google Scholar]
  4. Chen HY, Little RJA. Proportional hazards regression with missing covariates. Journal of the American Statistical Association. 1999;94:896–908. [Google Scholar]
  5. Clayton DG. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika. 1978;65:141. [Google Scholar]
  6. Clayton DG, Cuzick J. Multivariate generalizations of the proportional hazards model (with discussion) Journal of the Royal Statistical Society A. 1985;48:82–117. [Google Scholar]
  7. Cox DR. Regression models and life tables (with discussion) Journal of the Royal Statistical Society Series B. 1972;34:187–220. [Google Scholar]
  8. Hougaard P. Survival models for heterogeneous populations derived from stable distributions. Biometrika. 1986;73:387–396. [Google Scholar]
  9. Hougaard P. Modelling multivariate survival. Scandinavian Journal of Statistics. 1987;14:291–30. [Google Scholar]
  10. Huster WJ, Brookmeyer R, Self SG. Modelling paired survival data with covariates. Biometrics. 1989;45:145–156. [PubMed] [Google Scholar]
  11. Jin Z, Lin DY, Wei LJ, Ying Z. Rank-based inference for the accelerated failure time model. Biometrika. 2003;90:341–353. [Google Scholar]
  12. Jin Z, Lin DY, Ying Z. On least-squares regression with censored data. Biometrika. 2006a;93:147–161. [Google Scholar]
  13. Jin Z, Lin DY, Ying Z. Rank regression analysis of multivariate failure time data based on marginal linear models. Scandinavian Journal of Statistics. 2006b;33:1–23. [Google Scholar]
  14. Johnson L, Strawderman R. Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika. 2009;96:577–590. doi: 10.1093/biomet/asp025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Johnson L, Strawderman R. A smoothing expectation and substitution algorithm for the semiparametric accelerated failure time frailty model. Statistics in Medicine. 2012;31:2335–2358. doi: 10.1002/sim.5349. [DOI] [PubMed] [Google Scholar]
  16. Jones MC. The performance of kernel density functions in kernel distribution function estimation. Statistics and Probability Letter. 1990;9:129–132. [Google Scholar]
  17. Jones MC, Sheather SJ. Using non-stochastic terms to adavantage in kernelbased estimation of integrated squared density derivatives. Statistics and Probability Letter. 1991;11:511–514. [Google Scholar]
  18. Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. 2 New York; Wiley: 2002. [Google Scholar]
  19. Li H, Yin GS. Generalized method of moments estimation for linear regression with clustered failure time data. Biometrika. 2009;96:293–306. [Google Scholar]
  20. Lin DY. Cox regression analysis of multivariate failure time data: the marginal approach. Statistics in Medicine. 1994;13:2233–2247. doi: 10.1002/sim.4780132105. [DOI] [PubMed] [Google Scholar]
  21. Lin DY, Wei LJ, Ying Z. Accelerated failure time models for counting processes. Biometrika. 1998;85:605–618. [Google Scholar]
  22. Lu W. Tests of independence for censored bivariate failure time data. Lifetime Data Analysis. 2007;13:75–90. doi: 10.1007/s10985-006-9031-z. [DOI] [PubMed] [Google Scholar]
  23. Lu W. Efficient estimation for an accelerated failure time model with a cure fraction. Statistica Sinica. 2010;20:661–674. [PMC free article] [PubMed] [Google Scholar]
  24. McGilchrist CA, Aisbett CW. Regression with frailty in survival analysis. Biometrics. 1991;47:461–466. [PubMed] [Google Scholar]
  25. Murphy SA. Consistency in a proportional hazard model incorporating a random effect. Annals of Statistics. 1994;22:712–731. [Google Scholar]
  26. Murphy SA. Asymptotic theory for the frailty model. Annals of Statistics. 1995;23:182–198. [Google Scholar]
  27. Nielsen GG, Gill RD, Andersen PK, Sorensen TIA. A counting process approach to maximum likelihood estimation in frailty models. Scandinavian Journal of Statistics. 1992;19:25–44. [Google Scholar]
  28. Oakes D. Bivariate survival models induced by frailties. Journal of the American Statistical Association. 1989;84:487–493. [Google Scholar]
  29. Pan W. Using frailties in the accelerated failure time model. Lifetime Data Analysis. 2001;7:55–64. doi: 10.1023/a:1009625210191. [DOI] [PubMed] [Google Scholar]
  30. Parner E. Asymptotic theory for the correlated gamma-frailty model. Annals of Statistics. 1998;26:183–214. [Google Scholar]
  31. Schuster EF. Estimation of a probability density function and its derivatives. The Annals ofMathematical Statistics. 1969;40:1187–1195. [Google Scholar]
  32. Strawderman R. A regression model for dependent gap times. International Journal of Biostatistics. 2006;2:1–33. [Google Scholar]
  33. Tsiatis AA. Estimating regression parameteters using linear rank tests for censored data. The Annals of Statistics. 1990;18:354–372. [Google Scholar]
  34. van der Vaart AW, Wellner JA. Weak Convergence and Empirical Processes. New York: Springer-Verlag; 1996. [Google Scholar]
  35. Wei LJ, Lin DY, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modelling marginal distributions. Journal of the American Statistical Association. 1989;84:1065–1073. [Google Scholar]
  36. Xu L, Zhang J. An EM-like algorithm for the semiparametric accelerated failure time gamma frailty model. Computational Statistics and Data Analysis. 2010;54:1467–1474. [Google Scholar]
  37. Ying Z. A large sample study of rank estimation for censored regression data. The Annals of Statistics. 1993;21:76–99. [Google Scholar]
  38. Zeng D, Lin DY. Efficient estimation for the accelerated failure time model. Journal of the American Statistical Association. 2007;102:1387–1396. [Google Scholar]
  39. Zhang J, Peng Y. An alternative estimation method for the accelerated failure time frailty model. Computational Statistics and Data Analysis. 2007;51:4413–4423. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material

RESOURCES