Kernel Smoothed Profile Likelihood Estimation in the Accelerated Failure Time Frailty Model for Clustered Survival Data

Bo Liu; Wenbin Lu; Jiajia Zhang

doi:10.1093/biomet/ast012

. Author manuscript; available in PMC: 2014 Jan 15.

Published in final edited form as: Biometrika. 2013 May 14;100(3):741–755. doi: 10.1093/biomet/ast012

Kernel Smoothed Profile Likelihood Estimation in the Accelerated Failure Time Frailty Model for Clustered Survival Data

Bo Liu ¹, Wenbin Lu ², Jiajia Zhang ³

PMCID: PMC3893096 NIHMSID: NIHMS480514 PMID: 24443587

Summary

Clustered survival data frequently arise in biomedical applications, where event times of interest are clustered into groups such as families. In this article we consider an accelerated failure time frailty model for clustered survival data and develop nonparametric maximum likelihood estimation for it via a kernel smoother aided EM algorithm. We show that the proposed estimator for the regression coefficients is consistent, asymptotically normal and semiparametric efficient when the kernel bandwidth is properly chosen. An EM-aided numerical differentiation method is derived for estimating its variance. Simulation studies evaluate the finite sample performance of the estimator, and it is applied to the Diabetic Retinopathy data set.

Keywords: Accelerated failure time model, Clustered survival data, EM algorithm, Kernel smoothing, Profile likelihood estimation

1. Introduction

Clustered survival data are a common type of multivariate survival data, often encountered in fields such as medicine, economics and epidemiology. Because multivariate survival models are important tools for analyzing clustered survival data, they have attracted considerable attention. There are two main approaches: marginal modelling and joint modelling via random effects. The first approach models the marginal distribution of correlated failure times without specifying the correlation structure. For example, Wei et al. (1989) proposed marginal regression analysis based on the proportional hazards model (Cox, 1972) for multivariate failure time data. A review of marginal approaches based on the proportional hazards model can be found in Lin (1994). In some applications, such as family studies, the within-cluster association is also important to investigators. Joint modelling uses random effects to describe the association among failure times within clusters. In addition, by appropriately taking into account the correlation structure, joint modelling can have better estimation efficiency than the marginal approach. Clayton & Cuzick (1985) introduced cluster-specific random effects or frailties to the proportional hazards model, which assumes that subjects within the same cluster can be considered independent conditional on the frailty. The multiplicative proportional hazards frailty model has been widely studied (Hougaard, 1987; Oakes, 1989; Nielsen et al., 1992) and various frailty distributions have been used to describe the within cluster correlation. The large sample properties of the associated nonparametric maximum likelihood estimators have been investigated by Murphy (1994, 1995) and Parner (1998).

The accelerated failure time model (Kalbfleisch & Prentice, 2002) is a useful alternative to the proportional hazards model. Many methods have been developed for parameter estimation in the accelerated failure time model for univariate survival data (Buckley & James, 1979; Tsiatis, 1990; Ying, 1993; Jin et al., 2003; Zeng & Lin, 2007). Recently, this model has been extended to clustered survival data. For example, Jin et al. (2006a,b) considered the marginal accelerated failure time model for clustered survival data: the former extended the Buckley-James estimation method; the latter extended the weighted log-rank estimation method. To improve the efficiency of the marginal approach, Li & Yin (2009) proposed a generalized moments estimation method, incorporating a posited correlation matrix into the rank-based estimating equations and minimizing a quadratic inference function. In addition, Johnson & Strawderman (2009) applied the induced smoothing technique to the weighted log-rank estimators for clustered survival data, which facilitates the resulting estimation and inference procedures. To characterize the correlation structure of failure times within clusters, Pan (2001) proposed to use frailties in the accelerated failure time model and developed an EM-like algorithm to estimate the coefficients in the accelerated failure time frailty model. Based on Pan’s method, Zhang & Peng (2007) and Xu & Zhang (2010) developed more stable estimation procedures using M-estimation and rank-based estimation, respectively. More recently, Johnson & Strawderman (2012) introduced smoothing into the EM-like algorithm to facilitate parameter estimation. However, none of the above estimators are semiparametric efficient because the considered EM-like algorithms do not maximize the likelihood function. Moreover, the asymptotic properties of these estimators have not been studied. In this article, we develop a nonparametric maximum likelihood estimation method for the accelerated failure time frailty model.

2. The accelerated failure time frailty model

Let T_ij be the failure time, C_ij be the censoring time and X_ij be the p-dimensional vector of baseline covariates for the jth individual in the ith cluster, for i = 1, …, n and j = 1, …, m_i. Here n is the total number of clusters and m_i is the size of the ith cluster. The observed data are O = {(T̃_ij, δ_ij, X_ij) : i = 1, …, n; j = 1, …, m_i}, where T̃_ij = min(T_ij, C_ij) and δ_ij = I(T_ij ≤ C_ij).

The marginal accelerated failure time model is

\log T_{ij} = - β^{'} X_{ij} + ε_{ij},

(1)

where β is the p-dimensional vector of regression coefficients, and the error terms, (ε_i1, …, ε_{im_i}), are independent across clusters and independent of (X_i1, …, X_{im_i}). It is assumed that all ε_ij have a common unknown marginal distribution, and ε_ij and ε_ik may be correlated for j ≠ k. We assume that T_ij and C_ij are independent conditional on X_ij, and m_i is small compared to n and is noninformative, i.e., independent of T_ij, C_ij and X_ij.

To describe the dependence between clustered survival times, Pan (2001) proposed to consider the accelerated failure time frailty model. Specifically, given a positive latent variable α_i of mean 1 and variance σ², it is assumed that the hazard function of e^ε_ij is

λ_{ij} (t) = α_{i} λ (t) (i = 1, \dots, n; j = 1, \dots, m_{i}),

(2)

where λ(·) is an unspecified baseline hazard function. In addition, ε_i1, …, ε_{im_i} are assumed independent conditional on α_i and the magnitude of dependence among the ε_ij is characterized by the value of σ². There are many choices for the frailty distribution, e.g., the gamma distribution (Clayton, 1978), the positive stable distribution (Hougaard, 1986), the compound Poisson distribution (Aalen, 1992) and the log-normal distribution (McGilchrist & Aisbett, 1991).

3. Nonparametric maximum likelihood estimator

Let f_α(·; θ) denote the density of the latent variable α_i, where θ is an unknown finite dimensional vector of parameters. The log-likelihood function for the complete data, {(T̃_ij, δ_ij, X_ij, α_i) : i = 1, …, n; j = 1, …, m_i}, can be written as

l_{n}^{c} (β, Λ, θ) = l_{n, 1}^{c} (θ) + l_{n, 2}^{c} (β, Λ),

where $Λ (t) = \int_{0}^{t} λ (s) ds$ , R_ij (β) = log(T̃_ij) + β′ X_ij,

l_{n, 1}^{c} (θ) = \frac{1}{n} \sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} δ_{ij} \log α_{i} + \frac{1}{n} \sum_{i = 1}^{n} \log f_{α} (α_{i}; θ),

(3)

l_{n, 2}^{c} (β, Λ) = \frac{1}{n} \sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} [δ_{ij} {β^{'} X_{ij} + \log λ (e^{R_{ij} (β)})} - α_{i} Λ (e^{R_{ij} (β)})] .

(4)

We use an EM algorithm to obtain the nonparametric maximum likelihood estimator. Let Ω̂^[k] = (β̂^[k], Λ̂^[k], θ̂^[k]) denote the parameter estimates at step k. In the expectation step, we obtain the conditional density of α_i given the observed data O and current parameter estimates Ω̂^[k],

f_{α} (α_{i} ∣ O, {\hat{Ω}}^{[k]}) = \frac{f_{α} (α_{i}; {\hat{θ}}^{[k]}) α_{i}^{\sum_{j = 1}^{m_{i}} δ_{ij}} \exp {- α_{i} \sum_{j = 1}^{m_{i}} {\hat{Λ}}^{[k]} (e^{R_{ij} ({\hat{β}}^{[k]})})}}{\int_{0}^{+ \infty} f_{α} (α_{i}; {\hat{θ}}^{[k]}) α_{i}^{\sum_{j = 1}^{m_{i}} δ_{ij}} \exp {- α_{i} \sum_{j = 1}^{m_{i}} {\hat{Λ}}^{[k]} (e^{R_{ij} ({\hat{β}}^{[k]})})} d α_{i}} .

(5)

The conditional expectations E(α_i∣O, Ω̂^[k]), E(log α_i∣O, Ω̂^[k]) and E{log f_α(α_i; θ)∣O, Ω̂^[k]} can be calculated as the integrals of the corresponding terms with respect to the conditional density f_α(α_i∣O, Ω̂^[k]). For example, when the frailty has a gamma density f_α(x; θ) = x^θ−1 e^−θx θ^θ/Γ(θ), where x > 0, θ > 0 and $Γ (θ) = \int_{0}^{+ \infty} t^{θ - 1} e^{- t} dt$ , we have

\begin{array}{l} {\hat{α}}_{i}^{[k]} \equiv E (α_{i} ∣ O, {\hat{Ω}}^{[k]}) = (D_{i} + {\hat{θ}}^{[k]}) / {{\hat{θ}}^{[k]} + \sum_{j = 1}^{m_{i}} {\hat{Λ}}^{[k]} (e^{R_{ij} ({\hat{β}}^{[k]})})}, \\ E_{2, i}^{[k]} \equiv E (\log α_{i} ∣ O, {\hat{Ω}}^{[k]}) = Ψ (D_{i} + {\hat{θ}}^{[k]}) - \log {{\hat{θ}}^{[k]} + \sum_{j = 1}^{m_{i}} {\hat{Λ}}^{[k]} (e^{R_{ij} ({\hat{β}}^{[k]})})}, \\ E_{3, i}^{[k]} \equiv E {\log f_{α} (α_{i}; θ) ∣ O, {\hat{Ω}}^{[k]}} = (θ - 1) E_{2, i}^{[k]} - θ {\hat{α}}_{i}^{[k]} + θ \log θ - \log Γ (θ), \end{array}

where ψ (x) = Γ′(x)/Γ(x) is the digamma function. For general frailty distributions, such as the log-normal distribution, these conditional expectations may not have closed analytical forms. In such cases we use gaussian quadrature. Therefore, the conditional expectations of (3) and (4) given O and Ω̂^[k] are

E {l_{n, 1}^{c} (θ) ∣ O, {\hat{Ω}}^{[k]}} = \frac{1}{n} \sum_{i = 1}^{n} E_{2, i}^{[k]} D_{i} + \frac{1}{n} \sum_{i = 1}^{n} E_{3, i}^{[k]},

(6)

E {l_{n, 2}^{c} (β, Λ) ∣ O, {\hat{Ω}}^{[k]}} = \frac{1}{n} \sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} [δ_{ij} {β^{'} X_{ij} + \log λ (e^{R_{ij} (β)})} - {\hat{α}}_{i}^{[k]} Λ (e^{R_{ij} (β)})],

(7)

where $D_{i} = \sum_{j = 1}^{m_{i}} δ_{ij}$ is the total number of observed events in cluster i.

In the maximization step, equation (6) can be easily maximized using standard gradient-based optimization algorithms. Let θ̂^[k+1] denote the maximizer of (6). The conditional log-likelihood given in (7) cannot be directly maximized over β and Λ. We adopt a kernel smoothing technique similar to that used by Zeng & Lin (2007). Specifically, consider the piecewise constant hazard function $\tilde{λ} (t) = \sum_{l = 1}^{J_{n}} c_{l} I (t_{l - 1} \leq t < t_{l})$ on [0, M], where 0 = t₀ < t₁ < ⋯ < t_{J_n} = M are equally spaced, and M is an upper bound for all e^{R_ij (β)}. Accordingly, the cumulative hazard function is $\tilde{Λ} (t) = \sum_{l = 1}^{J_{n}} c_{l} (t - t_{l}) I (t_{l - 1} \leq t < t_{l}) + (M / J_{n}) \sum_{l = 1}^{J_{n}} c_{l} I (t \geq t_{l})$ . Using these expression for λ̃(·) and Λ̃(·) in (7) and maximizing (7) with respect to c_l (l = 1, …, J_n), for fixed β, the following maximizers are obtained:

{\hat{c}}_{l}^{[k]} = \frac{\sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} δ_{ij} I (t_{l - 1} \leq e^{R_{ij} (β)} < t_{l})}{\sum_{i = 1}^{n} {\hat{α}}_{i}^{[k]} \sum_{j = 1}^{m_{i}} {(e^{R_{ij} (β)} - t_{l - 1}) I (t_{l - 1} \leq e^{R_{ij} (β)} < t_{l}) + (M / J_{n}) I (e^{R_{ij} (β)} \geq t_{l})}} .

The profile likelihood function for β constructed using the above expression for ${\hat{c}}_{l}^{[k]}$ is

\begin{array}{l} l_{n, 2}^{p, k} (β) = & \frac{1}{n} \sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} δ_{ij} β^{'} X_{ij} + \sum_{l = 1}^{J_{n}} [{\frac{1}{n} \sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} δ_{ij} I (t_{l - 1} \leq e^{R_{ij} (β)} < t_{l})} \\ \times \log {\frac{J_{n}}{nM} \sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} δ_{ij} I (t_{l - 1} \leq e^{R_{ij} (β)} < t_{l})}] - \sum_{l = 1}^{J_{n}} ({\frac{1}{n} \sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} δ_{ij} I (t_{l - 1} \leq e^{R_{ij} (β)} < t_{l})} \\ \times \log [\frac{J_{n}}{nM} \sum_{i = 1}^{n} {\hat{α}}_{i}^{[k]} \sum_{j = 1}^{m_{i}} {(e^{R_{ij} (β)} - t_{l - 1}) I (t_{l - 1} \leq e^{R_{ij} (β)} < t_{l}) + \frac{M}{J_{n}} I (e^{R_{ij} (β)} \geq t_{l})}]) . \end{array}

Following derivations similar to those given in Zeng & Lin (2007), $l_{n, 2}^{p, k} (β)$ converges uniformly in β to a limiting function as n → ∞, J_n → ∞ and J_n/n → 0, and the limiting function can be approximated by the smooth function

l_{n, 2}^{s, k} (β) = \frac{1}{n} \sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} δ_{ij} [β^{'} X_{ij} + \log {{\hat{λ}}^{[k + 1]} (e^{R_{ij} (β)}; β)}],

(8)

where

{\hat{λ}}^{[k + 1]} (t; β) = \frac{1}{t h_{n}} \frac{\sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} δ_{ij} K [{R_{ij} (β) - \log (t)} / h_{n}]}{\sum_{i = 1}^{n} {\hat{α}}_{i}^{[k]} \sum_{j = 1}^{m_{i}} \int_{- \infty}^{{R_{ij} (β) - \log (t)} / h_{n}} K (u) du}, t > 0 .

(9)

Let β̂^[k+1] denote the maximizer of $l_{n, 2}^{s, k} (β)$ . Given β̂^[k+1], a smooth estimator of λ(t) can be obtained as λ̂^[k+1](t; β̂^[k+1]) and ${\hat{Λ}}^{[k + 1]} (t) \equiv \int_{0}^{t} {\hat{λ}}^{[k + 1]} (s; {\hat{β}}^{[k + 1]}) ds$ .

From an initial estimator Ω̂^[0], the E-step and M-step are repeated until convergence. The estimators of β, Λ(·) and θ at convergence are denoted by β̂_n, Λ_n(·) and θ̂_n, respectively. In our implementation, the starting value of β was taken to be the maximum smoothed profile likelihood estimator of Zeng & Lin (2007) assuming working independence among correlated failure times. An initial estimator for λ was obtained by setting k = −1 and ${\hat{α}}_{i}^{[- 1]} \equiv 1$ in (9). Based on our numerical experience, the convergence of the proposed EM algorithm is not sensitive to the choice of the starting value for θ. For convenience, we chose θ̂^[0] = 1 for all scenarios considered in our numerical studies. Define ${\hat{γ}}_{n} = {({\hat{β}}_{n}^{'}, {\hat{θ}}_{n}^{'})}^{'}$ and $γ_{0} = {(β_{0}^{'}, θ_{0}^{'})}^{'}$ , the true value of γ. Let Λ₀ and λ₀ denote the true values of Λ and λ, respectively. Next, we establish the asymptotic properties for estimators γ̂_n and Λ̂_n.

Theorem 1

Assume that the regularity conditions (C1)–(C8) given in the Appendix hold. As n → ∞, $n h_{n}^{2} \to \infty$ and $n h_{n}^{4} \to 0$ : (i) sup_t∈[0,τ] |Λ̂_n(t) − Λ₀(t)| → 0 and γ̂_n → γ₀ almost surely; and (ii) n^1/2(γ̂_n − γ₀) converges in distribution to a mean-zero normal random vector with a covariance matrix that achieves the semiparametric efficiency bound I⁻¹.

The proof of Theorem 1 is given in the Appendix. To estimate the variance of β̂_n, we adopt the EM-aided numerical differentiation method proposed by Chen & Little (1999), which numerically computes the empirical Fisher information matrix of the profile likelihood. A similar method was used by Lu (2010) for variance estimation in the accelerated failure time model with a cure fraction. Specifically,

E {l_{n, 1}^{c} (θ) + l_{n, 2}^{c} (β, Λ) ∣ O, Ω} = \frac{1}{n} \sum_{i = 1}^{n} \tilde{l_{i}} (β, Λ, θ) .

The jth component of β̂_n is perturbed by a small value, d. The pair of perturbed estimates is denoted by β̂_n,j− = (β̂_n,1, …, β̂_n,j − d, …, β̂_n,p)′ and β̂_n,j+ = (β̂_n,1, …, β̂_n,j + d, …, β̂_n,p)′ for j = 1, …, p. The β is fixed at β̂_n,j−, and the above EM algorithm is implemented until convergence. The estimates of Λ and θ at convergence are denoted by Λ̂_n,j− and θ̂_n,j−, respectively. The estimates Λ̂_n,j+ and θ̂_n,j+ can be similarly obtained. For i = 1, …, n and j = 1, …, p, define

{\tilde{S}}_{ij} = {\tilde{l_{i}} ({\hat{β}}_{n, j +}, {\hat{Λ}}_{n, j +}, {\hat{θ}}_{n, j +}) - \tilde{l_{i}} ({\hat{β}}_{n, j -}, {\hat{Λ}}_{n, j -}, {\hat{θ}}_{n, j -})} / (2 d) .

Let S̃_i = (S̃_i1, …, S̃_ip)′ and ${\tilde{I}}_{n} = \sum_{i = 1}^{n} {\tilde{S}}_{i} {\tilde{S}}_{i}^{'}$ . Then, the covariance matrix of β̂_n can be estimated by ${\tilde{I}}_{n}^{- 1}$ .

4. Numerical Examples

4·1. Simulation studies

We generated clustered failure times from the following model

\log T_{ij} = X_{ij 1} - X_{ij 2} + ε_{ij} (i = 1, \dots, 100; j = 1, \dots, 5),

where X_ij1 follows a Bernoulli distribution with a success probability of 0.5, X_ij2 follows a uniform distribution on [-1,1] and ε_ij follows the frailty model (2). We considered two frailty distributions: gamma frailty with mean 1 and variance σ² = 1/θ; and log-normal frailty with mean 1 and variance σ² = e^θ − 1. Further, we considered three choices for λ₀(t): Weibull-type, λ₀(t) = at^b; log-normal-type, λ₀(t) = t⁻¹ ϕ{log(t)}/[1 − Φ{log(t)}], where ϕ(·) and Φ(·) are the density and cumulative distribution functions of the standard normal random variable; and reciprocal-type, λ₀(t) = c/(1 + t). Here, a, b and c are positive constants. Censoring times were generated from a uniform distribution on [0, τ_c], where τ_c was chosen to yield censoring proportions of 15% and 40%. For each setting, we conducted 2000 simulation runs.

For the bandwidth parameter, h_n, of the kernel smoother, we adapted the optimal bandwidths proposed by Jones (1990) and Jones & Sheather (1991) for density estimation. Such bandwidths were also used by Zeng & Lin (2007) for smoothing the profile likelihood in the standard accelerated failure time model. Specifically, we set h_n = ζ σ̂_e n^−1/3, where ζ is a positive constant, and σ̂_e is the sample standard deviation of the fitted residuals, $\log {\tilde{T}}_{ij} + {\hat{β}}_{LS}^{'} X_{ij}$ . Here β̂_LS is the least squares estimate based on all of the data, including censored data. In our simulations, we tried a range of values for ζ and found that 0.8 ≤ ζ ≤ 1.8 works well in all of the scenarios. For comparison, we also included the Gehan rank estimator (Jin et al., 2006b), the induced smoothing estimator (Johnson & Strawderman, 2009) and the smoothed EM-like estimator (Johnson & Strawderman, 2012). The former two estimators are based on the marginal accelerated failure time model.

The results for the gamma and log-normal frailties are summarized in Tables 1 and 2, respectively. Because the results for the various hazard functions are similar, we present only the results for the Weibull-type and reciprocal-type. In addition, as reported in Johnson & Strawderman (2009), the Gehan rank estimator and induced smoothing estimator have very similar performances. Therefore, we exclude the results for the Gehan rank estimator. All three estimators for the regression parameters are essentially unbiased under all settings and the averages of the estimated standard errors obtained using the proposed EM-aided numerical differentiation method for the nonparametric maximum likelihood estimator are close to their standard deviations with the empirical coverage probabilities close to the nominal level. In most cases, the nonparametric maximum likelihood estimator is more efficient than the Gehan rank estimator and the induced smoothing estimator. The efficiency gain is more substantial when the variance of the frailty is large, but it decreases as the variance decreases. This result agrees with our expectation since when the variance of the frailty is large, the survival times within the same cluster are strongly correlated. Thus, the nonparametric maximum likelihood estimator is expected to be more efficient since it effectively accounts for the within-cluster correlation. The nonparametric maximum likelihood estimator is generally more efficient than the smoothed EM-like estimator in terms of the mean square error for the Weibull-type hazard function, especially when the correlation between clustered survival times is strong and the censoring proportion is low. However, for the reciprocal-type hazard function, the smoothed EM-like estimator may have better efficiency than the nonparametric maximum likelihood estimator when the correlation is weak or the censoring proportion is high. This result is attributed to the smaller biases of the smoothed EM-like estimators. Finally, the proposed nonparametric maximum likelihood estimator for the variance of the frailty is nearly unbiased. The mean estimated survival curves for S₀(t) ≡ exp{−Λ₀(t)} are given in Figures 1 – 6 of the Supplementary Material. For all the scenarios, the mean estimated survival curves are close to the true survival curves.

Table 1.

Simulation results for gamma frailty.

CR	σ²	σ̂²	SD	β	NPMLE				J-S¹		J-S²
CR	σ²	σ̂²	SD	β	Bias(%)	SD	SE	CP(%)	Bias(%)	SD	Bias(%)	SD
λ₀(t): Weibull type
15%	2	1.98	.30	β̂₁	−0.5	.068	.08	97	0.3	.078	0.0	.104
15%	2	1.98	.30	β̂₂	0.5	.061	.07	97	−0.4	.069	0.1	.090
40%	2	1.96	.31	β̂₁	−1.6	.085	.09	95	1.2	.090	−1.0	.109
40%	2	1.96	.31	β̂₂	1.5	.076	.08	94	−0.9	.081	1.3	.096
15%	1	0.98	.17	β̂₁	−0.5	.069	.07	95	0.3	.075	−0.4	.082
15%	1	0.98	.17	β̂₂	0.4	.060	.06	94	−0.3	.064	0.3	.071
40%	1	0.97	.19	β̂₁	−2.2	.082	.10	97	0.4	.089	−1.1	.093
40%	1	0.97	.19	β̂₂	2.3	.071	.08	96	−0.3	.076	1.1	.081
15%	0.5	0.48	.12	β̂₁	−0.9	.062	.08	97	−0.1	.069	−0.5	.069
15%	0.5	0.48	.12	β̂₂	1.1	.055	.07	97	0.1	.061	0.7	.062
40%	0.5	0.47	.14	β̂₁	−2.4	.077	.09	96	−0.2	.083	−1.7	.082
40%	0.5	0.47	.14	β̂₂	2.9	.070	.08	96	0.2	.075	1.9	.074
λ₀ (t): reciprocal type
15%	2	2.00	.30	β̂₁	−0.3	.154	.17	96	0.6	.161	0.3	.234
15%	2	2.00	.30	β̂₂	1.4	.138	.14	95	0.5	.143	0.8	.208
40%	2	2.00	.32	β̂₁	−2.5	.173	.18	95	1.0	.179	0.3	.219
40%	2	2.00	.32	β̂₂	4.0	.154	.16	94	0.8	.157	1.0	.201
15%	1	0.99	.17	β̂₁	−1.5	.143	.15	96	−0.5	.144	−0.9	.166
15%	1	0.99	.17	β̂₂	1.4	.125	.13	96	0.8	.128	0.5	.148
40%	1	0.98	.22	β̂₁	−4.2	.163	.18	96	−0.8	.160	−1.6	.172
40%	1	0.98	.22	β̂₂	5.2	.146	.16	96	1.0	.145	1.0	.152
15%	0.5	0.48	.11	β̂₁	−1.0	.136	.13	94	0.6	.136	0.2	.139
15%	0.5	0.48	.11	β̂₂	1.1	.124	.12	93	−1.0	.122	0.2	.122
40%	0.5	0.47	.12	β̂₁	−5.1	.156	.17	95	0.8	.157	−0.3	.156
40%	0.5	0.47	.12	β̂₂	5.2	.140	.15	94	−0.9	.139	1.0	.138

Open in a new tab

^†

CR, censoring rate; σ², true variance of frailty; σ̂², estimate for σ²; SD, sample standard deviation; SE, mean of estimated standard errors; CP, empirical coverage probability of 95% Wald-type confidence interval; NPMLE, proposed nonparametric maximum likelihood estimator; J-S¹, smoothed EM-like estimator; J-S², induced smoothing estimator.

Table 2.

Simulation results for log-normal frailty.

CR	σ²	σ̂²	SD	β	NPMLE				J-S¹		J-S²
CR	σ²	σ̂²	SD	β	Bias(%)	SD	SE	CP(%)	Bias(%)	SD	Bias(%)	SD
λ₀(t): Weibull type
15%	3.48	3.43	1.26	β̂₁	−1.0	.066	.08	97	0.3	.077	−0.6	.083
15%	3.48	3.43	1.26	β̂₂	1.0	.059	.07	97	−0.2	.066	0.6	.073
40%	3.48	3.10	1.31	β̂₁	−2.3	.085	.10	96	0.5	.093	−1.5	.099
40%	3.48	3.10	1.31	β̂₂	2.5	.074	.09	97	−0.4	.080	1.7	.085
15%	1.72	1.70	0.55	β̂₁	−0.8	.066	.07	96	−0.2	.073	−0.5	.076
15%	1.72	1.70	0.55	β̂₂	0.9	.058	.06	96	0.1	.063	0.5	.066
40%	1.72	1.59	0.59	β̂₁	−2.8	.083	.09	95	0.5	.089	−1.6	.092
40%	1.72	1.59	0.59	β̂₂	3.0	.073	.08	96	−0.3	.077	1.7	.078
15%	0.65	0.64	0.21	β̂₁	−0.7	.063	.07	96	−0.1	.070	−0.5	.067
15%	0.65	0.64	0.21	β̂₂	0.8	.056	.06	96	0.2	.060	0.5	.059
40%	0.65	0.63	0.23	β̂₁	−3.1	.080	.09	95	−0.2	.086	−1.6	.083
40%	0.65	0.63	0.23	β̂₂	3.0	.069	.08	95	0.5	.075	1.6	.072
λ₀ (t): reciprocal type
15%	3.48	3.41	1.24	β̂₁	−0.8	.143	.17	98	0.8	.151	0.4	.171
15%	3.48	3.41	1.24	β̂₂	1.1	.128	.15	97	−0.2	.133	−0.3	.157
40%	3.48	2.94	1.05	β̂₁	−3.8	.171	.19	96	1.0	.173	−0.2	.187
40%	3.48	2.94	1.05	β̂₂	4.7	.151	.17	96	−0.5	.152	0.5	.169
15%	1.72	1.69	0.53	β̂₁	−1.0	.140	.16	97	0.7	.146	0.2	.153
15%	1.72	1.69	0.53	β̂₂	1.5	.126	.14	96	−0.3	.132	−0.1	.140
40%	1.72	1.53	0.51	β̂₁	−4.5	.168	.18	96	1.2	.167	−0.3	.172
40%	1.72	1.53	0.51	β̂₂	5.1	.149	.16	95	0.5	.148	0.7	.156
15%	0.65	0.65	0.19	β̂₁	−1.0	.134	.14	95	0.8	.136	0.1	.135
15%	0.65	0.65	0.19	β̂₂	1.5	.123	.13	94	−0.3	.124	0.1	.124
40%	0.65	0.62	0.21	β̂₁	−5.1	.160	.18	95	0.5	.159	−0.4	.155
40%	0.65	0.62	0.21	β̂₂	6.1	.142	.16	95	0.3	.141	0.6	.142

Open in a new tab

^†

CR, censoring rate; σ², true variance of frailty; σ̂;², estimate for σ²; SD, sample standard deviation; SE, mean of estimated standard errors; CP, empirical coverage probability of 95% Wald-type confidence interval; NPMLE, proposed nonparametric maximum likelihood estimator; J-S¹, smoothed EM-like estimator; J-S², induced smoothing estimator.

We also conducted a sensitivity analysis to study the performance of the nonparametric maximum likelihood estimator when the frailty distribution is misspecified. Specifically, clustered survival data were generated from the accelerated failure time frailty model with the log-normal frailty as considered previously. However, the nonparametric maximum likelihood estimator was computed based on the gamma frailty. The simulation results are given in Table 3. The nonparametric maximum likelihood estimator for the regression parameters shows very small biases that are comparable to those reported in Table 2 when the log-normal frailty distribution was correctly specified. The means of the estimated standard errors are close to the standard deviations with proper coverage probabilities. Based on the limited simulations we have conducted, the performance of the nonparametric maximum likelihood estimator for the regression parameters is relatively robust to the misspecification of the frailty distribution. However, the estimate for the variance of the frailty shows large biases when the frailty distribution is misspecified. The mean estimated survival curves for S₀(t) are given in Figures 7–9 of the Supplementary Material. When the log-normal frailty is misspecified as the gamma frailty, the mean estimated survival curves slightly overestimate the true survival curves for the cases with large frailty variance, i.e., σ² = 3.48, while they are nearly unbiased for cases with smaller variances.

Table 3.

Sensitivity analysis results for misspecified frailty distribution

CR	σ²	σ̂²	SD	β	Bias(%)	SD	SE	CP(%)
λ₀(t): Weibull type
15%	3.48	1.01	0.17	β̂₁	−1.1	.067	.08	97
15%	3.48	1.01	0.17	β̂₂	1.0	.060	.07	97
40%	3.48	1.04	0.20	β̂₁	−3.4	.086	.10	96
40%	3.48	1.04	0.20	β̂₂	3.3	.075	.09	97
15%	1.72	0.70	0.12	β̂₁	−0.9	.067	.07	96
15%	1.72	0.70	0.12	β̂₂	1.0	.059	.07	97
40%	1.72	0.72	0.15	β̂₁	−3.3	.084	.09	95
40%	1.72	0.72	0.15	β̂₂	3.2	.072	.08	96
15%	0.65	0.39	0.10	β̂₁	−0.8	.063	.07	96
15%	0.65	0.39	0.10	β̂₂	0.8	.056	.06	97
40%	0.65	0.39	0.12	β̂₁	−3.2	.080	.09	96
40%	0.65	0.39	0.12	β̂₂	3.0	.069	.08	96

λ₀ (t): reciprocal type
15%	3.48	1.04	0.16	β̂₁	−1.2	.145	.17	98
15%	3.48	1.04	0.16	β̂₂	1.4	.130	.15	97
40%	3.48	1.14	0.22	β̂₁	−5.3	.174	.19	96
40%	3.48	1.14	0.22	β̂₂	5.4	.153	.17	96
15%	1.72	0.74	0.13	β̂₁	−1.6	.141	.16	97
15%	1.72	0.74	0.13	β̂₂	1.7	.127	.14	96
40%	1.72	0.79	0.16	β̂₁	−5.4	.169	.18	95
40%	1.72	0.79	0.16	β̂₂	5.6	.151	.16	95
15%	0.65	0.38	0.09	β̂₁	−1.6	.134	.14	95
15%	0.65	0.38	0.09	β̂₂	1.8	.123	.12	94
40%	0.65	0.41	0.11	β̂₁	−6.6	.160	.18	95
40%	0.65	0.41	0.11	β̂₂	6.8	.146	.16	95

Open in a new tab

^†

We conducted additional simulations with cluster size of 2 and n = 200. The simulation results are given in the Supplementary Material. The findings are similar to those reported here.

4·2. Analysis of diabetic retinopathy data

We applied our estimation methods to clustered survival data from the diabetic retinopathy study conducted by the National Eye Institute (Huster et al., 1989). The study enrolled 197 patients with proliferative diabetic retinopathy representing a 50% simple random sample of patients with high-risk. For each patient, the photocoagulation treatment was randomly assigned to one eye, while the other eye was an untreated control. The endpoint of interest is the time to severe visual loss after treatment. In addition to the treatment indicator, 1 for treated with photocoagulation and 0 for untreated, there are three prognostic factors: age at diagnosis of diabetes, type of diabetes, 1 for adult and 0 for juvenile, and risk group ranging from 6–12. A primary goal is to study the effects of treatment and risk factors on the time to severe visual loss. This data set has been previously studied. For example, Lu (2007) studied the data using a marginal bivariate accelerated failure time model based on the weighted log-rank estimation method of Jin et al. (2006b). Lu also developed a statistical test for the association between pairs of failure times after adjusting for covariates. It was found that the null hypothesis of independence was rejected with a small p-value, which implies that there is a strong correlation between the pair of error terms in the bivariate accelerated failure time model. Here, we fit the accelerated failure time frailty model with all four covariates using the proposed nonparametric maximum likelihood estimation method and the Gehan rank estimation method. For our method, both the gamma and log-normal frailties were considered. The bandwidth parameters for the nonparametric maximum likelihood estimators were selected as in the simulation studies. The results are given in Table 4. The nonparametric maximum likelihood estimators have much smaller standard errors than the Gehan rank estimator, which indicates the efficiency gained by taking the correlation between error terms into account in the nonparametric maximum likelihood estimators. In addition, the nonparametric maximum likelihood estimators with the gamma and log-normal frailty distributions show very similar performances, which may imply that the analysis results are not sensitive to the choice of the frailty distribution. The estimated variance of the frailty is 0.88 under the gamma frailty and 1.16 under the log-normal frailty. Finally, all of the methods found that treatment and risk group are significantly associated with time to severe visual loss, whereas age at diagnosis of diabetes and type of diabetes are not.

Table 4.

Analysis results for diabetes data.

	NPMLE_g			NPMLE_l			GehanR
	Est.	SE	pv	Est.	SE	pv	Est.	SE	pv
treatment	−0.929	.104	.000	−0.949	.105	.000	−0.981	.201	.000
age	0.011	.006	.078	0.008	.006	.162	0.010	.016	.551
type	−0.029	.102	.779	0.070	.096	.467	−0.301	.453	.506
risk	1.660	.353	.000	1.491	.346	.000	2.568	1.036	.013

Open in a new tab

^†

NPMLE_g, nonparametric maximum likelihood estimator with the gamma frailty; NPMLE_l, nonparametric maximum likelihood estimator with the log-normal frailty; GehanR, Gehan rank estimator; Est., estimated coefficients; pv, p-value; age, age at diagnosis of diabetes; type, type of diabetes; risk, risk group.

5. Discussion

The proposed kernel-smoothing based nonparametric maximum likelihood estimation method can be extended to other types of multivariate survival data, such as recurrent event data. Specifically, let $N_{i}^{*} (t)$ denote the number of events observed on subject i by time t. We assume that $N_{i}^{*} (t)$ is a nonhomogeneous Poisson process and model its conditional intensity function given covariates Z_i and frailty α_i by α_i e^{β′ Z_i} λ(e^{β′ Z_i} t). This model is an extension of the accelerated failure time model for counting processes considered by Lin et al. (1998) and was studied by Strawderman (2006) using an EM-like algorithm. The nonparametric maximum likelihood estimation and its associated inference for the above model require further investigation.

Supplementary Material

Supplementary material

NIHMS480514-supplement-Supplementary_material.pdf^{(1.9MB, pdf)}

Acknowledgments

We thank the editor, an associate editor and two referees for very insightful comments. Lu and Zhang’s research was supported by the National Cancer Institute.

Appendix

Throughout the proofs, we assume the following regularity conditions:

(C1)
The hazard function λ₀(·) is positive and thrice-continuously differentiable with λ̇₀(0) > 0, where λ̇₀(0) is the right derivative of λ(t) at t = 0. In addition, Λ₀(τ) < ∞.
(C2)
There exists some positive constant c₀, such that $pr (C_{ij} e^{β_{0}^{'} X_{ij}} \geq τ ∣ X_{ij}) \geq c_{0}$ .
(C3)
The covariates X_ij are bounded. If there exists a constant vector a, such that a′X_ij = 0 almost surely, then a = 0.
(C4)
The true regression parameters β₀ belong to the interior of a known compact set B, and 0 < θ₀ < ∞.
(C5)
The kernel function K(·) is thrice-continuously differentiable. In addition, K^(r) (·) (r = 0, 1, 2, 3), have bounded variations in R, where K^(r) (·) is the rth derivative of K(·).
(C6)
The cluster size m_i is completely random. In addition, there exists a positive integer m₀, such that 1 ≤ m_i ≤ m₀ and pr(m_i ≥ 2) > 0.
(C7)
For any 1 ≤ k ≤ m₀, $\sup_{θ} \int_{0}^{\infty} u^{k} | f_{α}^{(m)} (u; θ) | du < \infty$ for m = 0, 1, 2, where $f_{α}^{(m)} (u; θ)$ is the mth derivative of f_α(u; θ) with respect to θ. In addition, $\int_{0}^{\infty} u^{k} f_{α}^{(1)} (u; θ_{0}) du \neq 0$ for some k.
(C8)
The information matrix I is finite and positive definite.

Conditions (C1)–(C5) are similar to those used in Zeng & Lin (2007). Condition (C7) is assumed to establish the consistency of the proposed estimators, which is satisfied by many commonly used frailty distributions, e.g., the gamma and log-normal distributions. Condition (C8) is assumed to establish the asymptotic normality of the estimators.

Proof of Theorem 1

To establish the consistency of the estimators, we introduce the following quantity

{\tilde{Λ}}_{n} (t) \equiv \int_{0}^{t} \frac{{(n h_{n} s)}^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} δ_{ij} K [{R_{ij} (β_{0}) - \log s} / h_{n}]}{n^{- 1} \sum_{i = 1}^{n} α_{i}^{*} \sum_{j = 1}^{m_{i}} \int_{- \infty}^{{R_{ij} (β_{0}) - \log s} / h_{n}} K (u) du} ds,

where $α_{i}^{*} = E (α_{i} ∣ δ_{ij}, {\tilde{T}}_{ij}, X_{ij})$ . Following Lemma 2.4 of Schuster (1969), we can show that as n → ∞,

\begin{array}{l} \sup_{s \in [0, τ]} | \frac{1}{n h_{n} s} \sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} δ_{ij} K {\frac{R_{ij} (β_{0}) - \log s}{h_{n}}} - \frac{dE {\sum_{j = 1}^{m_{i}} I (δ_{ij} = 1, e^{R_{ij} (β_{0})} \leq s)}}{ds} | \to 0, \\ \sup_{s \in [0, τ]} | \frac{1}{n} \sum_{i = 1}^{n} α_{i}^{*} \sum_{j = 1}^{m_{i}} \int_{- \infty}^{{R_{ij} (β_{0}) - \log s} / h_{n}} K (u) du - E {α_{i}^{*} \sum_{j = 1}^{m_{i}} I (e^{R_{ij} (β_{0})} \geq s)} | \to 0 . \end{array}

almost surely. Moreover,

\begin{matrix} \frac{dE {\sum_{j = 1}^{m_{i}} I (δ_{ij} = 1, e^{R_{ij} (β_{0})} \leq s)}}{ds} = λ_{0} (s) E {α_{i} e^{- α_{i} Λ_{0} (s)} \sum_{j = 1}^{m_{i}} S_{c} (s e^{- β_{0}^{'} X_{ij}} ∣ X_{ij})}, \\ E {α_{i}^{*} \sum_{j = 1}^{m_{i}} I (e^{R_{ij} (β_{0})} \geq s)} = E {α_{i} \sum_{j = 1}^{m_{i}} I (e^{R_{ij} (β_{0})} \geq s)} = E {α_{i} e^{- α_{i} Λ_{0} (s)} \sum_{j = 1}^{m_{i}} S_{c} (s e^{- β_{0}^{'} X_{ij}} ∣ X_{ij})}, \end{matrix}

where S_c(t ∣ x) = pr(C_ij ≥ t ∣ X_ij = x). Therefore,

\sup_{s \in [0, τ]} | \frac{{(n h_{n} s)}^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{m_{i}} δ_{ij} K [{R_{ij} (β_{0}) - \log s} / h_{n}]}{n^{- 1} \sum_{i = 1}^{n} α_{i}^{*} \sum_{j = 1}^{m_{i}} \int_{- \infty}^{{R_{ij} (β_{0}) - \log s} / h_{n}} K (u) du} - λ_{0} (s) | \to 0,

almost surely, which implies Λ̃_n(t) → Λ₀(t) almost surely for any t ∈ [0, τ]. This pointwise consistency can be strengthened to uniform consistency on [0, τ] due to the monotonicity and boundedness of Λ̃_n(t) and Λ₀(t).

By Helly’s theorem, there exists a convergent subsequence (β̂_{n_k}, θ̂_{n_k}, Λ̂_{n_k}) such that (β̂_{n_k}, θ̂_{n_k}, Λ̂_{n_k}) → (β*, θ*, Λ*) almost surely, where Λ* is a monotonically increasing function. Define the observed log-likelihood function

l_{n}^{o} (β, θ, Λ) = \sum_{i = 1}^{n} \log \int_{0}^{\infty} [\prod_{j = 1}^{m_{i}} {α_{i} λ (e^{R_{ij} (β)}) e^{β^{'} X_{ij}}}^{δ_{ij}} e^{- α_{i} Λ (e^{R_{ij} (β)})}] f_{α} (α_{i}; θ) d α_{i} .

We have

0 \leq n_{k}^{- 1} l_{n_{k}}^{o} ({\hat{β}}_{n_{k}}, {\hat{θ}}_{n_{k}}, {\hat{Λ}}_{n_{k}}) - n_{k}^{- 1} l_{n_{k}}^{o} (β_{0}, θ_{0}, {\tilde{Λ}}_{n_{k}}) .

Letting k → ∞ leads to

0 \leq E (\log \frac{\int_{0}^{\infty} [\prod_{j = 1}^{m_{i}} {α_{i} λ^{*} (e^{R_{ij} (β^{*})}) e^{β^{*'} X_{ij}}}^{δ_{ij}} e^{- α_{i} Λ^{*} (e^{R_{ij} (β^{*})})}] f_{α} (α_{i}; θ^{*}) d α_{i}}{\int_{0}^{\infty} [\prod_{j = 1}^{m_{i}} {α_{i} λ_{0} (e^{R_{ij} (β_{0})}) e^{β_{0}^{'} X_{ij}}}^{δ_{ij}} e^{- α_{i} Λ_{0} (e^{R_{ij} (β_{0})})}] f_{α} (α_{i}; θ_{0}) d α_{i}}),

where λ* is the derivative of Λ*. Due to the nonnegativity of the Kullback–Leibler information,

\begin{array}{l} \int_{0}^{\infty} [\prod_{j = 1}^{m_{i}} {α_{i} λ^{*} (e^{R_{ij} (β^{*})}) e^{β^{*'} X_{ij}}}^{δ_{ij}} e^{- α_{i} Λ^{*} (e^{R_{ij} (β^{*})})}] f_{α} (α_{i}; θ^{*}) d α_{i} \\ = \int_{0}^{\infty} [\prod_{j = 1}^{m_{i}} {α_{i} λ_{0} (e^{R_{ij} (β_{0})}) e^{β_{0}^{'} X_{ij}}}^{δ_{ij}} e^{- α_{i} Λ_{0} (e^{R_{ij} (β_{0})})}] f_{α} (α_{i}; θ_{0}) d α_{i} . \end{array}

Set δ_i1 = 1 and T̃_i1 = 0. For j = 2, …, m_i, if δ_ij = 0, set T̃_ij = τ; if δ_ij = 1, integrate T̃_ij from 0 to τ. We have

\begin{array}{l} \int_{0}^{\infty} α_{i} λ^{*} (0) e^{β^{*'} X_{ij}} \prod_{j = 2}^{m_{i}} {1 - e^{- α_{i} Λ^{*} (τ e^{β^{*'} X_{ij}})}}^{δ_{ij}} {e^{- α_{i} Λ^{*} (τ e^{β^{*'} X_{ij}})}}^{1 - δ_{ij}} f_{α} (α_{i}; θ^{*}) d α_{i} \\ = \int_{0}^{\infty} α_{i} λ_{0} (0) e^{β_{0}^{'} X_{ij}} \prod_{j = 2}^{m_{i}} {1 - e^{- α_{i} Λ_{0} (τ e^{β_{0}^{'} X_{ij}})}}^{δ_{ij}} {e^{- α_{i} Λ_{0} (τ e^{β_{0}^{'} X_{ij}})}}^{1 - δ_{ij}} f_{α} (α_{i}; θ_{0}) d α_{i} . \end{array}

The two sides of the above equation are summed over all possible combinations of δ_ij (j = 2, …, m_i) to obtain

λ^{*} (0) e^{β^{*^{'}} X_{ij}} = λ_{0} (0) e^{β_{0}^{'} X_{ij}}

since E(α_i) = 1. Therefore, (β* − β₀)′ X_ij = log{λ₀(0)/λ*(0)}. By (C3), β* = β₀. It follows that λ₀(0) = λ*(0). In addition, following similar steps, we can obtain

\int_{0}^{\infty} α_{i}^{k} f_{α} (α_{i}; θ_{0}) d α_{i} = \int_{0}^{\infty} α_{i}^{k} f_{α} (α_{i}; θ^{*}) d α_{i},

for any 1 ≤ k ≤ m_i, which leads to θ₀ = θ*. Finally, set δ_i1 = 1 and integrate T̃_i1 from 0 to t. For j = 2, …, m_i, if δ_ij = 0, set T̃_ij = τ; if δ_ij = 1, integrate T̃_ij from 0 to τ. The two sides of the equation are summed over all possible combinations of δ_ij (j = 2, …, m_i) to obtain

\int_{0}^{\infty} {1 - e^{- α_{i} Λ^{*} (t e^{β^{*'} X_{i 1}})}} f_{α} (α_{i}; θ^{*}) d α_{i} = \int_{0}^{\infty} {1 - e^{- α_{i} Λ_{0} (t e^{β_{0}^{'} X_{i 1}})}} f_{α} (α_{i}; θ_{0}) d α_{i} .

It follows that Λ*(t) = Λ₀(t) for t ∈ [0, τ]. Therefore, (β̂_n, θ̂_n, Λ̂_n(t)) → (β₀, θ₀, Λ₀ (t)) almost surely by Helly’s theorem, which can be strengthened to uniform convergence on [0, τ].

Next, we show that β̂_n is asymptotically normal and its variance achieves the semiparametric efficiency bound. Let BV[0, τ] denote the space of bounded variation functions on [0, τ] and define class H = {h = (h₁₁, h₁₂, h₂) : h₁₁ ∈ ℝ^p with ∥h₁₁∥₁ < ∞, |h₁₂| < ∞, h₂ ∈ BV[0, τ]}. For h ∈ H, define the norm ∥h∥ = ∥h₁₁∥₁ + |h₁₂| + ∥h₂∥_υ, where ∥h₁₁∥₁ is the L₁ norm of h₁₁, and ∥h₂∥_υ is the absolute value of h₂(0) plus the total variation of h₂ on the interval [0, τ]. Consider submodels β_d = β + dh₁₁, θ_d = θ + dh₁₂ and $Λ_{d} (t) = \int_{0}^{t} {1 + d h_{2} (u)} d Λ (u)$ . Further, define $U_{n} (β, θ, Λ) (h_{11}, h_{12}, h_{2}) = n^{- 1} {\partial l_{n}^{o} (β_{d}, θ_{d}, Λ_{d}) / \partial d} |_{d = 0}$ , where (h₁₁, h₁₂, h₂) ∈ H. For simplicity of notation, we denote

A_{i} = A_{i} (β, Λ) = \prod_{j = 1}^{m_{i}} {α_{i} λ (e^{R_{ij} (β)}) e^{β^{'} X_{ij}}}^{δ_{ij}} e^{- α_{i} Λ (e^{R_{ij} (β)})} .

Then we can write U_n(β, θ, Λ) (h₁₁, h₁₂, h₂) = U_n1(h₁₁) + U_n2(h₂) + U_n3(h₁₂), where

U_{n 1} (h_{11}) = \frac{1}{n} \frac{\partial}{\partial d} |_{d = 0} l_{n}^{o} (β_{d}, Λ, θ) = \frac{1}{n} \sum_{i = 1}^{n} \frac{\int_{0}^{\infty} A_{i} [\sum_{j = 1}^{m_{i}} h_{11}^{'} X_{ij} {δ_{ij} \dot{λ} (e^{R_{ij} (β)}) e^{R_{ij} (β)} / λ (e^{R_{ij} (β)}) / + δ_{ij} - α_{i} λ (e^{R_{ij} (β)}) e^{R_{ij} (β)}}] f_{α} (α_{i}; θ) d α_{i}}{\int_{0}^{\infty} A_{i} f_{α} (α_{i}; θ) d α_{i}},

(A1)

U_{n 2} (h_{2}) = \frac{1}{n} \frac{\partial}{\partial d} |_{d = 0} l_{n}^{o} (β, Λ_{d}, θ) = \frac{1}{n} \sum_{i = 1}^{n} \frac{\int_{0}^{\infty} A_{i} {\sum_{j = 1}^{m_{i}} δ_{ij} h_{2} (e^{R_{ij} (β)}) - α_{i} \sum_{j = 1}^{m_{i}} \int_{0}^{e^{R_{ij} (β)}} h_{2} (u) d Λ (u)} f_{α} (α_{i}; θ) d α_{i}}{\int_{0}^{\infty} A_{i} f_{α} (α_{i}; θ) d α_{i}},

(A2)

U_{n 3} (h_{12}) = \frac{1}{n} \frac{\partial}{\partial d} |_{d = 0} l_{n}^{o} (β, Λ, θ_{d}) = \frac{1}{n} \sum_{i = 1}^{n} \frac{\int_{0}^{\infty} h_{12} A_{i} {\partial f_{α} (α_{i}; θ) / \partial θ} d α_{i}}{\int_{0}^{\infty} A_{i} f_{α} (α_{i}; θ) d α_{i}}

(A3)

Define u(β, θ, Λ) (h₁₁, h₁₂, h₂) = lim_n→∞ U_n(β, θ, Λ) (h₁₁, h₁₂, h₂) ≡ u₁(h₁₁) + u₂(h₂) + u₃(h₁₂). It can be easily shown that u(β₀, θ₀, Λ₀) (h₁₁, h₁₂, h₂) = 0. In addition, it is easy to show that u(β, θ, Λ) is Fréchet differentiable since u(β, θ, Λ) is a smooth function of β, θ and Λ. Let u̇(β₀, θ₀, Λ₀) (β − β₀, θ − θ₀, Λ − Λ₀) (h) denote the corresponding Fréchet derivative of u(β, θ, Λ) at (β₀, θ₀, Λ₀). After some algebra, we have

\dot{u} (β_{0}, θ_{0}, Λ_{0}) (β - β_{0}, θ - θ_{0}, Λ - Λ_{0}) (h) = {(β - β_{0})}^{'} Q_{1} (h) + \int_{0}^{\infty} Q_{2} (t, h) d {Λ (t) - Λ_{0} (t)} + (θ - θ_{0}) Q_{3} (h),

where h = (h₁₁, h₁₂, h₂),

\begin{matrix} Q_{1} (h) & = & B_{1} (\begin{matrix} h_{11} \\ h_{12} \end{matrix}) + \int_{0}^{\infty} D_{1} (t) h_{2} (t) dt, \\ Q_{2} (t, h) & = & B_{2} {(t)}^{'} (\begin{matrix} h_{11} \\ h_{12} \end{matrix}) + c_{2} (t) h_{2} (t) + \int_{0}^{\infty} D_{2} (t, u) h_{2} (u) du, \\ Q_{3} (h) & = & B_{3}^{'} (\begin{matrix} h_{11} \\ h_{12} \end{matrix}) + \int_{0}^{\infty} D_{3} (t) h_{2} (t) dt, \end{matrix}

where B₁ is a p × (p + 1) matrix, B₂(t) and B₃ are (p + 1)-dimensional vectors, D₁ (t) is a p-dimensional function, and c₂(t), D₂(t, u) and D₃(t) are scalar functions. Therefore, Q(h) ≡ (Q₁(h), Q₂(h), Q₃(h)) is a continuous linear operator from the linear span of H to itself.

Consider two classes of functions:

\begin{array}{l} A_{1} (β_{0}, θ_{0}, Λ_{0}) = {h_{11}^{'} U_{1}^{*} (β_{0}, θ_{0}, Λ_{0}) + h_{12} U_{3}^{*} (β_{0}, θ_{0}, Λ_{0}) : {‖ h_{11} ‖}_{1} < \infty, | h_{12} | < \infty}, \\ A_{2} (β_{0}, θ_{0}, Λ_{0}) = {U_{2}^{*} (β_{0}, θ_{0}, Λ_{0}) (h_{2}) : h_{2} \in BV [0, τ]}, \end{array}

where

\begin{array}{l} U_{1}^{*} (β_{0}, θ_{0}, Λ_{0}) = \frac{\int_{0}^{\infty} A_{i 0} [\sum_{j = 1}^{m_{i}} X_{ij} {δ_{ij} {\dot{λ}}_{0} (e^{R_{ij} (β_{0})}) e^{R_{ij} (β_{0})} / λ_{0} (e^{R_{ij} (β_{0})}) + δ_{ij} - α_{i} λ (e^{R_{ij} (β_{0})}) e^{R_{ij} (β_{0})}}] f_{α} (α_{i}; θ_{0}) d α_{i}}{\int_{0}^{\infty} A_{i 0} f_{α} (α_{i}; θ_{0}) d α_{i}}, \\ U_{2}^{*} (β_{0}, θ_{0}, Λ_{0}) (h_{2}) = \frac{\int_{0}^{\infty} A_{i 0} {\sum_{j = 1}^{m_{i}} δ_{ij} h_{2} (e^{R_{ij} (β_{0})}) - α_{i} \sum_{j = 1}^{m_{i}} \int_{0}^{e^{R_{ij} (β_{0})}} h_{2} (u) d Λ_{0} (u)} f_{α} (α_{i}; θ_{0}) d α_{i}}{\int_{0}^{\infty} A_{i 0} f_{α} (α_{i}; θ_{0}) d α_{i}}, \\ U_{3}^{*} (β_{0}, θ_{0}, Λ_{0}) = \frac{\int_{0}^{\infty} A_{i 0} {\partial f_{α} (α_{i}; θ_{0}) / \partial θ} d α_{i}}{\int_{0}^{\infty} A_{i 0} f_{α} (α_{i}; θ_{0}) d α_{i}} \end{array}

and A_i0 = A_i(β₀, Λ₀). Since $U_{1}^{*} (β_{0}, θ_{0}, Λ_{0})$ and $U_{3}^{*} (β_{0}, θ_{0}, Λ_{0})$ are bounded functions based on assumptions (C1)–(C5), A₁ is a Donsker class. In addition, since h₂ ∈ BV[0, τ], A₂ can be written as the summation of bounded Donsker classes, which is also a Donsker class. Therefore, we have n^1/2{U_n(β₀, θ₀, Λ₀)(h) − u(β₀, θ₀, Λ₀)(h)} converges weakly to a Gaussian process G* on l^∞(H).

In addition, since ∥β − β₀∥₁ + |θ − θ₀| = o_p(1) and sup_{t∈[0, τ]} |Λ(t) − Λ₀(t)| = o_p(1), we can show that A₁(β, θ, Λ) and A₂(β, θ, Λ) are Donsker classes. This implies that

\begin{matrix} \sup_{h \in H} | (U_{n} - u) ({\hat{β}}_{n}, {\hat{θ}}_{n}, {\hat{Λ}}_{n}) (h) - (U_{n} - u) (β_{0}, θ_{0}, Λ_{0}) (h) | \\ = o_{p} {\max (n^{- 1 / 2}, {‖ {\hat{β}}_{n} - β_{0} ‖}_{1} + | {\hat{θ}}_{n} - θ_{0} | + \sup_{t \in [0, τ]} | {\hat{Λ}}_{n} (t) - Λ_{0} (t) |)} . \end{matrix}

(A4)

Finally, we show that u̇(β₀, θ₀, Λ₀) is continuously invertible. It is equivalent to show that Q(h) is a one to one map, i.e., Q(h) = 0 implies h = 0. If Q(h) = 0, u̇(β₀, θ₀, Λ₀) = 0 for (β, θ, Λ) in a neighbourhood of (β₀, θ₀, Λ₀). We choose β = β₀ + dh₁₁, θ = θ₀ + dh₁₂ and $Λ (t) = Λ_{0} (t) + d \int_{0}^{t} h_{2} (u) d Λ_{0} (u)$ for a small constant d. By the definition of u̇(β₀, Λ₀, θ₀), we have $\dot{u} (β_{0}, Λ_{0}, θ_{0}) = dE [{h_{11}^{'} U_{1}^{*} (β_{0}, θ_{0}, Λ_{0}) + U_{2}^{*} (β_{0}, θ_{0}, Λ_{0}) (h_{2}) + h_{12} U_{3}^{*} (β_{0}, θ_{0}, Λ_{0})}^{2}] = 0$ . This implies that $h_{11}^{'} U_{1}^{*} (β_{0}, θ_{0}, Λ_{0}) + U_{2}^{*} (β_{0}, θ_{0}, Λ_{0}) (h_{2}) + h_{12} U_{3}^{*} (β_{0}, θ_{0}, Λ_{0}) = 0$ almost surely. Following the techniques used to derive the consistency of the estimators, we can show that h = 0. The details are given in the Supplementary Material.

Since u̇(β₀, θ₀, Λ₀) is continuously invertible on its range, based on Theorem 3.3.1. of van der Vaart & Wellner (1996), we have that n^1/2[{γ̂_n, Λ̂_n(t)} − {γ₀, Λ₀(t)}] converges weakly to a tight Gaussian process G = {u̇(β₀, θ₀, Λ₀)}⁻¹G*. In addition, the variance of G is

var {G (h)} = \int_{0}^{\infty} h_{2} (t) Q_{2}^{- 1} (t, h) d Λ_{0} (t) + (h_{11}^{'}, h_{12}) (\begin{matrix} Q_{1}^{- 1} (h) \\ Q_{3}^{- 1} (h) \end{matrix}),

where $Q^{- 1} (h) \equiv {Q_{1}^{- 1} (h), Q_{2}^{- 1} (h), Q_{3}^{- 1} (h)}$ is the inverse of Q(h). We derive the semiparametric efficiency bound I⁻¹ (Bickel et al., 1993) and show that the asymptotic variance of n^1/2(γ̂_n − γ₀) achieves the semiparametric efficiency bound. The details are given in the Supplementary Material.

Footnotes

Supplementary Material

Supplementary material available at Biometrika online includes additional simulation study results and technical derivations.

Contributor Information

Bo Liu, Email: bliu4@ncsu.edu, Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, North Carolina 27695, U.S.A.

Wenbin Lu, Email: lu@stat.ncsu.edu, Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, North Carolina 27695, U.S.A.

Jiajia Zhang, Email: jzhang@mailbox.sc.edu, Department of Epidemiology and Biostatistics, University of South Carolina, 800 Sumter Street, Columbia, South Carolina 29208, U.S.A.

References

Aalen OO. Modelling heterogeneity in survival analysis by the compound Poisson distribution. The Annals of Applied Probability. 1992;2:951–972. [Google Scholar]
Bickel PJ, Klaassen CAJ, Ritov Y, Wellner JA. Efficient and Adaptive Estimation for Semiparametric Models. Baltimore: Johns Hopkins University Press; 1993. [Google Scholar]
Buckley J, James I. Linear regression with censored data. Biometrika. 1979;66:429–436. [Google Scholar]
Chen HY, Little RJA. Proportional hazards regression with missing covariates. Journal of the American Statistical Association. 1999;94:896–908. [Google Scholar]
Clayton DG. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika. 1978;65:141. [Google Scholar]
Clayton DG, Cuzick J. Multivariate generalizations of the proportional hazards model (with discussion) Journal of the Royal Statistical Society A. 1985;48:82–117. [Google Scholar]
Cox DR. Regression models and life tables (with discussion) Journal of the Royal Statistical Society Series B. 1972;34:187–220. [Google Scholar]
Hougaard P. Survival models for heterogeneous populations derived from stable distributions. Biometrika. 1986;73:387–396. [Google Scholar]
Hougaard P. Modelling multivariate survival. Scandinavian Journal of Statistics. 1987;14:291–30. [Google Scholar]
Huster WJ, Brookmeyer R, Self SG. Modelling paired survival data with covariates. Biometrics. 1989;45:145–156. [PubMed] [Google Scholar]
Jin Z, Lin DY, Wei LJ, Ying Z. Rank-based inference for the accelerated failure time model. Biometrika. 2003;90:341–353. [Google Scholar]
Jin Z, Lin DY, Ying Z. On least-squares regression with censored data. Biometrika. 2006a;93:147–161. [Google Scholar]
Jin Z, Lin DY, Ying Z. Rank regression analysis of multivariate failure time data based on marginal linear models. Scandinavian Journal of Statistics. 2006b;33:1–23. [Google Scholar]
Johnson L, Strawderman R. Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika. 2009;96:577–590. doi: 10.1093/biomet/asp025. [DOI] [PMC free article] [PubMed] [Google Scholar]
Johnson L, Strawderman R. A smoothing expectation and substitution algorithm for the semiparametric accelerated failure time frailty model. Statistics in Medicine. 2012;31:2335–2358. doi: 10.1002/sim.5349. [DOI] [PubMed] [Google Scholar]
Jones MC. The performance of kernel density functions in kernel distribution function estimation. Statistics and Probability Letter. 1990;9:129–132. [Google Scholar]
Jones MC, Sheather SJ. Using non-stochastic terms to adavantage in kernelbased estimation of integrated squared density derivatives. Statistics and Probability Letter. 1991;11:511–514. [Google Scholar]
Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. 2 New York; Wiley: 2002. [Google Scholar]
Li H, Yin GS. Generalized method of moments estimation for linear regression with clustered failure time data. Biometrika. 2009;96:293–306. [Google Scholar]
Lin DY. Cox regression analysis of multivariate failure time data: the marginal approach. Statistics in Medicine. 1994;13:2233–2247. doi: 10.1002/sim.4780132105. [DOI] [PubMed] [Google Scholar]
Lin DY, Wei LJ, Ying Z. Accelerated failure time models for counting processes. Biometrika. 1998;85:605–618. [Google Scholar]
Lu W. Tests of independence for censored bivariate failure time data. Lifetime Data Analysis. 2007;13:75–90. doi: 10.1007/s10985-006-9031-z. [DOI] [PubMed] [Google Scholar]
Lu W. Efficient estimation for an accelerated failure time model with a cure fraction. Statistica Sinica. 2010;20:661–674. [PMC free article] [PubMed] [Google Scholar]
McGilchrist CA, Aisbett CW. Regression with frailty in survival analysis. Biometrics. 1991;47:461–466. [PubMed] [Google Scholar]
Murphy SA. Consistency in a proportional hazard model incorporating a random effect. Annals of Statistics. 1994;22:712–731. [Google Scholar]
Murphy SA. Asymptotic theory for the frailty model. Annals of Statistics. 1995;23:182–198. [Google Scholar]
Nielsen GG, Gill RD, Andersen PK, Sorensen TIA. A counting process approach to maximum likelihood estimation in frailty models. Scandinavian Journal of Statistics. 1992;19:25–44. [Google Scholar]
Oakes D. Bivariate survival models induced by frailties. Journal of the American Statistical Association. 1989;84:487–493. [Google Scholar]
Pan W. Using frailties in the accelerated failure time model. Lifetime Data Analysis. 2001;7:55–64. doi: 10.1023/a:1009625210191. [DOI] [PubMed] [Google Scholar]
Parner E. Asymptotic theory for the correlated gamma-frailty model. Annals of Statistics. 1998;26:183–214. [Google Scholar]
Schuster EF. Estimation of a probability density function and its derivatives. The Annals ofMathematical Statistics. 1969;40:1187–1195. [Google Scholar]
Strawderman R. A regression model for dependent gap times. International Journal of Biostatistics. 2006;2:1–33. [Google Scholar]
Tsiatis AA. Estimating regression parameteters using linear rank tests for censored data. The Annals of Statistics. 1990;18:354–372. [Google Scholar]
van der Vaart AW, Wellner JA. Weak Convergence and Empirical Processes. New York: Springer-Verlag; 1996. [Google Scholar]
Wei LJ, Lin DY, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modelling marginal distributions. Journal of the American Statistical Association. 1989;84:1065–1073. [Google Scholar]
Xu L, Zhang J. An EM-like algorithm for the semiparametric accelerated failure time gamma frailty model. Computational Statistics and Data Analysis. 2010;54:1467–1474. [Google Scholar]
Ying Z. A large sample study of rank estimation for censored regression data. The Annals of Statistics. 1993;21:76–99. [Google Scholar]
Zeng D, Lin DY. Efficient estimation for the accelerated failure time model. Journal of the American Statistical Association. 2007;102:1387–1396. [Google Scholar]
Zhang J, Peng Y. An alternative estimation method for the accelerated failure time frailty model. Computational Statistics and Data Analysis. 2007;51:4413–4423. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material

NIHMS480514-supplement-Supplementary_material.pdf^{(1.9MB, pdf)}

[R1] Aalen OO. Modelling heterogeneity in survival analysis by the compound Poisson distribution. The Annals of Applied Probability. 1992;2:951–972. [Google Scholar]

[R2] Bickel PJ, Klaassen CAJ, Ritov Y, Wellner JA. Efficient and Adaptive Estimation for Semiparametric Models. Baltimore: Johns Hopkins University Press; 1993. [Google Scholar]

[R3] Buckley J, James I. Linear regression with censored data. Biometrika. 1979;66:429–436. [Google Scholar]

[R4] Chen HY, Little RJA. Proportional hazards regression with missing covariates. Journal of the American Statistical Association. 1999;94:896–908. [Google Scholar]

[R5] Clayton DG. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika. 1978;65:141. [Google Scholar]

[R6] Clayton DG, Cuzick J. Multivariate generalizations of the proportional hazards model (with discussion) Journal of the Royal Statistical Society A. 1985;48:82–117. [Google Scholar]

[R7] Cox DR. Regression models and life tables (with discussion) Journal of the Royal Statistical Society Series B. 1972;34:187–220. [Google Scholar]

[R8] Hougaard P. Survival models for heterogeneous populations derived from stable distributions. Biometrika. 1986;73:387–396. [Google Scholar]

[R9] Hougaard P. Modelling multivariate survival. Scandinavian Journal of Statistics. 1987;14:291–30. [Google Scholar]

[R10] Huster WJ, Brookmeyer R, Self SG. Modelling paired survival data with covariates. Biometrics. 1989;45:145–156. [PubMed] [Google Scholar]

[R11] Jin Z, Lin DY, Wei LJ, Ying Z. Rank-based inference for the accelerated failure time model. Biometrika. 2003;90:341–353. [Google Scholar]

[R12] Jin Z, Lin DY, Ying Z. On least-squares regression with censored data. Biometrika. 2006a;93:147–161. [Google Scholar]

[R13] Jin Z, Lin DY, Ying Z. Rank regression analysis of multivariate failure time data based on marginal linear models. Scandinavian Journal of Statistics. 2006b;33:1–23. [Google Scholar]

[R14] Johnson L, Strawderman R. Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika. 2009;96:577–590. doi: 10.1093/biomet/asp025. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] Johnson L, Strawderman R. A smoothing expectation and substitution algorithm for the semiparametric accelerated failure time frailty model. Statistics in Medicine. 2012;31:2335–2358. doi: 10.1002/sim.5349. [DOI] [PubMed] [Google Scholar]

[R16] Jones MC. The performance of kernel density functions in kernel distribution function estimation. Statistics and Probability Letter. 1990;9:129–132. [Google Scholar]

[R17] Jones MC, Sheather SJ. Using non-stochastic terms to adavantage in kernelbased estimation of integrated squared density derivatives. Statistics and Probability Letter. 1991;11:511–514. [Google Scholar]

[R18] Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. 2 New York; Wiley: 2002. [Google Scholar]

[R19] Li H, Yin GS. Generalized method of moments estimation for linear regression with clustered failure time data. Biometrika. 2009;96:293–306. [Google Scholar]

[R20] Lin DY. Cox regression analysis of multivariate failure time data: the marginal approach. Statistics in Medicine. 1994;13:2233–2247. doi: 10.1002/sim.4780132105. [DOI] [PubMed] [Google Scholar]

[R21] Lin DY, Wei LJ, Ying Z. Accelerated failure time models for counting processes. Biometrika. 1998;85:605–618. [Google Scholar]

[R22] Lu W. Tests of independence for censored bivariate failure time data. Lifetime Data Analysis. 2007;13:75–90. doi: 10.1007/s10985-006-9031-z. [DOI] [PubMed] [Google Scholar]

[R23] Lu W. Efficient estimation for an accelerated failure time model with a cure fraction. Statistica Sinica. 2010;20:661–674. [PMC free article] [PubMed] [Google Scholar]

[R24] McGilchrist CA, Aisbett CW. Regression with frailty in survival analysis. Biometrics. 1991;47:461–466. [PubMed] [Google Scholar]

[R25] Murphy SA. Consistency in a proportional hazard model incorporating a random effect. Annals of Statistics. 1994;22:712–731. [Google Scholar]

[R26] Murphy SA. Asymptotic theory for the frailty model. Annals of Statistics. 1995;23:182–198. [Google Scholar]

[R27] Nielsen GG, Gill RD, Andersen PK, Sorensen TIA. A counting process approach to maximum likelihood estimation in frailty models. Scandinavian Journal of Statistics. 1992;19:25–44. [Google Scholar]

[R28] Oakes D. Bivariate survival models induced by frailties. Journal of the American Statistical Association. 1989;84:487–493. [Google Scholar]

[R29] Pan W. Using frailties in the accelerated failure time model. Lifetime Data Analysis. 2001;7:55–64. doi: 10.1023/a:1009625210191. [DOI] [PubMed] [Google Scholar]

[R30] Parner E. Asymptotic theory for the correlated gamma-frailty model. Annals of Statistics. 1998;26:183–214. [Google Scholar]

[R31] Schuster EF. Estimation of a probability density function and its derivatives. The Annals ofMathematical Statistics. 1969;40:1187–1195. [Google Scholar]

[R32] Strawderman R. A regression model for dependent gap times. International Journal of Biostatistics. 2006;2:1–33. [Google Scholar]

[R33] Tsiatis AA. Estimating regression parameteters using linear rank tests for censored data. The Annals of Statistics. 1990;18:354–372. [Google Scholar]

[R34] van der Vaart AW, Wellner JA. Weak Convergence and Empirical Processes. New York: Springer-Verlag; 1996. [Google Scholar]

[R35] Wei LJ, Lin DY, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modelling marginal distributions. Journal of the American Statistical Association. 1989;84:1065–1073. [Google Scholar]

[R36] Xu L, Zhang J. An EM-like algorithm for the semiparametric accelerated failure time gamma frailty model. Computational Statistics and Data Analysis. 2010;54:1467–1474. [Google Scholar]

[R37] Ying Z. A large sample study of rank estimation for censored regression data. The Annals of Statistics. 1993;21:76–99. [Google Scholar]

[R38] Zeng D, Lin DY. Efficient estimation for the accelerated failure time model. Journal of the American Statistical Association. 2007;102:1387–1396. [Google Scholar]

[R39] Zhang J, Peng Y. An alternative estimation method for the accelerated failure time frailty model. Computational Statistics and Data Analysis. 2007;51:4413–4423. [Google Scholar]

PERMALINK

Kernel Smoothed Profile Likelihood Estimation in the Accelerated Failure Time Frailty Model for Clustered Survival Data

Bo Liu

Wenbin Lu

Jiajia Zhang

Summary

1. Introduction

2. The accelerated failure time frailty model

3. Nonparametric maximum likelihood estimator

Theorem 1

4. Numerical Examples

4·1. Simulation studies

Table 1.

Table 2.

Table 3.

4·2. Analysis of diabetic retinopathy data

Table 4.

5. Discussion

Supplementary Material

Acknowledgments

Appendix

Proof of Theorem 1

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Kernel Smoothed Profile Likelihood Estimation in the Accelerated Failure Time Frailty Model for Clustered Survival Data

Bo Liu

Wenbin Lu

Jiajia Zhang

Summary

1. Introduction

2. The accelerated failure time frailty model

3. Nonparametric maximum likelihood estimator

Theorem 1

4. Numerical Examples

4·1. Simulation studies

Table 1.

Table 2.

Table 3.

4·2. Analysis of diabetic retinopathy data

Table 4.

5. Discussion

Supplementary Material

Acknowledgments

Appendix

Proof of Theorem 1

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases