Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Dec 19.
Published in final edited form as: Stat Methods Med Res. 2013 Jan 29;25(2):936–953. doi: 10.1177/0962280212474059

Interval Estimation of Random Effects in Proportional Hazards Models with Frailties

Il Do Ha 1, Florin Vaida 2, Youngjo Lee 3
PMCID: PMC4270953  NIHMSID: NIHMS555378  PMID: 23361438

Abstract

Semi-parametric frailty models are widely used to analyze clustered survival data. In this paper, we propose the use of the hierarchical likelihood interval for individual frailties of the clusters, not the parameters of the frailty distribution. We study the relationship between hierarchical likelihood, empirical Bayesian, and fully Bayesian intervals for frailties. We show that our proposed interval can be interpreted as a frequentist confidence interval and Bayesian credible interval under a uniform prior. We also propose an adjustment of the proposed interval to avoid null intervals. Simulation studies show that the proposed interval preserves the nominal confidence level. The procedure is illustrated using data from a multicenter lung cancer clinical trial.

Keywords: Empirical Bayes, Hierarchical likelihood, Interval estimator, Random effects, Survival analysis

1 Introduction

Frailty models, or proportional hazards models with random effects, are widely used to analyze clustered survival data.1 It is important to investigate the potential heterogeneity in survival among clusters in order to understand and interpret the variability in the data.2 Multivariate semi-parametric frailty models offer a flexible framework for modeling this heterogeneity. Such heterogeneity can be accounted for by random cluster effects on the baseline hazard. Furthermore, the effects of a treatment can vary substantially across participating centers in a multicenter clinical trial3, suggesting the inclusion of random treatment effects.

In addition to the estimation of random effects, a measure of uncertainty for these point estimates is useful and necessary. In this paper we focus on the interval estimation of the individual random effects, not the parameters of the random effects distribution. The standard methods in use are empirical Bayes (EB) confidence intervals, based on the conditional posterior distribution of random effects given the observed data and the estimated parameter values.2 However, EB interval estimators have been criticized for not maintaining the nominal level.4 Gray3, Legrand et al.5 and Komarek et al.6 developed fully Bayesian methods. Lee and Ha7 proposed hierarchical likelihood (HL) methods to estimate random effects and their confidence intervals for hierarchical generalized linear models (HGLMs8). Recently, HL inferences of random effects are of special interest9,10. In this paper, we extend these methods to semi-parametric frailty models.

One particular difficulty is that in random-effect models such as frailty models, likelihood methods may lead to zero estimates for strictly positive variance components.12 In turn, this leads to null confidence intervals in the EB and HL approaches, clearly underestimating the variability of the random effects. These intervals will have a coverage probability below the nominal level, especially for small dispersion parameters or for small sample sizes. In the context of Fay-Herriot’s13 small-area linear mixed models, Morris14 addressed this problem and proposed an adjustment to the restricted maximum likelihood (REML) estimation. Li and Lahiri15 advocated the use of this approach for confidence intervals of random effects. Here, we extend the adjustment proposed by Morris14 to general random-effect models, including HGLMs and frailty models. Li and Lahiri15 used the bootstrap method16; however, here we show that for frailty models, the likelihood method is sufficient and bootstrapping is not required. The proposed HL interval can be interpreted as a frequentist confidence interval and fully Bayesian credible interval under a uniform prior.9 Through numerical studies, we show that the proposed interval improves the EB interval by maintaining the stated nominal level.

The remainder of this paper is organized as follows. In Section 2, we discuss three interval estimators based on HL, EB, and fully Bayesian approaches. In Section 3, we review frailty models. In Section 4, we study how these intervals can be extended to frailty models, and investigate the relationships among them. In addition, we propose an adjustment to avoid null intervals for frailty models. Simulation studies are presented in Section 5. In Section 6, we conduct a case study using a multicenter lung cancer trial2,3 conducted by the Eastern Cooperative Oncology Group (ECOG). Finally, the discussion is presented in Section 7. The technical details are described in Appendix.

2 Interval estimators for random effects

In this section, we review the HL interval and study its relationship with EB and fully Bayesian credible intervals.

2.1 Hierarchical-likelihood confidence intervals

Consider a statistical model, described by the joint density

logfθ(y,υ)=logfθ(y|υ)+logfθ(υ), (1)

where fθ(υ) and fθ(y|υ) are respectively density functions for υ and y|υ, so that they have different functional forms even though we use the same notation fθ(·). Here three objects (y, υ, θ) denote respectively the response variables, unobservable random variables, and fixed unknown parameters. Fixed unknown parameters θ consist of regression coefficients β and dispersion parameters. When covariates x are available fθ(y|υ) = fθ(y|x, υ). This model allows a two-stage interpretation: first, the unobservable random variables are realized as υ* from the distribution fθ(υ); second, the observed data yo is sampled from fθ(y|υ*). Once the data yo are observed, yo are known; however, υ* remain unknown. For the likelihood inference of unknowns (θ, υ*), Lee and Nelder8 proposed the use of the HL defined by

h=h(θ,υ*)=logfθ(yo,υ*)=m+logfθ(υ*|yo), (2)

where m = log fθ(yo) is the marginal log-likelihood, with fθ(y) = ∫ fθ(y|υ)fθ(υ)dυ. In this paper, we omit the star in υ* unless it is necessary to highlight the fact that υ* is a fixed unknown.

The maximization of the HL (1) yields the EB-mode estimator for υ, without computing fθ(υ|yo) in (2). Given θ, let υ̂(θ) be a random-effect estimator solving ∂h/∂υ = 0. In HGLMs, (υ, β) and dispersion parameters are asymptotically orthogonal8; therefore, in estimating (υ, β), we need not consider the information loss caused by estimating the dispersion parameters. The negative Hessian matrix of β and υ based on h is given by

H(h;β,υ)2h/(β,υ)2=(2h/ββT2h/βυT2h/υβT2h/υυT). (3)

For linear mixed models, Henderson17 showed that the lower-right-hand corner of H(h; β, υ)−1 gives an estimate of the unconditional mean squared error (UMSE):

UMSEEθ[{υ̂(θ̂)υ}{υ̂(θ̂)υ}T]=Eθ[{υ̂(θ)υ}{υ̂(θ)υ}T]+Eθ[{υ̂(θ̂)υ̂(θ)}{υ̂(θ̂)υ̂(θ)}T], (4)

where υ̂(θ̂) ≡ υ̂(θ)|θ=θ̂ and θ̂ is either the maximum likelihood (ML) or REML estimator of θ. The second term above is the inflation caused in the UMSE because of the estimation of θ by θ̂. This holds generally in HGLMs.8,11 The UMSE above could be used to construct HL confidence intervals for υ (= υ*) using an asymptotic normal approximation. For example, let A be the lower-right-hand corner of H(h; β̂, υ̂)−1 corresponding to υ, with the kth diagonal element akk. Then, a (1 − α) HL interval for υk is υ̂k±zα/2·akk, where zα/2 is the normal quantile with probability α/2 in the right tail. Lee and Nelder9 argue that this HL interval can be interpreted as a frequentist confidence interval for υ*. Asymptotically, the HL interval will include the true realized value υk=υk*, in the long run, in (1 − α) proportion of cases.

Booth and Hobert18 showed that in generalized linear mixed models (GLMMs), H(h; β, υ)−1 in (3) gives an estimate of the the conditional MSE (CMSE), defined by CMSE ≡ Eθ[{υ̂(θ̂) − υ}{υ̂(θ̂) − υ}T |yo], because the UMSE is the first-order approximation to the CMSE. In the CMSE, the probability statement of the interval is based on the conditional distribution of {υ̂(θ̂)−υ}|y0. Therefore, the probability statement is confined to the current data yo, whereas that of the confidence interval depends upon the repeated sampling of y.

2.2 Empirical Bayes intervals

In the Bayesian framework, fixed parameters θ are treated as random variables with a prior distribution π(θ). The Bayesian statistical model is

B=π(θ)fθ(υ)fθ(y|υ)=π(θ)f(υ|θ)f(y|υ,θ),

where f(υ|θ) = fθ(υ) and f(y|υ, θ) = fθ(y|υ). The maximum likelihood estimator for fixed unknown parameter θ can be interpreted as the maximum posterior estimator under a uniform prior π(θ) = 1. Similarly, the HL, i.e. h = log B can be interpretd as the Bayesian probability under the uniform prior.

The EB interval for υ is based on the (conditional) posterior distribution fθ̂(υ|yo) = f(υ|yo, θ̂), where θ̂ is the ML estimator.19 The asymptotic variance of υ given y0 is estimated by {−d2 log fθ̂ (υ|yo)/dυ2}−1. This can also be obtained via h in (2) without computing fθ(υ|yo) because d log fθ (υ|yo)/dυ = dh/dυ and {−d2 log fθ(υ|yo)/dυ2}−1 = (−d2h/dυ2)−1. An uni-dimensional EB interval can be derived as follows: for υk, a scalar component of υ, the EB (1−α)-level interval corresponds to the (1− α) credible region of fθ̂k|yo) = fk|yo, θ̂). A normal approximation of the type υ̂k ± zα/2 · σk is often used, where σk2=var(υk|yo,θ)|θ=θ̂ and zα/2 is the α/2 normal quantile.

However, the downside of the EB interval is that it ignores the uncertainty in θ̂. That is, the EB variance estimator, the inverse of the negative Hessian matrix from log fθ(υ|yo), is obtained from (−∂2h/∂υ∂υT)−1 which is the inverse of bottom right-hand corner of H in (3). To contrast the EB and HL estimators, note that the HL estimator takes into account the covariance between β̂ and υ̂ − υ, corresponding to the off-diagonal term in the Hessian H in (3), whereas this term is ignored in the EB. This covariance term carries information about the uncertainty in estimating β and its effect on υ̂ − υ.7,9 This implies that the EB method is not satisfactory because it does not reflect the uncertainty caused by estimating θ. Bjørnstad20 also showed that the EB intervals underestimate the uncertainty in υ, occasionally quite severely. See also the discussion in Section 3.5 of Carlin and Louis4.

2.3 Bayesian credible intervals

The Bayesian credible interval for υ is based on the marginal posterior

π(υ|yo)=π(υ,θ|yo)dθ,

where θ is integrated out. The marginal posterior variance of υ is

var(υ|yo)=Eθ|yo{var(υ|yo,θ)}+varθ|yo{E(υ|yo,θ)}. (5)

Carlin and Gelfand21 noted that the EB variance estimate only approximates the first term in (5), and ignores the second. Laird and Louis22 and Carlin and Gelfand21 proposed the use of a bootstrap method to estimate the second term. In general, Kass and Steffey23 used a Laplace approximation to show that under the improper prior π(θ) = 1, the estimator for var(υ|yo) can be obtained from H(h; θ, υ)−1 = {−∂h2/∂(θ, υ)2}−1. This implies that the marginal posterior distribution of υ|yo is approximately multivariate normal with mean υ̂{θ̂(yo)} and the variance obtained from H(h; θ, υ)−1; the resulting HL interval can be viewed as an approximate fully Bayesian credible interval under π(θ) = 1.

In summary, credible and confidence intervals for the random effects are defined via different considerations. Empirical and credible intervals are based on the observed data y0, whereas the confidence interval from UMSE is based on the statistical model of y. The HL interval allows both interpretations.

3 Formulation for frailty models

Suppose that the data consist of censored time-to-event observations collected from q clusters. Let Tij be the survival time for the jth observation in the ith cluster, i = 1, …, q, j = 1, …, ni, n = Σi ni. Denote by υi an s-dimensional vector of unobserved log-frailties (random effects) associated with the ith cluster. Given υi, the conditional hazard function of Tij is of the form

λij(t|υi)=λ0(t)exp(ηij), (6)

where λ0(·) is the unknown baseline hazard function, ηij=xijTβ+zijTυi is the linear predictor for the log-hazard, and xij = (xij1, …, xijp)T and zij = (zij1, …, zijs)T are p × 1 and s × 1 covariate vectors corresponding to fixed effects β = (β1, …, βp)T and log-frailties υi, respectively. We assume that the log-frailties υi are independent and follow a multivariate normal distribution, υi ~ Ns(0, Σ). Here, the covariance matrix Σ = Σ(ϕ) depends on a vector of unknown parameters ϕ. The normal distribution has been used for modelling multi-component24 and correlated frailties25.

Model (6) includes some well-known models as particular cases. In a multicenter medical study, let υi0 be a random intercept or random center effect that modifies the baseline risk for center i, and let υi1 be associated with the treatment effect, i.e., a random treatment effect or random treatment-by-center interaction. In (6), if we consider zij = 1 and υi = υi0 for all i, j, this is the random center or shared frailty model1 with

ηij=xijTβ+υi0, (7)

where υi0~N(0,σ02) for all i. Model (7) can be extended as follows. Let β1 be the main treatment effect associated with the treatment indicator xij1 and let βm (m = 2, …, p) be the fixed effects corresponding to the covariates xijm. By taking zij = (xij0, xij1)T and υi = (υi0, υi1)T, we have a two-random-interactions model2 with

ηij=(β0+υi0)xij0+(β1+υi1)xij1+m=2pβmxijm, (8)

where υi = (υi0, υi1)T ~ N2(0, Σ). In this model, by taking xij0 = 1 for all i, j and β0 = 0, we obtain the random coefficient model3 with

ηij=υi0+(β1+υi1)xij1+m=2pβmxijm. (9)

To maintain the invariance of model to parameterization of the treatment effect, we allow a general covariance matrix11,25 in (8) and (9):

Σ(σ02σ01σ01σ12), (10)

where ρ = σ01/(σ0σ1) is the correlation between υi0 and υi1 within a cluster.

4 Interval estimators for random effects in frailty models

We now show how the HL confidence intervals for random effects can be extended to frailty models. A particular problem occurs when one or more frailty variance parameters are estimated to be zero, leading to null HL intervals. We propose a general modification to deal with this issue.

4.1 Hierarchical likelihood for frailty models

For observations j of cluster i, let Tij and Cij be the event and censoring times, respectively, and response variable yij = min{Tij, Cij} with event indicator δij = I(TijCij). Ha et al.26 made the following two assumptions:

  • Assumption 1. Given υi, the pairs {(Tij, Cij), j = 1, …, ni} are conditionally independent and both Tij and Cij are also conditionally independent for j = 1, …, ni.

  • Assumption 2. Given υi, the set {Cij, j = 1, …, ni} are noninformative about υi.

In the semi-parametric frailty model (6), the functional form of λ0(t) is unknown. The non-parametric estimator27 of the baseline cumulative hazard function Λ0(t)=0tλ0(k)dk is a step function with jumps at the observed event times. Restricting ourselves to hazard functions of the above form, we have Λ0(t) = Σk:y(k)t λ0k, where y(1) < … < y(r) are the ordered distinct event times and λ0k = λ0(y(k)). Let y* be the vector of response variables yij*, with yij*=(yij,δij).

Once the data are observed, yoij*=(yoij,δoij). Under assumptions 1 and 2, the HL for the semi-parametric frailty model (6) is given by

h=h{(β,λ0,ϕ),υ}=ij1ij+i2i, (11)

where

ij1ij=ijδoij{logλ0(yoij)+ηij}ij{Λ0(yoij)exp(ηij)}=k=1rd(k)logλ0k+ijδoijηijk=1rλ0k{(i,j)R(k)exp(ηij)},

1ij=1ij(β,λ0;yoij*|υi) is the logarithm of the conditional density function for yij*=yoij* given υi,

2i=2i(ϕ;υi)=12{log det(2πΣ(ϕ))}12υiTΣ(ϕ)1υi

is the logarithm of the density function for υi with parameters ϕ, and ηij=xijTβ+zijTυi. Here, υ=(υ1T,,υqT)T are vectors of υi’s, λ0 = (λ01, …, λ0r)T, d(k) is the number of events at y(k), and R(k) = R(y(k)) = {(i, j) : yoijy(k)} is the risk set at y(k). Since (β, λ0, υ) and ϕ are asymptotically orthogonal as in HGLMs28, we only need to consider the Hessian matrix of υ and ψ=(βT,λ0T)T. Let yo* be the vector of observed data points yoij*. The definitions of the Hessian matrix (3) and UMSE (4) are respectively extended to

H(h;ψ,υ)2h/(ψ,υ)(ψ,υ)andUMSEEψ[{υ̂(ψ̂)υ}{υ̂(ψ̂)υ}T]. (12)

Here, υ̂(ψ̂) ≡ υ̂(ψ)|ψ=ψ̂, where υ̂(ψ) is the solution to ∂h/∂υ = 0 for a given ψ. Note that υ̂(ψ)=Eψ(υ|yo*) asymptotically as N = min1≤iq ni → ∞. Following Lee and Nelder (1996) and Lee and Ha (2010), we can be shown that the H(h;ψ̂, υ̂)−1 gives the first-order approximation to the UMSE in (12), leading to a standard error for υ̂ − υ and a HL confidence interval for υ.

In semi-parametric frailty models (6), the number of λ0k in ψ increases with sample size n. Thus, H(h; ψ̂, υ̂)−1 requires inversion of a high-dimensional (p + q + r) matrix. Following Ha et al.26, we propose the use of the profiled HL, h*, that eliminates λ0:

h*=h|λ0=λ̂0=kd(k)logλ̂0k+ijδoijηijkd(k)+i2i, (13)

where

λ̂0k(β,υ)=d(k)(i,j)R(k)exp(ηij)

are solutions of the estimating equations ∂h/∂λ0k = 0 for k = 1, …, r. Note that h* is proportional to the penalized partial likelihood of Ripatti and Palmgren29. Again, the covariance estimates for υ̂ − υ are obtained from H(h*; β, υ)−1, given by

H(h*;β,υ)=(2h*/ββT2h*/βυT2h*/υβT2h*/υυT)=(XTW*XXTW*ZZTW*XZTW*Z+U), (14)

where X and Z respectively denote the model matrices for β and υ, U = −∂22/∂υ2 = BD(Σ−1, …, Σ−1) is a q* × q* block diagonal matrix with q* = q × s, and the weight matrix W* = W*(β, υ) is given in Appendix 2 of Ha and Lee36. With the use of h*, we need to invert the (p+q) matrix H(h*; β, υ), leading to an efficient computation of the confidence interval for υ.

As in the previous section, the individual (1 − α)-level HL and EB confidence intervals for the uni-dimensional components υk of υ are of the form

υ̂k±zα/2·SE(υ̂kυk), (15)

where υ̂ maximizes the profile HL h* in (13). Let A* be the lower-right-hand corner matrix of dimension s from H(h*; β̂, υ̂)−1, corresponding to −∂2h*/∂υ∂υ. Then, for HL confidence intervals, SE(υ̂k − υk) is akk*, where akk* is the kth diagonal element of A* corresponding to υk. For EB confidence intervals, SE(υ̂k − υk) is simply hkk*, where hkk* is the kth diagonal element of the matrix (−∂2h*/∂υ∂υ)−1. Therefore, SE(υ̂k−υk) using HL is always greater than that using EB method, so that HL interval is larger than the EB interval. In other words, as compared to the EB interval, the HL confidence interval takes into account the correlation between υ̂−υ and β̂ reflected in the off-diagonal block of the Hessian matrix H(h*; β, υ). For the frailty model (6), Vaida and Xu2 used the EB interval based on the conditional distribution fκ̂(υ|yo*), where κ̂=(β̂T,λ̂0T,ϕ̂T)T are the ML estimators. As in the case of GLMM, this HL interval enjoys a confidence interval interpretation, based on the UMSE, and is also an approximate Bayesian credible interval under the uniform prior.

Lee and Nelder9, Lee and Ha7, Lee et al.30 and Ha et al.31 have used the HL intervals in (15) for various random-effect models. However, it gives null intervals when variance-component estimates are zero. This can lead to liberal intervals when variance components are small. In this paper, we propose a general modification in order to overcome this shortcoming.

4.2 Non-negative variance-component estimation

As discussed above, for the frailty model (6), the maximum HL estimators of τ = (βT, υT)T are obtained by solving, for given ϕ, the equation

h*/τ=(h/τ)|λ0=λ̂0=0. (16)

The estimation of ϕ can be carried out by extending Lee et al.’s11 restricted likelihood:

pτ(h*)=[h*12log det{H(h*;τ)/(2π)}]|τ=τ̂, (17)

where τ̂ = τ̂(ϕ) = (β̂T (ϕ), υ̂T (ϕ))T and, as before, H(h*; τ) = −∂2h*/∂τ∂τ. The REML estimators for ϕ are obtained by maximizing pτ(h*). The restricted likelihood pτ(h*) is the first-order Laplace approximation to a modified marginal likelihood, which becomes exact as N = min1≤iq ni → ∞. We found that the increase of N rather than q reduces bias more effectively28: see Lee et al.11 and Ha et al.32 for further justification of asymptotic properties.

This REML method based on (17) can give zero estimates for variance components σs2>0 (s = 0,1,2, …, c).15 This leads to the null confidence interval for υ in (15). This issue was recognized by Morris14 in the context of linear mixed models. To extend the Morris method14, we propose the use of the adjusted likelihood padj, defined as

padj=pτ(h*)+log det(Σ). (18)

Note that the last term in (18) is asymptotically negligible (i.e. asymptotic property of the maximum padj estimator is asymptotically the same as that of the maximum pτ(h*) estimator). Furthermore,

exp(padj)=exp{pτ(h*)}det(Σ)0

and exp(padj) = 0 only if det(Σ) = 0. Thus, by adding the last term we can effectively avoid zero estimates in dispersion parameters. Morris14 proposed this adjustment to avoid the zero REML estimators in linear mixed models. Morris and Christiansen33 proposed a similar formula to (18) for one random-component Weibull frailty model. The adjusted likelihood padj is always defined, even when the original restricted likelihood based upon the marginal likelihood is hardly available. As shown in the subsequent sections, this correction works well in practice.

In practice when we obtain a zero estimate for a certain variance component we may use the model without the corresponding random effects. In this case it is impossible to maintain the stated level of the confidence intervals (CIs). To avoid such a drawback Benjamini and Yekutieli (2005) proposed to consider false coverage rate, which is proportion of noncovering CIs among all non-null CIs. Further research are of interest how to form CIs controlling false coverage rate. If we want CIs maintaining the stated level, we should use non-negative variance component estimators.

5 Simulation study

We conducted a numerical study, based upon 500 replications of simulated data, in order to compare the operating characteristics of the EB and HL intervals. Following the setup of the Vaida and Xu2 data analysis of a multicenter lung cancer clinical trial, we consider the following two frailty models, described in (7) and (9):

  • M1: ηij = β1xij1 + β2xij2 + υi0,

  • M2: ηij = (β1 + υi1)xij1 + β2xij2 + υi0.

In simulation studies allowing a covariance Σ in (10), we assume λ0(t) = 1, β1 = −0.5, β2 = 0.5, σ02=σ12=0.2 for smaller variances and σ02=σ12=1.0 for larger variances and ρ = −0.5 in M2. Even though unreported, we found similar results for ρ = 0.5. The binary covariates xij1 and xij2 are each generated from a Bernoulli distribution with success probability 0.5. In the case study described in Section 6, there are 31 institutions (q = 31) and the average number of patients per institution (i.e., average ni) is 18.7. Thus, we consider the following sample sizes: n=i=qni with n = 150, 300, 600 and 1200, and (q, ni) =(30,5), (60,5), (30,20) and (60, 20). The censoring times were generated from an exponential distribution with empirically determined parameter values to approximately achieve two right-censoring rates, low (around 15%) and high (around 50%). All of the computations were done using SAS/IML.

The standard HL estimates for fixed parameters β and ϕ=(σ02,σ12,σ01)T in M1 and M2 using (16) and (17) work well (not shown) as in the simulation results by Ha et al.31. However, the variance components (σ02,σ12) are sometimes estimated to be zero, leading to null intervals. Table 1 shows the percentages of zero estimates of (σ02,σ12). They increase with censoring rate, which confirms the simulation results of Vu and Knuiman12. When σ02=0.2 in M1, the percentage of zero estimates of σ02 is 4.4% under 15% censoring with small samples (n = 150), but it is as high as 13.4% under 50% censoring. The trends of results in M2 are similar to those evident in M1.

Table 1.

Simulation results for percentages of zero HL estimates of variance components (σ02,σ12) over 500 replications under random center (M1) and correlated (M2) frailty models

M1
σ02
(q, ni) σ̂2 15% censoring 50% censoring

0.2 (30,5)
σ̂02
4.4 13.4
(60,5)
σ̂02
0.4 2.8
(30,20)
σ̂02
0 0
(60,20)
σ̂02
0 0
1.0 (30,5)
σ̂02
0 0
(60,5)
σ̂02
0 0
(30,20)
σ̂02
0 0
(60,20)
σ̂02
0 0

M2
(σ02,σ12,σ01)
(q, ni) σ̂2 15% censoring 50% censoring

(0.2, 0.2, −0.1) (30,5)
σ̂02
1.2 2.0
σ̂12
3.4 3.6
(60,5)
σ̂02
0.4 0.6
σ̂12
1 3.6
(30,20)
σ̂02
0 0
σ̂12
0.6 1.2
(60,20)
σ̂02
0 0
σ̂12
0 0.4
(1.0, 1.0, −0.5) (30,5)
σ̂02
0 0.2
σ̂12
0.6 0.6
(60,5)
σ̂02
0 0
σ̂12
0 0.6
(30,20)
σ̂02
0 0
σ̂12
0 0
(60,20)
σ̂02
0 0
σ̂12
0 0

HL, h-likelihood; q, the number of clusters; ni, cluster size.

HL(S) and EB(S) denote the HL and EB methods using standard REML estimators for variance components using (17), whereas HL(A) and EB(A) denote those using adjusted REML estimators based on (18). In each simulation we generate υ = υ* firstly using fθ(υ). Then, we generate yo* using fθ(y*|υ*), where yo*=(yo,δo) and y* = (y, δ). Based on yo*, we form the EB and HL intervals for υ*. Then, we compute coverage probabilities of CIs for individual υi* for i = 1, ⋯, q based upon the 500 replications. Using q coverage probabilities we form box-plots. Figure 1 shows the coverage probabilities of the nominal 95% intervals for individual random effects (υi0’s) in M1 under 15% censoring. For a small variance (σ02=0.2) and small sample (ni = 5), both HL(S) and EB(S) intervals are liberal because they often give null intervals. Figure 1 shows that the adjustments HL(A) and EB(A) adequately correct this issue. For a large variance σ02=1.0, however, the EB(A) does not maintain the nominal level, even when ni is large. In contrast, the HL(A) intervals maintain the nominal 95% level in all cases studied, indicating that it is necessary to correct for the uncertainty in the estimation of β. The coverage probabilities (see Figure 1.1) of intervals for all υi0 in M2 are consistent with simulation for M1. Although not shown here, for M2, the results for υi1’s are similar to those for υi0’s.

Figure 1.

Figure 1

Simulation results for coverage probabilities (y-label) of the nominal 95% (dotted line) EB and HL intervals of individual random effects (υi0’s) in frailty model (M1) under 15% censoring, with 500 replications; boxplots for υi0’s under σ02=0.2 ((a), (b), (c),(d)) and σ02=1 ((e), (f), (g), (h)). (30, 5), (60, 5), (30, 20), and (60, 20) in x-label indicate the sample size (q, ni), respectively, corresponding to the total sample size n = 150, 300, 600, and 1200. q is the number of clusters and ni is the cluster size.

Figure 1.1.

Figure 1.1

Simulation results for coverage probabilities (y-label) of the nominal 95% (dotted line) EB and HL intervals of individual random effects (υi0’s) in correlated frailty model (M2) under 15% censoring, with 500 replications; boxplots for υi0’s under σ02=σ12=0.2 with ρ = −0.5 ((a), (b), (c),(d)) and σ02=σ12=1 with ρ = −0.5 ((e), (f), (g), (h)). (30, 5), (60, 5), (30, 20), and (60, 20) in x-label indicate the sample size (q, ni), respectively, corresponding to the total sample size n = 150, 300, 600, and 1200. q is the number of clusters and ni is the cluster size.

In Table 2 we summarizes the means for coverage probabilities of the nominal 95% intervals of individual random effects υi0 in M1 and M2, with 500 replications. Overall, the HL(A) intervals work well as judging from that the mean coverage probabilities are close to the nominal 95% level. In particular, the results of the intervals for 50% censoring are similar to those of 15% censoring and they become better as N = min1≤iq ni increases. Furthermore, we also conducted the simulations under an extreme case that the cluster size ni ≡ 2. Here we considered the three sample sizes, (q, ni) = (50, 2), (100, 2) and (200, 2). Table 2 shows that the proposed HL(A) intervals are improved as q increases.

Table 2.

Simulation results for means of coverage probabilities of the nominal 95% EB(empirical Bayes) and HL(h-likelihood) intervals of individual random effects (υi0’s) in random center (M1) and correlated (M2) frailty models, with 500 replications

M1 15% censoring 50% censoring
σ02
(q, ni) EB(S) HL(S) EB(A) HL(A) EB(S) HL(S) EB(A) HL(A)

0.2 (30,5) 0.860 0.868 0.939 0.947 0.774 0.778 0.952 0.958
(60,5) 0.908 0.911 0.942 0.946 0.869 0.871 0.950 0.953
(30,20) 0.930 0.944 0.935 0.950 0.927 0.936 0.938 0.949
(60,20) 0.938 0.946 0.940 0.948 0.938 0.943 0.943 0.947
1.0 (30,5) 0.909 0.936 0.913 0.943 0.907 0.924 0.921 0.941
(60,5) 0.927 0.941 0.929 0.943 0.926 0.934 0.932 0.941
(30,20) 0.877 0.948 0.878 0.953 0.901 0.945 0.904 0.949
(60,20) 0.913 0.948 0.914 0.949 0.922 0.946 0.924 0.948
0.2 (50,2) 0.802 0.808 0.960 0.968 0.740 0.744 0.974 0.979
(100,2) 0.848 0.850 0.961 0.964 0.783 0.785 0.975 0.977
(200,2) 0.872 0.873 0.949 0.950 0.811 0.811 0.963 0.963
1.0 (50,2) 0.901 0.914 0.920 0.938 0.875 0.884 0.927 0.934
(100,2) 0.922 0.928 0.931 0.938 0.900 0.904 0.929 0.938
(200,2) 0.933 0.936 0.938 0.941 0.913 0.915 0.930 0.939

M2 15% censoring 50% censoring
(σ02,σ12,σ01)
(q, ni) EB(S) HL(S) EB(A) HL(A) EB(S) HL(S) EB(A) HL(A)

(0.2, 0.2, −0.1) (30,5) 0.834 0.842 0.957 0.964 0.838 0.843 0.969 0.974
(60,5) 0.858 0.860 0.953 0.956 0.840 0.842 0.966 0.968
(30,20) 0.909 0.920 0.936 0.947 0.894 0.901 0.943 0.949
(60,20) 0.931 0.934 0.942 0.945 0.927 0.930 0.946 0.949
(1.0, 1.0, −0.5) (30,5) 0.882 0.901 0.919 0.937 0.867 0.882 0.928 0.942
(60,5) 0.909 0.917 0.927 0.935 0.897 0.902 0.932 0.938
(30,20) 0.899 0.940 0.902 0.945 0.908 0.934 0.916 0.942
(60,20) 0.926 0.944 0.927 0.946 0.929 0.940 0.932 0.943
(0.2, 0.2, −0.1) (50,2) 0.833 0.839 0.957 0.968 0.840 0.843 0.974 0.982
(100,2) 0.855 0.857 0.972 0.976 0.842 0.844 0.986 0.988
(200,2) 0.875 0.876 0.965 0.966 0.839 0.840 0.970 0.971
(1.0, 1.0, −0.5) (50,2) 0.860 0.871 0.931 0.947 0.835 0.843 0.953 0.963
(100,2) 0.887 0.893 0.942 0.948 0.853 0.857 0.945 0.952
(200,2) 0.911 0.913 0.945 0.947 0.866 0.867 0.938 0.940

EB(S) & HL(S), standard EB & HL; EB(A) & HL(A), adjusted EB & HL; q, No. of clusters; ni, cluster size.

In summary, we recommend the use of HL(A) for confidence intervals of realized values υ* of random effects.

6 Case study: a multicenter lung cancer clinical trial

We examine the data from the EST 1582 multicenter lung cancer trial34. This trial enrolled 579 patients from 31 distinct institutions (centers). The number of patients per institution varied from 1 to 56, with a mean of 18.7 and median of 17. The subjects were randomized to one of two treatment arms, standard chemotherapy (CAV) or an alternating regimen (CAV-HEM). The primary endpoint was the time (in years) from randomization to death. The study had a high mortality rate: of the 579 patients, 569 died (censoring rate = 1.7%), with a median survival time of 0.86 years and maximum follow-up of 8.45 years. The five dichotomous covariates considered are treatment (xij1 is 0 for CAV and 1 for CAV-HEM), presence(1) or absence(0) of bone metastases (xij2), presence(1) or absence(0) of liver metastases (xij3), whether the subject was ambulatory (xij4 = 1) or confined to bed or chair (xij4 = 0), and whether there was weight loss prior to entry (xij5).

These data were previously analyzed using a fully Bayesian approach via Gibbs sampling by Gray3 and by marginal likelihood via Monte Carlo EM by Vaida and Xu2. Vaida and Xu2 used EB interval estimation for random effects. Gray3 used a restricted data set (570 patients from 26 institutions) with cluster size ni ≥ 5 and a single covariate xij1. Such a restriction is often applied with fixed effects because random-effect estimates for centers with fewer patients are imprecise. But with random-effect model, the uncertainty is automatically accounted for. Thus, our approach can include all centers, indeed even those with only a single patient. However, all authors above assumed ρ = 0 for two random effects.

Let υi0 and υi1 be random center effects and random treatment effects, and let υi2 be random effects for presence of bone metastases. Following Vaida and Xu2, we consider the following four models, which include Cox model without random effects and three frailty models (FMs):

  • FM0 (Cox model): ηij = β1xij1 + β2xij2 + β3xij3 + β4xij4 + β5xij5;

  • FM1 (model with υi0): ηij = υi0 + β1xij1 + β2xij2 + β3xij3 + β4xij4 + β5xij5;

  • FM2 (model with υi0 & υi1): ηij = υi0 + (β1 + υi1)xij1 + β2xij2 + β3xij3 + β4xij4 + β5xij5;

  • FM3 (model with υi1 & υi2): ηij = (β1 + υi1)xij1 + (β2 + υi2)xij2 + β3xij3 + β4xij4 + β5xij5.

In two random-component models (FM2 and FM3) we always assume correlation. The HL results are listed in Table 3. The HL estimates are similar to the marginal likelihood estimates by Vaida and Xu2. Ha et al.24 and Plummer35 pointed out that the use of the Akaike information criterion (AIC) for model selection can be problematic when there are random effects. To avoid such problem in model comparison with the common fixed effects as above, Ha et al.24 proposed the use of the AIC (hAIC), based on the restricted likelihood (17):

hAIC=2pτ(h*)+2d, (19)

where the restricted likelihood pτ(h*) is a function only of ϕ and d is the number of frailty parameters in ϕ. With the hAIC, FM3 is chosen. Under FM3, σ22=var(υi2) is much larger than σ12=var(υi1), which is also confirmed in Figure 2.

Table 3.

HL analyses for lung cancer data under four frailty models (FMs)

Model FM0 FM1 FM2 FM3

Est. (SE) Est. (SE) Est. (SE) Est. (SE)
Trt β1 −0.25 (0.09) −0.25 (0.09) −0.22 (0.10) −0.23 (0.10)
Bone β2 0.22 (0.09) 0.22 (0.09) 0.22 (0.10) 0.26 (0.13)
Liver β3 0.43 (0.09) 0.43 (0.09) 0.42 (0.09) 0.39 (0.09)
PS β4 −0.60 (0.10) −0.60 (0.10) −0.64 (0.11) −0.65 (0.11)
WL β5 0.20 (0.09) 0.20 (0.09) 0.22 (0.09) 0.21 (0.09)

Center σ02 0.000 0.001
Center-Trt σ12 0.059 0.053
Center-Bone σ22 0.135
ρ[σ01 or σ12] 0.974[0.006] 0.187[0.016]

−2pτ (h*) 6112.2 6112.2 6108.2 6100.5
  hAIC 5.7 7.7 7.7 0

FM0, Cox model without random effects;

FM1, frailty model with one random-center effect;

FM2, correlated frailty model with random center and random treatment-by-center interaction (Center-Trt);

FM3, correlated frailty model with random treatment-by-center (Center-Trt) and random bone-by-center interaction (Center-Bone);

Trt, treatment; PS, ambulatory performance status; WL, weight loss;

pτ (h*), the restricted likelihood given in (17);

hAIC, AIC differences in (19) where the smallest AIC is adjusted to be zero.

Figure 2.

Figure 2

Random effects and their 95% confidence intervals using HL(A) under correlated frailty model (FM3) allowing dependency between random treatment and random bone for the lung cancer trial; (a) random treatment effects; (b) random bone effects; 31 institutions are sorted in increasing order of number of patients.

The HL(A) random-effect estimates and their 95% intervals for each institution are plotted in Figure 2(a),(b) under a final model FM3. In particular, Figure 2(b) shows substantial institutional variation among the bone effects; the HL(A) intervals for the two institutions (22, 29) do not include zero. Overall, the HL(A) intervals are similar to the EB(S) intervals obtained by Vaida and Xu2.

Table 3 shows that in FM1 and FM2, the HL estimates for the variance, σ02=var(υi0), of the random center effect are very close to zero. Now we investigate the behavior of the proposed adjustment method HL(A) as compared to the standard HL(S) in the case of zero frailty variance. For this we consider the correlated model (FM2), a general model of FM1. The HL random effects and their 95% intervals for each institution under FM2 are plotted in Figure 3. For the random center effects, the HL(S) and HL(A) intervals are shown in Figures 3(a) and 3(b), respectively. Due to almost zero estimate for σ02, the HL(S) intervals shown in Figure 3(a) are substantially null, whereas the HL(A) intervals shown in Figure 3(b) have an average length of 0.57 with a standard deviation of 0.05. However, all HL(A) intervals shown in Figure 3(b) include zero, strongly indicating that random effects could be null. By having non-null intervals, HL(A) maintains the stated level of confidence. In particular, we observe that the resulting non-null HL(A) intervals and homogeneity over institutions from the HL(A) plot are similar to those from Gray’s Figure 1(a) using fully Bayesian credible intervals. Next, for the random treatment effects we present only the HL(A) plot shown in Figure 3(c), because they were similar to the HL(S) intervals. Figure 3(c) shows that there are substantial variations in the treatment effect over institutions.2,3 For example, six institutions (13, 16, 18, 19, 28, 29) noticeably stand out. Gray3 noted that exp(υi1) = exp(β1 + υi1)/exp(β1) is the ratio of treatment hazard rate in the ith institution to overall treatment hazard rate, so that a decreasing value of υi1 corresponds to an increase of treatment effect.25

Figure 3.

Figure 3

Random effects and their 95% confidence intervals using HL under correlated frailty model (FM2) allowing dependency between random center and random treatment for the lung cancer trial; (a) random center effects using HL(S); (b) random center effects using HL(A); (c) random treatment effects using HL(A); 31 institutions are sorted in increasing order of number of patients.

7 Discussion

Several methods for generating confidence intervals have been proposed for mixed effects models.15 In frailty models, HL leads to confidence intervals for random effects, improving upon the EB intervals by appropriately accounting for the variability in the estimation of the fixed effects. Numerical studies show that this improvement is substantial in terms of obtaining close to nominal coverage for these intervals, whereas the EB intervals suffer from under-coverage, especially for large variance components. However, these intervals can be null due to zero estimation of the variance components, especially for small sample sizes and/or small variance components. A correction proposed by Morris14 can be extended to avoid null intervals. These methods are straightforward to implement, and integrate seamlessly with the estimation of the general frailty model using HL.

In the literature, the word prediction is often used in reference to random effects to distinguish this from the estimation of the fixed effects. However, in this paper, following Lee et al.11, we use the term estimation, and we view the random effects as already realized unknowns. Accordingly, for simplicity, we use the term confidence interval for the interval that summarizes the uncertainty of estimation of random effects.

Acknowledgements

The authors thank Professor Robert J. Gray for the permission to use the lung cancer trial data from the Eastern Cooperative Oncology Group (ECOG) study (E1582). This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (No. 2009-0088978 and 2010-0011372).

Footnotes

Conflict of interest statement

The Authors declare that there is no conflict of interest.

Contributor Information

Il Do Ha, Department of Asset Management, Daegu Haany University, Gyeongsan 712-715, South Korea idha@dhu.ac.kr.

Florin Vaida, Division of Biostatistics and Bioinformatics, Department of Family and Preventive Medicine, University of California, San Diego, USA fvaida@ucsd.edu.

Youngjo Lee, Department of Statistics, Seoul National University, Seoul 151-742, South Korea youngjo@snu.ac.kr.

References

  • 1.Hougaard P. Analysis of Multivariate Survival Data. New York: Springer; 2000. [Google Scholar]
  • 2.Vaida F, Xu R. Proportional hazards model with random effects. Statistics in Medicine. 2000;19:3309–3324. doi: 10.1002/1097-0258(20001230)19:24<3309::aid-sim825>3.0.co;2-9. [DOI] [PubMed] [Google Scholar]
  • 3.Gray RJ. A Bayesian analysis of institutional effects in multicenter cancer clinical trial. Biometrics. 1994;50:244–253. [PubMed] [Google Scholar]
  • 4.Carlin BP, Louis TA. Bayes and Empirical Bayes Methods for Data Analysis. 2nd edn. London: Chapman and Hall; 2000. [Google Scholar]
  • 5.Legrand C, Ducrocq V, Janssen P, Sylvester R, Duchateau L. A Bayesian approach to jointly estimate centre and treatment by centre heterogeneity in a proportional hazards model. Statistics in Medicine. 2005;24:3789–3804. doi: 10.1002/sim.2475. [DOI] [PubMed] [Google Scholar]
  • 6.Komarek A, Lesaffre E, Legrand C. Baseline and treatment effect heterogeneity for survival times between centers using a random effects accelerated failure time model with flexible error distribution. Statistics in Medicine. 2007;26:5457–5472. doi: 10.1002/sim.3083. [DOI] [PubMed] [Google Scholar]
  • 7.Lee Y, Ha ID. Orthodox BLUP versus HL methods for inferences about random effects in Tweedie mixed models. Statistics and Computing. 2010;20:295–303. [Google Scholar]
  • 8.Lee Y, Nelder JA. Hierarchical generalized linear models (with discussion) Journal of the Royal Statistical Society, Series B. 1996;58:619–678. [Google Scholar]
  • 9.Lee Y, Nelder JA. Likelihood inference for models with unobservables: another view (with discussion) Statistical Science. 2009;24:255–293. [Google Scholar]
  • 10.Meng X-L. What’s the H in H-likelihood: A holy grail or an achilles’ heel? (with discussion) Bayesian Statistics. 2011;9:473–500. [Google Scholar]
  • 11.Lee Y, Nelder JA, Pawitan Y. Generalised Linear Models with Random Effects: Unified Analysis via h-Likelihood. Chapman and Hall; 2006. [Google Scholar]
  • 12.Vu HTV, Knuiman MW. A hybrid ML-EM algorithm for calculation of maximum likelihood estimates in semiparametric shared frailty models. Computational Statistics and Data Analysis. 2002;40:173–187. [Google Scholar]
  • 13.Fay RE, Herriot RA. Estimates of income for small places: an application of James-Stein procedure to Census data. Journal of the American Statistical Association. 1979;74:269–277. [Google Scholar]
  • 14.Morris CN. Mixed model prediction and small area estimation. Test. 2006;15:72–76. [Google Scholar]
  • 15.Li H, Lahiri P. An adjusted maximum likelihood method for solving small area estimation problems. Journal of Multivariate Analysis. 2010;101:882–892. doi: 10.1016/j.jmva.2009.10.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Efron B, Tibshirani TJ. An Introduction to the Bootstrap. Chapman and Hall; 1993. [Google Scholar]
  • 17.Henderson CR. Best linear unbiased estimation and prediction under a selection model. Biometrics. 1975;31:423–447. [PubMed] [Google Scholar]
  • 18.Booth JG, Hobert JP. Standard errors of prediction in generalized linear mixed models. Journal of the American Statistical Association. 1998;93:262–272. [Google Scholar]
  • 19.Morris CN. Parametric empirical Bayes inference: theory and application. Journal of the American Statistical Association. 1983;78:47–59. [Google Scholar]
  • 20.Bjørnstad JF. Predictive likelihood principle: a review (with discussion) Statistical Science. 1990;5:242–265. [Google Scholar]
  • 21.Carlin BP, Gelfand AE. Approaches for empirical Bayes confidence intervals. Journal of the American Statistical Association. 1990;85:105–114. [Google Scholar]
  • 22.Laird NM, Louis TA. Empirical Bayes confidence intervals based on bootstrap samples. Journal of the American Statistical Association. 1987;82:739–750. [Google Scholar]
  • 23.Kass RE, Steffey D. Approximate Bayesian inference in conditionally independent hierarchical models (parametric empirical Bayes models) Journal of the American Statistical Association. 1989;84:717–726. [Google Scholar]
  • 24.Ha ID, Lee Y, MacKenzie G. Model selection for multi-component frailty models. Statistics in Medicine. 2007;26:4790–4807. doi: 10.1002/sim.2879. [DOI] [PubMed] [Google Scholar]
  • 25.Rondeau V, Michiels S, Liquet B, Pignon JP. Investigating trial and treatment heterogeneity in an individual patient data meta-analysis of survival data by means of the penalized maximum likelihood approach. Statistics in Medicine. 2008;27:1894–1910. doi: 10.1002/sim.3161. [DOI] [PubMed] [Google Scholar]
  • 26.Ha ID, Lee Y, Song JK. Hierarchical likelihood approach for frailty models. Biometrika. 2001;88:233–243. [Google Scholar]
  • 27.Breslow NE. Discussion of Professor Cox’s paper. Journal of the Royal Statistical Society, Series B. 1972;34:216–217. [Google Scholar]
  • 28.Ha ID, Lee Y. Comparison of hierarchical likelihood versus orthodox best linear unbiased predictor approaches for frailty models. Biometrika. 2005;92:717–723. [Google Scholar]
  • 29.Ripatti S, Palmgren J. Estimation of multivariate frailty models using penalized partial likelihood. Biometrics. 2000;56:1016–1022. doi: 10.1111/j.0006-341x.2000.01016.x. [DOI] [PubMed] [Google Scholar]
  • 30.Lee Y, Jang M, Lee W. Prediction interval for disease mapping using hierarchical likelihood. Computational Statatistics. 2011;26:159–179. [Google Scholar]
  • 31.Ha ID, Sylvester R, Legrand C, MacKenzie G. Frailty modelling for survival data from multi-centre clinical trials. Statistics in Medicine. 2011;30:2144–2159. doi: 10.1002/sim.4250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Ha ID, Noh M, Lee Y. Bias reduction of likelihood estimators in semi-parametric frailty models. Scandinavian Journal of Statistics. 2010;37:307–320. [Google Scholar]
  • 33.Morris CN, Christiansen C. Fitting Weibull duration models with random effects. Lifetime Data Analysis. 1995;1:347–359. doi: 10.1007/BF00985449. [DOI] [PubMed] [Google Scholar]
  • 34.Ettinger DS, Finkelstein DM, Abeloff MD, Ruckdeschel JC, Aisner SC, Eggleston JC. A randomized comparison of standard chemotherapy versus alternating chemotherapy and maintenance versus no maintenance therapy for extensive-stage small-cell lung cancer: a phase III study of the Eastern Cooperative Oncology Group. Journal of Clinical Oncology. 1990;8:230–240. doi: 10.1200/JCO.1990.8.2.230. [DOI] [PubMed] [Google Scholar]
  • 35.Plummer M. Penalized loss functions for Bayesian model comparison. Biostatistics. 2008;9:523–539. doi: 10.1093/biostatistics/kxm049. [DOI] [PubMed] [Google Scholar]
  • 36.Ha ID, Lee Y. Estimating frailty models via Poisson hierarchical generalized linear models. Journal of Computational and Graphical Statistics. 2003;12:663–681. [Google Scholar]
  • Benjamini Y, Yekutieli D. False discovery rate-adjusted multiple confidence intervals for selected parameters (with discussion) Journal of the American Statistical Association. 2005;100:71–93. [Google Scholar]

RESOURCES