Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Aug 26.
Published in final edited form as: Lifetime Data Anal. 2010 Jun 12;17(1):80–100. doi: 10.1007/s10985-010-9169-6

A general joint model for longitudinal measurements and competing risks survival data with heterogeneous random effects

Xin Huang 1,, Gang Li 2, Robert M Elashoff 3, Jianxin Pan 4
PMCID: PMC3162577  NIHMSID: NIHMS316707  PMID: 20549344

Abstract

This article studies a general joint model for longitudinal measurements and competing risks survival data. The model consists of a linear mixed effects sub-model for the longitudinal outcome, a proportional cause-specific hazards frailty sub-model for the competing risks survival data, and a regression sub-model for the variance–covariance matrix of the multivariate latent random effects based on a modified Cholesky decomposition. The model provides a useful approach to adjust for non-ignorable missing data due to dropout for the longitudinal outcome, enables analysis of the survival outcome with informative censoring and intermittently measured time-dependent covariates, as well as joint analysis of the longitudinal and survival outcomes. Unlike previously studied joint models, our model allows for heterogeneous random covariance matrices. It also offers a framework to assess the homogeneous covariance assumption of existing joint models. A Bayesian MCMC procedure is developed for parameter estimation and inference. Its performances and frequentist properties are investigated using simulations. A real data example is used to illustrate the usefulness of the approach.

Keywords: Cause-specific hazard, Bayesian analysis, Cholesky decomposition, Mixed effects model, MCMC, Modeling covariance matrices

1 Introduction

Joint modeling of longitudinal and survival data has received a great deal of attention in the past decades in many studies in which both a longitudinal outcome during follow-up and the occurrence of some key events are recorded. In the statistical literature, joint models have been proposed to adjust inferences on longitudinal measurements in the presence of non-ignorable missing values due to dropout (Schluchter 1992; DeGruttola and Tu 1994; Little 1995; Hogan and Laird 1997; Henderson et al. 2000; Elashoff et al. 2007, 2008; Hu et al. 2009); to solve difficulties in Cox proportional hazards model arising from time-dependent covariates which are possibly missing at some event times or subject to substantial measurement error (Faucett and Thomas 1996; Wulfsohn and Tsiatis 1997; Faucett et al. 1998; Wang and Taylor 2001; Xu and Zeger 2001; Song et al. 2002; Brown and Ibrahim 2003; Tseng et al. 2005; Ye et al. 2008); and to assess covariates effects on both endpoints simultaneously (Henderson et al. 2000; Zeng and Cai 2005; Elashoff et al. 2007, 2008; Liu et al. 2008).

All the aforementioned joint models assume that the random effects covariance matrix is the same for all subjects. However, examining whether this matrix is the same for all subjects (homogeneous) or whether it differs depending on subject-specific characteristics (heterogeneous) is often neglected in the modeling. Furthermore, ignoring the heterogeneity can result in biased estimates of the fixed and random effects for the longitudinal outcome (Heagerty and Kurland 2001; Daniels and Zhao 2003). Accounting for heterogeneity in covariance matrices has been discussed by serval authors in the field of generalized linear regression models (Chiu et al. 1996), non-linear mixed models (Davidian and Giltinan 1995), and linear mixed models (Pourahmadi and Daniels 2002; Lin et al. 1997; Zhang and Weiss 2000; Daniels and Zhao 2003). Nonetheless, no work has been done on modeling the entire random effects covariance matrix for the joint models.

In this paper, we propose an approach that allows heterogeneous random effects covariance matrix within the framework of joint analysis of longitudinal measurements and competing risks failure time data. In our joint model, a linear mixed effects sub-model is used to characterize the longitudinal measurements, a cause-specific hazards frailty sub-model for the competing risks survival data (Prentice and Breslow 1978), together with a regression sub-model for the joint multivariate random effects covariance matrix which links the first two sub-models. Specifically, we first use a modified Cholesky decomposition to decompose the covariance matrix into a lower-triangular matrix and a diagonal matrix, and then model these matrix entries using regression models (Pourahmadi 1999; Daniels and Zhao 2003). By jointly modeling the random effects covariance matrices, our model is distinct from previously studied joint models (e.g. Hu et al. 2009) that consist of only two sub-models for the longitudinal and survival outcomes respectively. Our model has several advantages. First of all, unlike existing joint models that assume a homogeneous covariance matrix, our model allows for heterogeneous covariance matrices. Secondly, as discussed in the remark of Sect. 2, our model includes homogeneous models as special cases. Thirdly, the covariance model enables dimension reduction. With different choices of regression covariates, it provides a flexible means to model the heterogeneity and reduce the number of variance–covariance parameters to be estimated. Forthly, the resulting estimated covariance matrices of the multivariate random effects are guaranteed to be positive definite, which is not always the case for other existing joint models. Finally, our model provides a useful framework to assess the homogeneous covariance assumption of existing joint models which is otherwise untestable. Likelihood-based inference for our model is rather challenging with high-dimensional random effects. We develop a Bayesian MCMC algorithm to fit the joint model. Gibbs sampling technique, together with Metropolis-Hastings sampling and adaptive rejection sampling (ARS) methods, is used to draw random samples from the full conditional distributions of parameters. With the Bayesian approach, prior information can be incorporated in a natural way. If no prior information is available, we recommend noninformative priors for parameters to allow data to dominate the determination of the posterior distributions.

This paper is organized as follows: in Sect. 2 we describe the joint model formulation. In Sect. 3 we develop the Bayesian estimation and inference methods. In Sect. 4, a real data application is illustrated using the data from the Scleroderma clinical trial (Tashkin et al. 2006). In Sect. 5, the performance of our method is examined by simulation studies. Some concluding remarks are provided in Sect. 6. Details of the MCMC algorithm are deferred to the Appendix.

2 Joint model

Our joint model consists of three components: a linear mixed effect model for the longitudinal measurements, a cause-specific hazards model for the competing risks survival data, and a regression model for the variance–covariance matrix of the multivariate latent random effects based on a modified Cholesky decomposition.

2.1 Longitudinal sub-model

Suppose there are m subjects in the study. For the ith subject at time t, the longitudinal outcome Yi (t) follows a linear mixed effects model:

Yi(t)=Xi(1)(t)Tβ+Zi(t)TUi+εi(t) (1)

where Xi(1)(t) and Zi (t) are vectors of covariates associated with the fixed effects β (p × 1) and the random effects Ui (q × 1) respectively. We assume that the measurement error εi (t), which is distributed as N (0, σ2), is independent of Ui and εi (t1) ⊥ εi (t2) for any t1t2.

2.2 Cause-specific hazards sub-model

During follow-up, each subject may experience one of g distinct competing causes of failure or may be right censored. Let Ci = (Ti, Di) be the competing risks survival data on subject i, where Ti is the failure or censoring time, and Di assumes a value from 0, 1, …, g, with Di = 0 indicating a noninformative censored event and Di = k indicating the kth failure type, k = 1, …, g. Dependent (or informative) censoring is treated as one of the g types of failures. The cause-specific hazards sub-model for the competing risks survival data is specified as follows:

λk(t;Xi(2)(t),υi,γk,νk)=limh0P[tTi<t+h,Di=kTit,Xi(2)(t),υi,γk,νk]h=λ0k(t)exp{Xi(2)(t)Tγk+νkυi}. (2)

The function λk(t;Xi(2),υi,νk,γk) is the instantaneous failure rate from cause k at time t given the vector of covariates Xi(2)(t) and the latent unknown factor υi, in the presence of all other failure types. The regression coefficient νk represents the effect of the latent variable υi with ν1 set to 1 to ensure identifiability. The parameter γk represents the effects of the observed covariates Xi(2)(t) on cause k. We further assume that the kth baseline hazard is a step function, λ0k(t)=λ0k(s), for tk(s1)<ttk(s), where 0<tk(1)<<tk(Sk)< is a partition of (0, ∞) and Sk indicates the number of steps for the kth baseline hazard.

2.3 Variance–covariance regression sub-model

We model the association between the longitudinal and survival sub-models by assuming that Ui and vi jointly have a multivariate normal distribution:

Wi=(Uivi)Nq+1((00),i=(UiUviUviTσvi2)) (3)

Similar to Pourahmadi (1999), we model the covariance matrices Σi through a modified Cholesky decomposition MiiMiT=Hi, where Hi is a diagonal matrix with positive entries and Mi is the lower triangular matrix with −φi, jl as its (j, l)th entry. This decomposition has a clear statistical interpretation: the below-diagonal entries of Mi are the negatives of generalized autoregressive parameters (GARP), φi, jl, in the autoregressive model

Wij=l=1j1φi,jlWjl+eij,j=1,,q+1. (4)

The diagonal entries of Hi are the innovation variances (IV) hij2=var(eij) and we have cov(ei j, ejk) = 0 if jk (1 ≤ j, kq + 1 and i = 1, …, m). The GARPs and the logarithms of the IVs are modeled with linear and log link functions:

{φi,jl=ai,jlTη1fori=1,,mloghij2=bijTη2j=1,,q+1,l=1,,j1 (5)

where ai, jl and bi j are covariates, and η1 and η2 are low-dimensional parameter vectors. For example, ai, jl and bi j may contain group indicators, implying that the random effects covariances are heterogeneous. The homogeneous random effects assumption in existing joint models becomes a testable assumption within our model framework. Furthermore, the resulting estimated covariance matrix is guaranteed to be positive definite. The latent association between the longitudinal measurements and survival outcomes can be assessed by testing the hypothesis Σi = 0. Finally, we assume that conditional on all the covariates and random effects, the longitudinal measurements and the competing risks survival data are independent.

Remark: Choice of design vectors for GARP/IV parameters

As we mentioned earlier, the choice of covariate vectors ai, jl and bi, j are flexible. For example, a 3-dimensional random effects variance–covariance matrix has six parameters. We can model the homogeneous unstructured covariance matrix by setting ai, jl = bi j = 1 for all j = 1, …, 3, l = 1, …, j − 1. If we assume the design vectors contain subject-dependent covariate, say, a group indicator (G), the unstructured heterogeneous covariance matrix can be modeled with ai, jl = bi j = (1, Gi) for all j = 1, …, 3, l = 1, …, j − 1; that is,

{ai=(ai,jl)=(1,Gi,1,Gi,1,Gi)T;η1=(η11Int,η11G,η12Int,η12G,η13Int,η13G)Tbi=(bij)=(1,Gi,1,Gi,1,Gi)T;η2=(η21Int,η21G,η22Int,η22G,η23Int,η23G)T. (6)

When there are high-dimensional random effects with limited data, one can impose a restricted covariance structure and assume some of the GARP are the same to reduce the number of parameters.

3 Estimation and inference

The standard maximum likelihood method involves integrating out latent variables from the log-likelihood function which is difficult when dealing with high-dimensional variables. We develop a Bayesian estimation procedure and a Markov chain Monte Carlo (MCMC) method for estimation and inference.

3.1 Likelihood

Suppose the longitudinal outcome Yi (t) is observed at time points ti j for j = 1, …, ni, and denote Yi = (Yi1, …, Yini). Let Ω = {β, σ2, γ, ν, λ0, η1, η2}, where γ = (γ1, γ2, …, γg), ν = (ν2, …, νg) and λ0=(λ01(1),λ01(2),,λ0g(Sg)). It is convenient to work directly with the joint distribution of the observed data (Y, C) and the unobservable random effects W, conditional on Ω, which facilitates the MCMC implementation. The conditional joint density of (Y, C) and W is:

p(Y,C,WΩ)=i=1mp(YiWi,Ω)p(CiWi,Ω)p(WiΩ)i=1m(2πσ2)ni2exp{(YiXi(1)βZiUi)T(YiXi(1)βZiUi)2σ2}×k=1g((λk(Ti))I(Di=k)exp{Hk(Ti)})×exp{12j=1q+1[bijTη2+(Wijl=1j1aijlTη1Wil)2×exp(bijTη2)]} (7)

where

λk(Ti)=λ0k(Ti)exp{Xi(2)(Ti)γk+νkvi} (8)

and

Hk(Ti)=0Tiλok(t)exp(Xi(2)(t)γk+νkvi)dt. (9)

Under the piecewise constant hazard assumption,

Hk(Ti)=exp(νkvi)s=1SkI(Ti>tk(s1))λ0k(s)tk(s1)min(Ti,tk(s))exp(Xi(2)(t)γk)dt. (10)

3.2 Priors and MCMC sampling procedure

We assume independent priors for Ω. We use Normal priors for the parameters β, γ, ν, η1 and η2, leading to conjugate posteriors for β and some components of the η1. We use an inverse Gamma prior for the measurement error variance σ2 and a gamma prior for each step of the kth baseline hazard function λ0k by which conjugate posterior distributions are easy to obtain.

Markov Chain Monte Carlo (MCMC) methods are used for posterior sampling. It involves sampling directly from the full conditional distribution, Metropolis–Hastings (MH) sampling (Hastings 1970; Chib and Greenberg 1995) and adaptive rejection sampling (ARS) (Gilks and Wild 1992). Since the full conditional distributions of the parameters β, σ2, and λ0k(s), (s =1, …, Sg, k = 1, …, g) are standard distributions, drawing random variates from their full conditional distributions is straightforward. For the rest of the parameters and random effects (Ui, υi), we either use a Metropolis–Hastings step with the normal approximation to the full conditional distribution as the candidate distribution or apply the ARS technique. The technical details on the sampling distributions are given in the Appendix.

The initial values of the parameters for sampling are obtained by modeling the longitudinal data and survival data separately by a linear mixed model and a Cox proportional hazards model. The initial value for λ0k(s) (s = 1, …, Sk, k = 1, …, g) can be obtained by drawing a random variate from the gamma full conditional distribution described in the Appendix. We estimate the parameters by their posterior medians. Approximate 95% probability intervals are based on 2.5th percentile and 97.5th percentile. Standard errors are obtained from the standard deviations of the posterior samples. The convergence of the Gibbs sampler is monitored by examining time series plots of the parameters over iteration and the Gelman and Rubin (1992) approach of using multiple chains.

4 Application

We analyze the data from a scleroderma lung study (SLS) (Tashkin et al. 2006) with our proposed joint model. The study enrolled 158 patients with scleroderma-related interstitial lung disease, randomized to receive either CYC (79 patients) or identical appearing placebo (79 patients) for 12 months. An additional year of follow-up was performed to determine if CYC effects persisted after treatment. The primary outcome is forced vital capacity (FVC, % predicted), measured at 3-month intervals from the baseline. We are interested in evaluating if oral cyclophosphamide (CYC) can either improve the %FVC level of a patient or decrease the risk of treatment failure or death.

Since the full dose of CYC is not reached until month 6, our analysis is based on 6–24 months %FVC scores which includes 140 subjects. We observe 14 treatment failures or deaths, 32 informative and 5 noninformative dropouts. A dropout is non-informative if there is no evidence showing that the dropout is related to the disease or the treatment, and informative otherwise. Since the informative dropout is related to the patient’s disease condition, it not only causes non-ignorable missing data in %FVC, but also is an informatively censored event for treatment failure or death.

We consider two baseline factors in our joint model when assessing the CYC treatment effects: baseline %FVC (FVC0), and lung fibrosis (FIB0). It is suggested by clinicians that the beneficial effects of CYC on pulmonary function continue to increase after stopping treatment at 12 months and eventually begin to wane after 18 months. Therefore, we fit the following linear spline mixed effects model with change point at month 18 for longitudinal measurements %FVC:

%FVCij=β0+β1FVC0i+β2FIB0i+β3CYCi+β4Timeij+β5(Timeij18)++β6FVC0i×CYCi+β7FIB0i×CYCi+β8Timeij×CYCi+β9(Timeij18)+×CYCi+ZijUi+εij (11)

where Ui is the subject-specific random effect and the εi j is the mutually independent measurement error.

We consider multiple choices for random effects covariates Zi and select the model based on the Deviance Information Criterion (DIC) (Spiegelhalter et al. 2002). The DIC has the advantage of being easy to compute using output from a Gibbs sampler and has a similar form as the Akaika Information Criterion (AIC): a goodness-of-fit term measured by deviance evaluated at the posterior mean of the parameters, and a penalty term defined by twice of the effective number of parameters. The effective number of parameters is computed as the mean deviance minus the deviance evaluated at the posterior mean. That is,

DIC=dev(Ω¯)+2pD (12)

where Ω̄ is the posterior mean of parameter Ω, pD=dev¯dev(Ω¯) and dev¯ is the posterior mean of the deviance (the average of the deviances calculated using the estimated parameters at each step of the MCMC sampler). Based on the form of the DIC, it is obvious that the smaller the DIC value, the better the model proposed. We note that there are several versions of DIC for missing data models (Celeux et al. 2006; Chen 2006). Here we use the DIC constructed from the conditional distribution while treating both Ω and W as parameters because it is easy to compute. We conduct a small simulation to evaluate the DIC which selects 147 times out of 200 datasets and the effective dimension is always positive.

A cause-specific competing risks sub-model is applied to model disease-related dropout (risk 1) and treatment failure or death (risk 2):

λ1(t)=λ01(t)exp(γ11FVC0i+γ12FIB0i+γ13CYCi+γ14FVC0i×CYCi+γ15FIB0i×CYCi+vi) (13)

and

λ2(t)=λ02(t)exp(γ21FVC0i+γ22FIB0i+γ23CYCi+γ24FVC0i×CYCi+γ25FIB0i×CYCi+ν2vi). (14)

The latent variables from both sub-models are assumed to have a multivariate normal distribution with mean zero and variance–covariance matrices

i=(UiUviUviTσvi2).

We test the homogeneous random effects covariance matrix assumption by considering subject-dependent covariates for ai jl and bi j. Specifically, we choose ai jl = bi j = (1, CYCi), which allows heterogeneous covariance matrices for different treatment groups, and test the null hypothesis by examining if the 95% credible interval of CYC effects contains zero for all the GARP and IV parameters.

A 3-step baseline hazard function, with the time points defining the steps being equally split percentiles of the observed event times, is utilized for the informatively censored events and the event of treatment failure or death. Sensitivity analyses with 4- and 5-step baseline hazard functions are conducted and show no significant difference. We apply independent noninformative prior distributions for all the parameters which all assumed to have relatively large variances. The corresponding priors for the parameters are β0 ~ N (70, 103) and βl ~ N (0, 103) for l = 1, …, 9; σ2 ~ IG(10−3, 10−3); γkr ~ N (0, 103) for k = 1, 2 and r = 1, …, 5; λ0k(s)Γ(103,103) for s = 1, …, Sk and S1 = S2 = 3; ν2~ N (0, 105); and each element of η1 and η2 ~ N (0, 105).

Table 1 summarizes the covariance matrix parameters of different models, each was based on 30,000 iterations of MCMC sampling chains following a 15,000-iteration “burn-in” period. Since we include baseline %FVC as a fixed effect covariate, we do not consider random intercept to avoid possible confounding effects. We consider a one-random-slope (before 18 months) model, a structured two-random-slope model assuming the entries of last row in matrix Mi from the decomposition are the same, and an unstructured two-random-slope model. The structured random effects covariance matrix model might be useful when dealing with high-dimensional random effects model but with limited data. For the last element of the innovation variance parameter, we do not include the CYC effects due to the convergence issue. It is clear that none of the 95% credible intervals for CYC exclude zero. Therefore, we don’t have sufficient evidence to reject the homogeneous random effects covariances assumption. All the effective numbers of parameters (pD) are positive which is not an indication of possibly poor fit between the models and the data (Spiegelhalter et al. 2002). The conditional DIC we use tends to produce increasing pDs for increasing model complexity as suggested by Celeux et al. (2006). The two-random-slope model with unstructured covariance matrix has the smallest DIC, indicating that it might provide the best fit for the SLS data. Combining the earlier results, we chose the homogenous two-random-slope model with unstructured covariance matrix as our final model and its covariance parameters and DIC values are listed in the last column of Table 1.

Table 1.

Random effects covariance matrix parameters for four different models

[Timei j]
Est. (95% CI)
[Timei j, Time18i j]a
Est. (95% CI)
[Timei j, Time18i]b
Est. (95% CI)
[Timei j, Time18i]c
Est. (95% CI)
Generalized autoregressive parameters
η11 Intercept −0.27 (−1.02,0.18) −1.24 (−2.08, −0.38) −1.01 (−1.81, −0.20) −1.17 (−1.74, −0.58)
η11 CYC −0.71 (−1.65,0.08) −0.67 (−1.82, 0.93) −0.26 (−1.45, 0.92)
η12 Intercept 0.23 (0.03, 0.71) −0.27 (−1.23, 0.29) −0.51 (−1.23, −0.01)
η12 CYC 0.44 (−1.47, 2.00) −1.48 (−2.64, 0.15)
η13 Intercept 0.36 (0.10, 0.96) 0.34 (0.08, 0.85)
η13 CYC −0.65 (−1.49, 0.13)
Innovation variances
η21 Intercept −1.49 (−1.87, −1.09) −1.29 (−1.67, −0.90) −1.32 (−1.69, −0.91) −1.31 (−1.59, −1.03)
η21 CYC −0.06 (−0.63,0.49) 0.01 (−0.58, 0.59) −0.01 (−0.59, 0.57)
η22 Intercept −4.72 (−7.21, −2.19) 0.0.23 (−0.53, 0.84) 0.24 (−0.48, 0.84) −0.10 (−0.68, 0.38)
η22 CYC −0.66 (−3.73, 0.45) −0.68 (−1.95, 0.37)
η23 Intercept −5.11 (−7.55, −2.08) −4.58 (−7.38, −1.79) −4.62 (−7.26, −1.60)
Model fit
DIC 5661.09 5565.38 5549.32 5540.80
pD 151.77 212.84 218.26 212.98
a

Structured heterogeneous two-random-slope model;

b

Unstructured heterogeneous two-random-slope model;

c

Unstructured homogeneous two-random-slope model

The results of the selected two-random-slope model are summarized in Table 2. For comparison purposes, we perform separate analysis of the two endpoints, which is done by fitting a linear mixed model with two random slopes (11) for %FVC and a cause-specific hazards frailty model for the competing risks failure time data (13), (14), respectively. The two methods produce similar point estimates and credible intervals for baseline %FVC, lung fibrosis and their interactions with CYC, but give different results on the interactions of CYC and time trends. With the joint model, the significance of the interactions between CYC and time trends indicates that the developing trends of %FVC in the two groups are different. The %FVC declines for the placebo group (β4 = −0.12) but increases for the CYC group (β4 + β8 = 0.14) in the first 18 months. After 18 months the %FVC declines for the CYC group (β4 + β5 + β8 + β9 = −0.45) since the CYC effects decrease gradually after the treatment stops, while a positive slope is found for the placebo group (β4 + β5 = 0.15). However, none of the time trends is significantly different from zero. The difference might be explained by the significant covariances ΣU1υ and ΣU2υ between the random slopes in the longitudinal model and the latent variable of the survival model, which indicates dependence between the longitudinal measurement %FVC and the survival process. We also observe significantly positive coefficient ν2 which shows that there is a latent association between the two competing risks. The negative sign of ΣU1υ and positive sign of ΣU2υ together with positive ν2 indicate that in the first 18 months, there tends to be a lower risk for both treatment failure or death and informatively censored events due to dropout for patients with higher than average increasing rate of %FVC over time; after 18 months, the trend is reversed due to the negative association between the two slopes. The consequence of such informative dropout process results in biased estimates in time trends and attenuated slope changes comparing the CYC group with the placebo group for the separate analysis. The results are confirmed by the simulation study in the later section. The overall effects of treatment CYC on %FVC scores are evaluated by testing the null hypothesis H0: β3 = β6 = β7 = β8 = β9 = 0 which yields a p-value 0.01 for the joint model and 0.03 for the separate model.

Table 2.

Analysis of SLS data using the unstructured homogeneous two-random-slope model

Joint analysis Estimate (95% CI) Separate analysis Estimate (95% CI)
Longitudinal outcome %FVC
Int (β0) 65.33 (64.72, 67.87) 65.94 (64.41, 67.47)
FVC0 (β1) 0.89 (0.80, 0.99) 0.89 (0.79, 0.99)
FIB0 (β2) −1.85 (−2.94, −0.79) −1.86 (−2.94, −0.78)
CYC (β3) −0.98 (−3.18, 1.26) −0.76 (−2.94, 1.42)
Time (β4) −0.12 (−0.29, 0.06) −0.05 (−0.23, 0.13)
Time18 (β5) 0.27 (−0.17, 0.72) 0.11 (−0.34, 0.56)
FVC0 × CYC (β6) 0.14 (0.00, 0.28) 0.14 (0.00, 0.28)
FIB0 × CYC (β7) 1.74 (0.13, 3.27) 1.78 (0.23, 3.33)
Time × CYC (β8) 0.26 (0.01, 0.50) 0.21 (−0.04, 0.46)
Time18 × CYC (β9) −0.72 (−1.33, −0.08) −0.64 (−1.29, 0.00)
σ2 21.55 (19.23, 24.25) 21.28 (18.80, 23.76)
ΣU11 0.27 (0.20, 0.36) 0.25 (0.18, 0.32)
ΣU12 −0.31 (−0.53, −0.14) −0.31 (−0.49, −0.12)
ΣU22 1.29 (0.70, 2.14) 1.34 (0.67, 2.00)
p-value for H0: β3 = β6 = β7 = β8 = β9 = 0 0.01 0.03
Cause-specific hazards (time to informative dropout)
FVC0 (γ11) −0.06 (−0.12, −0.01) −0.06 (−0.12,−0.00)
FIB0 (γ12) 0.22 (−0.27, 0.78) 0.20 (−0.35, 0.75)
CYC (γ13) 0.23 (−0.60, 1.12) 0.40 (−0.46, 1.26)
FVC0 × CYC (γ14) 0.10(0.03, 0.18) 0.09(0.03, 0.15)
FIB0 × CYC (γ15) 0.13 (−0.60, 0.83) 0.07 (−0.64, 0.76)
p-value for H01: γ13 = γ14 = γ15 = 0 0.08 0.07
Cause-specific hazards (time to treatment failure or death)
FVC0 (γ21) 0.02 (−0.07, 0.09) 0.03 (−0.07, 0.13)
FIB0 (γ22) 0.29 (−0.62, 1.19) 0.28 (−0.80, 1.36)
CYC (γ23) −1.33 (−3.44, 0.21) −1.14 (−3.26, 0.98)
FVC0 × CYC (γ24) −0.07 (−0.21, 0.06) −0.08 (−0.24, 0.08)
FIB0 × CYC (γ25) −0.58 (−2.31, 0.91) −0.88 (−2.78, 1.02)
p-value for H02: γ23 = γ24 = γ25 = 0 0.39 0.48
Random effects for survival endpoint
ν2 3.04 (1.27, 7.65) −0.31 (−79.80, 81.16)
σv2
0.38 (0.07, 1.42) 0.04 (0.00, 0.40)
Covariance of Ui and υi
ΣU1υ −0.25 (−0.51, −0.09)
ΣU2υ 0.60 (0.21, 1.33)

p-Value < 0.05

When modeling the competing risks survival data, the two methods produce similar point estimates and CIs for most parameters and identify the same set of significant effects. The joint model is able to identify the relationship (ν2) between the two competing risks much better than the separate model since the separate model does not rely on the additional information from the longitudinal endpoints. We note that, in our second simulation study in the next section, the estimate for ν2 is not reliable under the current sample size and event rates even for the joint model. Hence we would not overinterpret the quantity in this application. However, the simulation also suggests that the bias of ν2 does not seem to affect the estimation of other parameters in the joint model. No significant overall effects of CYC are identified for the time to treatment failure or death by testing the null hypothesis H02: γ23 = γ24 = γ25 = 0 because of the relatively short follow-up period.

5 Simulation studies

We carry out two simulation studies to assess the performance of our method. In the first simulation, the data are generated with heterogeneous covariance matrices and we want to show how the parameter estimates and standard errors would be affected if we ignore the heterogeneity. The longitudinal measurements are simulated from the following model:

Yij=β0+β1tij+β2X2i+Uitij+εij (15)

where ti j = 0, 0.15, 0.3, …, 3, is the scheduled visit time and X2i ~ Bernoulli(0.5) is a group indicator. The measurement error εi j ~ N (0, 5). We simulate two competing risks failure times with the following cause-specific hazards:

λ1(t;X1i,X2i,υi,γ1)=λ01(t)exp{γ11X1i+γ12X2i+υi}λ2(t;X1i,X2i,υi,γ2,ν2)=λ02(t)exp{γ21X1i+γ22X2i+ν2υi} (16)

where X1 ~ N (2, 1.0), and X2 are shared with the longitudinal model. We use constant baseline hazards of 0.12 and 0.25 for risk 1 and risk 2 respectively to generate the event time data. The random effects are generated from the multivariate normal distribution with covariance matrices Σi which are decomposed into the GARPs and IVs modeled with covariates ai jl = bi j = (1, X2i). In other words, the covariance matrices are different in the two groups: strong positive correlation in one group and strong negative correlation in the other. The parameters are given in Table 3. With this setup, the rate of risk 1 is approximately 0.40, the rate of risk 2 is 0.38 and censoring rate is 0.22. Longitudinal responses are missing after the observed or censored event times. The average number of total longitudinal observations is 11.6 per subject. We use N (0, 105) priors for each component of β, γ, ν, η1 and η2, I G(10−3, 10−3) for σ2, and Γ (10−3, 10−3) for λ0. The simulation is based on 200 Monte Carlo samples with sample size of 200 and 500. The MCMC sampling in all simulation studies is run using 5, 000 iterations, and the estimation results are based on the last 2, 500 iterations.

Table 3.

Comparison of simulated bias, standard error (SE) and coverage probability (CP) between two homogeneous (incorrect) models and a heterogeneous (correct) model (sample size = 200)

m Parameter True Homogeneousa Homogeneous Heterogeneous

Bias SE CP Bias SE CP Bias SE CP
200 Longitudinal
β0 10 0.023 0.055 0.935 0.017 0.054 0.925 0.007 0.057 0.914
β1 1.5 −0.029 0.069 0.950 −0.019 0.065 0.960 −0.010 0.073 0.921
β2 −1 0.029 0.173 0.885 0.036 0.215 0.860 −0.029 0.117 0.911
σ2 1 −0.002 0.30 0.965 0.001 0.029 0.965 0.002 0.033 0.938
σu12 2.5 −0.001 0.525 0.942
σu02 0.5 0.041 0.182 0.935
Survival
γ11 0.8 −0.26 0.137 0.885 0.011 0.158 0.920 −0.006 0.139 0.925
γ12 −1 0.133 0.289 0.840 0.047 0.416 0.880 −0.024 0.302 0.932
γ21 0.5 −0.066 0.163 0.910 −0.066 0.195 0.875 −0.026 0.150 0.928
γ22 −1 0.378 0.347 0.755 0.350 0.426 0.810 −0.007 0.359 0.932
ν2 1.5 0.281 1.049 0.715 −0.352 1.891 0.715 −0.116 0.863 0.912
σv12 1 0.026 0.706 0.825
σv02 0.5 −0.048 0.301 0.801
Joint covariances
σ1 1.5 0.216 0.785 0.805
σ0 −0.4 0.080 0.247 0.787
a

Homogeneous model from Hu et al. (2009)

The bold numbers represent relatively large biases

We analyze the simulated data with a joint model that models the covariance matrices with subject-specific covariates (heterogeneous) and a joint model with subject-independent covariates (homogeneous). We also compare the results with the homogeneous model proposed by Hu et al. (2009). Tables 3 and 4 report the biases, estimated standard errors (the median of estimated standard error), and coverage rates of the 95% credible intervals. The parameters η1 and η2 are transformed back to variance–covariance parameters in the table. It is seen that the heterogeneous joint model gives almost unbiased estimates for all the parameters. Our method for the homogeneous model performed similarly to Hu et al. (2009). Both homogeneous joint analyses lead to large bias in some of the parameter estimates including γ12, γ22 and ν2, which indicates that we may obtain biased parameter estimates for the survival endpoint when combining the information of the longitudinal outcome if the correlation of the two endpoints is incorrectly modeled. Therefore, ignoring the heterogeneity can result in biased estimates and invalid inference.

Table 4.

Comparison of simulated bias, standard error (SE) and coverage probability (CP) between a homogeneous (incorrect) model and a heterogeneous (correct) model (sample size = 500)

m Parameter True Homogeneousa Homogeneous Heterogeneous

Bias SE CP Bias SE CP Bias SE CP
500 Longitudinal
β0 10 0.024 0.035 0.895 0.024 0.035 0.893 0.001 0.037 0.916
β1 1.5 −0.032 0.044 0.875 −0.031 0.043 0.902 −0.002 0.045 0.948
β2 −1 0.055 0.113 0.835 0.053 0.127 0.907 0.008 0.087 0.924
σ2 1 −0.001 0.019 0.960 −0.001 0.019 0.937 0.001 0.019 0.948
σu12 2.5 0.007 0.346 0.920
σu02 0.5 0.017 0.105 0.913
Survival
γ11 0.8 −0.004 0.085 0.890 −0.017 0.091 0.917 −0.005 0.087 0.920
γ12 −1 0.153 0.177 0.815 0.162 0.219 0.815 −0.016 0.171 0.937
γ21 0.5 −0.001 0.099 0.900 −0.033 0.101 0.922 −0.012 0.093 0.920
γ22 −1 0.336 0.211 0.605 0.346 0.216 0.649 −0.021 0.207 0.920
ν2 1.5 0.102 0.525 0.825 0.152 1.516 0.809 −0.075 0.843 0.937
σv12 1 −0.034 0.354 0.906
σv02 0.5 −0.030 0.182 0.899
Joint covariances
σ1 1.5 −0.064 0.398 0.923
σ0 −0.4 0.009 0.145 0.882
a

Homogeneous model from Hu et al. (2009)

The bold numbers represent relatively large biases

We conduct the second simulation by generating data with structures similar to the SLS. The longitudinal measurements and the competing risks event times are simulated from model (11–14) with Zi j = ([Timei j, Time18i j]), where the covariates are generated from distributions close to what we observe in the real data. All the parameters for the joint model are set to the estimated values from the joint analysis for SLS in Table 2. Weibull distributions are used as the true baseline hazard function which produce similar risk rates and censoring rate to those in SLS. The results of the joint model and the separate analysis are compared in Table 5 using 200 simulated datasets with a sample size m = 140. MCMC sampling is run in 10,000 iterations, and the estimation results were based on the last 5,000 iterations. It is shown that the joint model produces good point estimates and coverage rates for most of the parameters in the longitudinal sub-model except for the time trend after 18 months (β5) and the corresponding variance (ΣU22). The separate analysis gives biased estimates for both time trends and their corresponding variances. These biases do not decrease even for a large sample size of 500 (simulation results are not reported here) since the biases are the consequences of the informative dropout process, which cannot be accounted for by the linear mixed effects alone. In contrast, the biases in the joint model are much improved with increased sample size. The random effects coefficient ν2 and frailty σv2 and their standard errors are poorly estimated by the separate competing risks models. The joint model gives biased estimate for ν2 as well, which suggests that with a small sample size of 140 and low event rates (10% for risk 1 and 23% for risk 2), even the joint analysis may not provide good estimates for the frailty at the survival endpoint.

Table 5.

Comparison of simulated bias, standard error (SE) and coverage probability (CP) between joint and separate analyses (sample size = 140)

Parameter True Joint Separate

Bias SE CP Bias SE CP
Longitudinal
 Fixed effects
  β0 66.33 0.004 0.765 0.975 0.459 0.768 0.925
  β1 0.89 −0.002 0.051 0.955 0.002 0.051 0.940
  β2 −1.85 −0.028 0.597 0.925 −0.030 0.592 0.940
  β3 −0.98 −0.049 1.031 0.955 0.210 1.019 0.960
  β4 −0.12 −0.002 0.099 0.935 0.082 0.096 0.815
  β5 0.27 0.111 0.278 0.900 0.299 0.239 0.775
  β6 0.14 −0.002 0.073 0.950 −0.003 0.072 0.935
  β7 1.74 0.026 0.839 0.945 0.032 0.832 0.935
  β8 0.26 −0.006 0.130 0.940 0.036 0.127 0.945
  β9 −0.72 0.040 0.337 0.935 0.101 0.327 0.920
 Random effects
  σ2 21.55 0.049 1.357 0.935 0.041 1.310 0.940
  ΣU11 0.27 −0.001 0.038 0.965 0.022 0.035 0.855
  ΣU12 −0.31 0.060 0.095 0.860 0.066 0.083 0.860
  ΣU22 1.29 0.284 0.401 0.790 0.235 0.345 0.790
Competing risks
 Fixed effects
  γ11 −0.06 −0.004 0.030 0.950 0.005 0.025 0.940
  γ12 0.22 0.077 0.320 0.925 0.017 0.270 0.930
  γ13 0.23 0.011 0.518 0.920 0.038 0.448 0.935
  γ14 0.10 0.005 0.042 0.950 −0.010 0.031 0.965
  γ15 0.13 0.063 0.434 0.920 −0.064 0.345 0.955
  γ21 0.02 −0.009 0.041 0.945 −0.008 0.029 0.945
  γ22 0.29 0.055 0.473 0.925 −0.061 0.356 0.925
  γ23 −1.33 −0.052 1.107 0.925 −0.098 1.049 0.915
  γ24 −0.07 0.003 0.075 0.960 0.010 0.059 0.995
  γ25 −0.58 −0.010 0.891 0.940 0.095 0.805 0.965
 Random effects
  ν2 3.04 0.456 1.043 0.925 0.414 3.212 1.000
   σv2 0.38 0.076 0.728 0.995 0.333 0.058 0.285
 Joint covariances
  ΣU1υ −0.25 −0.100 0.128 0.970
  ΣU2υ 0.60 −0.022 0.418 0.945

The bold numbers represent relatively large biases

6 Discussion

For simplicity, we assume in our model that the measurement errors are mutually independent and normally distributed with constant variance. This assumption can be weakened and our method can be modified to handle correlated normal random errors. Our model also assumes that the longitudinal sub-model and survival sub-model are independent conditional on the observed data and latent variables. This may not be satisfied in a real study such as the SLS, in which one of the risks in the survival endpoint, treatment failure or death, is partly determined by the longitudinal outcome %FVC. We did some sensitivity analyses and found that our model is robust for mild violation of the independence assumption.

Our model can be extended to clustered data. Frequently, clustered data arises from multi-site clinical trials or from studies across families, in which each site or family can be viewed as a cluster. The cluster effect can be conveniently incorporated as a random effect or as design vectors for the GARP/IV parameters to take into account the heterogeneity across the cluster. Similarly, our method can be extended to recurrent event data where each subject may repeatedly experience a certain phenomenon. In addition, within our joint model framework, the linear mixed sub-model can be extended to the generalized linear mixed effects model (Diggle et al. 2002) to handle non-normally distributed data, such as binomial or Poisson outcomes. Due to the complexity of the likelihood function in both GLMMs and joint models, only a few papers discussed such a generalized joint model framework (Molenberghs et al. 1997; Faucett et al. 1998; Larsen 2005; Yao 2008). Although in our joint model the posterior sampling distributions for the fixed and random effects in the longitudinal sub-model need to be changed, the parameters in the survival sub-model and the joint variance–covariances parameters can be sampled with our described algorithm. One of the possible approaches to sample the parameters in the GLMM sub-model is to update the fixed and random effects by constructing a normal proposal distribution with mean and variance from a single iteration of weighted least squares based on the previous value (Gamerman 1997).

We finally note that the modified Cholesky decomposition can provide an unconstrained and statistically meaningful reparameterization of a covariance matrix, but at the expense of imposing an order among the underlying random variables. Despite of this shortcoming, it has been used effectively in various applications including multivariate quality control, multivariate time series, finance and random effects models (Pourahmadi 2007).

Appendix

This section provides details for the full conditional distributions of the parameters used in the Gibbs sampling algorithm. We use p(.) and p(.|.) to denote marginal and conditional densities, respectively. We denote the prior distribution by p0(.). Based on the modified Cholesky decomposition, the random effects υi can be written as vi=l=1qaiqlTη1Uil+ei,q+1 where ei,q+1N(0,exp(bi,q+1Tη2)). Instead of sampling υi directly, we sample ei, q+1, leading to a faster convergence rate.

  1. Sample β from
    p(β·)N((i=1mXi(1)TXi(1))1(i=1nXi(1)T(YiZiUi)),(i=1nXi(1)TXi(1)σ2)1)p0(β).
  2. Sample σ2 from
    p(σ2·)IG(i=1mni21,12i=1mj=1ni(YijβTXi(1)(tij)UiTZ(tij))2)×p0(σ2).
  3. Sample the random effects Ui from
    p(Ui·)N(μUiYi,UiYi)×k=1gexp{(l=1qaiqlTη1Uil+ei,q+1)×νkI(Di=K)Hk(Ti)},

    where UiYi=(ZiTZiσ2+ui1)1,μUiYi=UiYi[ZiT(YiXiβ)σ2] and ui1=MiTHi1Mi,Mi is a q × q matrix consisting of the first q columns and rows of Mi, Hi is a q × q matrix consisting of the first q columns and rows of Hi. We use the one-step Metropolis–Hastings algorithm to obtain the update in the sampling sequence with the normal density from the longitudinal data as the proposal density. The random effects Ui is obtained by first sampling a random variable form the conditional density based on the longitudinal data and then using the conditional likelihood contribution from the survival data to determine the acceptance of the new draw.

  4. Sample η1 from
    p(η1·)N((i=1mQiTHi1Qi)1(i=1mQiTHi1Ui),(i=1mQiTHi1Qi)1)×k=1gexp{(l=1qaiqlTη1Uil+ei,q+1)νkI(Di=k)Hk(Ti)}p0(η1),

    where Qi is a q×q1 matrix with first row Qi1 = 0 and jth row Qij=l=1j1aijlTUil for j = 2, …, q. We sample η1 in two steps: sample the entries only involves Ui from the normal conditional density, sample the entries involves Ui and υi with ARS.

  5. Sample η2 from
    p(η2·)exp[12i=1m(j=1q{bijTη2+(Uijl=1j1aijlTη1Uil)2exp(bijTη2)}+bi,q+1Tη2+ei,q+12exp(bi,q+1Tη2))]p0(η2).

    We use a Metropolis–Hastings step with a normal approximation to the full conditional as the candidate distribution. For details, see Daniels and Pourahmadi (2002).

  6. Sample γkr, k = 1, …, g, r = 1, …, R from
    p(γkr·)exp[γkri=1mI(Di=k)Xir(2)(Ti)i=1mHk(Ti)]p0(γk).

    We use a Metropolis–Hastings step within the single component sampler to update the values of these parameters. For each of these parameters, we propose a normal density as the proposal density, which has the current value of the parameter as its mean and its standard deviation is set equal to four times the standard error of a maximum partial likelihood estimate from a standard Cox model (Wang and Taylor 2001).

  7. Sample νk with ARS from
    p(νk·)exp[i=1mI(Di=k)νk(l=1qaiqlTη1Uil+ei,q+1)i=1m0Tiλ0kexp(γkTXi(2)+νk(l=1qaiqlTη1Uil+ei,q+1))dt]p0(νk).
  8. Sample ei, q+1 (i = 1, …, m) from
    p(ei,q+1·)N(0,exp(bi,q+1Tη2))×k=1gexp[ei,q+1νkI(Di=k)Hk(Ti)].

    The sample is obtained by first sampling a candidate from the normal densities as its assumption and then using the conditional likelihood contribution from the survival data to determine the acceptance of the new draw.

  9. Sample each piece of λ0k (k = 1, …, g) from
    p(λ0k(s)·)Γ(αks,βks)p0(λ0k(s)),

    where αks=i=1mI(Di=k,tk(s1)<Titk(s))+1 indicates the number of events occurring in the time interval ( tk(s1),tk(s)], and βks=i=1mI(Ti>tk(s1))tk(s1)min(Ti,tk(s))exp(γkTXi(2)+νkvi)dt, for s = 1, …, Sk.

Footnotes

Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Contributor Information

Xin Huang, Email: xin@amgen.com, Amgen Inc., 1120 Veterans Boulevard, Mail Stop ASF3-3, South San Francisco, CA 94080, USA.

Gang Li, Email: vli@ucla.edu, Department of Biostatistics, School of Public Health, University of California at Los Angeles, Los Angeles, CA 90095, USA.

Robert M. Elashoff, Email: relashof@biomath.medsch.ucla.edu, Department of Biomathematics, University of California at Los Angeles, Los Angeles, CA 90095, USA

Jianxin Pan, Email: jianxin.pan@manchester.ac.uk, School of Mathematics, The University of Manchester, Manchester, UK.

References

  1. Brown ER, Ibrahim JG. A Bayesian semiparametric joint hierarchical model for longitudinal and survival data. Biometrics. 2003;59:221–228. doi: 10.1111/1541-0420.00028. [DOI] [PubMed] [Google Scholar]
  2. Celeux G, Forbes F, Robert CP, Titterington DM. Deviance information criteria for missing data models. Bayesian Anal. 2006;4:651–674. [Google Scholar]
  3. Chen MH. Comments on article by celeux et al. Bayesian Anal. 2006;4:677–680. [Google Scholar]
  4. Chib S, Greenberg E. Understanding the metropolis-hastings algorithm. Am Stat. 1995;49:327–335. [Google Scholar]
  5. Chiu TYM, Leonard T, Tsui KW. The matrix-logarithmic covariance model. J Am Stat Assoc. 1996;91:198–210. [Google Scholar]
  6. Daniels MJ, Pourahmadi M. Bayesian analysis of covariance matrices and dynamic models for longitudinal data. Biometrika. 2002;89:553–566. [Google Scholar]
  7. Daniels MJ, Zhao YD. Modelling the random effects covariance matrix in longitudinal data. Stat Med. 2003;22:1631–1647. doi: 10.1002/sim.1470. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Davidian M, Giltinan DM. Nonlinear models for repeated measurement data. Chapman and Hall; New York: 1995. [Google Scholar]
  9. De Gruttola V, Tu XM. Modeling progression of CD4-lymphocyte count and its relationship to survival time. Biometrics. 1994;50:1003–1014. [PubMed] [Google Scholar]
  10. Diggle P, Hergerty P, Liang KY, Zeger S. Analysis of longitudinal data. Oxford University Press; Oxford: 2002. [Google Scholar]
  11. Elashoff R, Li G, Li N. An approach to joint analysis of longitudinal measurements and competing risks failure time data. Stat Med. 2007;26:2813–2835. doi: 10.1002/sim.2749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Elashoff R, Li G, Li N. A joint model for longitudinal measurements and survival data in the presence of multiple failure types. Biometrics. 2008;64:762–771. doi: 10.1111/j.1541-0420.2007.00952.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Faucett CL, Thomas DC. Simultaneously modeling of censored survival data and repeated measured covariates: a gibbs sampling approach. Stat Med. 1996;16:1663–1685. doi: 10.1002/(SICI)1097-0258(19960815)15:15<1663::AID-SIM294>3.0.CO;2-1. [DOI] [PubMed] [Google Scholar]
  14. Faucett CL, Schenker N, Elashoff RM. Analysis of censored survival data with intermittently observed time-dependent binary covariates. J Am Stat Assoc. 1998;93:427–437. [Google Scholar]
  15. Gamerman D. Sampling from the posterior distribution in generalized linear mixed models. Stat Comput. 1997;7:57–68. [Google Scholar]
  16. Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences (with discussion) Stat Sci. 1992;7:457–511. [Google Scholar]
  17. Gilks WR, Wild P. Adaptive rejection sampling for Gibbs sampling. Appl Stat. 1992;41:337–348. [Google Scholar]
  18. Hastings WK. Monte Carlo sampling methods using Markov chains and their applications. Biometrika. 1970;57:97–109. [Google Scholar]
  19. Heagerty PJ, Kurland BF. Misspecified maximum likelihood estimate and generalised linear mixed models. Biometrika. 2001;88:973–986. [Google Scholar]
  20. Henderson R, Diggle P, Dobson A. Joint modeling of longitudinal measurements and event time data. Biostatistics. 2000;4:465–480. doi: 10.1093/biostatistics/1.4.465. [DOI] [PubMed] [Google Scholar]
  21. Hogan JW, Laird NM. Model-based approaches to analysing incomplete longitudinal and failure time data. Stat Med. 1997;16:259–272. doi: 10.1002/(sici)1097-0258(19970215)16:3<259::aid-sim484>3.0.co;2-s. [DOI] [PubMed] [Google Scholar]
  22. Hu WH, Li G, Li N. A Bayesian approach to joint analysis of longitudinal measurements and competing risks failure time data. Stat Med. 2009;29:1601–1619. doi: 10.1002/sim.3562. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Larsen K. The Cox proportional hazards model with a continuous latent variable measured by binary indicators. Biometrics. 2005;61:1049–1055. doi: 10.1111/j.1541-0420.2005.00374.x. [DOI] [PubMed] [Google Scholar]
  24. Lin X, Raz J, Harlow SD. Linear mixed models with heterogeneous within-cluster variances. Biometrics. 1997;53:910–923. [PubMed] [Google Scholar]
  25. Little RJA. Modeling the drop out mechanism in repeated measures studies. J Am Stat Assoc. 1995;90:1112–1121. [Google Scholar]
  26. Liu L, Ma JZ, O’Quigley J. Joint analysis of multi-level repeated measures data and survival: an application to the end stage renal disease (ESRD) data. Stat Med. 2008;27:5679–5691. doi: 10.1002/sim.3392. [DOI] [PubMed] [Google Scholar]
  27. Molenberghs G, Kenward MG, Lesaffre E. The analysis of longitudinal ordinal data with nonrandom drop-out. Biometrika. 1997;84:33–44. [Google Scholar]
  28. Pourahmadi M. Joint mean-covariance models with applications to longitudinal data: unconstrained parameterization. Biometrika. 1999;86:677–690. [Google Scholar]
  29. Pourahmadi M. Cholesky decompositions and estimation of a covariance matrix: orthogonality of variance-correlation parameters. Biometrika. 2007;94:1006–1013. [Google Scholar]
  30. Pourahmadi M, Daniels MJ. Dynamic conditional linear mixed models for longitudinal data. Biometrics. 2002;58:225–231. doi: 10.1111/j.0006-341x.2002.00225.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Prentice RL, Breslow NE. Retrospective studies and failure time models. Biometrika. 1978;65:153–158. [Google Scholar]
  32. Schluchter MD. Methods for the analysis of informatively censored longitudinal data. Stat Med. 1992;11:1861–1870. doi: 10.1002/sim.4780111408. [DOI] [PubMed] [Google Scholar]
  33. Song X, Davidian M, Tsiatis AA. A semiparametric likelihood approach to joint modeling of longitudinal and time-to-event data. Biometrics. 2002;58:742–753. doi: 10.1111/j.0006-341x.2002.00742.x. [DOI] [PubMed] [Google Scholar]
  34. Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A. Bayesian measures of model complexity and fit (with discussion) J R Stat Soc Ser B. 2002;64:583–639. [Google Scholar]
  35. Tashkin DP, Elashoff RM, et al. Cyclophosphamide versus placebo in scleroderma lung disease. New Engl J Med. 2006;354:2655–2666. doi: 10.1056/NEJMoa055120. [DOI] [PubMed] [Google Scholar]
  36. Tseng YK, Hsieh F, Wang JL. Joint modelling of accelerated failure time and longitudinal data. Biometrika. 2005;92:587–603. [Google Scholar]
  37. Wang Y, Taylor JMG. Joint modeling longitudinal and event time data with application to acquired immunodeficiency syndrome. J Am Stat Assoc. 2001;96:895–905. [Google Scholar]
  38. Wulfsohn MS, Tsiatis AA. A joint model for survival and longitudinal data measured with error. Biometrics. 1997;53:330–339. [PubMed] [Google Scholar]
  39. Xu J, Zeger SL. Joint analysis of longitudinal data comprising repeated measures and times to events. Appl Stat. 2001;50:375–387. [Google Scholar]
  40. Yao F. Functional approach of flexibly modelling generalized longitudinal data and survival time. J Stat Plan Inference. 2008;138:995–1009. [Google Scholar]
  41. Ye W, Lin XH, Taylor JMG. Semiparametric modeling of longitudinal measurements and time-to-event data—a two-stage regression calibration approach. Biometrics. 2008;64:1238–1246. doi: 10.1111/j.1541-0420.2007.00983.x. [DOI] [PubMed] [Google Scholar]
  42. Zeng D, Cai J. Simultaneous modelling of survival and longitudinal data with an application to repeated quality of life measures. Lifetime Data Anal. 2005;11:151–174. doi: 10.1007/s10985-004-0381-0. [DOI] [PubMed] [Google Scholar]
  43. Zhang F, Weiss RE. Diagnosing explainable heterogeneity of variance in random effects models. Can J Stat. 2000;28:3–18. [Google Scholar]

RESOURCES