Abstract
Background
It has not been previously demonstrated whether Bayesian joint modeling (BJM) of disability and survival can, under certain conditions, improve precision of individual survival curves.
Methods
A longitudinal, observational study wherein 754 initially non-disabled community dwelling adults in greater New Haven, CT, were observed on a monthly basis for over 10 years.
Results
In this study BJM exploited many monthly observations to demonstrate, relative to a separate Bayesian survival model with adjustment, improved precision of individual survival curves permitting detection of significant differences between survival curves of two similar individuals. The gain in precision was lost when using only those observations from intervals of six, nine, or twelve months.
Conclusion
When there are many repeated measures, BJM of longitudinal functional disability and interval-censored survival can potentially increase the precision of individual survival curves relative to those from a separate Bayesian survival model. This may facilitate the identification of significant differences between individual survival curves, a useful result usually precluded by the large variability inherent to individual level estimates from stand-alone survival models.
Keywords: Bayesian, joint modeling, interval-censored survival, ordinal outcome, functional disability, individual survival curves
The joint modeling of a longitudinal outcome and survival has historically been motivated by the desire to reduce potential bias in the associations of explanatory variables with both longitudinal outcomes and survival when time to the survival event is likely correlated with the longitudinal outcome (Albert PS & Follmann DA, 2000; Cowling BJ, Hutton JL, & Shaw JEH, 2006; Faucett C, Schenker N, & Elashoff RM, 1998; Rizopoulos D, Verbeke G, Lesaffre E, & Vanrenterghem Y, 2008; Tsiatis AA, Degruttola V, & Wulfsohn MS, 1995; Wulfsohn MS & Tsiatis AA, 1997). It has been shown that such bias, which may result from informative censoring, can be reduced by the joint modeling of a longitudinal outcome and time to the censoring event (Aalen OO, 2012; Ibrahim J, Chu H, & Chen LM, 2010). In longitudinal studies of older persons death is often related to the gerontologic outcomes under study (Gill TM, Allore HG, Gahbauer EA, & Murphy TE, 2010), and may therefore constitute informative censoring. A representative example is the longitudinal association between slow gait speed, an important risk factor, and functional disability, a commonly studied gerontologic outcome. Gerontologic researchers have questioned whether the association between slow gait and longitudinal occurrence of functional disability among older persons, i.e., 70 years and older, is biased by death of participants. Because previous studies have shown that functional disability and death are correlated, especially in the months immediately preceding death (Murphy TE et al., 2011) joint modeling can provide insight as to whether bias is being introduced. Although joint modeling is often utilized to reduce bias on the associations with longitudinal outcomes, this article demonstrates that the survival associations can also be modified in a way that is very useful for the comparison of individual survival curves.
While there are a number of proposed methods with which to implement the sharing of information between the longitudinal and time-to-event outcomes (Ye W, Lin X, & Taylor JMG, 2008), for this application we use the joint modeling paradigm of Henderson (Henderson RJ, Diggle PJ, & Dobson A, 2000). This approach assumes a latent structure driving the two outcomes through the sharing of person-specific random effects, which is congruent with the authors’ concept of an underlying aging process that manifests as longitudinally increasing disability with eventual termination in death. In this article we jointly model the longitudinal trajectory of an important geriatric outcome, e.g., functional disability, whose monthly ordinal values are modeled in sync with the monthly survival status of each participant. Methodologically, we apply the joint modeling approach to a longitudinal ordinal outcome and time to death in a Bayesian framework (Hu W, Mengersen K, & Tong S, 2009; Symons JM, Le HQ, Kreckman KH, Sakr CJ, & Lednar WM, 2009) that features person-specific intercepts in each of the sub-models. Although the Bayesian approach is increasingly showing itself useful in survival analysis (Lin I-F, Chang WP, & Liao Y-N, 2013), to our knowledge there has not been a previous demonstration of the joint modeling of an ordinal outcome and interval-censored survival in the Bayesian context. Using a large cohort of community-dwelling older persons in the U.S., we demonstrate a previously under-reported advantage of joint modeling, i.e., a notable improvement in the precision of jointly estimated individual survival curves.
Methods
Data Source
Data for this study came from the Precipitating Events Project, an on-going longitudinal study of 754 community-living persons, aged 70 or older, who were initially nondisabled in four basic activities of daily living (ADLs) —bathing, dressing, walking, and transferring. The assembly of this cohort has been described in detail elsewhere (Gill TM, Desai MM, Gahbauer EA, Holford TR, & Williams CS, 2001; Hardy SE & Gill TM, 2004). Potential participants were members of a large health plan in New Haven, CT, USA. Exclusion criteria included significant cognitive impairment with no available proxy, life expectancy less than 12 months, plans to move out of the area, and inability to speak English. Only 4.6% of persons contacted refused screening, and 75.2% of those eligible agreed to participate and were enrolled from March 1998 to October 1999. Participants completed in-person home-based comprehensive assessments at baseline and at subsequent 18-month intervals, as well as monthly telephone interviews. Analysis in this study was based on data collected through December 2008, with each individual having up to 7 repeated comprehensive assessments and 129 monthly telephone interviews. The study protocol was approved by the Yale Human Investigation Committee, and all participants provided informed consent.
Explanatory Variables
The analytic model included time invariant covariates, such as baseline age in years centered on the sample average value of 78.4, race, and sex, as well as a time dependent indicator of slow gait speed. The slow gait criterion was met if the participant needed more than 10 seconds to complete the rapid gait test (Gill TM, Gahbauer EA, Allore HG, & Han L, 2006), and was measured at each comprehensive assessment. The natural logarithm of number of months of follow-up was also included in the disability models.
Longitudinal and Survival Outcomes
Both outcomes of interest, i.e., ordinal level of ADL disability and time to death, were measured at the monthly telephone interviews. Because the time of death and length of follow-up vary by individual, each participant’s number of observations differs, ranging from 1 to 129 months. Data were available for 98.7% of the 70500 monthly telephone interviews. Sequential multiple imputation was applied from baseline at monthly iterations to impute the missing data, with decedents and dropouts removed from all iterations subsequent to their final observations (Ning Y, McAvay G, Chaudhry SI, Arnold A, & Allore HG, 2013).
In the Precipitating Events Project, disability in ADLs is measured by the number of basic activities of daily living (among bathing, dressing, walking, and transferring) for which the participant needed help. To facilitate clinical interpretability the count outcome is converted to a three level ordinal scale as follows: 0= no disability (person had no ADL limitations); 1= mild disability (one or two ADL limitations); and 2= severe disability (3 or four ADL limitations). The translation to this ordinal scale has been successfully utilized in previous analyses (Gill TM, Gahbauer EA, Han L, & Allore HG, 2010). Because monthly death status was ascertained from local obituaries and/or a subsequent telephone interview with the family, time to death was measured in months. Through December 2009, four hundred thirty three (57.4%) participants died with median follow-up of 56.5 months, while 35 (4.6%) dropped out with median follow-up of 23.5 months.
Separate Bayesian Models
Ordinal Disability over Time
We evaluated the disability outcome with a cumulative log it model that assumes an underlying continuous latent outcome (Yit, for person i at time t). This proportional odds model yields each explanatory variable’s average association with moving to a worsened state of disability in a month’s time. Each increment of the ordinal value represents a worsening of disability, and is assumed to arise from an interval along the latent continuous scale partitioned by specific threshold values, aj : j= 1 or 2, that collectively define the three level scale. Assuming a logistic distribution with mean μit for Yit, cumulative probability Qitj of subject i having a value of Yit ≥ aj at time t is given by:
| [1] |
where x1it … xkit can be time-dependent covariates. The b0i are independently distributed normal random intercepts with standard normal priors. The terms α1 … αk represent, on average, the associations between the fixed covariates and a worsening of disability in any given month. To provide minimal information with reasonable convergence, α’s are assigned normal priors with mean zero and a variance parameter with vaguely dispersed gamma hyper-parameters.
Time to Death
Because time to death is measured in discrete months since baseline and 57.4% of participants died during the study, a large number of ties exist among the survival times. This motivated a method that adjusts for interval censoring (Cox DR & Oakes D, 1984). Used here was a binomial distribution with the complementary log-log link, which is a discrete analog of the continuous proportional hazards model. Specifically, given survival time Ti in discrete units and the time-dependent vector of covariates Xit, the discrete time hazard rate is Pit = Pr[ Ti= t | Ti ≥ t, Xit]. Assuming a multiplicative model for the hazard, the likelihood for interval-censored survival is the same as from a binomial distribution with event probabilities equal to Pit (Allison PD, 1982; Prentice RL et al., 1978). Assuming that the time-to-death process is continuous, it has also been shown that the probability of death in any given month may be modeled as follows:
| [2] |
where τt and σ0i are independently distributed random effects that respectively represent month- and person-specific intercepts, and are each assigned normal priors with mean zero. The month specific term τt is assigned a prior variance with vaguely dispersed gamma hyper-parameters and the person-specific term σ0i is assigned unit variance. Because each month has its own survival intercept, the separate survival model contains neither overall intercept nor months of follow-up. The terms β1 … βk represent the fixed, average associations between explanatory variables and probability of death in any given month. They are assigned the same priors as the vector of α’s in the separate longitudinal model.
Bayesian Joint Models with Shared Random Effects
The joint model formulation concurrently estimates disability and survival sub-models with a shared random intercept b0i which, per Henderson (Henderson RJ, et al., 2000), is multiplied by the random effect r0 in the survival sub-model as follows:
-
Disability Sub-model;
[3] Survival Sub-model;where the σoi in equation 2 has been replaced by the product of two normally distributed random effects, i.e., r0b0i. Note that b0i is the “shared” person-specific random effect that makes information accessible between the sub-models. In contrast r0“loads” the person-specific intercept in the survival sub-model and is assigned a normal prior of mean one and a gamma variance with vaguely dispersed hyper-parameters. The r0 term implies that the person-specific random effect in the survival sub-model is a multiple of the person-specific effect calculated by the longitudinal sub-model. The sign of r0 reveals the direction of the correlation between the two outcomes. The two equations in this joint model will be henceforth referred to as the disability sub-model and the survival sub-model, respectively.
Bayesian Model Fit and Convergence
The separate Bayesian models were first evaluated for compliance with modeling assumptions. For the assumption of proportional odds among the ordinal levels of ADL disability, cumulative log its were plotted against all covariates. To test the fit of the logistic model of monthly occurrence of death, the Hosmer-Lemeshow statistic was calculated. Within the Bayesian framework, model fit was also evaluated with posterior predictive simulations (Gelman A & Hill J, 2007) from three initially disparate Markov chains and convergence was confirmed using longitudinal plots of each parameter and the Gelman-Rubin statistic as modified by Brooks (Brooks SP & Gelman A, 1998; Gelman A & Rubin DB, 1992). An additional 5,000 post-convergence iterations of the three Markov chains were run to provide posterior estimates of all parameters. The deviance information criterion (DIC) was also used to compare relative fit of all models (Spiegelhalter DJ, Best NJ, Carlin BP, & van der Linde A, 2002). An estimated OR or HR was interpreted as statistically significant if its 95% credible interval excluded the value one. The Bayesian software Open BUGS was used to fit all models and produce the posterior draws of all parameter values from all separate models and sub-models (Lunn D, Spiegelhalter D, Thomas A, & Best N, 2009). The Open BUGS code for the joint model approach, whose sub-models include all pertinent information from the separate models for disability and death, is included in the appendices.
Sensitivity Analysis
Because the data illustrating these methods has 129 monthly observations of both outcomes with inconsequential missingness, the amount of longitudinal information it contains is uncommonly rich. For this reason we performed a series of sensitivity analyses to study how the precision of jointly modeled survival curves depends on the frequency of the repeated measures. To that end joint models were fit to subsets of the full data comprised of every 3rd, 6th, 9th, and 12th monthly observation, respectively.
Results
Baseline Characteristics of Cohort
Important clinical and demographic characteristics of the cohort at baseline are presented in Table 1.
Table 1.
Baseline Characteristics of Participants in the Precipitating Events Project (N= 754)
| Characteristics | Baseline Values (n=754) | |
|---|---|---|
| Age in years: | Mean (range) | 78.4 (70–96) |
| Female: | count (percent) | 487 (64.6) |
| Non-white race: | count (percent) | 72 (9.5) |
| Slow Gait speed *: | count (percent) | 322 (42.7) |
Dichotomous indicator representing a time > 10 seconds to complete a rapid gait test of walking 10 feet forward and 10 feet back as quickly as possible.
Odds Ratios and Hazards Ratios Estimated by Bayesian Joint and Separate Models
With regards to the modeling of ordinal levels of functional disability, plots of cumulative log its against the levels of each covariate were strongly parallel, indicating that the proportional odds assumption was not contradicted. With regards to the logistic models of monthly occurrence of death, the Hosmer-Lemeshow statistic indicated no significant lack of fit. In Table 2 we compare estimated ORs and HRs from the separate disability (equation 1) and survival (equation 2) models with those from their corresponding sub-models (equation 3), using the same covariates throughout. The OR estimates of each covariate tell us, on average, how much the odds of a worsened state of disability increase in a month’s time, relative to the covariate’s reference value. For binary indicators like slow gait and sex, these odds are relative to someone without the condition. For baseline age centered on its sample average, the OR tells us how much the odds of entering a worse state of disability go up for each year above the average age at baseline. Compared to results from the stand-alone disability model, all the OR from the sub-model of longitudinal disability have similar point estimates and greatly overlapping credible intervals. Slow gait is the strongest predictor of disability, and, being the only time-varying covariate in the model, the most likely to be influenced by the longitudinal information accessed through the person-specific random intercept in the joint model. The fact that its association with longitudinal disability did not differ by modeling approach, i.e., separate versus joint, suggests that the influence of the survival model on the longitudinal estimates is negligible. We believe this is due to the much greater amount of information contributed by the longitudinal response, which can change every month, relative to that from death, which occurs once at most for each participant.
Table 2.
Associations with Ordinal Disability and Interval-Censored Survival from Bayesian Separate and Joint Models
| Model Terms | Model Type and Outcome | |||
|---|---|---|---|---|
| Separate Longitudinal Model of Ordinal ADL | Jointly Estimated Longitudinal Sub-model of Ordinal ADL | Separate Survival Model of Time to Death in Months | Jointly Estimated Survival Sub-model of Time to Death in Months | |
|
|
|
|||
| Odds Ratio* (CI) | Odds Ratio* (CI) | Hazard Ratio* (CI) | Hazard Ratio* (CI) | |
|
|
|
|||
| Age at Baseline Centered on Mean of 78.4 | 1.19 (1.16, 1.20) | 1.19 (1.16, 1.21) | 1.08 (1.05, 1.12) | 1.09 (1.06, 1.12) |
|
|
|
|||
| Slow Gait (time varying+) | 4.66 (4.26, 5.05) | 4.66 (4.22, 5.10) | 2.39 (1.86, 3.06) | 1.40 (1.12, 1.79) |
|
|
|
|||
| Non-white race | 1.31 (1.00, 1.72) | 1.31 (1.01, 1.70) | 0.80 (0.52, 1.25) | 0.93 (0.66, 1.30) |
|
|
|
|||
| Female Sex | 1.39 (1.17, 1.65) | 1.38 (1.17, 1.62) | 0.58 (0.44, 0.75) | 0.65 (0.53, 0.80) |
|
|
|
|||
| Natural Log of Months | 6.04 (5.88, 6.26) | 6.08 (5.88, 6.31) | N/A | N/A |
|
| ||||
| Person-Specific Random Intercepts Mean (SD) | −0.02 (1.86) | −0.02 (1.88) | 0.00 (0.55) | −0.01 (0.64) |
|
|
|
|||
| DIC (lower is better) | 49770 | 49718 | 5067 | 5004 |
Statistical significance interpreted as a 95% credible interval exclusive of one
Updated every 18 months
ADL = ordinal levels (0, 1, and 2) of disability in activities of daily living denoting none, mild, and severe functional disability CI = 95% credible interval (Bayesian counterpart to confidence interval)
DIC = deviance information criterion
SD = standard deviation
Regarding survival, the HR estimates of each covariate tell us, on average, the risk of death in a month’s time, relative to the covariate’s reference value. The interpretation of the HR from the binary covariates and age are similar to those of the OR for disability. The separate model yields a hazard ratio (HR) of greater than 2 for slow gait (HR = 2.39, 95% credible interval: 1.86–3.06), suggesting that persons with Slow Gait, relative to persons with normal gait, have more than double the risk of death in any given month. However, when modeling survival jointly with disability, the HR for slow gait was only1.40 (95% credible interval: 1.12–1.79), interpreted as 40% higher risk relative to those with normal gait. As the only time-varying covariate, Slow Gait’s association with risk of death was significantly reduced by the extra information made accessible by the joint model. The HRs for other covariates from both approaches exhibit inconsequential differences, with each year of age over the baseline mean associated with 8–9% higher risk of death and female sex being protective as expected.
The person-specific random intercepts from all models are also described in Table 2, showing means very close to zero in both pairs of separate and sub-models, and similar variances. Because a reduction between 3 and 7 in the value of the deviance information criterion (Spiegelhalter DJ, et al., 2002) is considered an improvement in model fit, each sub-model has better fit than its separately modeled counterpart. The joint model’s point estimate and 95% credible interval for r0, i.e. the loading of the person-specific intercept in the survival sub-model, are 0.34 (0.29, 0.38), indicating that worsening ADL disability is positively correlated with risk of death (Murphy TE, et al., 2011).
All of the separately and jointly estimated ORs and HRs reported in Table 2 are derived by drawing 5000 iterations from the posterior joint distribution of all parameters in a given model. Bayesian estimation assumes that the parameters of interest are not fixed values, but rather follow probability distributions that account for uncertainty regarding the exact value of any parameter’s point estimate. Each of the ORs and HRs in Table 2 therefore represent the mean of a unique probability distribution. Figure 1 plots the unique posterior probability distributions for each of the parameters from the survival sub-model. Note that each of these parameters follows a normal distribution and that the mean of each corresponds in value to the natural logarithmic values of the reported HRs. This means that the reported Bayesian estimates are generally robust with respect to variation regarding the value of the parameters.
Figure 1.
Bayesian Posterior Probability Distributions of Log (Hazard Ratios) from Survival Sub-model of Joint Model of Monthly Disability and Death where ([1] = Age in Years, [2]= Frailty, [3]=Nonwhite, and [4]=Female Sex).
Credible Intervals of Individual Survival Curves from Bayesian Separate and Joint Models
We compared the credible intervals of some individual survival curves that include person-specific intercepts and were derived from the joint and separate modeling approaches. This is illustrated by comparing two individual women, W1 and W2, with the same fixed effect covariate values (white, 75 years old with slow gait at baseline). Because both women had inconsiderable functional disability before month 90, their individual patterns of functional disability, starting month 91, are presented immediately below the three graphic panels of Figure 2. W1 has recurring severe disability whileW2 has only a few isolated months of mild disability. In panel A of Figure 2 the survival probabilities calculated by the separate model for the two women are presented with 95% credible intervals. In this panel the survival probabilities of W2 are consistently higher than those of W1, reflecting their respective patterns of ordinal disability after 90 months, and exhibit large variability. Whereas the survival sub-model receives monthly updates on functional disability from the longitudinal sub-model, the separate survival model is not informed regarding the longitudinal pattern of functional disability. Because higher levels of functional disability are positively associated with probability of death, functional disability is itself a powerful time-varying covariate for survival. It has been long established that the inclusion of informative, time-varying covariates generally improves the precision of model estimates (Rubin DB, 1974). For this reason, in panel B of Figure 2, survival probabilities for the same two women (W1 and W2) are calculated from a second, stand alone survival model that includes monthly observations of functional disability as a time-varying covariate. Note that in this panel the two women’s survival probabilities are very close until month 80, after which they separate. The timing of this separation coincides with the divergence of their patterns of functional disability. The credible intervals in panel B are narrower than those in panel A, but still exhibit variability high enough to preclude the detection of a statistically significant difference. In panel C of Figure 2, the survival probabilities of the same two women are calculated from the survival sub-model. The survival probabilities in panel C are higher for both women, and exhibit precision that allows their individual curves to become statistically different from 40 months onward. Examining the panels from left to right (A to B to C) reveals two trends. The first trend shows higher estimated survival probabilities for both women as the adjustment for functional disability becomes increasingly sophisticated. This is a consequence of the reduction in the estimated association for Slow Gait, which in turn raises the survival point estimates. The second trend is towards higher precision, i.e., narrower credible intervals. The latter trend has not been previously demonstrated, but our sensitivity analyses later in this article will demonstrate this to be a function of this data set’s large number of repeated measures of both outcomes.
Figure 2.
Individual Survival Curves for Two White Women (W1 and W2) Age 75 with Frailty at Baseline and Disparate Histories of Disability from Three Bayesian Models : separate monthly survival model with no adjustment for disability (A), separate monthly survival model with adjustment for disability (B), and joint model of monthly disability and death (C).
We conclude our examination of Figure 2 by discussing why we believe that Panel C provides the best estimates, i.e., least biased and most precise, of these pairs of individual survival curves. The deviance information criterion (DIC) was used to contrast relative fit of all models. The DIC is the Bayesian analog of the corrected Akaike Information Criterion (AIC) for comparison of mixed models (Aikake H, 1973), i.e., those with random effects, and its properties are well established in the literature (Spiegelhalter DJ, et al., 2002). Like the AIC, the DIC is the sum of minus 2 times the log likelihood and a penalty for higher numbers of model terms, meaning that lower values reflect better model fit. The DIC values for Panels A, B, and C are descending in order, indicating sequential improvement of fit. The decrease in DIC between Panels A and B from inclusion of functional disability as a time-varying covariate is small compared to the larger decrease attributable to the joint model. The joint modeling literature has confirmed that unbiased statistical inference, for both longitudinal and survival sub-models, is more likely obtained from the joint modeling framework (Aalen OO, 2012; Ibrahim J, et al., 2010; Tsiatis AA, et al., 1995; Wulfsohn MS & Tsiatis AA, 1997). The individual survival curves in Panel C also demonstrate better precision. For these reasons we believe that Panel C depicts the least biased and most precise individual survival curves for these two women.
Sensitivity Analyses
The longitudinal sub-model coefficients from the four reduced datasets are presented in Table 3. Two of the fixed-in-time covariates, i.e., age at baseline and female sex, shift a trivial amount in all cases. The other fixed covariate, i.e., non-white race, exhibits higher variance and a diminished association for all subsets relative to the full dataset. The instability of the racial association emphasizes one of the limitations of the full dataset, i.e., its low representation of non-white race (9%). Its longitudinal association is further attenuated by the increasing sparsity of the longitudinal data. As expected, the estimated association of the time-varying covariate (Slow Gait) is the most strongly affected by frequency of measurement. Slow Gait exhibits a monotonically increasing association with functional disability as longitudinal data occurs less frequently.
Table 3.
Odds-Ratios (95% Credible Intervals) from a Bayesian Longitudinal Model of Ordinal Functional Disability Jointly Estimated with Interval-Censored Survival Featuring Progressive Longitudinal Removal of Monthly Observations
| Variable | All Monthly Observations | Every Third Monthly Observation | Every Sixth Monthly Observation | Every Ninth Monthly Observation | Every Twelfth Monthly Observation |
|---|---|---|---|---|---|
| Age at Baseline in Years | 1.19 (1.16, 1.21) | 1.15 (1.13, 1.17) | 1.13 (1.11, 1.16) | 1.12 (1.09, 1.14) | 1.11 (1.09, 1.14) |
| Slow Gait (time varying+) | 4.66 (4.22, 5.10) | 6.27 (5.48, 7.13) | 7.04 (5.92, 8.40) | 7.31 (5.98, 8.94) | 7.61 (6.09, 9.55) |
| Non-White Race | 1.31 (1.01, 1.70) | 1.13 (0.84, 1.52) | 1.11 (0.80, 1.54) | 1.11 (0.78, 1.58) | 1.06 (0.73, 1.52) |
| Female Sex | 1.38 (1.17, 1.62) | 1.36 (1.12, 1.65) | 1.33 (1.08, 1.65) | 1.32 (1.04, 1.66) | 1.32 (1.03, 1.68) |
Updated every 18 months
Functional Disability ordinal levels (0, 1, and 2) of disability in activities of daily living denoting none, mild, and severe functional disability
The estimated coefficients from fitting the survival sub-model to the four reduced datasets are presented in Table 4. In comparing the full dataset and the four subsets, the three fixed covariates (age at baseline, non-white race, and female sex) exhibit minute changes in point estimates and variability. As in Table 3, the estimated associations of the time-varying covariate Slow Gait exhibit positive growth as data became less frequent. Lastly, we observe that increasing the longitudinal sparsity of the data shifts the estimated HR (95% confidence interval) for Slow Gait in the survival sub-model, i.e. 2.20 (1.74 – 2.78), toward that estimated by the separate survival model in Table 2 from all monthly data, i.e., 2.39 (1.86 – 3.06).
Table 4.
Hazard-Ratios (95% Credible Intervals) from an Interval-Censored Bayesian Survival Model Jointly Estimated with Ordinal Functional Disability Featuring Progressive Longitudinal Removal of Monthly Observations
| Variable | All Monthly Observations | Every Third Monthly Observation | Every Sixth Monthly Observation | Every Ninth Monthly Observation | Every Twelfth Monthly Observation |
|---|---|---|---|---|---|
| Age at Baseline in Years | 1.09 (1.06, 1.12) | 1.07 (1.05, 1.10) | 1.07 (1.05, 1.10) | 1.07 (1.04, 1.09) | 1.07 (1.04, 1.09) |
|
| |||||
| Slow Gait (time varying+) | 1.40 (1.12, 1.79) | 1.97 (1.54, 2.52) | 2.04 (1.62, 2.60) | 2.21 (1.73, 2.81) | 2.20 (1.74, 2.78) |
|
| |||||
| Non-White Race | 0.93 (0.66, 1.30) | 0.88 (0.62, 1.24) | 0.89 (0.62, 1.25) | 0.88 (0.61, 1.25) | 0.88 (0.60, 1.25) |
|
| |||||
| Female Sex | 0.65 (0.53, 0.80) | 0.63 (0.52, 0.78) | 0.63 (0.51, 0.78) | 0.62 (0.51, 0.77) | 0.63 (0.51, 0.77) |
Updated every 18 months
Functional Disability ordinal levels (0, 1, and 2) of disability in activities of daily living denoting none, mild, and severe functional disability
The panels of Figure 3 depict the individual survival curves of the two women from the four subsets of data corresponding to every 3rd, 6th, 9th, and 12th monthly observation, respectively. We observe that the point estimates are very similar in all cases, reflecting the affinity of the corresponding coefficients in Table 3. The more meaningful distinction is that as longitudinal observations are increasingly separated, the variability of the women’s individual survival curves grows. Only the reduced dataset containing every 3rd monthly observation retains enough information to enable the statistical separation of the curves demonstrated by the full dataset in Figure 2. This implies that the gain in precision of individual survival curves from joint modeling is contingent upon having frequent measures of both outcomes.
Figure 3.
Individual Survival Curves from Two White Women (W1 and W2) Age 75 with Frailty at Baseline and Disparate Histories of Disability from Bayesian Joint Model of Monthly Disability and Death with Differing Frequencies of Follow-up Observations
Discussion
Largely ignored in the statistical and epidemiological literature is the effect of joint modeling on the precision of individual survival curves. Figures 2 and 3 suggest that in the case of many monthly observations, individual survival curves estimated from the survival sub-model, relative to those from separate survival models with or without monthly adjustment for functional disability, “borrow” information from the sub-model of longitudinal disability in a way that increases precision in the estimation of individual survival. To the authors’ knowledge, the improved precision for individual survival curves attributed here to the joint modeling approach has not been previously reported. The sensitivity analyses show that this benefit is based on the frequency of observations. Our illustrative example of individual survival curves showed that using only those observations at intervals ≥ 6 months negated the detection of significant differences.
Although the literature has established joint modeling’s potential for elimination of bias in the associations between explanatory variables and a longitudinal outcome, we have observed that there is sometimes no change in those estimated associations subsequent to joint modeling. It is notable that in the widely cited example of Guo and Carlin, there is also little change demonstrated for the parameters associated with the longitudinal outcome (Guo X & Carlin BP, 2004). In our example, joint modeling did not yield any modification of the associations with the longitudinal outcome. There was, however, a meaningful change in the magnitude of the association of the time-varying covariate in the survival sub-model. Because functional disability and the time-varying predictor (Slow Gait) are each positively associated with the probability of death, the attenuation of Slow Gait’s HR for death in the survival sub-model was in part due to information contributed by the disability sub-model. Because the other model terms were not longitudinally informative of the outcomes, their associations were not affected by the sharing of information in the joint framework.
A major strength of this study is the large dataset with up to 129 monthly observations with very little missing data and with over 50% of the participants dying during the follow-up period. The limitations of this analysis include the fact this is an observational study, implying that inference is not causal, and that bias may have been introduced by an imbalance of unmeasured covariates. Furthermore, the dataset is comprised of older persons who were members of a single health plan in a small urban area; however, the generalizability of our results is enhanced by our high participation rate, which was greater than 75%. With regards to the national population of the U.S., this study population reflects the demographic characteristics of persons aged 65 years or older in New Haven County, which are comparable to those of the United States as a whole, with the exception of race. New Haven county has a larger proportion of non-Hispanic whites in this age group than the United States, i.e., 91% vs. 84%(American Fact Finder, Accessed May 29, 2003).
Conclusions
In this paper we have applied Bayesian joint modeling to monthly observations of functional disability and interval-censored survival among a cohort of community-dwelling older adults with up to 129 months of follow-up. Joint modeling has historically been used to demonstrate the existence of bias, or lack thereof, in the associations between covariates and either a longitudinal outcome or time to survival, when these two outcomes are correlated. We’ve demonstrated here another potential use, i.e., the generation of more precise individual survival curves, by joint modeling in the presence of many repeated measures of both outcomes with negligible missing data. This improved precision allows for the potential discernment of significant differences in survival among persons with the same fixed risk factors for the longitudinal outcome. Because individual survival curves are usually characterized by large variability, the detection of these differences has been elusive. As the field of medicine strives to evaluate personalized medical regimens, higher precision individual survival curves may prove very useful.
Acknowledgments
FUNDING
Support was received from the National Institute on Aging (R37AG17560 - Gill, Allore and Murphy and 1R21AG033130-01A2 –Murphy). The study was conducted at the Yale Claude D. Pepper Older Americans Independence Center (P30AG21342). Dr. Gill is the recipient of a Midcareer Investigator Award in Patient-Oriented Research (K24AG021507) from the National Institute on Aging. This publication was also made possible in part by CTSA Grant Number UL1 RR024139 (Murphy) from the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH), and NIH roadmap for Medical research.
We thank Denise Shepard, BSN, MBA, Andrea Benjamin, BSN, Paula Clark, RN, Martha Oravetz, RN, Shirley Hannan, RN, Barbara Foster, Alice Van Wie, BSW, Patricia Fugal, BS, Amy Shelton, MPH, and Alice Kossack for assistance with data collection; Evelyne Gahbauer, MD, MPH for data management and programming; Wanda Carr and Geraldine Hawthorne for assistance with data entry and management; Peter Charpentier, MPH for development of the participant tracking system; Joanne McGloin, MDiv, MBA for leadership and advice as the PEP Project Director.
Footnotes
CONFLICT OF INTEREST STATEMENT
The authors have no conflicts relevant to this research.
References
- Aalen OO. Armitage Lecture 2010: Understanding treatment effects: the value of integrating longitudinal data and survival analysis. Statistics in Medicine. 2012;31(18):1903–1917. doi: 10.1002/sim.5324. [DOI] [PubMed] [Google Scholar]
- Aikake H. Information theory and an extension of the maximum likelihood principle. Paper presented at the 2nd International Symposium on Information Theory; Budapest. 1973. [Google Scholar]
- Albert PS, Follmann DA. Modeling repeated count data subject to informative dropout. Biometrics. 2000;56:667–677. doi: 10.1111/j.0006-341x.2000.00667.x. [DOI] [PubMed] [Google Scholar]
- Allison PD. Discrete-time methods for the analysis of event histories. Sociological Methods and Research. 1982;15:61–98. [Google Scholar]
- American Fact Finder. U.S. Census Bureau; [Accessed May 29, 2003]. http://factfinder.census.gov. [Google Scholar]
- Brooks SP, Gelman A. Alternative methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics. 1998;7:434–455. [Google Scholar]
- Cowling BJ, Hutton JL, Shaw JEH. Joint modelling of event counts and survival times. Journal Of The Royal Statistical Society Series C. 2006;55(1):31–39. [Google Scholar]
- Cox DR, Oakes D. Analysis of Survival Data. Chapter 7. London: Chapman & Hall; 1984. [Google Scholar]
- Faucett C, Schenker N, Elashoff RM. Analysis of censored survival data with intermittently observed time-dependent binary covariates. Journal of the American Statistical Association. 1998;93:427–437. [Google Scholar]
- Gelman A, Hill J. Data Analysis Using Regression and Multilevel/Hierarchical Models (Section 8.4) Cambridge University Press; 2007. [Google Scholar]
- Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences (with discussion) Statistical Science. 1992;7:457–511. [Google Scholar]
- Gill TM, Allore HG, Gahbauer EA, Murphy TE. Change in disability after hospitalization or restricted activity in older persons. Journal of the American Medical Association. 2010;304(17):1919–1928. doi: 10.1001/jama.2010.1568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gill TM, Desai MM, Gahbauer EA, Holford TR, Williams CS. Restricted activity among community-living older persons: incidence, precipitants and health care utilization (with editorial) Annals of Internal Medicine. 2001;135:313–321. doi: 10.7326/0003-4819-135-5-200109040-00007. [DOI] [PubMed] [Google Scholar]
- Gill TM, Gahbauer EA, Allore HG, Han L. Transitions between frailty states among community-living older persons. Archives of Internal Medicine. 2006;166:418–423. doi: 10.1001/archinte.166.4.418. [DOI] [PubMed] [Google Scholar]
- Gill TM, Gahbauer EA, Han L, Allore HG. Trajectories of disability in the last year of life. New England Journal of Medicine. 2010;362(13):1173–1180. doi: 10.1056/NEJMoa0909087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo X, Carlin BP. Separate and joint modeling of longitudinal and event time data using standard computer packages. The American Statistician. 2004;58(1):16–24. doi: 10.1198/0003130042854. [DOI] [Google Scholar]
- Hardy SE, Gill TM. Recovery from disability among community-dwelling older persons. Journal of the American Medical Association. 2004;291:1596–1602. doi: 10.1001/jama.291.13.1596. [DOI] [PubMed] [Google Scholar]
- Henderson RJ, Diggle PJ, Dobson A. Joint modelling of longitudinal measurements and event time data. Biostatistics. 2000;1:465–480. doi: 10.1093/biostatistics/1.4.465. [DOI] [PubMed] [Google Scholar]
- Hu W, Mengersen K, Tong S. Spatial analysis of notified cryptosporidiosis infections in Brisbane, Australia. Annals of Epidemiology. 2009;19(12):900–907. doi: 10.1016/j.annepidem.2009.06.004. [DOI] [PubMed] [Google Scholar]
- Ibrahim J, Chu H, Chen LM. Basic concepts and methods for joint models of longitudinal and survival data. Journal of Clinical Oncology. 2010;28(16):2796–2801. doi: 10.1200/JCO.2009.25.0654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lin I-F, Chang WP, Liao Y-N. Shrinkage methods enhanced the accuracy of parameter estimation using Cox models with small number of events. Journal of Clinical Epidemiology. 2013;66:743–751. doi: 10.1016/j.jclinepi.2013.02.002. [DOI] [PubMed] [Google Scholar]
- Lunn D, Spiegelhalter D, Thomas A, Best N. The BUGS project: Evolution, critique and future directions (with discussion) Statistics in Medicine. 2009;28:3049–3082. doi: 10.1002/sim.3680. [DOI] [PubMed] [Google Scholar]
- Murphy TE, Han L, Allore HG, Peduzzi PN, Gill TM, Lin H. Treatment of death in the analysis of longitudinal studies of gerontological outcomes. Journals of Gerontology Series A: Biological Sciences and Medical Sciences. 2011;66A(1):109–114. doi: 10.1093/geronaglq188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ning Y, McAvay G, Chaudhry SI, Arnold A, Allore HG. Results differ by applying distinctive multiple imnputation approaches on the longitudinal Cardiovascular Health Study data. Experimental Aging Research. 2013;39(1):27–43. doi: 10.1080/0361073X.2013.741968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prentice RL, Kalbfleisch JD, Peterson AV, Flournoy N, Farewell VT, Breslow NE. The analysis of failure times in the presence of competing risks. Biometrics. 1978;34:541–554. [PubMed] [Google Scholar]
- Rizopoulos D, Verbeke G, Lesaffre E, Vanrenterghem Y. A two-part joint model for the analysis of survival and longitudinal binary data with excess zeros. Biometrics. 2008;64(2):611–619. doi: 10.1111/j.1541-0420.2007.00894.x. [DOI] [PubMed] [Google Scholar]
- Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology. 1974;66(5):688–701. [Google Scholar]
- Spiegelhalter DJ, Best NJ, Carlin BP, van der Linde A. Bayesian measures of model complexity and fit (with discussion) Journal of the Royal Statistical Society B. 2002;64:583–640. [Google Scholar]
- Symons JM, Le HQ, Kreckman KH, Sakr CJ, Lednar WM. A Bayesian approach to occupational mortality surveillance. Annals of Epidemiology. 2009;19(9) [Google Scholar]
- Tsiatis AA, Degruttola V, Wulfsohn MS. Modeling the relationship of survival to longitudinal data measured with error, applications to survival and CD4 counts in patients with AIDS. Journal of the American Statistical Association. 1995;90:27–37. [Google Scholar]
- Wulfsohn MS, Tsiatis AA. A joint model for survival and longitudinal data measured with error. Biometrics. 1997;53:330–339. [PubMed] [Google Scholar]
- Ye W, Lin X, Taylor JMG. Semiparametric modeling of longitudinal measurements and time-to-event data - a two-stage regression calibration approach. Biometrics. 2008;64:1238–1246. doi: 10.1111/j.1541-0420.2007.00983.x. [DOI] [PubMed] [Google Scholar]



