Abstract
Objective: To create personalized estimates of future health and ability status for older adults. Method: Data came from the Cardiovascular Health Study (CHS), a large longitudinal study. Outcomes included years of life, years of healthy life (based on self-rated health), years of able life (based on activities of daily living), and years of healthy and able life. We developed regression estimates using the demographic and health characteristics that best predicted the four outcomes. Internal and external validity were assessed. Results: A prediction equation based on 11 variables accounted for about 40% of the variability for each outcome. Internal validity was excellent, and external validity was satisfactory. The resulting CHS Healthy Life Calculator (CHSHLC) is available at http://healthylifecalculator.org. Conclusion: CHSHLC provides a well-documented estimate of future years of healthy and able life for older adults, who may use it in planning for the future.
Keywords: demography, mortality, active life/physical activity, clinical gerontology, decision-making, life expectancy/longevity, quality of life
Objectives
Older adults often need to make decisions about the future, such as relocation from their current home. Those who expect a long and healthy life may plan for an active retirement and consider a resort community. Those with worse prospects may choose instead to move near their children or to a retirement community with assisted care. There are no documented tools to provide older adults with a personalized estimate of how many healthy and physically able years they may anticipate.
U.S. lifetables (U.S. Lifetable from Social Security Administration, n.d.) show the expected number of additional years of life (YOL), based on a person’s age and sex, but they do not incorporate health characteristics. There are no well-documented tools for estimating a person’s future years of healthy life (YHL), or years in which they will be able to perform basic activities of daily living (YABL).
Our objective was to develop useful and accessible estimates of future YOL, YHL, based on data from the Cardiovascular Health Study (CHS), a large longitudinal study of persons aged 65 to 99 at baseline. This manuscript describes the process of creating and evaluating the CHS Healthy Life Calculator (CHSHLC). Additional detail is available in an online working paper (Diehr et al., 2015).
Method
Data
Description of the CHS
The CHS, funded by the National Heart and Lung Blood Institute, recruited 5,201 older adults in 1990 from Medicare eligibility lists in four U.S. communities. Persons who used wheelchairs at home, were under treatment for cancer, or were not expected to participate for 3 years after baseline were ineligible. More details about the study design can be found in Fried et al. (1991). CHS followed enrollees’ health from baseline in 1990 to the analysis date (2013), providing 23 years of follow-up. A second cohort of 687 African Americans began in 1993 and now has 20 years of follow-up. Participants were contacted every 6 months and were seen in the field centers annually through 1999, and again in 2005-2006. Hundreds of health-related variables were collected at baseline and at the annual clinic visits and a small number were collected annually or semi-annually by phone throughout follow-up.
Dependent variables
Two health-related variables were measured every year after baseline. Self-rated health was a single question: “Is your health excellent, very good, good, fair, or poor?” and “Healthy” was defined as being in excellent, very good, or good health (as opposed to fair or poor health). ADLs were defined as self-reported difficulty in walking around the house, getting out of a bed or chair, feeding, dressing or bathing oneself, and getting to and using the toilet. A person who had no difficulties with any of those activities was defined as “Able.” We summed the number of years when a person was Alive (YOL), Healthy (YHL), Able (YABL), and both Healthy and Able (YHABL); (Diehr et al., 1995). These variables have been used as outcomes in other CHS publications (Diehr et al., 2008; Diehr, Patrick, Bild, Burke, & Williamson, 1998; Diehr, Thielke, O’Meara, Fitzpatrick, & Newman, 2012; Hirsch et al., 2010; Locke et al., 2013; Longstreth, Diehr, Yee, Newman, & Beauchamp, 2014). (Population average YABL is conceptually similar to “active life expectancy,” which is mentioned further in the “Method” section.)
Missing self-reported health and ADL data were imputed by linear interpolation of each person’s observed values over time. In brief, available data were transformed to a scale that included a value for death. Missing values were linearly interpolated over time for each person, and the resulting variables were transformed back to the original scale. Details are available elsewhere (Diehr, 2013). About 14.4% of the self-rated health data and about 28.9% of the ADL data had to be imputed. The latter number was larger because ADL was not collected in the necessary format from 2000 to 2004, and all had to be imputed.
For the 85% of enrollees who died between 1990 and 2013, the observed data were complete. We estimated the additional years for the remaining 15%. For example, for a person who was 65 at baseline and still alive 20 years later, the number of remaining years was estimated from persons who were age 85 and of the same sex, Healthy and Able status at baseline (Diehr et al., 2015, Appendix 1). These estimates were added to the sum of the observed data to provide lifetime data for everyone. The lifetime sums were the outcome variables for the analyses.
Potential predictor variables
CHS collected hundreds of potential predictor variables at the baseline intake. We restricted analysis to about 200 variables that had almost no missing values at baseline and that could easily be reported by the user. These requirements excluded laboratory test results, clinic measurements, and lengthy questionnaires. Space limitations do not permit listing all the variables, but they included measures of personal history, medical history, physical function, cognitive function, physical activity, social support, quality of life, and stressful life events (Diehr et al., 2015, Appendix 7).
Waves of data
For this analysis, we created four waves of data, where Wave 0 consisted of the baseline year and 20 years of follow-up for both cohorts. Wave 1, for the first cohort only, started 1 year after baseline and had 20 years of follow-up from Year 1 to 21, and similarly for Waves 2 and 3 which started 2 and 3 years after the first cohort’s baseline, respectively, and included 20 years of follow-up. There were thus 5,201 × 4 + 687 = 21,491 potential waves; because some enrollees died in the first 3 years, there were actually 20,876 waves. This approach allowed us to use all the data while maintaining the same number of years of follow-up for both cohorts, increased the number of the oldest persons available for analysis, and potentially reduced the likelihood of “healthy volunteer bias” because only about a fourth of the waves started at the true baseline. The disadvantage is that observations were not statistically independent (most persons were in the data set 4 times). As described below, that was handled by restricting analyses where independence was required to a single wave of data.
Analysis
Statistical methods
Quantities such as healthy life expectancy and active life expectancy are usually estimated from transition probability data, using multi-state lifetable methods (Crimmins & Saito, 2001; Diehr et al., 2008; Rogers, Rogers, & Branch, 2001). Here, however, we have lifetime data on the outcomes for each CHS enrollee. This allowed us to use the more flexible multiple regression methods, with the person as the unit of analysis, to screen the data and create the estimation equations.
Selection of predictor variables
The goal was to predict YOL, YHL, YABL, and YHABL for a person with certain attributes. The prediction equations, separate for men and women, needed to include age and the baseline values of Healthy and Able. We next screened the eligible baseline variables to identify a small set of variables that improved the predictions. The variables were screened in two stages. The first stage screened the 200 or so potential variables, as described below and listed in Diehr et al. (2015, Appendix Table 7.1). The second stage screened only variables that users might expect to be included (see below), to improve the face validity of the eventual calculator. Stepwise multiple regressions were used for screening.
Screening for strong predictors
The first screening forced baseline age, Healthy and Able into the regression, and then performed a forward selection regression among all of the remaining eligible baseline variables, with an alpha to enter of 0.0001. This screening used only Wave 0 data, so that observations were statistically independent and the significance levels had some meaning. Variables that were selected in all eight of the regressions (four outcomes for two sexes) were retained. The likelihood of false discovery was limited by the small alpha level and the requirement that each predictor be selected for both men and women.
Screening to improve face validity
The second screening forced in the regression variables chosen above, and then performed a forward selection among variables commonly associated with mortality, self-reported health, or functional status in CHS, even though they were not selected in the first screen (that is, their p value to enter was not less than 0.0001). A less stringent alpha level of .01 was used and the following variables were re-considered in the second screen: bed days in the past 2 weeks, blocks walked in the previous week, hospitalization in the previous year, myocardial infarction (MI), stroke, feeling about life as a whole, number of difficulties with instrumental activities of daily living (IADL), previous angioplasty, coronary bypass surgery, current diagnosis of cancer, taking insulin or hypoglycemic agents, renal disease or failure, and body mass index. Variables were retained if they were selected in all or most of the eight regressions. This screen was restricted to the Wave 3 data (which began 3 years after baseline) to ensure statistical independence and to reduce the chance of healthy cohort bias. The variables selected at this stage were included in the main prediction equation.
The final prediction equations were calculated using all waves of data, because statistical independence was no longer an issue and the larger sample improved the estimation at the oldest ages.
Internal and External Validation
Internal validation involved random assignment of 80% of the enrollees into a “training” sample and the remaining 20% into a “validation” sample. The two-stage variable screening was repeated in the training sample only, and the resulting prediction equations were applied to the validation sample. The root mean squared error (RMSE), defined as the square root of the average squared difference between observed and predicted values, was calculated. This process addressed the issues of over-fitting because the validation sample was not used in creating the prediction equations. Note that this type of validation does not test the specific variables chosen or the regression coefficients, but rather whether the methods used to create the estimates provided good estimates for the validation sample. We calculated the % of estimates that were within ±5 years of the observed values, and also for ±3.
The external validation used two outside sources of data: the current U.S. lifetable (2015) and unpublished data from a different cohort study. The life expectancies from the current U.S. lifetable are estimates of YOL. We compared the lifetable with the CHS estimates of YOL, and also to the observed data. There are no national estimates of YHL, and we found no study that was strictly comparable with CHS. Instead, we used unpublished data from the Multi-Ethnic Study of Atherosclerosis (MESA), also funded by the National Heart, Lung, and Blood Institute (NHLBI; Protocol for the MESA, 2002). MESA enrollees, who were required to be free of heart disease at baseline, have been followed for 10 years to date. Self-rated health was collected at each survey wave. Using the approach outlined above, we created new prediction equations for 10-year YOL and YHL in CHS, limited to variables that were available in both CHS and MESA, plus a variable indicating heart disease that was set to 0 for all MESA enrollees (see “Results” section). We applied the new CHS equations to the MESA enrollees aged 65 and older, and compared the mean observed and predicted values.
Creation, Documentation, and Beta Testing of the CHSHLC
We created a web-based calculator (the CHSHLC) that asked the user to provide the information for the prediction equations, and then calculated the user’s lifetime expected values. The web pages include documentation in a frequently asked question (FAQ) format. Three convenience samples of older adults were invited to use the calculator and provide feedback. After each wave, we modified the calculator to reflect the user comments. Users are now required to acknowledge that the results may not predict their personal experience, to further discourage them from over-interpreting the results (Diehr et al., 2015, Appendix 2).
Results
Predictor Variables Chosen for the CHSHLC
Descriptive statistics for the four outcome variables (YOL, YHL, YABL, and YHABL) are shown in the first four lines of Table 1. Figure 1 is a histogram of YHABL. Although mean YHABL was 7.18 years, YHABL ranged from 0 to 30. Histograms for the other three outcomes are in Diehr et al. (2015, Appendix 1).
Table 1.
Baseline wave |
All waves |
||||
---|---|---|---|---|---|
Women | Men | Women | Men | SD (all) | |
Sample size | 3,393 | 2,495 | 12,047 | 8,829 | |
YOL, years | 14.29 | 11.52 | 13.43 | 10.69 | 7.25 |
YHL, years | 9.71 | 8.36 | 9.11 | 7.69 | 6.66 |
YABL, years | 9.96 | 8.88 | 9.21 | 8.04 | 6.92 |
YHABL, years | 7.47 | 6.79 | 6.87 | 6.12 | 6.32 |
Age, years | 72.52 | 73.28 | 73.80 | 74.62 | 5.61 |
Healthy at baseline * | 0.73 | 0.76 | 0.76 | 0.77 | 0.42 |
Able at baseline * | 0.90 | 0.94 | 0.87 | 0.91 | 0.32 |
Healthy and Able at baseline * | 0.70 | 0.74 | 0.70 | 0.73 | 0.45 |
Short of breath * | 0.42 | 0.34 | 0.41 | 0.32 | 0.48 |
Diabetes * | 0.11 | 0.14 | 0.10 | 0.14 | 0.32 |
Number of prescription meds | 2.48 | 2.17 | 2.55 | 2.29 | 2.23 |
Current smoker * | 0.12 | 0.11 | 0.12 | 0.09 | 0.31 |
Former smoker * | 0.30 | 0.57 | 0.30 | 0.58 | 0.49 |
Never smoked * | 0.57 | 0.32 | 0.58 | 0.33 | 0.50 |
Years since quittinga | 19.08 | 22.62 | 19.01 | 22.81 | 13.70 |
Blocks walked per week | 32.00 | 49.38 | 29.57 | 45.85 | 52.59 |
Number of IADL difficulties | 0.44 | 0.26 | 0.49 | 0.36 | 0.90 |
Feeling about life (1-7) | 2.36 | 2.17 | 2.46 | 2.31 | 0.91 |
MI or stroke * | 0.12 | 0.24 | 0.12 | 0.24 | 0.38 |
Note. Table entries are mean values unless otherwise denoted. “*” indicates the proportion who have the indicated characteristic. YOL = years of life; YHL = years of healthy life; YABL = years of able life; YHABL = years of healthy and able life; IADL = Instrumental activities of daily living; MI = myocardial infarction.
Former smokers only.
The predictor variables were chosen in several stages, as previously described. Analyses were done separately for men and women. In the first stage, baseline age was included both as a linear and a log term, to allow the relationship to be non-linear where it was warranted. For baseline self-reported health, we included both the binary “Healthy” variable (1 if excellent, very good, or good; 0 if fair or poor) and also a recode of excellent through poor to 95, 90, 80, 30, and 15, respectively (Diehr et al., 2001). Baseline Able was coded as 0 if the person had difficulty with any of the ADLs, and 1 otherwise. (CHS had relatively few enrollees with 2 or more ADL difficulties.) Baseline HABLE was coded as 1 if the person was both Healthy and Able, 0 otherwise.
The first screen of about 200 baseline variables selected four predictors: smoking, shortness of breath, diabetes, and number of prescription drugs. Smoking history was coded as current smoker, quit <5 years ago, 5 to 9 years, 10 to 14 years, 15 to 19 years, or 20+ years ago. (Never smoked was the reference category.) Shortness of breath (coded 1 for yes, 0 for no) was based on self-report of the symptom when hurrying on the level or walking up a slight hill. Diabetes was coded 1 for persons whose doctor had told them they had diabetes and 0 otherwise. The number of prescription drugs was included on the logarithm scale to reduce the impact of outliers.
The second screen, intended to improve face validity, chose four more variables: a history of MI or stroke, blocks walked in the last week, IADL, and feeling about life as a whole. MI and stroke were combined to a single question in the calculator. Number of blocks walked in the last week (used on the log scale) is a simple measure of physical activity. IADLs (used on the log scale) were defined as difficulty with housework, shopping, meal preparation, money management, or using the telephone. Feeling about life as a whole—rated from delighted (1) to terrible (6)—was not as strong a predictor as the others (was not selected for all eight regressions).
The descriptive statistics for the four outcomes and the variables selected for the calculator are in Table 1. The first two columns are for Wave 0 (true baseline) only, and columns 3 and 4 show Waves 0 to 3 combined. YOL through YHABL are the dependent variables; for example, in the complete data set, women averaged 13.43 YOL but only 6.87 YHABL. The averages for men were a little lower. Mean age at baseline (for all waves combined) was 73.8 for women and 74.6 for men. Only 48 enrollees were aged 90 or older at the true baseline, but the extra waves of data provided a total of 245 persons over 90 for analysis (data not shown). Means for the covariates are also shown. (For binary covariates, the mean is the proportion who have the characteristic.)
Predictions
The proportion of variability explained, R2, was .37 for YOL, and .41, .40, and .41 for YHL, YABL, and YHABL, respectively. In the sex-specific regressions, age alone accounted for about 17% of the variability, baseline Healthy and Able for another 13%, the Screen 1 variables for 5% or 6%, and the Screen 2 variables account for another 2% or 3% (Diehr et al., 2015, Appendix 3).
The eight regression equations are shown in Table 2. “Coeff” is the regression coefficient and p is the significance level in the final equation. The coefficients should not be over-interpreted because the variables were chosen as the most significant predictors rather than based on theory. The study goal was prediction, not interpretation. The coefficient for age is not easily interpretable because ln(age) is also in the equation. Similarly, Healthy (binary) and self-rated health are both included, as are Able and “Healthy and Able.” None of those coefficients is directly interpretable because of multicollinearity. Three of the remaining variables were used on the log scale (ln[IADL + 1], ln[blocks walked + 1], and ln[number of medications + 1]), also making their coefficients difficult to interpret directly.
Table 2.
Coefficients for women |
||||||||
---|---|---|---|---|---|---|---|---|
YOL |
YHL |
YABL |
YHABL |
|||||
Coeff | p | Coeff | p | Coeff | p | Coeff | p | |
(Constant) | 357.818 | .000 | 523.610 | .000 | 646.474 | .000 | 544.266 | .000 |
Age | 0.651 | .002 | 1.523 | .000 | 1.918 | .000 | 1.686 | .000 |
ln(age) | −91.579 | .000 | −146.703 | .000 | −181.743 | .000 | −154.581 | .000 |
Healthy | −1.070 | .029 | −3.225 | .000 | −2.855 | .000 | −4.555 | .000 |
SRH (0-100) | 0.052 | .000 | 0.111 | .000 | 0.059 | .000 | 0.094 | .000 |
Able | 0.269 | .307 | −0.709 | .004 | 1.106 | .000 | −0.632 | .005 |
HABLE | −0.051 | .881 | 1.864 | .000 | 1.737 | .000 | 3.330 | .000 |
Shortness of breath | −0.590 | .000 | −1.173 | .000 | −1.044 | .000 | −1.280 | .000 |
Diabetes | −1.993 | .000 | −1.923 | .000 | −1.721 | .000 | −1.521 | .000 |
ln(number of meds) | −0.606 | .000 | −0.834 | .000 | −0.907 | .000 | −0.975 | .000 |
Current smoker | −3.479 | .000 | −3.141 | .000 | −2.841 | .000 | −2.505 | .000 |
Quit <5 years | −2.222 | .000 | −2.338 | .000 | −1.813 | .000 | −1.613 | .000 |
Quit 5-9 years | −1.969 | .000 | −1.630 | .000 | −1.588 | .000 | −1.238 | .000 |
Quit 10-14 years | −1.669 | .000 | −1.563 | .000 | −1.635 | .000 | −1.269 | .000 |
Quit 15-19 years | −1.596 | .000 | −1.228 | .000 | −1.202 | .000 | −1.009 | .000 |
Quit 20+ years | −0.755 | .000 | −0.406 | .005 | −0.598 | .000 | −0.462 | .000 |
ln(blocks + 1) | 0.381 | .000 | 0.351 | .000 | 0.487 | .000 | 0.402 | .000 |
ln(number IADL difficulties) | −1.026 | .000 | −0.935 | .000 | −1.650 | .000 | −1.162 | .000 |
Feeling about life as a whole | −0.175 | .006 | −0.387 | .000 | −0.110 | .063 | −0.305 | .000 |
MI or stroke | −2.139 | .000 | −1.592 | .000 | −1.403 | .000 | −1.006 | .000 |
Coefficients for men |
||||||||
YOL |
YHL |
YABL |
YHABL |
|||||
Coeff | p | Coeff | p | Coeff | p | Coeff | p | |
(Constant) | 300.339 | .000 | 405.229 | .000 | 468.808 | .000 | 422.927 | .000 |
Age | 0.518 | .022 | 1.092 | .000 | 1.253 | .000 | 1.201 | .000 |
ln(age) | −76.107 | .000 | −111.684 | .000 | −128.981 | .000 | −117.919 | .000 |
Healthy | −0.475 | .402 | −3.333 | .000 | −2.641 | .000 | −4.445 | .000 |
SRH (0-100) | 0.035 | .000 | 0.099 | .000 | 0.051 | .000 | 0.084 | .000 |
Able | 0.053 | .871 | −0.976 | .001 | 1.062 | .001 | −0.653 | .019 |
HABLE | 0.255 | .545 | 1.744 | .000 | 1.816 | .000 | 3.216 | .000 |
Shortness of breath | −0.635 | .000 | −1.118 | .000 | −0.845 | .000 | −1.082 | .000 |
Diabetes | −1.643 | .000 | −1.611 | .000 | −1.609 | .000 | −1.473 | .000 |
ln(number of meds) | −1.263 | .000 | −1.155 | .000 | −1.214 | .000 | −1.044 | .000 |
Current smoker | −3.631 | .000 | −3.179 | .000 | −3.250 | .000 | −2.825 | .000 |
Quit <5 years | −2.731 | .000 | −2.361 | .000 | −2.131 | .000 | −1.992 | .000 |
Quit 5-9 years | −2.652 | .000 | −2.265 | .000 | −2.205 | .000 | −1.902 | .000 |
Quit 10-14 years | −1.728 | .000 | −1.169 | .000 | −1.406 | .000 | −1.030 | .000 |
Quit 15-19 years | −1.032 | .000 | −0.795 | .000 | −1.199 | .000 | −0.959 | .000 |
Quit 20+ years | −0.436 | .002 | −0.511 | .000 | −0.521 | .000 | −0.508 | .000 |
ln(blocks + 1) | 0.388 | .000 | 0.281 | .000 | 0.389 | .000 | 0.284 | .000 |
ln(number IADL difficulties) | −1.205 | .000 | −0.845 | .000 | −1.181 | .000 | −0.834 | .000 |
Feeling about life as a whole | −0.269 | .000 | −0.411 | .000 | −0.180 | .007 | −0.329 | .000 |
MI or stroke | −1.806 | .000 | −1.442 | .000 | −1.466 | .000 | −1.224 | .000 |
Note. YOL = years of life; YHL = years of healthy life; YABL = years of able life; YHABL = years of healthy and able life; IADL = Instrumental activities of daily living; MI = myocardial infarctio; SRH = Self-Rated Health.
The remaining coefficients are more easily interpretable. For example, for women, shortness of breath was associated with 0.6 fewer YOL and 1.2 fewer YHL, after controlling for the other variables in the equation. Variables were highly statistically significant with a few exceptions that can be attributed to multicollinearity. This is not surprising, given the way the variables were chosen.
Descriptive statistics for predictions at age 70
Table 3 provides an example of the predictions for 70-year-old women and men at several percentiles of health. For example, in row 1, for 70-year-old CHS women, mean observed YOL was 16.04 years, comparing favorably with a mean predicted value of 15.82 years. Unlike the U.S. lifetable estimate (16.33 years for all 70-year-old-women), we obtained a range of estimates based on personal characteristics. The 5th percentile of the predicted values was 10.80 years, the median was 16.32 years, and the 95th percentile was 18.98 years. For 70-year-old men, the estimates of YOL were lower than for women, and the mean was slightly less than the lifetable estimate.
Table 3.
Mean |
Percentiles of predicted |
||||
---|---|---|---|---|---|
Observed | Predicted | 5% | 50% | 95% | |
Women (U.S lifetable = 16.33 YOL) | |||||
YOL | 16.04 | 15.82 | 10.80 | 16.32 | 18.98 |
YHL | 11.03 | 10.96 | 3.57 | 11.72 | 15.22 |
YABL | 11.65 | 11.48 | 5.49 | 12.05 | 15.18 |
YHABL | 8.71 | 8.67 | 2.13 | 9.40 | 12.70 |
Men (U.S. lifetable = 14.03 YOL) | |||||
YOL | 13.47 | 13.27 | 7.69 | 13.68 | 17.11 |
YHL | 9.76 | 9.52 | 2.74 | 10.21 | 13.78 |
YABL | 10.69 | 10.52 | 4.32 | 11.04 | 14.39 |
YHABL | 8.36 | 8.18 | 1.88 | 8.84 | 12.06 |
Note. YOL = years of life; YHL = years of healthy life; YABL = years of able life; YHABL = years of healthy and able life.
There is no national standard for YHL, YABL, or YHABL. The tabled results show that the mean observed and predicted values are close to each other, and that there is a large range of predicted values for both men and women. The CHSHLC estimates are thus close to the national standard (for YOL) and to the observed data, and produce a wide range of estimates rather than estimating everyone at the mean.
Internal and External Validity
Internal validity
To assess internal validity, we repeated the process for creating the prediction rules in the training sample and applied the resulting rules to the validation sample. The same four variables were selected in the first screen of the training sample as in the overall analysis. The RMSE was nearly identical in the training and validation samples; that is, the prediction was nearly as good in the validation sample as in the training sample (Diehr et al., 2015, Appendix 4).
Because RMSE is difficult to interpret, we instead present Table 4, which shows the percentage of estimates that were within ±5 (or 3) years of the observed data. First, consider the column for YOL. Only 42% of the predicted values for 65- to 69-year-olds were within ±5 years of the observed values, but the results improved with age. Prediction was better for YHL, YABL, and YHABL than for YOL. The lower part of the table shows the percent of estimates within 3 years of the observed values. Tables for the % more than 5 years away from the observed are in Diehr et al. (2015, Appendix 5). The percent more than 5 years too high was roughly comparable to the percent more than 5 years too low. These percentages can be approximated from Table 4 as (100 − % within 5 years) / 2. Personalized percentages are presented in the CHSHLC, taken from a regression of a binary variable “within 5 years” on age, sex, and the estimate (equation not shown). An example of the output is shown below.
Table 4.
Lifetable | YOL | YHL | YABL | YHABL | |
---|---|---|---|---|---|
65.00 | 36 | 42 | 55 | 48 | 58 |
70.00 | 49 | 55 | 68 | 62 | 73 |
75.00 | 59 | 67 | 78 | 76 | 83 |
80.00 | 73 | 76 | 85 | 88 | 91 |
85.00 | 88 | 82 | 84 | 89 | 91 |
90.00 | 98 | 78 | 78 | 84 | 86 |
95.00 | 100 | 100 | 100 | 100 | 100 |
100.00 | 100 | 100 | 100 | 100 | 100 |
Lifetable | YOL | YHL | YABL | YHABL | |
---|---|---|---|---|---|
65.00 | 25 | 29 | 34 | 29 | 36 |
70.00 | 32 | 35 | 46 | 40 | 50 |
75.00 | 39 | 42 | 54 | 52 | 61 |
80.00 | 49 | 51 | 63 | 63 | 70 |
85.00 | 59 | 61 | 65 | 73 | 73 |
90.00 | 76 | 60 | 62 | 61 | 65 |
95.00 | 67 | 80 | 67 | 80 | 87 |
100.00 | 100 | 100 | 100 | 75 | 75 |
Note. YOL = years of life; YHL = years of healthy life; YABL = years of able life; YHABL = years of healthy and able life.
External validity
We first compared predicted YOL with the lifetable estimates. For the entire CHS sample, the mean lifetable values were about .07 years higher than the predicted YOL for men and were about .4 years lower for women, which is reasonably close. In Table 4, only 36% of the lifetable values for 65-year-olds were within ±5 years of the observed values, as compared with 42% for YOL. Agreement between YOL and the lifetable values was quite good on average. Thus, today’s lifetable applied reasonably well to the CHS cohort in 1990. Predicted YOL had a slightly smaller RMSE than the lifetable estimates, probably because it used covariates (data not shown).
We next applied the CHS prediction equations to the MESA data (Diehr et al., 2015, Appendix 6.). The MESA population was healthier than the CHS population, because of the difference in eligibility criteria described above. The 10-year CHS predictions underestimated observed MESA data by .3 years for YOL and .6 years for YHL for women, and by .6 and .5 years, respectively, for men. The fit was better at the younger ages. MESA started data collection in about 2000, 10 years later than CHS. This under-prediction may suggest that the CHSHLC will be a little conservative for today’s users, on the order of 6 months in the first 10 years. These results did not involve the actual variables or equations used in the CHSHLC, but do show that the method used to create the CHSHLC could provide reasonable predictions in a later data set.
The CHSHLC
The web-based calculator for the CHSHLC is available at http://healthylifecalculator.org/. As an example of the CHSHLC, consider “Mary,” who is 70 years old and would like to put off making any major changes until she is about 80 (10 years from now). Mary is quite healthy, giving the best possible answers to all the CHSHLC questions. Her prediction results are here.
You answered that you are a woman, 70 years old. In our database, people like you (who gave similar answers on these questions) lived, on average, to be 90.0 years old. During these remaining 20.0 years, people like you enjoyed 16.8 years of Healthy life, 16.5 years of Able life, and 14.2 years in which they were both Healthy and Able.
▼How likely is it that I’ll do better?
About half of the people like you did better than their estimates.
Furthermore, approximately . . .
29% | had more than | 25 | years of life (YOL) |
28% | had more than | 21 | years of healthy life (YHL) |
29% | had more than | 21 | years of able life (YABL) |
26% | had more than | 19 | years of healthy and able life (YHABL) |
▼How likely is it that I’ll do worse?
About half of the people like you did worse than their estimates.
Furthermore, approximately . . .
29% | had fewer than | 16 | years of life (YOL) |
28% | had fewer than | 12 | years of healthy life (YHL) |
29% | had fewer than | 12 | years of able life (YABL) |
26% | had fewer than | 10 | years of healthy and able life (YHABL) |
Discussion
We created prediction equations for lifetime YOL, YHL, YABL, and YHABL from a unique data set that had 200 potential predictors and 23 years of follow-up. From them, we created the CHSHLC, for persons aged 65 and older. Documentation is provided for the methods used and the probable accuracy of the predictions.
The predictions should be useful for planning. For example, Mary, who wants to avoid making changes for 10 years, might reason that she will be both healthy and able for 14.2 years, which should allow her to defer thinking about changes until she is 80. But she also has about a 26% chance of having fewer than 9.2 YHABL, and so might prefer to make her plans sooner.
Other Calculators
We have compared our YOL estimates with the U.S. lifetable. Other predictors of life expectancy exist on the Internet, but there is no formal way to compare them with the CHSHLC predictions, because of their lack of documentation or their use of variables not in the CHS data set. We have found no other individual-level predictions of YHL or YABL.
Limitations
The CHS data were well suited for the development of a health-prediction calculator because few assumptions needed to be made about life span and YHL. That is, the outcomes were completely observed for 85% of the sample, and only the final few years needed to be estimated for the others. But the CHS enrollees may not have been representative of all older adults. Eligibility criteria and the likely healthy volunteer effect may also have contributed to a healthier sample. If so, predictions could be too optimistic. Because CHS did not start out with many people who were very old or very sick, predictions may be less accurate for such people. The inclusion of later waves of data may have mitigated these effects. As our average YOL predictions were close to the values in the current U.S. lifetable, these potential problems may not have existed, or their effects may have averaged out.
We restricted the prediction analysis to CHS variables that could be self-reported and were rarely missing. Some important features specific to the health of users may not have been taken into account. Their parents may have lived well into their 90s, or they may have a serious disease that was not used in the calculator. Those specific features may already have been accounted for by the health and medication information that were included. The small improvement in the overall R2 at each step suggests that additional variables would not have made much overall improvement, even if they did improve predictions for some users.
We could instead have chosen predictor variables in advance, based on theory, and emphasized mutable health behaviors. But that approach might have missed the strongest predictors, such as shortness of breath, or required a much longer calculator. Our approach does not allow us to make individual recommendations about how users might improve their health, but such recommendations were never our intent. Ample health advice is available from other sources.
Other screening approaches might have selected different or even better predictors. Some of the variables removed from consideration might have been stronger predictors in some of the regressions. We might have used a more complex regression model. Interactions with age were considered but not used because they seemed to contribute to over-fitting. Linear regression was used because our goal was to estimate average YOL, YHL, YABL, and YHABL on the original scale (Lumley, Diehr, Emerson, & Chen, 2002). Forward selection was a practical approach for screening the hundreds of available variables. For comparison, we considered a one-step screening approach with an alpha level of .01 for inclusion and no restriction that the variables be the same in all eight regressions. This approach ended up with about 3 times as many predictor variables in each equation, probably included more variables that were significant by chance alone, and improved R2 by only about .02 (Diehr et al., 2015, Appendix 3). Our approach seemed satisfactory.
The CHSHLC assumes that a user who is 70 years old today is similar to a person in CHS who was 70 in 1990. There have been many improvements in public health, health behaviors, and health care since then, suggesting that the CHSHLC may be pessimistic. Different changes such as the increases in antibiotic resistant bacteria could have the opposite effect. (Standard lifetable calculations rely on a related assumption that mortality rates calculated for persons currently aged 70 will still apply when a person born today reaches 70.) The strong agreement between the current lifetable and YOL suggests that this concern may not be serious, although the MESA comparison may suggest some underestimation.
Are YHL and YABL Important to Older Adults?
Older adults may disagree about the relative importance of YOL and YHL. For example, in one recent study of heart failure, about half the patients preferred treatments that prolonged survival while a different group favored strategies that reduced survival time but improved quality of life (MacIver et al., 2008). Persons for whom survival is the main consideration might obtain predictions elsewhere. But persons who want to estimate their YHL, YABL, or YHABL will need to use our calculator.
Older adults are also concerned about cognitive decline. Being healthy and able does not guarantee that a person will be cognitively capable. On average, cognitive function in CHS declined at a slower rate than did self-rated health and ADL ability (Diehr, Thielke, Newman, Hirsch, & Tracy, 2013; Diehr, Williamson, Burke, & Psaty, 2002). About 75% of CHS enrollees had more years of “life with good cognition” than they had years of “healthy and able” life (Diehr et al., 2015, Appendix 8). Thus, most users will have good cognition during their healthy and able years, and plans based only YHABL should be reasonable.
Conclusion
We created a personalized and well-documented calculator for future YOL, YHL, and YABL. The YOL estimates from the CHSHLC are, on average, comparable with the current U.S. lifetables but give a wider range of estimates. Most important, the calculator also estimates the number of years in which the user will be healthy and/or able to perform the ADLs, which are relevant to many life decisions. This seems to be the only published calculator for years of healthy, able, or healthy and able life. For that reason, the CHSHLC should, with proper caveats, be a useful planning tool for older adults.
Footnotes
Authors’ Note: A full list of principal CHS investigators and institutions can be found at CHS-NHLBI.org.
Declaration of Conflicting Interests: The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding: The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by contracts HHSN268201200036C, HHSN268200800007C, N01HC55222, N01HC85079, N01HC85080, N01HC85081, N01HC85082, N01HC85083, N01HC85086, and grant U01HL080295 from the National Heart, Lung, and Blood Institute (NHLBI), with additional contribution from the National Institute of Neurological Disorders and Stroke (NINDS).Additional support was provided by R01AG023629 from the National Institute on Aging (NIA).
References
- Crimmins E. M., Saito Y. (2001). Trends in healthy life expectancy in the United Sates, 1970-1990: Gender, racial, and educational differences. Social Science & Medicine, 52, 1629-1641. [DOI] [PubMed] [Google Scholar]
- Diehr P. (2013, August). Methods for dealing with death and missing data, and for standardizing different health variables in longitudinal datasets: The Cardiovascular Health Study. UW Biostatistics Working Paper Series (Working Paper No. 390). Retrieved from http://biostats.bepress.com/uwbiostat/paper390.
- Diehr P., Diehr M., Arnold A. M., Yee L., Odden M. C., Hirsch C. H., . . . Newman A. B. (2015, May). Predicting future years of life, health, and functional ability: A healthy life calculator for older adults (UW Biostatistics Working Paper Series 407). Retrieved from http://biostats.bepress.com/uwbiostat/paper407 [DOI] [PMC free article] [PubMed]
- Diehr P., O’Meara E. S., Fitzpatrick A., Newman A. B., Kuller L., Burke G. (2008). Weight, mortality, years of healthy life and active life expectancy in older adults. Journal of the American Geriatrics Society, 56, 76-83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Diehr P., Patrick D. L., Bild D. E., Burke G. L., Williamson J. D. (1998). Predicting future years of healthy life for older adults. Journal of Clinical Epidemiology, 51, 343-353. [DOI] [PubMed] [Google Scholar]
- Diehr P., Patrick D. L., Hedrick S., Rothman M., Grembowski D., Raghunathan T. E., Beresford S. (1995). Including deaths when measuring health status over time. Medical Care, 33, AS164-AS172. [PubMed] [Google Scholar]
- Diehr P., Patrick D. L., Spertus J., Kiefe C. I., McDonell M., Fihn S. D. (2001). Transforming self-rated health and the SF-36 Scales to include death and improve interpretability. Medical care, 39, 670-680. [DOI] [PubMed] [Google Scholar]
- Diehr P., Thielke S., O’Meara E., Fitzpatrick A., Newman A. (2012). Comparing years of healthy life, measured in 16 ways, for normal weight and overweight older adults. Journal of Obesity. Retrieved from http://www.hindawi.com/journals/jobes/2012/894894/ [DOI] [PMC free article] [PubMed]
- Diehr P., Williamson J., Burke G., Psaty B. (2002). The aging and dying process and the health of older adults. Journal of Clinical Epidemiology, 55, 269-278. [DOI] [PubMed] [Google Scholar]
- Diehr P. H., Thielke S. M., Newman A. B., Hirsch C. H., Tracy R. (2013). Decline in health for older adults: Five-year change in 13 key measures of standardized health. Journals of Gerontology. Series A: Biological Sciences & Medical Sciences, 68, 1059-1067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fried L. P., Borhani N. O., Enright P., Furberg C. D., Gardin J. M., Kronmal R. A., . . . Newman A. (1991). The cardiovascular health study: Design and rationale. Annals of Epidemiology, 1, 263-276. [DOI] [PubMed] [Google Scholar]
- Hirsch C. H., Diehr P., Newman A. B., Gerrior S. A., Pratt C., Lebowitz M. D., Jackson S. A. (2010). Physical activity and years of healthy life in older adults: Results from the Cardiovascular Health Study. Journal of Aging and Physical Activity, 18, 313-334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Locke E., Thielke S., Diehr P., Wilsdon A. G., Barr R., Hansel N., . . . Fan V. S. (2013). Effects of respiratory and non-respiratory factors on disability among older adults with airway obstruction: The Cardiovascular Health Study. COPD, 10, 588-596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Longstreth W., Diehr P. H., Yee L., Newman A. B., Beauchamp N. (2014). Brain imaging findings in the elderly and years of life, healthy life, and able life over the ensuing 16 years: The Cardiovascular Health Study. Journal of the American Geriatrics Society, 62, 1838-1843. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lumley T., Diehr P., Emerson S., Chen L. (2002). The importance of the normality assumption in large public health data sets. Annual Review of Public Health, 23, 151-169. [DOI] [PubMed] [Google Scholar]
- MacIver J., Rao V., Delgado D. H., Desai N., Ivanov J., Abbey S., Ross H. J. (2008). Choices: A study of preferences for end-of-life treatments in patients with advanced heart failure. Journal of Heart and Lung Transplantation, 27, 1002-1007. [DOI] [PubMed] [Google Scholar]
- Protocol for the Multi-Ethnic Study of Atherosclerosis. (2002). Retrieved from http://www.MESA-nhlbi.org/publicDocs/Protocol/MESAProt000225-updated.doc
- Rogers A., Rogers R. G., Branch L. G. (2001). A multistate analysis of active life expectancy. Public Health Reports, 104, 222-225. [PMC free article] [PubMed] [Google Scholar]
- U.S. Lifetable from Social Security Administration. (n.d.). Retrieved from http://www.socialsecurity.gov/oact/population/longevity.html