Skip to main content
American Journal of Epidemiology logoLink to American Journal of Epidemiology
. 2013 Jul 12;178(6):974–983. doi: 10.1093/aje/kwt054

Predicting Life Expectancy for Community-dwelling Older Adults From Medicare Claims Data

Alai Tan *, Yong-Fang Kuo, James S Goodwin
PMCID: PMC3775541  PMID: 23851579

Abstract

Estimates of life expectancy are useful in assessing whether different prevention strategies are appropriate in different populations. We developed sex-specific Cox proportional-hazard models that use Medicare claims data to predict life expectancy and risk of death at up to 10 years for older adults. We identified a cohort of Medicare beneficiaries 66–90 years of age from the 5% Medicare claims data in 2000 (n = 1,137,311) and tracked each subject's vital status until December 31, 2009. Subjects were split randomly into training and validation samples. Models were developed from the training sample and validated by comparison of predicted to actual survival in the validation sample. The C statistics for the models including predictors of age and Elixhauser comorbidities were 0.76–0.79 for men and women for prediction of death at the 1-, 5-, 7-, and 10-year follow-up periods. More than 80% of subjects with <25% risk of death at 5, 7, and 10 years survived longer than the chosen cutoff years. More than 80% of subjects with ≥75% risk of death at 5, 7, and 10 years died by those cutoff years. The models overestimated the risk of death at 1 year for the high-risk groups. Sex-specific models that use age and Elixhauser comorbidities can accurately predict patient life expectancy and risk of death at 5–10 years.

Keywords: aged, comorbidity, life expectancy, Medicare, mortality, prognosis


Life expectancy is an important factor in assessing the quality of specific preventive services for the elderly (1, 2). For example, in randomized screening mammography trials, no significant mortality rate reduction was found for screened versus unscreened women until 7 years’ follow-up for women 65–70 years of age (3). There is also an approximately 4-year lag between screen-detected tumors and those detected clinically (4, 5). Thus, screening mammography in women with limited life expectancy exposes them to unnecessary breast cancer diagnoses and treatment with no clear benefit. Similar reasoning can be applied to colorectal cancer screening (6, 7) and tight glycemic control for patients with diabetes (8, 9).

Administrative claims data are a key resource for population-based assessments of health-care utilization and outcomes (10, 11). However, prior studies of the extent and appropriateness of health-care utilization have used mainly age-based guidelines (12) that fail to take into account the heterogeneity in health and life expectancy among the elderly (13). Recent attempts to evaluate the overutilization of screening services have focused on specific conditions known to result in limited life expectancy, such as dementia and advanced cancer (14, 15). Few studies (6, 16) have evaluated the use of preventive services in the context of patient life expectancy in the general older population.

The comorbidity measures described by Charlson et al. (17) and Elixhauser et al. (18) have been validated in various settings to predict hospitalization, length of hospitalization, and death. Both are well adapted for use in administrative claims data to adjust for risks and predict death (1922), but they have not been used for life expectancy prediction. Our study adaptively integrated these comorbidity measures into a model to predict patient life expectancy for the older population from Medicare claims data.

MATERIALS AND METHODS

Data sources

The study used the 5% Medicare claims data (which contain all claims for a randomly selected 5% sample of Medicare beneficiaries) for 1999–2009. The files used include 1) Medicare enrollment files, which include yearly information on patient demographics, monthly eligibility and enrollment information, and vital status; 2) carrier files (claims for physician services); 3) outpatient statistical analysis files (claims for hospital outpatient visits); and 4) Medicare Provider Analysis and Review files (claims for hospital stays).

Study population

We identified 1,137,311 Medicare beneficiaries from the 5% Medicare claims data in 2000 who 1) were 66–90 years of age as of January 1, 2000; 2) had full coverage in Medicare Parts A (hospital care) and B (physician and outpatient services) in 1999; and 3) had no health maintenance organization coverage at any time in 1999. Health maintenance organization enrollees were excluded because their medical services are not fully captured by Medicare claims data. Information on demographics, Medicare entitlement, Parts A and B coverage, and health maintenance organization enrollment was obtained from the Medicare enrollment files.

Measures

Survival was tracked from January 1, 2000, through December 31, 2009, via the field for date of death in the Medicare enrollment files for 2000–2009. Information on month and year of death in the Medicare enrollment file is considered nearly 100% accurate (23). In the event of an occasional nonvalidated day of death, the Center for Medicare and Medicaid Services assigns the day as the last day of the month (23).

Potential predictors of death included age, sex, comorbidity measures, number of hospitalizations, and number of outpatient visits in the previous 12 months (i.e., during 1999). Age and sex were obtained from the Medicare entitlement file. The number of hospitalizations and number of outpatient visits were extracted from the claims for inpatient and physician services, respectively.

We used 2 comorbidity methods, those of Charlson et al. (17) and Elixhauser et al. (18). The Charlson method generates a weighted index score that is based on 17 comorbid conditions. Several adaptations of the Charlson comorbidity method have been developed (19, 20, 22). Comparative studies (24, 25) have shown that these adaptations were essentially equivalent in terms of predictive and explanatory power. The Elixhauser method includes 31 comorbid conditions listed individually. It contains all the conditions in the Charlson method except myocardial infarction, cerebrovascular diseases, dementia, and leukemia. These 4 conditions were excluded because they were not significantly associated with the prognostic outcomes when the Elixhauser method was developed. We used the International Classification of Diseases, Ninth Revision, coding algorithm of Quan et al. (21) to identify patient comorbid conditions from Medicare claims data according to the Charlson and Elixhauser methods. The claims for physician, outpatient, and inpatient services for the year 1999 were used for comorbidity extraction. For physician and outpatient claims, we considered valid comorbid conditions as those with corresponding diagnoses appearing on 2 or more claims at least 30 days apart (26). For diagnoses on inpatient claims, only 1 claim was required.

Model development and validation

We randomly split the data into training and validation samples. We used the training sample for model selection and applied the model estimation to the validation sample. First, we used a series of logistic regression models with various combinations of predictors to predict death by 1, 5, 7, and 10 years’ follow-up. C statistics were used to compare the predictive ability of models with various combinations of predictors. We chose the set of predictors with the optimal C statistics for the final life expectancy model. The nonparametric approach by DeLong et al. (27) was used to compare C statistics. The Akaike information criterion (AIC) and percent correctly classified (PCC) were also used to compare the models. The AIC is a measure of relative goodness-of-fit of competing models that also considers parsimony (i.e., models with fewer predictors are scored lower). The PCC measures the model accuracy. Higher C statistics, lower AIC, and higher PCC indicate a better model.

We used sex-specific Cox proportional-hazard models because men and women differ in illness incidence, prognosis, and mortality rate (28, 29), and there were also many significant interactions between sex and comorbidities on mortality rate in our data. The coefficient estimates and baseline survival function derived from the training sample were applied to subjects in the validation sample to calculate individual median survival time from January 1, 2000, as a proxy of life expectancy. Kaplan-Meier survival curves were used to plot the actual survival of subjects over the 10-year follow-up period for subjects with predicted life expectancies of <1, <5, <7, and <10 year(s). We calculated the probability of survival at 1-, 5-, 7-, and 10-year follow-up from the estimates from the Cox proportional-hazard models. We then categorized subjects into groups with <25%, 25%–49%, 50%–74%, and ≥75% risk of death at each follow-up period. The observed mortality rate was calculated for each risk group. For each cutoff (1, 5, 7, and 10 years), we also evaluated the validity of our predicted probabilities by using estimates of γ0, γ1, and the C statistic from a logistic regression model, logit(p) = γ+ γ1logit(^p), where p is the observed survival rate and ^p is the predicted survival probability. Estimated values of γ0 close to 0 and γ1 close to 1 indicate good calibration, whereas a higher value of the C statistic indicates better discrimination (30). We used SAS, version 9.2 (SAS Institute, Inc., Cary, North Carolina) for data extraction and statistical analyses.

RESULTS

Table 1 presents the distribution of age and comorbidities for subjects in the training sample by sex. The average age was 75.9 (standard deviation, 6.5) years for women and 74.7 (standard deviation, 6.1) years for men. Uncomplicated hypertension was the most frequent comorbidity for both sexes (38.4% for women and 30.1% for men), followed by uncomplicated diabetes (11.8% for women and 12.5% for men), cardiac arrhythmia (10.3% for women and 12.8% for men), and chronic pulmonary disease (9.7% for women and 11.3% for men). Table 1 also presents overall age-adjusted 10-year mortality rate and the mortality rate associated with each comorbidity. The estimates were adjusted for age but not for the other comorbidities. Comorbidities with the highest 10-year mortality rate included metastatic cancer, renal failure, alcohol abuse, drug abuse, and dementia—all with age-adjusted 10-year mortality rates of >85% in both men and women. Sample characteristics and death rates were almost identical for subjects in the training sample and the validation sample. None of the values in the validation sample differed from the training sample by more than 0.1% (data not shown).

Table 1.

Comorbidities and Mortality Rates at 10 Years, by Sex, of the Training Sample of Medicare Beneficiaries Used to Develop Models to Predict Life Expectancy From Medicare Claims Data, United States, 2000–2009a,b

Comorbidity Women (n = 338,382)
Men (n = 230,274)
Proportion of Sample With Comorbidity, % Age-adjusted 10-year Mortality Rate,c % Proportion of Sample With Comorbidity, % Age-adjusted 10-year Mortality Rate,c %
All 50.6 56.7
Circulatory diseases
 Congestive heart failured,e 8.2 82.3 8.8 85.8
 Cardiac arrhythmiad 10.3 68.0 12.8 71.8
 Myocardial infarctione 2.0 73.0 3.6 74.3
 Valvular diseased 4.0 68.8 4.1 73.1
 Pulmonary circulation diseased 0.9 83.2 0.8 86.0
 Peripheral vascular diseased,e 5.9 75.2 6.3 80.2
 Cerebrovascular diseasee 5.7 74.2 6.1 77.2
 Uncomplicated hypertensiond 38.4 54.0 30.1 60.3
 Complicated hypertensiond 4.5 65.2 4.0 71.3
Respiratory, digestive, and renal diseases
 Chronic pulmonary diseased,e 9.7 75.3 11.3 80.0
 Renal failured,e 1.3 90.9 2.0 92.2
 Peptic ulcer disease excluding bleedingd,e 0.9 68.4 0.8 73.5
 Liver diseased,e 0.8 77.1 0.9 82.1
Endocrine, nutritional, and metabolic diseases
 Diabetes without chronic complicationsd,e 11.8 63.5 12.5 68.0
 Diabetes with chronic complicationsd,e 3.1 77.8 3.2 81.2
 Hypothyroidismd 10.4 56.4 3.4 63.6
 Fluid and electrolyte disordersd 6.0 79.3 4.4 84.2
 Weight lossd 1.5 83.8 1.2 87.8
 Obesityd 1.1 67.5 0.7 72.0
Neoplasms
 Solid tumor without metastasisd,e 4.4 58.2 9.3 62.3
 Metastatic cancerd,e 0.8 89.5 0.9 91.4
 Lymphomad 0.5 78.2 0.6 82.6
Blood diseases
 Deficiency anemiad 3.4 71.0 2.7 76.7
 Chronic blood loss anemiad 0.9 76.5 0.8 81.7
 Coagulopathyd 1.6 73.8 2.1 76.9
 Leukemiae 0.2 75.9 0.2 80.8
Disease of nervous system or connective tissue
 Paralysisd,e 0.7 81.8 0.7 84.8
 Neurological disorders other than paralysisd 2.9 82.2 3.3 85.4
 Rheumatoid arthritis/collagen diseased,e 3.3 63.1 1.6 67.8
Mental disorders 0.0
 Depressiond 5.0 72.4 2.7 78.6
 Dementiae 2.6 93.3 1.7 95.7
 Psychosesd 1.3 84.2 0.9 88.2
Infectious diseases and substance abuse
 Alcohol abused 0.2 85.5 0.8 88.0
 Drug abused 0.2 86.3 0.2 88.9
 Acquired immunodeficiency syndromed,e 0.0 75.6 0.0 77.2

a The subjects were identified from the 5% Medicare claims data in 2000. Survival was tracked from January 1, 2000, through December 31, 2009. We randomly split the data into training and validation samples. All values in the validation sample were within 0.1% of those in the training sample.

b For each comorbidity, the difference between the sexes in age-adjusted 10-year mortality rate was tested through the use of a logistic regression model with age, sex, the specified comorbidity, and interaction between sex and comorbidity. All differences between the sexes in age-adjusted 10-year mortality rate are statistically significant (P < 0.001). The statistical tests are overpowered because of large sample size. We emphasize the absolute differences between the sexes. For all comorbidities, men had 1.3%–6.3% higher age-adjusted 10-year mortality rates than women.

c The rates were adjusted for age but not for other comorbidities.

d Included in the comorbidity method of Elixhauser et al. (18).

e Included in the comorbidity method of Charlson et al. (17).

Table 2 shows the C statistics from a series of logistic regression models for 1-, 5-, 7-, and 10-year mortality rates, stratified by sex, for the training sample. Because the results of all the models pointed to the same conclusion, we describe in the text only the results of the model predicting 10-year risk of death among women. The C statistic was 0.74 in the model that included age only. Adding the Charlson comorbidity measure increased the C statistic to 0.79. Substituting the Elixhauser comorbidity measure for the Charlson measure further increased the C statistic to 0.80. The C statistic remained at 0.80 after the addition of the 4 conditions that were included in the Charlson measure but not in the Elixhauser measure (myocardial infarction, cerebrovascular disease, dementia, and leukemia), the number of outpatient visits and hospitalizations in the previous 12 months, and the interactions between age and comorbidities or among comorbidities. In all models for 1-, 5-, 7-, and 10-year risk of death in men and women, the models with age and Elixhauser comorbidities had slightly higher C statistics than did the models with Charlson comorbidity measures (P < 0.001 for all comparisons). The Elixhauser models also had lower AIC statistics and approximately 0.7% higher PCC measures. Addition of prior outpatient visits or hospitalizations or interactions produced minimal effect. Thus, we used the Cox proportional model with age and Elixhauser comorbidities for both men and women as our final model to predict patient life expectancy.

Table 2.

Model Selection on the Basis of a Series of Logistic Regression Models Predicting Death at 1-, 5-, 7-, and 10-year Cutoffs in the Training Sample of Medicare Beneficiaries Used to Develop Models to Predict Life Expectancy From Medicare Claims Data, United States, 2000–2009a

Model C Statistic
Women
Men
Death at 1 Year Death at 5 Years Death at 7 Years Death at 10 Years Death at 1 Year Death at 5 Years Death at 7 Years Death at 10 Years
Age 0.69 0.71 0.73 0.74 0.67 0.69 0.70 0.72
Age + Charlson comorbidity scoreb 0.78 0.77 0.78 0.79 0.76 0.75 0.75 0.76
Age + Charlson comorbiditiesc,d 0.79 0.78 0.79 0.79 0.77 0.76 0.76 0.77
Age + Elixhauser comorbiditiesc,d 0.81 0.79 0.79 0.80 0.79 0.76 0.77 0.77
Age + Elixhauser + 4 Charlson comorbidities not included in Elixhausere 0.81 0.79 0.79 0.80 0.79 0.76 0.77 0.77
Age + Elixhauser + no. of outpatient visits in previous 12 months + no. of hospitalizations in previous 12 months 0.81 0.79 0.79 0.80 0.78 0.76 0.76 0.77
Age + Elixhauser + age × comorbidity interactions 0.81 0.79 0.79 0.80 0.78 0.76 0.76 0.77
Age + Elixhauser + age × comorbidity interactions + 2-way interactions between comorbidities 0.81 0.79 0.79 0.80 0.78 0.76 0.76 0.77

a The subjects were identified from the 5% Medicare claims data in 2000. Survival was tracked from January 1, 2000, through December 31, 2009. We randomly split the data into training and validation samples. The training sample was used for model selection.

b In the model with age + Charlson comorbidity score, the Charlson comorbidity index score (17) was included. It was calculated from the weighted sum of 17 individual conditions.

c In the model with age + Charlson comorbidities, the 17 individual Charlson comorbid conditions were included. In the model with age + Elixhauser comorbidities, the 31 individual Elixhauser comorbid conditions (18) were included.

d The nonparametric approach by DeLongs et al. (27) showed that the C statistics of the models with age + Elixhauser comorbidities were significantly higher than the models with age + Charlson comorbidities (all P values < 0.001). The models with Elixhauser measures also had lower Akaike information criterion and on average approximately 0.7% higher percent correctly classified than the models with Charlson measures.

e The 4 Charlson comorbidities not included in Elixhauser method were myocardial infarction, cerebrovascular disease, dementia, and leukemia.

The estimates of the final models are presented in Table 3. Each 1-year increase in age was associated with a 9%–11% increased hazard of death after adjustment for comorbidities. Metastatic cancer was associated with the highest hazards of death (hazard ratio = 3.51, 95% confidence interval: 3.37, 3.66 for women; hazard ratio = 3.18, 95% confidence interval: 3.04, 3.33 for men). Congestive heart failure, chronic pulmonary disease, neurological disorders other than paralysis, diabetes with chronic complications, renal failure, acquired immunodeficiency syndrome, lymphoma, weight loss, alcohol abuse, and psychoses were associated with a 46%–97% increased hazard of death. In these multivariable models, a few comorbidities (e.g., uncomplicated and complicated hypertensions) were associated with a slightly but significantly lower hazard of death, presumably because their associated comorbidities or complications are also in the models.

Table 3.

Estimates of Hazard of Death From the Sex-specific Cox Proportional-Hazard Models With Age and Elixhauser Comorbidities in the Training Sample of Medicare Beneficiaries Used to Develop Models to Predict Life Expectancy From Medicare Claims Data, United States, 2000–2009a

Predictor Women
Men
HR 95% CI HR 95% CI
Age, years 1.107 1.106, 1.108 1.094 1.093, 1.095
Circulatory diseases
 Congestive heart failure 1.637 1.610, 1.664 1.719 1.686, 1.752
 Cardiac arrhythmia 1.211 1.193, 1.230 1.122 1.103, 1.140
 Valvular disease 1.066 1.042, 1.090 1.033 1.007, 1.060
 Pulmonary circulation disease 1.224 1.175, 1.277 1.187 1.128, 1.248
 Peripheral vascular disease 1.425 1.400, 1.450 1.367 1.340, 1.395
 Hypertension uncomplicated 0.965 0.955, 0.975 0.946 0.934, 0.958
 Hypertension complicated 0.888 0.868, 0.909 0.882 0.857, 0.907
Respiratory, digestive, and renal diseases
 Chronic pulmonary disease 1.618 1.595, 1.643 1.610 1.584, 1.636
 Renal failure 1.965 1.897, 2.035 1.786 1.725, 1.849
 Peptic ulcer disease excluding bleeding 0.990 0.946, 1.036 1.025 0.971, 1.083
 Liver disease 1.270 1.211, 1.332 1.218 1.155, 1.283
Endocrine, nutritional, and metabolic diseases
 Diabetes without chronic complications 1.436 1.416, 1.457 1.327 1.305, 1.348
 Diabetes with chronic complications 1.738 1.696, 1.780 1.616 1.572, 1.662
 Hypothyroidism 0.968 0.953, 0.983 0.971 0.944, 1.000
 Fluid and electrolyte disorders 1.241 1.218, 1.264 1.235 1.205, 1.265
 Weight loss 1.524 1.476, 1.574 1.470 1.410, 1.533
 Obesity 0.987 0.945, 1.031 0.967 0.909, 1.029
Neoplasms
 Solid tumor without metastasis 1.265 1.237, 1.294 1.165 1.144, 1.186
 Metastatic cancer 3.508 3.365, 3.656 3.180 3.038, 3.329
 Lymphoma 1.882 1.778, 1.991 1.956 1.843, 2.075
Blood diseases
 Deficiency anemia 1.157 1.131, 1.185 1.139 1.105, 1.173
 Chronic blood loss anemia 1.084 1.038, 1.132 1.098 1.041, 1.157
 Coagulopathy 1.207 1.169, 1.247 1.141 1.103, 1.180
Disease of nervous system or connective tissue
 Paralysis 1.307 1.248, 1.370 1.323 1.253, 1.398
 Neurological disorders other than paralysis 1.778 1.736, 1.821 1.738 1.693, 1.785
 Rheumatoid arthritis/collagen vascular disease 1.224 1.194, 1.255 1.112 1.068, 1.157
Mental disorders
 Depression 1.307 1.281, 1.333 1.315 1.276, 1.355
 Psychoses 1.554 1.501, 1.608 1.603 1.526, 1.684
Infectious diseases and substance abuse
 Alcohol abuse 1.505 1.391, 1.628 1.455 1.378, 1.536
 Drug abuse 1.155 1.059, 1.260 1.173 1.050, 1.310
 Acquired immunodeficiency syndrome 1.969 0.939, 4.132 1.576 1.073, 2.315

Abbreviations: CI, confidence interval; HR, hazard ratio.

a The subjects were identified from the 5% Medicare claims data in 2000. Survival was tracked from January 1, 2000, through December 31, 2009. We randomly split the data into training and validation samples. The training sample was used for model development.

From the sex-specific Cox proportional-hazard models in Table 3, we obtained the coefficient estimates (see Web Table 1, available at http://aje.oxfordjournals.org/) and survival function estimates for patients aged 66 years with no comorbidity at baseline (see Web Table 2). The coefficient estimates and baseline survival function estimates were used to calculate median survival times and survival probabilities at 1, 5, 7, and 10 years’ follow-up for men and women in the validation set. Median survival time is used as a proxy for life expectancy (6). For example, our model predicted that an 82-year-old woman with a history of peripheral vascular disease, uncomplicated hypertension, uncomplicated diabetes, and obesity had a predicted life expectancy of 5.6 years. Her probabilities of living longer than 1, 5, 7, and 10 years were estimated to be 0.91, 0.56, 0.39, and 0.20, respectively.

Table 4 shows the calibration and discrimination of the Cox proportional-hazards models. The models were well calibrated on the basis of estimates of γ0 being close to 0 and γ1 close to 1. The values of γ0 and γ1 for death at 1 year were not as good as those for death at 5 or 10 years. The C statistics, measures of discrimination, were 0.78–0.79 for women and 0.76–0.77 for men. Table 4 also shows the predictive accuracy of the Cox proportional-hazard models by comparing the predicted risk of death to the observed mortality rate. The predicted risk of death was categorized as <25%, 25%–49%, 50%–74%, and ≥75% probability of death. The results were remarkably similar for both sexes and in both the training and validation samples. For 5-, 7-, and 10-year cutoffs, the observed mortality rates fell within the boundaries of the predicted probability of death in each instance. For example, of 338,596 women in the validation data, 66.3% (n = 224,432) had a <25% risk of death at 5 years. About 12.1% of patients with <25% risk of death at 5 years actually died within 5 years. About 6.8% (n = 22,883) and 2.8% (n = 9,319) had 50%–74% and ≥75% risk of death at 5 years, respectively. The observed mortality rates at 5 years for these 2 groups were 63.5% and 80.4%, respectively. For death at 10 years, 22.4% (n = 75,766) of women had <25% risk of death and 16.4% (n = 55,547) had ≥75% risk of death. The observed mortality rates at 10 years in these 2 groups were 16.6% and 88.4%, respectively. The predictions of death at 1 year were less accurate. The models overestimated the risk of death at 1 year for the high-risk groups.

Table 4.

Validation of the Sex-specific Cox Proportional-Hazard Models With Predictors of Age and 31 Elixhauser Comorbidities in the Validation Sample of Medicare Beneficiaries Used to Test Models That Predict Life Expectancy From Medicare Claims Data, United States, 2000–2009a

Sex and Cutoff, years Estimates for Modelb
logit(p) = γ+ γ1logit(^p)
Observed Mortality Rate by Category of Predicted Risk of Death, %
γ0b γ1b C Statisticb <25% Predicted Risk of Death 25%–49% Predicted Risk of Death 50%–74% Predicted Risk of Death ≥75% Predicted Risk of Death
Female (n = 338,596)
 1 0.312 1.119 0.79 4.3 33.4 46.6 52.4
 5 0.075 1.065 0.78 12.1 38.1 63.5 80.4
 7 0.052 1.074 0.79 14.0 36.8 64.6 85.0
 10 0.013 1.081 0.79 16.6 35.4 65.2 88.4
Male (n = 230,059)
 1 0.238 1.102 0.77 5.3 34.3 44.9 66.1
 5 0.064 1.065 0.76 14.7 36.8 64.2 83.3
 7 0.041 1.069 0.76 17.5 35.6 65.0 86.9
 10 0.005 1.084 0.77 20.4 34.8 63.5 89.6

a The subjects were identified from the 5% Medicare claims data in 2000. Survival was tracked from January 1, 2000, through December 31, 2009. We randomly split the data into training and validation samples. All values in the training and validation samples were very similar, with differences within 0.01 of each other.

b For each cutoff (1, 5, 7, and 10 years), we evaluated the validity of our predicted probabilities by using estimates of γ0, γ1, and C statistic from a logistic regression model, logit(p) = γ+ γ1logit(^p), where p is the observed survival and ^p is the predicted survival probability estimated from our sex-specific Cox proportional-hazard model. Estimated values of γ0 close to 0 and γ1 close to 1 indicate good calibration, whereas a higher value of C statistic indicates better discrimination.

Figure 1 describes actual survival over the 10-year follow-up for subjects with predicted life expectancies of <1, <5, <7, and <10 years in the validation sample. The Kaplan-Meier survival curves showed that these groups had distinct survival trajectories. For those with a predicted life expectancy of <10 years, 74.7% died within 10 years. For those with a predicted life expectancy of <7 years, 71.6% died within 7 years. For those with <5 years of predicted life expectancy, 69.2% died within 5 years. For those with <1 year of predicted life expectancy, 47.6% died within 1 year. The results were almost identical for subjects in the training sample (data not shown).

Figure 1.

Figure 1.

Actual survival rate over 10-year follow-up for subjects with predicted life expectancies of <1, <5, <7, and <10 years, stratified by sex, in a validation sample of Medicare beneficiaries from the 5% Medicare claims data in 2000 who were followed up through 2009. The validation sample was used to test sex-specific Cox proportional-hazard models to predict life expectancy from Medicare claims data.

DISCUSSION

We developed and validated sex-specific predictive models that use age and comorbidities to predict individual life expectancy and risk of death at up to 10 years from Medicare claims data in older people. The C statistics ranged from 0.76 to 0.79, depending on sex and the cutoff year of death (death at 1, 5, 7, or 10 years). The predicted life expectancy and risk of death accurately calibrated and discriminated the observed mortality rate for both sexes in the training and validation data. More than 80% of subjects with <25% risk of death at 5, 7, and 10 years survived longer than the chosen cutoff years. More than 80% of subjects with ≥75% risk of death at 5, 7, and 10 years died within those cutoff years.

To our knowledge, only a few publications focus on life expectancy prediction. The Centers for Disease Control and Prevention publishes annual life tables (31) that contain life expectancy estimates stratified by age, sex, and race/ethnicity. These life tables were generated with the use of actuarial life table methods without consideration of health conditions. Some published prognostic indexes for community-dwelling patients focus on predicting death at a fixed time point through logistic regression modeling (e.g., a prognostic index to predict death at 4 years on the basis of logistic regression) (32). Two prognostic indices were developed with Cox proportional-hazards modeling to estimate life expectancy or risk of death within certain years (33, 34). Carey et al. (34) developed a model to predict the need of frail elderly persons for long-term care by predicting risk of death within 3 years (C statistic = 0.69). An index by Schonberg et al. (33) was initially developed from the National Health Interview Survey to predict death at 5 years (C statistic = 0.75) and was later externally validated to predict death at up to 9 years (35). In comparison, our sex-specific predictive models had comparable or slightly higher discrimination power (C statistics = 0.76–0.81). In addition, the indices by Carey et al. and Schonberg et al. were developed from survey data and contain self-reported functional status. They are not applicable to studies that use administrative claims data.

Our study also showed an adaptive use of well-established comorbidity measures (1722) to estimate life expectancy from claims data. The inclusion of comorbidity measures significantly improved predictive ability relative to that of models that used age and sex alone. Consistent with findings that the Elixhauser method performed better than the Charlson method in predicting in-hospital death (36), our models with Elixhauser comorbidities also had slightly better predictive power than the models with Charlson measures. This could be due to the larger scope of comorbidities captured by the Elixhauser method. We should note that the C statistics for the models that used the Elixhauser and Charlson measures were very close (Table 2), even though the differences were statistically significant. The comparisons based on C statistics and AICs were overpowered because of the very large number of subjects in the 5% Medicare data. We emphasize the absolute differences in predictive accuracy when the PCC is used. Compared with the models that used Charlson measures, the models that used Elixhauser measures had approximately 0.7% higher PCC, which translated to 4,000 more subjects correctly classified in our sample. The predictive power was not further improved by the addition of 4 Charlson conditions (myocardial infarction, cerebrovascular disease, dementia, and leukemia) not included in the Elixhauser conditions. Two of the comorbidities (peptic ulcers and obesity) were not significant predictors of death (Table 3). However, we retained all 31 Elixhauser comorbidities in the final model to preserve the integrity of the comorbidity measurement. Reducing the number of variables will improve the reliability of the predictive models when the number of predictors exceeds one tenth the number of uncensored event times in the training sample (30). This is not a problem for analyses of claims data because of the very large sample sizes.

One potential application of our predictive model is in assessing the quality of preventive services on the basis of Medicare claims data. Target populations that are appropriate or not appropriate for a service can be more accurately defined by patient life expectancy than by age limit alone (37). For example, tight glycemic control in diabetes mellitus to prevent microvascular complications is not recommended for patients with less than 8 years of life expectancy (38), and colonoscopy screening for colon cancer is not recommended for patients with less than 10 years of life expectancy (39, 40). Use of age, sex, and comorbidity allows for a more accurate assessment of potential overuse and underuse of such preventive services than does age alone. Take the case of a 70-year-old man with congestive heart failure and complicated diabetes. Our model predicts a median survival time of 7.4 years and a 0.34 probability of surviving longer than 10 years. A screening colonoscopy for this man would be inappropriate according to life expectancy but would have been categorized as appropriate on the basis of age alone. In contrast, a screening colonoscopy might be appropriate for a 76-year-old woman with no comorbidities who has a 0.63 probability of surviving >10 years. Extending this concept to the population level, one might define the population not appropriate for screening colonoscopy as those with a median survival time (a proxy of life expectancy) of <10 years. On the basis of this criterion, about 74.7% with a predicted life expectancy of <10 years actually died within 10 years. Alternatively, one might use a cutoff of >75% probability of death at 10 years to define the population not appropriate for screening colonoscopy. According to this more rigorous definition, 89% identified died within 10 years. It is important to recognize that all prognostic indices, including ours, lack precision at the individual level (1, 2). Nevertheless, the rationale and evidence underlying all preventive services are at the population level rather than the individual level. The same is true of quality measures.

Current quality assessments focus heavily on appropriate use, with few indicators of overuse (41). Measurement of overuse is challenging because of the increasing heterogeneity of health status among community-dwelling older adults. Preventive services might be appropriate for old but very healthy people. Assessments of overuse would be less accurate if guidelines' upper age limits (e.g., screening mammography for women ≥75 years of age) were used. Our models enable quality-of-care assessments that are based on life expectancy. For example, we used 100% Medicare data from Texas to evaluate screening mammography utilization at the level of the usual-care providers. We found that providers varied greatly in mammography screening use for women with a limited life expectancy (average rate = 31.1%; range, 10.6%–75.7%) (42), which supports the call for refined quality indicators that include both appropriate use and overuse to improve quality of care (37).

The study has several limitations. The most important is that the models are proposed for use with claims data—specifically, Medicare. We see their utility in evaluating the appropriateness of health-care utilization among groups of patients, not in predicting life expectancy in the clinical setting. Claims data lack information on factors strongly linked to life expectancy, such as self-rated health, functional status, walking speed, and severity of illness (43, 44). The clinician interacting with an individual patient has a richer array of information with which to estimate survival. For example, a study by Lee et al. (32) suggests that C statistics of up to 0.84 can be achieved by incorporating both comorbidities and functional status into a model. The lack of self-rated health and functional status measures in our models might also account for the lower predictive accuracy for death at 1 year. Studies have reported that functional status, along with age and sex, are the most powerful predictors of the risk of death at 1 year (43, 45, 46). Of all the Charlson comorbidity conditions, only congestive heart failure and cancer had additional predictive power after adjustment for functional status in predicting death at 1 year (47). Another limitation is that the validity of these models over time has not been tested. For example, advances in the treatment of specific diseases, which can occur rapidly, are likely to lead to weaker relationships with mortality rate over time.

Some prior studies of predictors of survival have used a point score to simplify their application. With current high-speed computing power, such simplification is not necessary and could reduce accuracy. A computer model that applies the coefficient estimate and baseline survival function (provided in Web Tables 1 and 2, respectively) from our predictive model to a patient's age, sex, and comorbidity profile can easily calculate the median survival time (a proxy of life expectancy) and probability of survival beyond a certain year. A factor that might improve the predictive power of the model is inclusion of ethnicity, which is available in more recent Medicare denominator files (48). We chose to omit ethnicity because of the clear association of membership in certain minority ethnicities with receiving less than adequate medical care. The shorter life expectancy in those minorities, partially a result of inadequate medical care, should not be used against them in providing medical services. We used similar reasoning in deciding not to use a proxy indicator of socioeconomic status (Medicaid eligibility or zip code median income) in the models.

In conclusion, we adaptively used Elixhauser comorbidity measures to predict patient life expectancy from Medicare claims data. The predictive model was well calibrated and showed good predictive discrimination for risk of death by 5–10 years. It might be of value for researchers using Medicare claims to evaluate the quality of preventive care and outcomes in the context of patient life expectancy.

Supplementary Material

Web Tables

ACKNOWLEDGMENTS

Author affiliations: Sealy Center on Aging, University of Texas Medical Branch, Galveston, Texas (Alai Tan, Yong-Fang Kuo, and James S. Goodwin); Department of Preventive Medicine and Community Health, University of Texas Medical Branch, Galveston, Texas (Alai Tan, Yong-Fang Kuo, and James S. Goodwin); and Department of Internal Medicine, University of Texas Medical Branch, Galveston, Texas (Yong-Fang Kuo and James S. Goodwin).

This study was supported by the Cancer Prevention Research Institute of Texas, Austin, Texas (grant RP101207), and by the National Institutes of Health, Bethesda, Maryland (grants K05CA134923 and UL1TR00071).

We thank Dr. Sarah Toombs Smith of the Sealy Center on Aging, University of Texas Medical Branch, Galveston, Texas, for her editorial assistance.

Conflict of interest: none declared.

REFERENCES

  • 1.Gill TM. The central role of prognosis in clinical decision making. JAMA. 2012;307(2):199–200. doi: 10.1001/jama.2011.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Yourman LC, Lee SJ, Schonberg MA, et al. Prognostic indices for older adults: a systematic review. JAMA. 2012;307(2):182–192. doi: 10.1001/jama.2011.1966. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Nyström L, Andersson I, Bjurstam N, et al. Long-term effects of mammography screening: updated overview of the Swedish randomised trials. Lancet. 2002;359(9310):909–919. doi: 10.1016/S0140-6736(02)08020-0. [DOI] [PubMed] [Google Scholar]
  • 4.Jørgensen KJ, Gøtzsche PC. Overdiagnosis in publicly organised mammography screening programmes: systematic review of incidence trends. BMJ. 2009;339:b2587. doi: 10.1136/bmj.b2587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Paci E, Miccinesi G, Puliti D, et al. Estimate of overdiagnosis of breast cancer due to mammography after adjustment for lead time. A service screening study in Italy. Breast Cancer Res. 2006;8(6):R68. doi: 10.1186/bcr1625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Walter LC, Lindquist K, Nugent S, et al. Impact of age and comorbidity on colorectal cancer screening among older veterans. Ann Intern Med. 2009;150(7):465–473. doi: 10.7326/0003-4819-150-7-200904070-00006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Goodwin JS, Singh A, Reddy N, et al. Overuse of screening colonoscopy in the Medicare population. Arch Intern Med. 2011;171(15):1335–1343. doi: 10.1001/archinternmed.2011.212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Huang ES, Zhang Q, Gandra N, et al. The effect of comorbid illness and functional status on the expected benefits of intensive glucose control in older patients with type 2 diabetes: a decision analysis. Ann Intern Med. 2008;149(1):11–19. doi: 10.7326/0003-4819-149-1-200807010-00005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Skyler JS, Bergenstal R, Bonow RO, et al. Intensive glycemic control and the prevention of cardiovascular events: implications of the ACCORD, ADVANCE, and VA Diabetes Trials: a position statement of the American Diabetes Association and a Scientific Statement of the American College of Cardiology Foundation and the American Heart Association. J Am Coll Cardiol. 2009;53(3):298–304. doi: 10.1016/j.jacc.2008.10.008. [DOI] [PubMed] [Google Scholar]
  • 10.Randolph WM, Mahnken JD, Goodwin JS, et al. Using Medicare data to estimate the prevalence of breast cancer screening in older women: comparison of different methods to identify screening mammograms. Health Serv Res. 2002;37(6):1643–1657. doi: 10.1111/1475-6773.10912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Randolph WM, Goodwin JS, Mahnken JD, et al. Regular mammography use is associated with elimination of age-related disparities in size and stage of breast cancer at diagnosis. Ann Intern Med. 2002;137(10):783–790. doi: 10.7326/0003-4819-137-10-200211190-00006. [DOI] [PubMed] [Google Scholar]
  • 12.Armstrong K, Long JA, Shea JA. Measuring adherence to mammography screening recommendations among low-income women. Prev Med. 2004;38(6):754–760. doi: 10.1016/j.ypmed.2003.12.023. [DOI] [PubMed] [Google Scholar]
  • 13.Walter LC, Covinsky KE. Cancer screening in elderly patients: a framework for individualized decision making. JAMA. 2001;285(21):2750–2756. doi: 10.1001/jama.285.21.2750. [DOI] [PubMed] [Google Scholar]
  • 14.Sima CS, Panageas KS, Schrag D. Cancer screening among patients with advanced cancer. JAMA. 2010;304(14):1584–1591. doi: 10.1001/jama.2010.1449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Mehta KM, Fung KZ, Kistler CE, et al. Impact of cognitive impairment on screening mammography use in older US women. Am J Public Health. 2010;100(10):1917–1923. doi: 10.2105/AJPH.2008.158485. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Tan A, Kuo YF, Goodwin JS. Integrating age and comorbidity to assess screening mammography utilization. Am J Prev Med. 2012;42(3):229–234. doi: 10.1016/j.amepre.2011.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Charlson ME, Pompei P, Ales KL, et al. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40(5):373–383. doi: 10.1016/0021-9681(87)90171-8. [DOI] [PubMed] [Google Scholar]
  • 18.Elixhauser A, Steiner C, Harris DR, et al. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8–27. doi: 10.1097/00005650-199801000-00004. [DOI] [PubMed] [Google Scholar]
  • 19.Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J Clin Epidemiol. 1992;45(6):613–619. doi: 10.1016/0895-4356(92)90133-8. [DOI] [PubMed] [Google Scholar]
  • 20.Klabunde CN, Potosky AL, Legler JM, et al. Development of a comorbidity index using physician claims data. J Clin Epidemiol. 2000;53(12):1258–1267. doi: 10.1016/s0895-4356(00)00256-0. [DOI] [PubMed] [Google Scholar]
  • 21.Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130–1139. doi: 10.1097/01.mlr.0000182534.19832.83. [DOI] [PubMed] [Google Scholar]
  • 22.Romano PS, Roos LL, Jollis JG. Adapting a clinical comorbidity index for use with ICD-9-CM administrative data: differing perspectives. J Clin Epidemiol. 1993;46(10):1075–1079. doi: 10.1016/0895-4356(93)90103-8. discussion 81–90. [DOI] [PubMed] [Google Scholar]
  • 23.National Center for Health Statistics, Office of Analysis and Epidemiology. Analytic Issues in Using the Medicare Enrollment and Claims Data Linked to NCHS Surveys. Hyattsville, MD: National Center for Health Statistics; 2012. http://www.cdc.gov/nchs/data/datalinkage/cms_medicare_analytic_issues_final.pdf. (Accessed February 1, 2012) [Google Scholar]
  • 24.Ghali WA, Hall RE, Rosen AK, et al. Searching for an improved clinical comorbidity index for use with ICD-9-CM administrative data. J Clin Epidemiol. 1996;49(3):273–278. doi: 10.1016/0895-4356(95)00564-1. [DOI] [PubMed] [Google Scholar]
  • 25.Cleves MA, Sanchez N, Draheim M. Evaluation of two competing methods for calculating Charlson's comorbidity index when analyzing short-term mortality using administrative data. J Clin Epidemiol. 1997;50(8):903–908. doi: 10.1016/s0895-4356(97)00091-7. [DOI] [PubMed] [Google Scholar]
  • 26.National Cancer Institute. SEER-Medicare: Calculation of Comorbidity Weights. Bethesda, MD: National Cancer Institute; 2012. http://healthservices.cancer.gov/seermedicare/program/comorbidity.html. (Accessed November 30, 2012) [Google Scholar]
  • 27.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–845. [PubMed] [Google Scholar]
  • 28.Waldron I. Sex differences in illness incidence, prognosis and mortality: issues and evidence. Soc Sci Med. 1983;17(16):1107–1123. doi: 10.1016/0277-9536(83)90004-7. [DOI] [PubMed] [Google Scholar]
  • 29.Wingard DL. The sex differential in morbidity, mortality, and lifestyle. Annu Rev Public Health. 1984;5:433–458. doi: 10.1146/annurev.pu.05.050184.002245. [DOI] [PubMed] [Google Scholar]
  • 30.Harrell FE, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15(4):361–387. doi: 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4. [DOI] [PubMed] [Google Scholar]
  • 31.Arias E. Hyattsville, MD: National Center for Health Statistics; 2011. United States life tables, 2007. National vital statistics reports; vol 59 no 9. [Google Scholar]
  • 32.Lee SJ, Lindquist K, Segal MR, et al. Development and validation of a prognostic index for 4-year mortality in older adults. JAMA. 2006;295(7):801–808. doi: 10.1001/jama.295.7.801. [DOI] [PubMed] [Google Scholar]
  • 33.Schonberg MA, Davis RB, McCarthy EP, et al. Index to predict 5-year mortality of community-dwelling adults aged 65 and older using data from the National Health Interview Survey. J Gen Intern Med. 2009;24(10):1115–1122. doi: 10.1007/s11606-009-1073-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Carey EC, Covinsky KE, Lui LY, et al. Prediction of mortality in community-living frail elderly people with long-term care needs. J Am Geriatr Soc. 2008;56(1):68–75. doi: 10.1111/j.1532-5415.2007.01496.x. [DOI] [PubMed] [Google Scholar]
  • 35.Schonberg MA, Davis RB, McCarthy EP, et al. External validation of an index to predict up to 9-year mortality of community-dwelling adults aged 65 and older. J Am Geriatr Soc. 2011;59(8):1444–1451. doi: 10.1111/j.1532-5415.2011.03523.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Stukenborg GJ, Wagner DP, Connors AF. Comparison of the performance of two comorbidity measures, with and without information from prior hospitalizations. Med Care. 2001;39(7):727–739. doi: 10.1097/00005650-200107000-00009. [DOI] [PubMed] [Google Scholar]
  • 37.Lee SJ, Walter LC. Quality indicators for older adults: preventing unintended harms. JAMA. 2011;306(13):1481–1482. doi: 10.1001/jama.2011.1418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Brown AF, Mangione CM, Saliba D, et al. Guidelines for improving the care of the older person with diabetes mellitus. J Am Geriatr Soc. 2003;51(5 Suppl Guidelines):S265–S280. doi: 10.1046/j.1532-5415.51.5s.1.x. [DOI] [PubMed] [Google Scholar]
  • 39.Levin B, Lieberman DA, McFarland B, et al. Screening and surveillance for the early detection of colorectal cancer and adenomatous polyps, 2008: a joint guideline from the American Cancer Society, the US Multi-Society Task Force on Colorectal Cancer, and the American College of Radiology. Gastroenterology. 2008;134(5):1570–1595. doi: 10.1053/j.gastro.2008.02.002. [DOI] [PubMed] [Google Scholar]
  • 40.McFarland EG, Levin B, Lieberman DA, et al. Revised colorectal screening guidelines: joint effort of the American Cancer Society, U.S. Multisociety Task Force on Colorectal Cancer, and American College of Radiology. Radiology. 2008;248(3):717–720. doi: 10.1148/radiol.2483080842. [DOI] [PubMed] [Google Scholar]
  • 41.Physician Consortium for Performance Improvement (PCPI®) PCPI® and PCPI Approved Quality Measures. Chicago, IL: American Medical Association; 2011. http://www.ama-assn.org/apps/listserv/x-check/qmeasure.cgi?submit=PCPI. (Accessed February 1, 2013) [Google Scholar]
  • 42.Tan A, Kuo Y-F, Elting LS, et al. Refining physician quality indicators for screening mammography in older women: distinguishing appropriate use from overuse. J Am Geriatr Soc. 2013;61(3):380–387. doi: 10.1111/jgs.12151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Inouye SK, Peduzzi PN, Robison JT, et al. Importance of functional measures in predicting mortality among older hospitalized patients. JAMA. 1998;279(15):1187–1193. doi: 10.1001/jama.279.15.1187. [DOI] [PubMed] [Google Scholar]
  • 44.Goodwin JS. Gait speed: comment on “rethinking the association of high blood pressure with mortality in elderly adults”. Arch Intern Med. 2012;172(15):1168–1169. doi: 10.1001/archinternmed.2012.2642. [DOI] [PubMed] [Google Scholar]
  • 45.Mazzaglia G, Roti L, Corsini G, et al. Screening of older community-dwelling people at risk for death and hospitalization: the Assistenza Socio-Sanitaria in Italia project. J Am Geriatr Soc. 2007;55(12):1955–1960. doi: 10.1111/j.1532-5415.2007.01446.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Covinsky KE, Justice AC, Rosenthal GE, et al. Measuring prognosis and case mix in hospitalized elders. The importance of functional status. J Gen Intern Med. 1997;12(4):203–208. doi: 10.1046/j.1525-1497.1997.012004203.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Walter LC, Brand RJ, Counsell SR, et al. Development and validation of a prognostic index for 1-year mortality in older adults after hospitalization. JAMA. 2001;285(23):2987–2994. doi: 10.1001/jama.285.23.2987. [DOI] [PubMed] [Google Scholar]
  • 48.Centers for Medicare & Medicaid Services. Final Medicare Part D Data Regulation (CMS-4119-F) Baltimore, MD: Centers for Medicare & Medicaid Services; 2008. http://www.cms.gov/Medicare/Prescription-Drug-Coverage/PrescriptionDrugCovGenIn/downloads/PartDClaimsDataFactSheet.pdf. ). (Accessed February 1, 2012) [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web Tables

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES