Myriad factors in a large number of studies have been linked to human survival, encompassing influences such as the social environment as well as proximate determinants such as health conditions and biological markers. For example, an extensive literature explores Link and Phelan’s premise that social conditions are “fundamental causes” of disease that involve access to the necessary resources to both prevent and treat illness (Link and Phelan 1995). A long history of research demonstrates that a large number of such factors, including education, income, and social networks, are significantly associated with subsequent mortality (see, for example, Elo 2009).
Although Link and Phelan were concerned about an overemphasis on individual risk factors at the expense of societal context, researchers have continued to analyze health-related behaviors and a broad range of self-reported health conditions and physical function as correlates of survival. With the increasing availability of health interview surveys, scholars have paid more attention to these proximate determinants. Moreover, as some health interviews surveys have moved beyond the collection of self-reported information to include biological and clinical variables, social scientists and health researchers have used these biosocial surveys of population-based samples to expand their set of individual-based risk factors (Crimmins, Kim, and Vasunilashorn 2010; Glei et al. 2014; Newson et al. 2010; Rosero-Bixby and Dow 2012; Seeman et al. 2001). However, these researchers have rarely made explicit forecasts of survival, instead examining the significance of demographic, social, health, and biological variables in statistical models of mortality.
Clinicians have also had a large role in identifying determinants of survival, relying primarily on clinical parameters and symptoms of disease rather than social factors. In contrast to social scientists and epidemiologists, much of their research has focused on the likelihood of survival for chronically ill patients, generally considering a shorter timeframe than social scientists. A set of validated prognostic indexes reflect their efforts (Yourman et al. 2012) and were designed in part to help physicians, health personnel, and patients’ families deal with end-of-life decision-making.
A recent study bridged the gap between the approaches of researchers and clinicians by encompassing many of the variables considered by both groups and by explicitly evaluating their predictive ability (Goldman, Glei, and Weinstein 2016). Goldman and colleagues identified the strongest predictors of five-year survival in national samples of older adults in four countries with similar life expectancy: the United States, England, Costa Rica, and Taiwan. Although the authors identified remarkable consistency in the leading predictors among these four surveys, they recognized that potential heterogeneity may be present within samples, leaving an important question unanswered: Do the variables that best predict survival differ across demographic subgroups (defined by age, sex, and race/ethnicity)? For example, social factors may have less predictive ability among the elderly than for young and middle-aged adults. Researchers have not addressed this type of heterogeneity directly. However, past research reveals variation by age, sex, and race in the association between risk factors and mortality, suggesting a need to stratify prediction models. Such stratification would provide a deeper understanding of the survival process and improve prognosis. In this article, we use self-reported and clinical information in the National Health and Nutrition Examination Survey (NHANES) to assess whether the strongest predictors of survival in the US vary by age, sex, and race/ethnicity.
Background
Assessing the strength of prediction
A critical consideration in mortality prediction is the choice of a metric to assess the strength of predictors. In studies of the determinants of health and survival, social scientists have relied primarily on the magnitude of the effect size—typically an odds ratio or hazard ratio in survival models—along with statistical significance. Although such criteria are useful for identifying risk factors, they are not appropriate for prediction: even a large hazard ratio may fail to discriminate well between those who survive and those who die over a follow-up period.
This limitation of the odds (or hazard) ratio can be understood in terms of two measures frequently employed in health and medical research: sensitivity and specificity. For survival models, sensitivity reflects the probability that a model correctly predicts death among persons who died over the interval, and specificity reflects the corresponding probability that a model makes the correct prediction for survivors. Pepe and colleagues (2004) showed that in order to classify both survivors and decedents correctly (i.e., to have both high sensitivity and high specificity), an odds ratio would need to reach an order of magnitude rarely achieved in health research. In a simple example based on a binary predictor, they demonstrated that even a substantial odds ratio of three can result in poor classification, because the odds ratio is consistent with an infinite number of combinations of sensitivity and specificity, failing to guarantee that both measures are sufficiently large. Thus, instead of effect sizes, health researchers generally rely on discrimination measures to gauge the strength of a prediction, most frequently the area under the receiver operating characteristic curve (AUC). The receiver operating characteristic (ROC) curve is a graph of the true positive rate (i.e., sensitivity) against the false positive rate (i.e., one minus specificity), calculated for each possible cutoff point of the hazard model for distinguishing between survivors and decedents. The area under that curve (AUC) summarizes how well the model discriminates between these two groups and can be interpreted as the probability that the model predicts a higher probability of death for decedents than for survivors (Pencina and D’Agostino 2004). AUC values of 1.0 denote perfect accuracy, whereas values of 0.5 indicate that the model performs no better than a random coin toss.
The value of biomarkers for mortality prediction
Biosocial surveys, which include biological markers along with self-reported health and background information, have provided researchers with an opportunity to assess the value of those markers for predicting survival, particularly in comparison with self-reported information that is much easier and less expensive to obtain. Based on their review of prior studies that used biomarkers from multiple physiological systems to predict all-cause mortality and evaluated their discriminative ability (based on the AUC), Glei and colleagues concluded that the addition of biological markers to statistical models of mortality generally improved prediction (Glei et al. 2014). They also found support for the inclusion of biomarkers in their own analysis based on a large set of predictors in the Social Environment and Biomarkers of Aging Study (SEBAS) in Taiwan, although the effects of individual biomarkers, and of changes in these markers, were small. The predictive ability of a set of biomarkers for five-year mortality was approximately equal to that of a set of self-reported health variables, but the collection of biomarkers in the Taiwan study comprised 19 measures whereas the self-reported health collection comprised only six health indicators (although one was based on eight mobility tasks). Still, biomarkers continued to improve prediction when self-reported health information was included in models (ibid.).
Identifying the best predictors of survival for the total population
Goldman, Glei, and Weinstein (2016) undertook a systematic evaluation of what mattered most for survival prediction using 57 variables (25 of which were available in all surveys) derived from four biosocial surveys; these variables reflected a broad range of factors shown by social scientists and health researchers to be linked to survival. Their objective was to identify the variables that best discriminated between survivors and decedents over a five-year period and hence were most important in determining whether an older adult would die during that interval. Their conclusions underscored the importance of self-reported information on health and physical function. The findings were remarkably similar for Taiwan, Costa Rica, England, and the US—four countries with similar life expectancy but diverse cultural traditions and levels of economic development.
Their analysis for the US was based on 2,023 persons aged 50 and older in the 2005–2006 wave of NHANES. The findings, based on models that controlled for age and sex, indicated that three of the top six measures reflected self-assessments of physical function: mobility, instrumental activities of daily living (IADL), and activities of daily living (ADL); the other three measures also reflected health status: an indicator for respondents who were using at least five medications; the simple five-category question about overall self-assessed health (SAH); and heart rate. In addition to heart rate, three additional biomarkers—C-reactive protein (CRP), albumin, and homocysteine—were among the 17 (out of 34) variables with a meaningful improvement in prediction (assessed by an increment in the AUC value of at least 0.01 according to Pencina et al. (2008)). Other predictors with an increase in the AUC of at least 0.01 were whether the respondent had been diagnosed with three chronic diseases (heart disease, diabetes, and stroke), the number of hospitalizations within the past 12 months, and four socioeconomic or behavioral variables (exercise, smoking status, education, and income).
Variation in the effects of predictors across demographic groups
Statistical models provide some insight into the potential for variation in the predictive power of variables associated with mortality. As noted above, social scientists estimate survival models in order to identify the correlates and causal factors underlying survival rather than to make explicit forecasts. Their findings derive from effect sizes (e.g., the magnitude of coefficients or odds or hazard ratios) rather than from discrimination measures and thus provide limited insight into the strength of the covariates for the prediction of survival status. We highlight a few prominent examples of research that examines variation by age, sex, or race in the broad set of mortality determinants considered in the present analysis.
Associations between social factors and mortality
One area of extensive research on heterogeneity in covariate effects on mortality relates to how social disparities in survival vary by age. A theoretical perspective referred to as “cumulative advantage” hypothesizes that the disadvantages of lower socioeconomic status cumulate through the life course, resulting in widening inequalities at older ages (DiPrete and Eirich 2006). In contrast, the “age-as-leveler” hypothesis theorizes that the effects of socioeconomic status decrease later in life, possibly owing to selective mortality at younger ages, to an increase in more proximate health risks or biological frailty, or to the introduction of programs designed to improve the well-being of the elderly. Empirical research generally supports the age-as-leveler pattern: measures such as income and education have most often been found to have decreasing effects with age (see, for example, House et al. 1994; Dupre 2007; Lynch 2008; Lauderdale 2001; Markides and Black 1996).
In a similar vein, age-as-leveler arguments have been used to explain diminishing effects of marital status and social support with advancing age (Goldman, Korenman, and Weinstein 1995), as well as declining differentials by race/ethnicity. For example, Markides and Black (1996) found that, for black/white mortality differentials, there is stronger evidence for reduced inequalities with age than for the “double jeopardy” hypothesis of increasing differentials.
Demographers have also identified substantial sex differences in the association between social factors and mortality. For what is hypothesized to be a combination of social and health-related explanations, the longevity advantage of married persons, compared with the single and formerly married, is consistently greater for husbands than for wives (see, for example, Hu and Goldman 1990). Similarly, social integration (i.e., social ties and participation in activities) appears to have a stronger protective impact for men than for women (see, for example, Seeman 1996).
Associations between self-assessments of health and mortality
The simple self-assessed health (SAH) question, which asks respondents to evaluate their overall health in terms of one of four or five adjectives, is one of the most frequently used measures in health research, yet its strong association with mortality is only partly understood. SAH assessments have been shown to reflect a complex cognitive process with the type of information included in the evaluation and the overall assessment of that information (e.g., the reference group) varying by demographic characteristics (see, for example, Idler and Benyamini 1997; Jylhä 2009). Thus, it is not surprising that there is substantial variation across groups in the relationship between SAH and mortality. For example, a recent study found a large decline in the association between SAH and mortality across middle and older ages (Zajacova and Woo 2016), perhaps because older adults evaluated their health more positively than their younger counterparts. Franks and colleagues (2003) found a similar but more general result using four subscales from the 20-item Short Form Health Survey (SF-20), including one on physical function: the self-assessments had a stronger association with mortality among younger respondents.
There is a debate about whether the effect of SAH on mortality should be greater for men or women. Social scientists agree that SAH is a more inclusive measure for women (i.e., women are generally more aware of their symptoms and hence are more likely to provide accurate reports of their health than men—Case and Paxson 2005; Benyamini et al. 2003). These relationships are further complicated by the fact that men and women experience different health trajectories at older ages, with women living longer while also experiencing longer periods of declining health (Benyamini et al. 2003). In general, studies have been more likely to find stronger associations between SAH and survival for men than for women, although the evidence remains mixed.
The strength of the association between SAH and mortality varies considerably across ethnic groups, in part because of cultural differences in interpretations of illness and variation in reference groups. In particular, several studies have found that SAH predicts survival less well for blacks than for whites (Assari, Lankarani, and Burgard 2016; Woo and Zajacova 2016), a result that is consistent with racial differences in the association between the SF-20 and mortality (Franks, Gold, and Fiscella 2003). Weaker associations between SAH and mortality are also apparent for Latinos than for whites. Finch et al. (2002) suggested that the large concentration of immigrants among Latinos weakens the association because of their relatively low level of acculturation to US society. Bzostek et al. (2007) underscored numerous potential problems with SAH as a measure of health for Latinos and, more generally, questioned the use of self-reported information to compare health across ethnic populations, particularly for interviews administered in different languages. In light of the consistency of findings based on the simple SAH question and the SF-20 measure of well-being, self-reported measures of health, functioning, and disability are likely to show variation in their predictive ability across racial and ethnic groups.
Despite the many social science studies that have examined potential variation in the effects of covariates on mortality, none has employed a broad range of variables to measure the predictive strength of these variables—that is, their ability to accurately discriminate between survivors and decedents over a determined length of follow-up. Thus, we have no direct prior information regarding the degree to which the leading predictors of survival vary across demographic groups. However, the studies described above suggest several hypotheses. One is that social factors may yield weaker predictions at older ages. A second is that many of these same social factors may be stronger predictors among men than among women. A third is that self-assessments of health, not just the SAH question but also self-evaluations of function and disability, may be weaker predictors among blacks and Latinos than among whites. Although one objective of our analysis is to evaluate the strength of biomarkers vis-à-vis other variables across demographic groups, we have little a priori information that would allow us to speculate about the differential strength of such markers by age, sex, and race/ethnicity.
Methods
Data
We use a cohort study design based on data from the 1999–2006 waves of the US National Health and Nutrition Examination Survey (NHANES) with mortality follow-up through December 31, 2011. Each two-year cycle of NHANES is based on a stratified, multistage probability sample of non-institutionalized civilians living in the US, with oversampling of people with low income, adolescents (aged 12–19), those aged 65+, African Americans, and Mexican Americans. The response rate to the household interview varies from 79 percent (2003–04 wave) to 84 percent (2001–02 wave). We restrict our analysis to respondents aged 20 and older (n=20,311 across 1999–2006 waves). Of those, 93 percent (n=18,986) participated in the examination component, which was performed in the NHANES mobile examination center, and 89 percent (n=18,046) also provided a blood specimen. Among those who provided a blood specimen, 19 (0.1 percent) were excluded because vital status could not be verified. Thus, our analysis sample comprises 18,027 respondents aged 20 and older who provided a blood sample and for whom vital status could be verified.
Mortality
Vital status as of January 1, 2012 was determined by linkage with death certificate records from the National Death Index. The length of mortality follow-up varies from five years (for those surveyed in late 2006) to 13 years (for those surveyed in early 1999). For consistency, we restrict our analysis to mortality within five years post-exam. The number of respondents who died within five years post-exam was 1,312.
Predictors
Three basic demographic variables—age, sex, and race/ethnicity—form the basis for defining subgroups and also serve as predictors. We classify self-identified race/ethnicity into four categories, using the term Latino synonymously with Hispanic: non-Latino whites, non-Latino blacks, Latinos, and other/mixed race.1 We do not distinguish between foreign-born and native-born respondents within any of these groups.
Based on findings from clinical, community, and broader population-based samples, we include an additional 29 predictors that have been shown to be associated with mortality. One group includes three types of measures of underlying health: a) illness-related measures (i.e., self-reports of doctor-diagnosed diabetes, cancer, stroke, and heart disease; number of hospitalizations; use of medications); b) overall self-reported health and physical function (mobility limitations and two measures of disability: ADL and IADL limitations); and c) biological and clinical measures associated with chronic disease (e.g., standard markers of cardiovascular/metabolic function and of inflammation). A second group comprises less-proximate determinants of mortality, including a) social and demographic factors (e.g., socioeconomic status, marital status, race/ethnicity) and b) health behaviors (e.g., smoking, physical activity). Appendix Table 1 provides details regarding the measurement of all 32 predictors (30 predictors plus age and sex).2
Analytical strategy
To minimize selection biases resulting from missing data, we followed standard practices of multiple imputation to handle missing data (see the Appendix) (Schafer 1999; Rubin 1996). To account for differential response rates and oversampling, we weighted the descriptive statistics (Appendix Table 2). Using unweighted data, we estimated age-specific mortality using a Gompertz hazards model with age as the metric for time. The Gompertz model, which is frequently used to describe age-specific mortality, provides a fit to age-specific death rates almost identical to that of a non-parametric Cox hazard function.
To evaluate whether the best predictors of mortality differ by age, we fit models separately for three broad age groups: 20–64 (n=13,310, of whom 251 died), 65–79 (n=4,669, of whom 406 died), and 80+ (n=2,215, of whom 655 died). Because each respondent is aging over the five-year follow-up period, we assigned each respondent’s exposure to the relevant age groups (i.e., one respondent may contribute exposure to more than one broad age group). Within each broad age group, the model accounts for age (to the nearest month) as the time metric. Within each broad age group, we also fit models separately for non-Latino whites, non-Latino blacks, and Latinos; there were too few respondents of other/mixed race to model separately. We controlled for sex in all of these models. Finally, within each broad age group, we fit models separately by sex to assess whether the best predictors differ between men and women. For the rankings presented here, we added each of the remaining predictors one at a time to the baseline model.
Within each of the three broad age groups, we tested interaction terms between each predictor and age to allow for variation in the effect of the predictor across age (i.e., non-proportional hazards). If p<0.05 (two-tailed test) for the age interaction in any of the three broad age groups (see Appendix Table 3, which lists the predictors with non-proportional hazards), we included the interaction along with the main effect for that predictor for all subgroups (i.e., three age groups and within each age group, models by sex and race/ethnicity).
Results are based on the most frequently used discrimination measure, the area under the receiver operating characteristic curve (AUC). Because all hazards models in this analysis include some controls (i.e., all baseline models include an implicit control for age, and those fit separately by age group and by race/ethnic groups also control for sex), we assess the ranking of predictors by the increment in the AUC (ΔAUC): the amount by which the AUC value with the predictor included in the model exceeds the value for the baseline model. Although there is no scientific basis for evaluating the magnitude of ΔAUC, Pencina and colleagues (Pencina et al. 2008) suggest that values of at least 0.01 denote meaningful improvements. We identify this threshold of 0.01, along with the top 10 predictors, in our graphical results (Figures 1–3). We also test for significance of the ΔAUC values, although Pencina, D’Agostino, and Demler (2012) suggest that significance is a less important criterion for judging the impact of a predictive variable than the magnitude of the increment in AUC.
FIGURE 1. Top ten predictors of five-year age-specific mortality adjusted for sex, modeled separately by broad age groups.
Abbreviation: 5+ Meds. 5+ Medications; ADI, Activities of daily living, AUC Area under the receiver operating characteristic curve; Bduc Education: Hcy, Homocyseinc; Heart dis, History of beart disease; Hosp Stays, Number of hospital stay; LADL, Instrumental activities of daily living; Mar Stat, Mental status; Mobility, Index of mobility limitations; SAH, Self-assessed health status.
NOTE: The gain in AUC to significant at the 0.01 or 0.001 level for all of the top ten predictors in each age group.
FIGURE 3. Predictors of five-year age-specific mortality adjusted for sex, modeled separately by race/ethnicity and broad age groups.
Abbreviations: 5+ Meds 5+ Medications; ADL Activities of daily living. AUC, Area under the receiver operating characteristic curve. BMI Body mass index; Cancer, History of cancer. CRP, C-reactive protein, Diabetes, History of diabetes; Educ. Education; Hcy. Homocysteine; Heart dis, History of heart disease; Hosp Stays, Number of hospital stays; IADL, Instrumental activities of daily living Mar Stat, Marital status; Mobility; Index of mobility limitations; SAH, Self-assessed health status; SCr Serum creatinine; Stroke, History of stoke. TC, total cholesterol.
NOTE: For non-Latino whites, the gain in AUC is significant (p<0.05) for an of the top ten predictocs in each age group. For non-Latino blacks, the gain in AUC is significant except for hospitalizations, use of 5+ medications, and albumin at ages 20–64; cancer. BMI, and heart rate at ages 65–79. and at ages 80+ only the top lour predictors are significant. For Latinos, the gain in AUC b sinificant except for albumin, heart rate, smoking, and diabetes at ages 20–64. stroke, hospitalizations, and albumin at ages 65–79; and at ages 80+, only the top four predictors are significant.
Results
Baseline models
Table 1 presents the AUC values for the baseline models, stratified by age group, sex, and race/ethnicity. Sizable increments in the AUC are more difficult to obtain when baseline models already have good discrimination (i.e., moderate to high AUC values). Because the baseline AUC values vary across subgroup (e.g., discrimination is lowest in the middle age group), we focus on predictors that reveal substantial differences in ΔAUC or rank across demographic groups.
TABLE 1.
AUC for baseline model by sex, race/ethnicity, and age group
Age group
|
|||
---|---|---|---|
20–64 | 65–79 | 80+ | |
Total | 0.71 | 0.65 | 0.71 |
By sex | |||
Men | 0.67 | 0.62 | 0.71 |
Women | 0.73 | 0.61 | 0.69 |
By race/ethnicity | |||
Non-Latino Whites | 0.71 | 0.63 | 0.72 |
Non-Latino Blacks | 0.75 | 0.65 | 0.72 |
Latinos | 0.68 | 0.68 | 0.64 |
NOTE: All baseline models account for age (in months) as the underlying time metric. Models by broad age groups and by race/ethnicity also adjust for sex. Data from NHANES 1999–2006 waves with mortality follow-up through December 31, 2011.
Prediction by age
Results are shown in Figure 1 for the three broad age groups. The ΔAUC values for all of the top 10 predictors in each age group are statistically significant. The most highly ranked variables are generally consistent across age, most notably measures of functional limitations and disability (IADL, ADL, and mobility limitations) and SAH. Despite the well-documented inadequacies of SAH, it is the single strongest predictor of mortality below age 80 and ranked fourth for the oldest ages. Two other self-reported measures that reflect chronic illness (number of hospital stays and taking at least five medications) and one biomarker (albumin) are among the 10 leading measures in all age groups. As hypothesized, social variables (education, income, and marital status) are weaker predictors at the oldest ages: none appears among the top 10 predictors for ages 80 and older, in contrast to the relative importance of income and education for the youngest age group and marital status for the middle age group. With regard to behavioral variables, exercise frequency appears among the top 10 variables at ages 65 and older, as does smoking below age 80.
Predictions by sex
The ΔAUC values for all of the top 10 predictors for men and women in each age group are significant. Many of these predictors are the same for men and women (Figure 2): IADL, ADL, and mobility limitations and SAH are among the top 10 predictors for both men and women in all age groups, with the exception of ADL limitations for the youngest men. SAH is the most powerful variable for men and women under age 65. Despite these similarities, the disability measures—IADL and ADL limitations—are consistently weaker predictors for men (assessed in terms of both rank and ΔAUC). Other findings are generally consistent between men and women: albumin is a leading predictor for all groups except the youngest women, use of at least five medications ranks among the top 10 except for women aged 65–79, and smoking is a leading predictor for both sexes in all but the oldest age group. However, some systematic sex differences are apparent: marital status is a stronger indicator of survival for men than for women; a diagnosis of stroke is a stronger predictor for women in all but the youngest age group; and, for ages 80 and older, both homocysteine and serum creatinine are stronger predictors for women whereas heart rate is a stronger predictor for men.
FIGURE 2. Top ten predictors of five-year age-specific mortality, modeled separately by sex and broad age groups.
Abbreviations: 5+ Meds 5+ Medication; ADL, Activities of daily living, AUC, Area under the receiver operating characteristic curve; Educ, Education; Hcy, Homocysteine; Heart dis, History of heart disease; Hosp Stays, Number of hospital stays; LADL Instrumental activities of daily living Mar Stat, Marital status; Mobility; Index of mobility limitations; SAH. Self-assessed health status SCr Serum creatinine Stroke. History of stoke.
NOTE: The gain In AUC is significant (p<0.05) for all of the top ten predictors in each subgroup.
Predictions by race/ethnicity
All top 10 predictors significantly increase the AUC for whites in each age group, whereas this is the case for about two-thirds of predictors for blacks and Latinos (Figure 3; see the note to the figure for details). Differences by race and ethnicity are more apparent than those by sex, although some of this may be attributable to the smaller sample sizes for blacks and Latinos. Nevertheless, SAH and the measures of functioning are among the top 10 measures in all three race/ethnic groups, with the exception of blacks aged 20–64, where ADL and IADL limitations have a low ranking and low discriminatory power. Once again, SAH emerges as the single strongest measure for groups younger than 80 (the only exception is Latinos aged 65–79, where it ranks second). In the oldest age group, SAH drops to a rank of four for whites and blacks and a rank of nine for Latinos. Below age 80, the discriminatory power of these health measures is generally higher for whites than for blacks or Latinos.
Only one disease diagnosis appears for the youngest age group: diabetes for Latinos aged 20–64 (although the ΔAUC is not significant), a major source of morbidity and mortality in the Latino population (Goldman 2016). The importance of disease diagnoses varies considerably across ethnic groups. For example, at ages 65–79, heart disease is strongest for whites, cancer for blacks, and stroke for Latinos. The number of hospitalizations ranks particularly high among blacks below age 80, as does taking at least five medications among both whites and blacks of all ages. Smoking is a leading predictor for all ethnic groups at the youngest ages (although not significant for Latinos), but less consistently so at older ages. Education is a leading and significant predictor for all ethnic groups aged 20–64, but it appears to be more important for blacks and Latinos than for whites in the oldest age group.
Albumin is one of the leading 10 predictors in all ethnic and age groups except older blacks (aged 65+), and it ranks either first or second for the youngest and oldest Latinos (ΔAUC is significant for albumin only for the oldest Latinos and for whites in all age groups). Homocysteine is the only other biomarker that is a strong predictor for all three ethnic groups (in this case for the age group 80+, although it is significant only for whites). It is perhaps surprising that conventional clinical markers—diastolic and systolic blood pressure, the three cholesterol measurements (total cholesterol (TC), HDL cholesterol, and TC/HDL ratio), body mass index (BMI), and waist circumference—are among the weakest predictors across all demographic groups. For the analysis by age, these seven markers are the weakest discriminators among the 30 predictors analyzed here, with only one exception (TC is ranked 17th at ages 80+; results not shown). Only for the analyses by ethnic group do any of these biomarkers make it into the top 10, and only two markers do so: TC is the 8th ranked predictor among blacks aged 20–64 and BMI is the 9th ranked marker among blacks aged 65–79 (BMI is not significant).
Discussion
Our findings are consistent with a recent study that identified self-reported measures of mobility and disability along with overall self-assessed health as the top predictors in the US (also based on NHANES) and in three additional countries: England, Costa Rica, and Taiwan (Goldman, Glei, and Weinstein 2016). Here we demonstrate that these variables are among the top predictors across demographic groups in the US, defined by age, sex, and race/ethnicity. This is not altogether surprising since direct measures of health are more proximate to survival than are social, economic, and behavioral variables. What is more unexpected is that, despite previous research identifying response biases for the SAH question, SAH is often the strongest predictor among all of the variables considered here. Still, alongside these similarities, there is substantial heterogeneity among racial/ethnic groups, more so than by sex or age, particularly with regard to the importance of disease diagnoses and biomarkers for five-year survival.
Our findings provide partial support for the hypotheses described earlier. The predictive strength of social factors is generally attenuated with increasing age: the discriminatory power of income and education declines with age and that of marital status is much weaker for ages 80+ than for the younger age groups. Although these patterns provide support for the age-as-leveler hypothesis, we cannot determine whether the declining strength of social factors is driven largely by selective survival of those with relatively high SES or whether physiology dominates social status at later stages of life. With regard to the declining prognostic strength of smoking with age, the answer is likely to lie with selective mortality.
Consistent with the previous literature that identifies stronger effect sizes for social factors among men, we find that marital status is a stronger discriminator of survival for men than for women. This finding has been partly attributed to social support provided by wives and the social control they exert over husbands, both of which improve men’s life styles and reduce their risk-taking behaviors (Umberson 1992). Although income is a substantially stronger predictor for men than for women at ages 65–79, the increments in AUC for income and education are roughly similar for men and women in the youngest age group.
We find partial support for our third hypothesis in which we speculate that self-evaluations of health and functioning are strongest among whites. The discriminatory power of some of these measures is generally higher for whites than for blacks or Latinos, at least below age 80, and other measures of chronic illness are better predictors for whites and blacks than for Latinos. The lower predictive strength of disability measures for blacks, particularly younger blacks, suggests that although disability rates for blacks exceed those for whites, physical limitations may be more strongly associated with non-fatal conditions among blacks. Moreover, Whitson et al. (2011) find that obesity and diabetes account for more than 30 percent of the black/white disparity in disability at older ages. A similar explanation may hold for Latinos, a group characterized by especially high rates of obesity and diabetes (Goldman 2016). An alternative possibility is that, because younger blacks and Latinos are more likely than whites to die from external causes, a higher proportion of their deaths may not be preceded by disabling conditions. Still, SAH and measures of function are the highest ranking variables for all racial groups.
Disability measures are also weaker predictors for men than for women, but the reasons for this disparity are less likely to stem from differentials in underlying diseases. Women’s higher rates of chronic conditions in general and autoimmune diseases such as rheumatoid arthritis in particular, combined with a likely greater severity of smoking-associated diseases among men, suggest alternative explanations (Gleicher and Barad 2007; Case and Paxson 2005). One possible set of mechanisms comprise sociocultural or social-psychological factors such as a greater reluctance of men to acknowledge or report physical limitations (Kandrack, Grant, and Segall 1991; Murtagh and Hubert 2004).
The biomarker with the best discrimination is serum albumin, which ranks among the top 10 predictors in most subgroups analyzed here. In a recent study that examined the predictive power of a large number of variables in Taiwan, albumin also emerged as the biomarker that best discriminated between survivors and decedents (Goldman, Glei, and Weinstein 2016). Albumin, which is synthesized in the liver and is one of the dominant proteins in plasma, has many functions including maintenance of colloid osmotic pressure in the blood. Numerous studies point to links between serum albumin and a wide range of diseases (for example, coronary disease, stroke, cancer, liver disease, and kidney disease), nutritional status, and survival. These studies also identify an association between albumin and mortality among healthy individuals, in the presence of controls for a wide range of confounders (Klonoff-Cohen, Barrett-Connor, and Edelstein 1992; Goldwasser and Feldman 1997; Levitt and Levitt 2016). Although hypothesized mechanisms are consistent with a causal relationship between low albumin and increased risk of mortality (Djousse et al. 2002; Goldwasser and Feldman 1997), some researchers are dubious about causal links, in part because administration of albumin (even within randomized trials) generally does not improve survival of seriously ill patients and may impose additional risks (Levitt and Levitt 2016; Mendez, McClain, and Marsano 2005). In contrast to most previous studies examining links between albumin and survival that were restricted to older adults, our study underscores the potential importance of albumin levels as a prognostic indicator even in relatively young populations, although it is more likely to be a marker of morbidity and survival risk than a causal factor modifiable by clinical intervention (Levitt and Levitt 2016).
The predictive power of albumin stands in contrast to the weak discrimination provided by standard clinical measures of hypertension, cholesterol levels, and obesity. Although many previous studies have identified small effects of these clinical markers on survival in older adults (e.g., Calle et al. 1999; Lan et al. 2007; Norrish et al. 1995; Pastor-Barriuso et al. 2003), our study suggests that these standard markers are weaker predictors than several other biomarkers—serum creatinine, homocysteine, and C-reactive protein—in addition to albumin. As a key marker of kidney dysfunction, serum creatinine is likely to be a good predictor of mortality because it reflects serious underlying illness. Homocysteine is associated with cardiovascular disease and several other conditions that are related to increased mortality among elderly persons (Peng et al. 2015); some evidence suggests that it is more strongly associated with myocardial infarction, which often results in sudden death, than lipids, which is a better indicator of the earlier stages of coronary artery disease (Nygård et al. 1997). The importance of inflammatory markers for survival has been established in a large number of studies (e.g., Emerging Risk Factors Collaboration et al. 2010; Newman et al. 2009), as has the superior prognostic ability of these markers compared with standard metabolic and cardiovascular measures (Glei et al. 2014; Goldman, Glei, and Weinstein 2016). As with albumin, however, these biomarkers may not have causal links with survival, nor are they generally clinically modifiable.
The predictive ability of most biomarkers in this analysis is smaller than that of measures of health, functional limitations, and disability. This finding is not surprising, given that most of our measures of health and function integrate information about the respondent’s well-being over different sources of morbidity and different time periods. Functional ability, in particular, is based on a battery of questions that reflect a lifetime’s accumulation of biological processes and thus provides a more stable, comprehensive variable than a one-time measurement of a single biomarker.
For the purposes of mortality prediction, including clinical prognosis and social science analysis of health and longevity, this study would need to be expanded in several ways. Existing prognostic instruments frequently fail to include some of the measures that we have shown to be strong discriminators between survivors and decedents. For example, self-reported variables often provide more accurate predictions than the types of biological markers included in clinical prognosis. In light of substantial correlation among the powerful predictors identified in this study, researchers will need to employ model-building strategies to develop concise prediction instruments and evaluate the instruments with external data. In addition, these instruments may need to be modified according to the timeframe of interest. For example, the strongest predictors for longer-term survival are likely to differ from those identified here for a five-year period. Moreover, as we have shown, although most of the strongest predictors perform well across groups, the instruments will likely require some adaptation on the basis of an individual’s demographic characteristics.
Supplementary Material
Acknowledgments
This work was supported by the Graduate School of Arts and Sciences, Georgetown University and the Eunice Kennedy Shriver National Institute of Child Health and Human Development under Award Number P2CHD047879 to Princeton University.
Footnotes
The question about national origin or ancestry in NHANES included the following response categories: “Mexican-American/Mexican”; “Other Hispanic or Latino”; “Both Mexican and Other Hispanic”; and “Not Hispanic.” NHANES recoded all those who self-identified in the second or third category as “Other Hispanic.” We combined that group with “Mexican-American/Mexican” to create the category “Latino.”
Appendix is available at the supporting information tab at wileyonlinelibrary.com/journal/pdr.
References
- Assari Shervin, Lankarani Maryam M, Burgard Sarah. Black–white difference in long-term predictive power of self-rated health on all-cause mortality in United States. Annals of Epidemiology. 2016;26(2):106–114. doi: 10.1016/j.annepidem.2015.11.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benyamini Yael, Blumstein Tzvia, Lusky Ayala, Modan Baruch. Gender differences in the self-rated health–mortality association: Is it poor self-rated health that predicts mortality or excellent self-rated health that predicts survival? The Gerontologist. 2003;43(3):396–405. doi: 10.1093/geront/43.3.396. [DOI] [PubMed] [Google Scholar]
- Bzostek Sharon, Goldman Noreen, Pebley Anne. Why do Hispanics in the USA report poor health? Social Science & Medicine. 2007;65(5):990–1003. doi: 10.1016/j.socscimed.2007.04.028. [DOI] [PubMed] [Google Scholar]
- Calle EE, Thun MJ, Petrelli JM, Rodriguez C, Heath CW., Jr Body-mass index and mortality in a prospective cohort of U.S. adults. The New England Journal of Medicine. 1999;341(15):1097–1105. doi: 10.1056/NEJM199910073411501. [DOI] [PubMed] [Google Scholar]
- Case Anne, Paxson Christina. Sex differences in morbidity and mortality. Demography. 2005;42(2):189–214. doi: 10.1353/dem.2005.0011. [DOI] [PubMed] [Google Scholar]
- Crimmins Eileen, Kim Jung K, Vasunilashorn Sarinnapha. Biodemography: New approaches to understanding trends and differences in population health and mortality. Demography. 2010;47(Suppl):S41–S64. doi: 10.1353/dem.2010.0005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DiPrete Thomas A, Eirich Gregory M. Cumulative advantage as a mechanism for inequality: A review of theoretical and empirical developments. Annual Review of Sociology. 2006;32:271–297. [Google Scholar]
- Djousse L, Rothman KJ, Cupples LA, Levy D, Ellison RC. Serum albumin and risk of myocardial infarction and all-cause mortality in the Framingham Offspring Study. Circulation. 2002;106(23):2919–2924. doi: 10.1161/01.cir.0000042673.07632.76. [DOI] [PubMed] [Google Scholar]
- Dupre Matthew E. Educational differences in age-related patterns of disease: Reconsidering the cumulative disadvantage and age-as-leveler hypotheses. Journal of Health and Social behavior. 2007;48(1):1–15. doi: 10.1177/002214650704800101. [DOI] [PubMed] [Google Scholar]
- Elo Irma T. Social class differentials in health and mortality: Patterns and explanations in comparative perspective. Annual Review of Sociology. 2009;35:553–572. [Google Scholar]
- Emerging Risk Factors Collaboration. Kaptoge S, Di Angelantonio E, Lowe G, Pepys MB, et al. C-reactive protein concentration and risk of coronary heart disease, stroke, and mortality: An individual participant meta-analysis. Lancet. 2010;375(9709):132–140. doi: 10.1016/S0140-6736(09)61717-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Finch Brian K, Hummer Robert A, Reindl Maureen, Vega William A. Validity of self-rated health among Latino(a)s. American Journal of Epidemiology. 2002;155(8):755–759. doi: 10.1093/aje/155.8.755. [DOI] [PubMed] [Google Scholar]
- Franks Peter, Gold Marthe R, Fiscella Kevin. Sociodemographics, self-rated health, and mortality in the US. Social Science & Medicine. 2003;56(12):2505–2514. doi: 10.1016/s0277-9536(02)00281-2. [DOI] [PubMed] [Google Scholar]
- Glei Dana A, Goldman Noreen, Rodríguez Germán, Weinstein Maxine. Beyond self-reports: Changes in biomarkers as predictors of mortality. Population and Development Review. 2014;40(2):331–360. doi: 10.1111/j.1728-4457.2014.00676.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gleicher N, Barad DH. Gender as risk factor for autoimmune diseases. Journal of Autoimmunity. 2007;28(1):1–6. doi: 10.1016/j.jaut.2006.12.004. [DOI] [PubMed] [Google Scholar]
- Goldman Noreen. Will the Latino mortality advantage endure? Research on Aging. 2016;38(3):263–282. doi: 10.1177/0164027515620242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldman N, Glei DA, Weinstein M. What matters most for predicting survival? A multinational population-based cohort study. PLOS ONE, [Online] 2016;11(7):e0159273. doi: 10.1371/journal.pone.0159273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldman N, Korenman S, Weinstein R. Marital status and health among the elderly. Social Science & Medicine. 1995;40(12):1717–1730. doi: 10.1016/0277-9536(94)00281-w. [DOI] [PubMed] [Google Scholar]
- Goldwasser P, Feldman J. Association of serum albumin and mortality risk. Journal of Clinical Epidemiology. 1997;50(6):693–703. doi: 10.1016/s0895-4356(97)00015-2. [DOI] [PubMed] [Google Scholar]
- House JS, et al. The social stratification of aging and health. Journal of Health and Social behavior. 1994;35(3):213–234. [PubMed] [Google Scholar]
- Hu Yuanreng, Goldman Noreen. Mortality differentials by marital status: An international comparison. Demography. 1990;27(2):233–250. [PubMed] [Google Scholar]
- Idler Ellen L, Benyamini Yael. Self-rated health and mortality: A review of twenty-seven community studies. Journal of Health and Social behavior. 1997;38(1):21–37. [PubMed] [Google Scholar]
- Jylhä Marja. What is self-rated health and why does it predict mortality? Towards a unified conceptual model. Social Science & Medicine. 2009;69(3):307–316. doi: 10.1016/j.socscimed.2009.05.013. [DOI] [PubMed] [Google Scholar]
- Kandrack MA, Grant KR, Segall A. Gender differences in health related behaviour: Some unanswered questions. Social Science & Medicine (1982) 1991;32(5):579–590. doi: 10.1016/0277-9536(91)90293-l. [DOI] [PubMed] [Google Scholar]
- Klonoff-Cohen Hillary, Barrett-Connor Elizabeth L, Edelstein Sharon L. Albumin levels as a predictor of mortality in the healthy elderly. Journal of Clinical Epidemiology. 1992;45(3):207–212. doi: 10.1016/0895-4356(92)90080-7. [DOI] [PubMed] [Google Scholar]
- Lan Tzuo-Yun, et al. Clinical and laboratory predictors of all-cause mortality in older population. Archives of Gerontology and Geriatrics. 2007;45(3):327–334. doi: 10.1016/j.archger.2007.02.001. [DOI] [PubMed] [Google Scholar]
- Lauderdale Diane S. Education and survival: Birth cohort, period, and age effects. Demography. 2001;38(4):551–561. doi: 10.1353/dem.2001.0035. [DOI] [PubMed] [Google Scholar]
- Levitt DG, Levitt MD. Human serum albumin homeostasis: A new look at the roles of synthesis, catabolism, renal and gastrointestinal excretion, and the clinical value of serum albumin measurements. International Journal of General Medicine. 2016;9:229–255. doi: 10.2147/IJGM.S102819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Link Bruce G, Phelan Jo. Social conditions as fundamental causes of disease. Journal of Health and Social Behavior. 1995 Spec No: 80–94. [PubMed] [Google Scholar]
- Lynch Scott M. Race, socioeconomic status, and health in life-course perspective: Introduction to the Special Issue. Research on Aging. 2008;30(2):127–136. [Google Scholar]
- Markides Kyriakos S, Black Sandra A. Race, ethnicity, and aging: The impact of inequality. In: Binstock Robert H, George Linda K., editors. Handbook of Aging and the Social Sciences. San Diego: Academic Press; 1996. pp. 153–170. [Google Scholar]
- Mendez CM, McClain CJ, Marsano LS. Albumin therapy in clinical practice. Nutrition in Clinical Practice: Official Publication of the American Society for Parenteral and Enteral Nutrition. 2005;20(3):314–320. doi: 10.1177/0115426505020003314. [DOI] [PubMed] [Google Scholar]
- Murtagh KN, Hubert HB. Gender differences in physical disability among an elderly cohort. American Journal of Public Health. 2004;94(8):1406–1411. doi: 10.2105/ajph.94.8.1406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Newman Anne B, et al. Total and cause-specific mortality in the cardiovascular health study. The Journals of Gerontology: Series A, Biological Sciences and Medical Sciences. 2009;64(12):1251–1261. doi: 10.1093/gerona/glp127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Newson Rachel S, et al. Predicting survival and morbidity-free survival to very old age. Age (Dordrecht, Netherlands) 2010;32(4):521–534. doi: 10.1007/s11357-010-9154-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Norrish A, North D, Yee RL, Jackson R. Do cardiovascular disease risk factors predict all-cause mortality? International Journal of Epidemiology. 1995;24(5):908–914. doi: 10.1093/ije/24.5.908. [DOI] [PubMed] [Google Scholar]
- Nygård Ottar, et al. Plasma homocysteine levels and mortality in patients with coronary artery disease. New England Journal of Medicine. 1997;337:230–237. doi: 10.1056/NEJM199707243370403. [DOI] [PubMed] [Google Scholar]
- Pastor-Barriuso R, Banegas JR, Damian J, Appel LJ, Guallar E. Systolic blood pressure, diastolic blood pressure, and pulse pressure: An evaluation of their joint effect on mortality. Annals of Internal Medicine. 2003;139(9):731–739. doi: 10.7326/0003-4819-139-9-200311040-00007. [DOI] [PubMed] [Google Scholar]
- Pencina Michael J, D’Agostino Ralph B. Overall C as a measure of discrimination in survival analysis: Model specific population value and confidence interval estimation. Statistics in Medicine. 2004;23(13):2109–2123. doi: 10.1002/sim.1802. [DOI] [PubMed] [Google Scholar]
- Pencina Michael J, D’Agostino Ralph B, Sr, D’Agostino Ralph B, Jr, Vasan Ramachandran S. Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond. Statistics in Medicine. 2008;27(2):157–72. doi: 10.1002/sim.2929. discussion 207–212. [DOI] [PubMed] [Google Scholar]
- Pencina MJ, D’Agostino RB, Sr, Demler OV. Novel metrics for evaluating improvement in discrimination: Net reclassification and integrated discrimination improvement for normal variables and nested models. Statistics in Medicine. 2012;31(2):101–113. doi: 10.1002/sim.4348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peng HY, Man CF, Xu J, Fan Y. Elevated homocysteine levels and risk of cardiovascular and all-cause mortality: A meta-analysis of prospective studies. Journal of Zhejiang University Science B. 2015;16(1):78–86. doi: 10.1631/jzus.B1400183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pepe MS, Janes H, Longton G, Leisenring W, Newcomb P. Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. American Journal of Epidemiology. 2004;159(9):882–890. doi: 10.1093/aje/kwh101. [DOI] [PubMed] [Google Scholar]
- Rosero-Bixby Luis, Dow William H. Predicting mortality with biomarkers: A population-based prospective cohort study for elderly Costa Ricans. Population Health Metrics. 2012;10(1):11. doi: 10.1186/1478-7954-10-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rubin Donald B. Multiple imputation after 18+ years (with discussion) Journal of the American Statistical Association. 1996;91:473–489. [Google Scholar]
- Schafer Joseph L. Multiple imputation: A primer. Statistical Methods in Medical Research. 1999;8(1):3–15. doi: 10.1177/096228029900800102. [DOI] [PubMed] [Google Scholar]
- Seeman Teresa E. Social ties and health: The benefits of social integration. Annals of Epidemiology. 1996;6(5):442–451. doi: 10.1016/s1047-2797(96)00095-6. [DOI] [PubMed] [Google Scholar]
- Seeman Teresa E, McEwen Bruce S, Rowe John W, Singer Burton H. Allostatic load as a marker of cumulative biological risk: MacArthur Studies of Successful Aging. Proceedings of the National Academy of Sciences of the United States of America. 2001;98(8):4770–4775. doi: 10.1073/pnas.081072698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Umberson D. Gender, marital status and the social control of health behavior. Social Science & Medicine (1982) 1992;34(8):907–917. doi: 10.1016/0277-9536(92)90259-s. [DOI] [PubMed] [Google Scholar]
- Whitson HE, et al. Black–white disparity in disability: The role of medical conditions. Journal of the American Geriatrics Society. 2011;59(5):844–850. doi: 10.1111/j.1532-5415.2011.03401.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Woo Hyeyoung, Zajacova Anna. Predictive strength of self-rated health for mortality risk among older adults in the United States: Does it differ by race and ethnicity? Research on Aging. 2016 doi: 10.1177/0164027516637410. [Epub ahead of print] [DOI] [PubMed] [Google Scholar]
- Yourman Lindsey C, Lee Sei J, Schonberg Mara A, Widera Eric W, Smith Alexander K. Prognostic indices for older adults: A systematic review. JAMA : The Journal of the American Medical Association. 2012;307(2):182–192. doi: 10.1001/jama.2011.1966. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zajacova Anna, Woo Hyeyoung. Examination of age variations in the predictive validity of self-rated health. The Journals of Gerontology Series B: Psychological Sciences and Social Sciences. 2016;71(3):551–557. doi: 10.1093/geronb/gbv050. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.