Skip to main content
International Journal of Epidemiology logoLink to International Journal of Epidemiology
. 2017 Apr 11;46(4):1285–1294. doi: 10.1093/ije/dyx041

Mortality selection in a genetic sample and implications for association studies

Benjamin W Domingue 1,*, Daniel W Belsky 2, Amal Harrati 3, Dalton Conley 4, David R Weir 5, Jason D Boardman 6
PMCID: PMC5837559  PMID: 28402496

Abstract

Background

Mortality selection occurs when a non-random subset of a population of interest has died before data collection and is unobserved in the data. Mortality selection is of general concern in the social and health sciences, but has received little attention in genetic epidemiology. We tested the hypothesis that mortality selection may bias genetic association estimates, using data from the US-based Health and Retirement Study (HRS).

Methods

We tested mortality selection into the HRS genetic database by comparing HRS respondents who survive until genetic data collection in 2006 with those who do not. We next modelled mortality selection on demographic, health and social characteristics to calculate mortality selection probability weights. We analysed polygenic score associations with several traits before and after applying inverse-probability weighting to account for mortality selection. We tested simple associations and time-varying genetic associations (i.e. gene-by-cohort interactions).

Results

We observed mortality selection into the HRS genetic database on demographic, health and social characteristics. Correction for mortality selection using inverse probability weighting methods did not change simple association estimates. However, using these methods did change estimates of gene-by-cohort interaction effects. Correction for mortality selection changed gene-by-cohort interaction estimates in the opposite direction from increased mortality selection based on analysis of HRS respondents surviving through 2012.

Conclusions

Mortality selection may bias estimates of gene-by-cohort interaction effects. Analyses of HRS data can adjust for mortality selection associated with observables by including probability weights. Mortality selection is a potential confounder of genetic association studies, but the magnitude of confounding varies by trait.

Keywords: Mortality, genetic epidemiology, genotype


Key Messages

  • Mortality selection may confound studies of genetic associations.

  • In data that observe research participants in the years before genotyping, such mortality selection can be modelled and used to correct naïve estimates.

  • Using this approach, we observe little evidence that mortality selection biases simple genetic associations between polygenic scores and their target phenotypes.

  • However, mortality selection does appear to bias estimates of time-varying genetic associations (gene-by-cohort interactions) for traits linked to excess mortality. The direction of bias is toward the null.

Introduction

Among individuals in a birth cohort, those surviving to some threshold age are not a random sample. Traits, behaviours and environments influence survival. As cohorts age, members with disadvantageous characteristics (e.g. smoking) are more likely to die.1 Such ‘mortality selection’ can bias aetiological studies of conditions that cause mortality.2,3 Mortality selection has been raised as a concern in genetic epidemiology4–6 but not systematically evaluated.

Mortality selection is of interest in light of recent ‘gene-by-cohort’ interaction studies that test variation in genetic effects across birth cohorts.7–13 These studies often interpret variation in genetic effects across birth cohorts as reflecting environmental changes occurring across historical periods. However, because DNA has been collected only recently, members of the earliest-born birth cohorts are relatively long-lived. Thus mortality selection into the sample is increasingly pronounced from later to earlier-born birth cohorts. If long-lived individuals differ from the general population in ways that effect genotype-phenotype associations, mortality selection will complicate gene-by-cohort interaction estimates.

We investigated mortality-selection-related bias to estimates of genetic associations in a large sample of US adults followed longitudinally during later life: the Health and Retirement Study (HRS). HRS began collecting data in 1992, and first collected DNA for genotyping in 2006. Thus, we were able to examine characteristics of a sample 14 years before genotyping. We used differences between those who did and did not survive until 2006 to evaluate mortality selection into the genetic sample. We next modelled mortality before DNA collection as a function of demographic, health and social characteristics to estimate probability weights. Finally, we used these probability weights to estimate effects of mortality selection on polygenic score associations with height, body mass index (BMI), educational attainment and smoking behaviour. We considered ‘simple’ polygenic score associations with phenotypes and polygenic score associations with phenotypes that vary across environments, in this case broad environments indexed by birth cohort (i.e. gene-by-cohort interactions). Findings highlight challenges facing investigations of genetic associations in data on older adults, and suggest some solutions.

Mortality selection and missing data

Mortality selection is one cause of missing data.14 When data are missing completely at random (MCAR), i.e. missingness is not associated with parameters of interest, then missingness can be ignored. However, when causes of mortality are of interest, as in much epidemiological research, mortality-selection-related missingness is rarely MCAR and failure to account for mortality selection may cause bias. If the data are missing conditional on observed variables (i.e. missing at random, MAR) then bias can be reduced by appropriate techniques (e.g. inverse probability weighting).

Mortality selection and gene-by-birth-cohort interactions

Genetic effects may vary across environments.15,16 Recent studies have examined variation in genetic effects on complex phenotypes across historical contexts indexed by birth cohort.7–13 Because most genetic data were collected recently, these studies often observe populations from earlier-born birth cohorts at much older ages, which could introduce bias into estimates of gene-by-cohort interaction if individuals sampled at older ages represent a different population from those sampled at younger ages. Suppose that the genetic effect (G) on some outcome (Y) is a function of birth cohort (B) as indexed by year of birth:

Y = β(B)·G+ɛ. (Eqn 1)

Estimates of the effect, β(B0)^T, for a particular birth cohort (B0) may depend upon the specific time (T) when data are collected. In particular, only those who die (D denoting year of death) after the observation window (D>T) are included (data are missing for D<T). If missingness is associated with Y, then β(B0)^T may be biased. Bias will depend on G’s relationship with Y. If effects of G on Y occur after age TB, any mortality selection (i.e. D< T) is independent of G and Y (i.e. it is MCAR) and missingness will not bias estimates of β(B). If effects of G on Y occur before age TB, then estimates may contain bias. For example, if smoking causes premature death, then estimates of genetic associations with smoking using DNA collected after onset of smoking-related mortality are potentially biased. However, it is possible to reduce bias by modelling mortality-related missingness as a function of observables. One such approach is inverse probability weighting (IPW17).

We highlight three additional issues. First, under reasonable assumptions, Inline graphic will be an unbiased estimate of β(B|D> T). Such estimates may be of interest.18 Second, most recent research on birth cohort effects have assumed that β(B) is a linear function of B.8–10 We do the same, assuming:

β(B) = c+dB. (Eqn 2)

Third, estimates of cohort-invariant genetic effects via the model:

Y = β*·G+ɛ (Eqn 3)

may be of interest. If β(B) is nearly constant over a large number of observed birth cohorts, β* is likely to be a reasonable summary of the genetic effect. If not, possibility of specification error exists that may be difficult to disentangle from bias induced by mortality selection.

Polygenic scores and phenotypes

We investigated mortality bias to genetic associations using polygenic scores.19,20 Polygenic scores (PGSs) summarize genetic contributions to phenotypic variation across the entire genome. Mortality-selection-related bias is also relevant to study of individual genetic loci. We focused on polygenic scores because they are in wide use in epidemiology21 and have been the focus of most recent gene-by-cohort interaction research. We investigated polygenic scores for smoking initiation, educational attainment and BMI because each has been investigated in the context of gene-by-cohort interactions.8,9,13 We also investigated the polygenic score for height because it is, to date, the most predictive score available.

Smoking and educational attainment are strongly related to mortality.1,22,23 We expect mortality-related missingness to be associated with genetics pertinent to these phenotypes and, as a consequence, to observe mortality-selection-related bias. Higher mortality among smokers should attenuate polygenic score associations with smoking behaviour. Moreover, this attenuation bias should increase with the magnitude of genetic influence on smoking behaviour. Twin studies suggest that the magnitude of genetic influence on smoking behaviour has increased across the 20th century.24 Therefore, we expect modest attenuation bias to polygenic score associations with ever-smoker status and more substantial attenuation bias to polygenic-score-by-cohort interactions (i.e. d>d^ in estimates of Eqn 2). The same pattern should hold for educational attainment, although there it is the least-educated whom we expect to suffer disproportionate mortality-related attrition.

The relationship between the remaining phenotypes, BMI and height, and mortality is less clear. There are potential confounders of observed relationships between these two anthropometric traits and mortality,25 suggestions that the form of the relationship is nonlinear 26 and evidence of heterogeneity in associations.27,28 We therefore do not hold strong previous hypotheses about mortality-selection bias for polygenic score associations with these phenotypes.

Data and Methods

Data

The Health and Retirement Study is a biennial longitudinal study of persons aged 50+, initiated in 1992, examining socioeconomic status and health. Using the RAND Version N data,29 we consider 30 196 respondents in the HRS who provided any data to HRS during between 1992 and 2005, i.e. before the beginning of DNA collection for genotyping in 2006. We conceptualize genetic data collection as a two-step selection process (see inset of Figure 1). Respondents first must have survived until genotyping (2006–08) and second, must have consented to participate. Figure 1 illustrates this selection process graphically.

Figure 1.

Figure 1

Description of two-step selection process into sample of genotyped respondents (inset). Main figure shows Kaplan-Meier survival curves for those who died before 2006 (type 1), those who survived through 2006 but were not genotyped (type 2) and those who were genotyped (type 3).

Phenotypic data

We used measurements of health (diabetes status, heart disease status, depressive symptoms and self-rated health), ever-smoker status, educational attainment, height and BMI collected by HRS before 2006 (when DNA were first collected). Additional detail on phenotypic measures is provided in Supplementary data, available at IJE online.

Genetic data

Genotyping was conducted from DNA samples collected at assessment waves during 2006–08. We obtained data from the NIH Database of Genotypes and Phenotypes (dbGaP). Quality control procedures followed previous work8,10,11 and are described in Supplementary data, available at IJE online The analysis genotype dataset consisted of 1.7 M genotyped single-nucleotide polymorphisms (SNPs) that passed quality control.

Polygenic scores

Polygenic scores were computed from the genome-wide SNP database using published results30–33 from the original genome-wide association studies (GWAS) using the most recent PGS techniques.34 (Because the GWAS of educational attainment included HRS, we obtained a revised set of results based on meta-analysis excluding HRS.) Briefly, we coded SNP genotypes as counts of alleles weighted according to the associations reported in the published GWAS. We summed weighted allele counts across all SNPs and then standardized to compute polygenic scores.

Analysis

We compared survival differences between genotyped and non-genotyped HRS respondents using Kaplan-Meier survival curves35 and Cox proportional hazard models.36 We modelled early mortality using logistic regression. We used two specifications. The first specification included health indicators, ever-smoker status and educational attainment. The second specification added demographic information (birth year). We used fitted values from regression models to construct probability weights which are used as inverse probability weights to adjust estimates for mortality selection (see Supplementary data, available at IJE online).

Mortality selection analyses included all HRS participants born 1910–59 (n = 28 625). We excluded cohort members born before 1910 because only 12 of these respondents were genotyped. We excluded respondents born after 1960 because they are not categorized into current HRS birth cohorts. In our analyses of gene-by-cohort effects, we focused on those born between 1919 and 1955, because of sample size constraints.

We tested genetic associations with phenotypes using linear regression. (Ever-smoker status was analysed with linear probability models.) Models included polygenic scores, sex and birth year as covariates. We tested gene-by-cohort interactions using the same linear regression models used to test genetic associations, with the addition of a product term modelling interaction between birth cohort and polygenic score.

We evaluated mortality-selection-related bias under four scenarios:

  1. naïve scenario: those with genetic data;

  2. data from scenario 1 plus inverse probability weights constructed from model of mortality without birth year;

  3. data from scenario 1 plus inverse probability weights constructed from model of mortality with birth year. This scenario should be least affected by mortality selection;

  4. enhanced mortality selection: those with genetic data and still alive as of most recent data collection (the latest recorded death is in 2013). This scenario should show the most bias due to mortality selection.

All analyses were stratified by self-reported race/ethnicity and sex because of established differences in age-related mortality across demographic categories.

Results

Genotyped HRS respondents were longer-lived as compared with non-genotyped respondents

Genotyped respondents tended to be longer-lived as compared with their non-genotyped age-peers (Table 1, Figure 2). Age-specific mortality hazards for genotyped were 10–22% of those for non-genotyped respondents at mean age of first interview (P < 0.01 for non-Whites and P < 0.001 for whites). This increased longevity was related to health indicators, smoking behaviour and education (Figure 3). Respondents who survived through 2006 to become eligible for genotyping were healthier as compared with cohort-peers who did not survive through 2006: they were less likely to smoke, less likely to develop heart disease or diabetes, less likely to exhibit depressive symptoms and self-reported better health. They were also more highly educated.

Table 1.

Demographic characteristics of the HRS sample as a function of birth cohort and genotype status

Female
Hispanic
Black
Other
Cohort name Birth years N % geno sample % geno sample % non-geno sample % geno sample % non-geno sample % geno sample % non-geno sample % geno sample % non-geno sample
All 30196 40.9 59.2 54.9 9.1 8.4 13.1 16.4 4.7 4.6
AHEAD < 1924 7743 14.5 62.8 59.0 5.1 5.9 10.8 14.5 1.0 2.4
CODA 1924–30 4181 42.8 56 48.1 5.9 7.4 8.7 11.3 3.2 4.1
HRS 1931–41 10324 46.1 55.9 50.2 8.7 9.9 14.0 20.3 3.4 4.6
War Babies 1942–47 3418 55.2 62.9 60.1 8.1 7.6 14.3 16.2 4.5 5.4
Early Baby Boomers 1948–53 3506 61.2 58.2 53.9 12.9 15.9 15.4 19.0 8.8 13.1
Mid Baby Boomers 1954–59 747 62.5 80.1 78.9 16.3 13.9 11.8 13.2 10.9 12.1

Figure 2.

Figure 2

Kaplan-Meier survival curves for genotyped and non-genotyped HRS respondents, split by race and sex.

Figure 3.

Figure 3

Phenotypic means [with LOcally WEighted Scatter-plot Smoother (LOESS) fits] as a function of respondent birth year in genotyped and non-genotyped sample.

We next modelled mortality as a function of the health indicators. Model predictions were reasonably accurate as evaluated by area under the receiver operating characteristic curve (AUC). A straightforward interpretation of AUC is the probability that a randomly selected respondent who died before 2006 would have a higher predicted value as compared with a respondent who survived past 2006. AUCs tended to be > 0.8 for models including birth year and > 0.7 for models without birth year. AUCs were lower for non-White males. Complete results are reported in the Supplementary data, available at IJE online; note that these relatively simple models predict mortality as well as the more complex models.

Distributions of some polygenic scores vary across birth cohorts

If individuals who carry advantageous or disadvantageous genotypes have different survival probabilities as compared with cohort-peers who do not carry those genotypes, genotype distributions will vary across birth cohorts observed at a fixed point in time. Figure 4 plots means for BMI, height, education and smoking polygenic scores across HRS birth cohorts. Particularly among males, average educational attainment polygenic score declines from earlier to more recent birth cohorts.

Figure 4.

Figure 4

Mean polygenic score for non-Hispanic Whites split by sex as a function of birth year.

Correction for mortality selection disattenuates estimates of gene-by-cohort interaction effects for education and smoking polygenic scores

Correction for mortality selection using IPW did not substantially change estimates of simple genetic associations (Table 2; see also Supplementary data, available at IJE online). Correction for mortality selection using IPW did alter estimates of gene-by-cohort interaction effect estimates in two cases (Figure 5; see also Supplementary data, available at IJE online). For BMI, height, education and smoking, polygenic score effect sizes increased from earlier to more recent birth cohorts. Across four specifications—enhanced mortality selection, which considered only individuals surviving to 2012, the naïve model, and two models correcting for mortality selection with IPW—magnitudes of increase in polygenic score effect sizes were roughly constant for BMI and height, but varied for education and smoking. After disattenuation via IPW, estimates of gene-by-cohort interaction that account for mortality selection were 74% larger for education in males and 38% larger for smoking in females. In contrast, gene-by-cohort interaction effect estimates were further attenuated in enhanced mortality selection models. A few influential outliers (n = 10) led to the negative interaction for height and birth cohort in males after enhanced mortality selection (see Supplementary data, available at IJE online). After their removal, the estimated interaction term was positive.

Table 2.

Main effect of PGS on phenotype (net of birth year and sex) under four different scenarios

Enhanced mortality Naïve Weighted, no birth year Weighted, with birth year
BMI 0.268 0.259 0.253 0.254
SE 0.011 0.010 0.010 0.010
Height 0.350 0.328 0.322 0.317
SE 0.011 0.010 0.026 0.030
Education 0.221 0.227 0.232 0.227
SE 0.010 0.010 0.010 0.010
Smoke 0.117 0.115 0.112 0.108
SE 0.012 0.011 0.011 0.011

SE, standard error.

Figure 5.

Figure 5

Birth-cohort variation in effect sizes for polygenic scores in models with enhanced mortality selection, naïve models and models correcting for mortality selection using inverse probability weighting. Effect sizes are estimated from Equation 4 in the Supplementary data (available at IJE online)/ Analyses were conducted for non-Hispanic White HRS respondents born 1919–55. Slope plots show changes in polygenic score effects for fitted values one SD above PGS mean minus fitted value one SD below PGS mean (y-axis) across birth cohorts (x-axis, showing year of birth). Barplots show estimated interaction coefficients for all scenarios. Data for women are plotted on the left side of the figure. Data for men are plotted on the right side of the figure.

Discussion

The addition of genomic data to social surveys and epidemiological studies opens new research opportunities to understand how the genome influences health and disease. One emerging area of study is variation in genetic effects across birth cohorts.7–13 A threat to validity in such studies is mortality selection. We considered implications of mortality selection in one longitudinal social survey, the HRS. We observed substantial mortality selection into the HRS genetic database; during 1992–2005, HRS respondents who would eventually be genotyped were healthier, better educated and longer-lived as compared with birth-cohort peers who did provide DNA. This observation is consistent with previous study of the HRS.37

We found evidence for attenuation bias in the estimation of birth cohort-varying genetic effects. This finding, that mortality selection may lead to attenuation, has also been observed in the context of Mendelian randomization.5,6 If certain genetic profiles are associated with increased risks for mortality,38,39 then individuals with such genotypes are less likely to be observed in older samples (e.g. Figure 4).

We acknowledge that we cannot observe the ‘true’ model. We interpret the systematic changes in the birth cohort-varying PGS effects (i.e. increasing as we go from estimates based on enhanced mortality selection to estimates based on weighting to remove effects of mortality selection) as evidence that naïve estimates may be biased. This is contingent upon the model for mortality selection being correctly specified. Moreover, our ability to detect bias is contingent upon selection occurring due to observables (e.g. on smoking status).

The evidence presented here is focused on results from polygenic scores derived from GWAS. GWAS themselves are likely conducted in the presence of substantial mortality selection. Polygenic scores, because they average effects from across the genome, are fairly robust predictors.34,40 Therefore, it may not be surprising that little bias was observed in the estimation of birth cohort-invariant marginal effects. However, even small amounts of bias may be problematic in studies of individual genetic variants (e.g. GWAS) since magnitudes of single-variant effects are mostly small. Moreover, mortality selection may act as a form of collider bias,41 potentially leading to false-positive findings in a GWAS.42 Future GWAS replication studies may consider mortality bias as a potential confounder. The HRS is well-equipped for this purpose. It is used as a validation sample in several recent GWAS33,43 and it would be possible to test the robustness of replication results to mortality selection. The empirical evidence presented with respect to height and BMI suggest that such resulting biases may be small, but mortality bias may be more pernicious in other contexts, especially in studies of traits with clear mortality associations.

Supplementary Data

Supplementary data are available at IJE online.

Funding

Research was supported by the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) of the National Institutes of Health (NIH) under Award R21HD078031. Further support was provided by the NIH/NICHD-funded University of Colorado Population Center (R24HD066613). D.W.B. is supported by an Early Career Research Fellowship from the Jacobs Foundation, P30AG034424, P30AG028716 and R01AG032282. A.H. is supported by NIH/NIA R01 AG026291. This research was facilitated by the Social Science Genetic Association Consortium (SSGAC). D.C. supported by Russell Sage Foundation research grant (83-15-29), “GxE and health inequality across the life course”.

Supplementary Material

Supplementary Data

Acknowledgements

This research uses data from the HRS, which is sponsored by the National Institute on Aging (grants NIA U01AG009740, RC2AG036495 and RC4AG039029) and conducted by the University of Michigan.

Conflict of interest: None declared.

References

  • 1. Centers for Disease Control and Prevention. Smoking-attributable mortality, years of potential life lost, and productivity losses–United States, 2000–2004. MMWR Morb Mortal Wkly Rep 2008;57:1226. [PubMed] [Google Scholar]
  • 2. Vaupel JW, Yashin AI. Heterogeneity’s ruses: some surprising effects of selection on population dynamics. Am Stat 1985;39:176–85. [PubMed] [Google Scholar]
  • 3. Bergmann MM, Rehm J, Klipstein-Grobusch K. et al. The association of pattern of lifetime alcohol use and cause of death in the European Prospective Investigation into Cancer and Nutrition (EPIC) study. Int J Epidemiol 2013;42:1772–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Boef AG, le Cessie S, Dekkers OM. Mendelian Randomization Studies in the Elderly. Epidemiology 2015;26:e15–16. [DOI] [PubMed] [Google Scholar]
  • 5. Tchetgen EJT, Walter S, Vansteelandt S, Martinussen T, Glymour M. Instrumental variable estimation in a survival context. Epidemiology 2015;26:402–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Taylor AE, Munafò MR. CARTA consortium. Commentary: Does mortality from smoking have implications for future Mendelian randomization studies? Int J Epidemiol 2014;43:1483–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Demerath EW, Choh AC, Johnson W. et al. The positive association of obesity variants with adulthood adiposity strengthens over an 80-year period: a gene-by-birth year interaction. Hum Hered 2013;75:175–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Domingue BW, Conley D, Fletcher J, Boardman JD. Cohort Effects in the Genetic Influence on Smoking. Behav Genet 2016;46:31–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Liu H, Guo G. Lifetime socioeconomic status, historical context, and genetic inheritance in shaping body mass in middle and late adulthood. Am Sociol Rev 2015;80:705–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Conley D, Laidley TM, Boardman JD, Domingue BW. Changing Polygenic Penetrance on Phenotypes in the 20th Century Among Adults in the US Population. Sci Rep 2016;6:30348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Conley D, Laidley T, Belsky DW, Fletcher JM, Boardman JD, Domingue BW. Assortative mating and differential fertility by phenotype and genotype across the 20th century. Proc Natl Acad Sci U S A 2016;113:6647–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Walter S, Mejía-Guevara I, Estrada K, Liu SY, Glymour MM. Association of a Genetic Risk Score With Body Mass Index Across Different Birth Cohorts. JAMA 2016;316:63–69. [DOI] [PubMed] [Google Scholar]
  • 13. Beauchamp JP. Genetic evidence for natural selection in humans in the contemporary United States. Proc Natl Acad Sci U S A 2016;113:7774–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Rubin DB, Little RJ. Statistical Analysis With Missing Data. Hoboken, NJ: Wiley, 2002. [Google Scholar]
  • 15. Hunter DJ. Gene–environment interactions in human diseases. Nat Rev Genet 2005;6:287–98. [DOI] [PubMed] [Google Scholar]
  • 16. Manuck SB, McCaffery JM. Gene-Environment Interaction. Annu Rev Psychol 2014;65:41–70. [DOI] [PubMed] [Google Scholar]
  • 17. Seaman SR, White IR. Review of inverse probability weighting for dealing with missing data. Stat Methods Med Res 2013;22:278–95. [DOI] [PubMed] [Google Scholar]
  • 18. Levine ME, Crimmins EM. A Genetic Network Associated With Stress Resistance, Longevity, and Cancer in Humans. J Gerontol A Biol Sci Med Sci 2016;71:703–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Wray NR, Goddard ME, Visscher PM. Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res 2007;17:1520–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Purcell SM, Wray NR, Stone JL. et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 2009;460:748–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Dudbridge F. Polygenic Epidemiology. Genet Epidemiol 2016;40:268–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Hummer RA, Hernandez EM. The Effect of Educational Attainment on Adult Mortality in the United States. Popul Bull 2013;68(1):1. [PMC free article] [PubMed] [Google Scholar]
  • 23. Lundborg P, Lyttkens CH, Nystedt P. The Effect of Schooling on Mortality: New Evidence From 50,000 Swedish Twins. Demography 2016;53:1135–68. [DOI] [PubMed] [Google Scholar]
  • 24. Boardman JD, Blalock CL, Pampel FC. Trends in the Genetic Influences on Smoking. J Health Soc Behav 2010;51:108–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Allebeck P, Bergh C. Height, body mass index and mortality: do social factors explain the association? Public Health 1992;106:375–82. [DOI] [PubMed] [Google Scholar]
  • 26. Berrington de Gonzalez A, Hartge P, Cerhan JR. et al. Body-Mass Index and Mortality among 1.46 Million White Adults. N Engl J Med 2010;363:2211–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Leon DA, Davey Smith G, Shipley M, Strachan D. Adult height and mortality in London: early life, socioeconomic confounding, or shrinkage? J Epidemiol Community Health 1995;49:5–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Flegal KM, Graubard BI, Williamson DF, Gail MH. Cause-specific excess deaths associated with underweight, overweight, and obesity. JAMA 2007;298:2028–37. [DOI] [PubMed] [Google Scholar]
  • 29. Chien S, Campbell N, Hayden O. et al. RAND HRS Data Documentation, Version N. 2014. www.rand.org, Labor & Population, RAND Center for the Study of Aging, Data Products.
  • 30. Wood AR, Esko T, Yang J. et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat Genet 2014;46:1173–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Locke AE, Kahali B, Berndt SI. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 2015;518:197–206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. The Tobacco and Genetics Consortium. Genome-wide meta-analyses identify multiple loci associated with smoking behavior. Nat Genet 2010;42:441–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Okbay A, Beauchamp JP, Fontana MA. et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature 2016;533:539–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Dudbridge F. Power and predictive accuracy of polygenic risk scores. PLoS Genet 2013;9:e1003348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Stat Assoc 1958;53:457–81. [Google Scholar]
  • 36. Cox DR. Regression models and life-tables. J R Stat Soc Ser B Methodol 1972;34:187–220. [Google Scholar]
  • 37. Zajacova A, Burgard SA. Healthier, wealthier, and wiser: a demonstration of compositional changes in aging cohorts due to selective mortality. Popul Res Policy Rev 2013;32:311–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Marioni RE, Ritchie SJ, Joshi PK. et al. Genetic variants linked to education predict longevity. Proc Natl Acad Sci U S A 2016;113:13366–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Eppinga RN, Hagemeijer Y, Burgess S. et al. Identification of genomic loci associated with resting heart rate and shared genetic predictors with all-cause mortality. Nat Genet 2016;48:1557–63. [DOI] [PubMed] [Google Scholar]
  • 40. Rietveld CA, Conley D, Eriksson N. et al. Replicability and robustness of genome-wide-association studies for behavioral traits. Psychol Sci 2014;25:1975–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Day FR, Loh P-R, Scott RA, Ong KK, Perry JRB. A Robust Example of Collider Bias in a Genetic Association Study. Am J Hum Genet 2016;98:392–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Munafo MR, Tilling K, Taylor AE, Evans DM, Davey Smith G. Collider Scope: How selection bias can induce spurious associations. bioRxiv 2016;079707. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Okbay A, Baselmans BML, De Neve J-E. et al. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nat Genet 2016;48:624–33. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from International Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES