Significance
Educational policies may increase or decrease health differences, depending on whether they reinforce or counteract gene-related differences. We investigate whether one such policy affected health differently for people with different genetic backgrounds. We find that the additional education generated by the policy benefited those with higher genetic risk of obesity the most, reducing the gap in unhealthy body size between those in the top and bottom terciles of genetic risk of obesity from 20 to 6 percentage points. Our results challenge the notion of genetic determinism and underscore the role that social policy can have in mitigating possible health differences arising from genetic background.
Keywords: education, health, gene-by-environment, obesity, genetics
Abstract
This work investigates whether genetic makeup moderates the effects of education on health. Low statistical power and endogenous measures of environment have been obstacles to the credible estimation of such gene-by-environment interactions. We overcome these obstacles by combining a natural experiment that generated variation in secondary education with polygenic scores for a quarter-million individuals. The additional schooling affected body size, lung function, and blood pressure in middle age. The improvements in body size and lung function were larger for individuals with high genetic predisposition to obesity. As a result, education reduced the gap in unhealthy body size between those in the top and bottom terciles of genetic risk of obesity from 20 to 6 percentage points.
Educational policies may increase or decrease health differences, depending on whether they reinforce or counteract gene-related differences (1). Both early life experiences, such as education, and genetic factors are independently associated with later-life health (2–4). A growing literature suggests that health may also depend on the interaction between these two factors (5–7). Where strong gene-by-environment (GxE) interactions exist, modest average effects of education may conceal larger effects for populations with particular genotypes and lead to underestimates of the benefits of schooling. We investigate this possibility by testing whether genetic makeup moderates the effect of an additional year of secondary education on middle-age health.
After the publication of high-impact GxE studies (8–10), controversies over the replicability of results tempered the enthusiasm for this research program. Low statistical power and endogenous measures of environment are believed to be the main reasons for the limited replicability (11–13). Many GxE studies are low-powered because behavioral traits tend to be polygenic, meaning that they are influenced by a large number of genetic markers, each with a very small effect (14). Furthermore, the effect size of interactions is typically lower than that of direct effects (11). As a result, much of the previous literature, which focused on individual candidate genes (exceptions include refs. 15–17), was underpowered (12).
In addition, endogenous measures of environment may lead to biased estimates of GxE interactions (18, 19). Measures of environment are “endogenous” when the outcome affects the environment (i.e., “reverse causality”) or when the relationship between the environment and the outcome is confounded by omitted third factors. Endogenous measures are a concern in our context because health in childhood may affect educational attainment (EA), or self-control may drive both schooling decisions and health behaviors.
We overcome these obstacles by combining a natural experiment with polygenic scores (PGSs), which are indices constructed from millions of genetic markers. The natural experiment, a well-known compulsory schooling age reform in the United Kingdom, generated as-good-as-random variation in education, allowing us to obtain causal estimates of the effect of education on health (20, 21). We find that 14% of students completed an additional year of secondary education as a result of this reform. The combination of this experiment with the use of PGSs—instead of a candidate-gene approach—for a sample of a quarter-million individuals makes our analyses appropriately powered (22).
Before the release of the complete genetic data used in this study, we wrote a comprehensive preanalysis plan describing the construction of all variables to be used and the specification of all analyses to be run (ref. 22 and SI Appendix, section A). We strictly follow this plan below. Our plan was informed by our previous work, which used nongenetic data to estimate how education affects the distribution of health in middle age (23). In that paper, we documented that the effects of education on health are concentrated at particular parts of the health distribution, which suggests that such effects vary across individuals (SI Appendix, section C). In this work, we formally test whether the effects of education on health vary across individuals by investigating whether such effects are moderated by genetic makeup.
We use data from the UK Biobank (UKB). These data are restricted, but one can gain access by following the procedures described in www.ukbiobank.ac.uk/register-apply/.
Following our previous work, we studied three health dimensions: body size, lung function, and blood pressure. To reduce concerns about multiple-hypothesis testing, we constructed an index that is a weighted average of objective outcomes measuring each dimension (24). The body size index includes body mass index (BMI), body fat percentage, and waist–hip ratio. The lung function index includes the forced expiratory volume in 1 s (FEV1), the forced vital capacity (FVC), and the peak expiratory flow (PEF). The blood pressure index includes multiple diastolic and systolic blood pressure measurements. We also constructed a summary index that is a weighted average of the body size, lung function, and blood pressure indices (SI Appendix, section A, VI). We oriented all four indices so that a higher number corresponds to worse health. For each index, we studied two types of outcomes: the continuous index measure and an indicator for whether the index is above a threshold specified in our preanalysis plan (22). These thresholds correspond to the values where we estimated the largest distributional effects in our previous work (23). We used such a threshold in an effort to maximize statistical power: We anticipated that individuals near this threshold would be most responsive to the policy and also exhibit the largest GxE effects. Although selecting the threshold this way leads to upward-biased estimates of the effect of education, it does not lead to biased estimates of the GxE interaction (SI Appendix, section A, XI). Below, we compare our results to more traditional measures and clinical thresholds that may not be as well powered. In SI Appendix, section I, we show that our results are robust to alternative thresholds.
We constructed PGSs for two traits for which large genome-wide association studies (GWAS) are publicly available: BMI (25) and EA (26). We used UKB data to augment the published GWASs in a way that avoids over-fitting (SI Appendix) and followed a standard set of quality-control protocols (27). Final weights were produced by using LDpred (28). The PGSs were normalized to have mean zero and SD one and oriented so that each PGS was positively correlated with its corresponding outcome. The correlation between these two PGSs is −0.24.
The literature has resorted to several different models to justify why genetic predisposition for obesity might interact with education (17, 29). Two examples of such models are the diathesis-stress model and the differential susceptibility model. The diathesis-stress model (also known as the social trigger/compensation model) posits that an unhealthy environment magnifies genetic tendencies for unhealthy behaviors, while a healthy environment protects against genetic risk (30, 31). There is suggestive evidence that physical activity, diet, and one’s obesogenic environment—all of which may be potentially affected by education—may modify the genetic risk for obesity (32–35). It predicts that education will cause larger weight losses among those with higher genetic predisposition to obesity. In contrast, the differential susceptibility model hypothesizes that individuals with certain genotypes are more sensitive to environmental conditions (36); these individuals thrive in positive environments, but wilt in negative environments. Assuming that the BMI PGS reflects such sensitivity, this model also predicts that education will cause larger weight losses among those with higher genetic predisposition to obesity.
Similarly, we studied the interaction between education and the EA PGS because the EA PGS, which is thought to capture, among many other things, innate academic ability (26, 37), may moderate the effect of education on health. It is a priori unknown whether individuals with higher genetic predisposition to EA might benefit more or less from an additional year of education. On one hand, individuals with higher EA PGSs may learn more during that year (perhaps, e.g., because they are fast learners or it is easier for them to learn), which could translate into larger health improvements. On the other hand, individuals with lower EA PGSs may have worse health to begin with, such that they may benefit most from a given change in learning. The EA PGS may also capture personality traits and intergenerational pathways, which could alternatively explain why it may moderate the effect of education on health, although the sign of the interaction is also a priori unknown.
Currently, there are no publicly available, sufficiently predictive GWASs for traits related to lung function and blood pressure. We opted therefore to investigate whether the BMI PGS moderated the effects of education on lung function and on blood pressure because BMI is genetically correlated with smoking and with coronary artery disease (38). Moreover, obesity has direct effects on both lung and vascular functions (39–41).
Fig. 1 documents health differences between those with different levels of genetic risk of obesity. Specifically, it plots the fraction of study participants in the bottom, middle, and top terciles of the BMI PGS distribution with a health index above its corresponding threshold. To facilitate the comparison with Fig. 3 estimates, we restricted the sample to participants who were born before September 1, 1957, and who dropped out before age 16. While 11% of those in the bottom PGS tercile had a body size above the threshold, this fraction was almost three times larger (31%) among those in the top tercile. Fig. 1 shows that the BMI PGS is more predictive of the body size (R2 = 0.049) and summary indices (R2 = 0.021) than of the lung function (R2 = 0.002) and blood pressure indices (R2 = 0.002). See SI Appendix, section D for the corresponding figures for continuous outcomes and for the predictive power of EA PGS.
In 1972, England, Scotland, and Wales increased the minimum age at which students could drop out of school from 15 to 16 y. The reform affected only students born on or after September 1, 1957, generating a discontinuity in the relationship between education and date of birth.
Fig. 2A shows that the fraction staying in school until age 16 increased discontinuously for those born after September 1, 1957. About 83% of those born between September 1956 and August 1957 stayed in school until at least age 16. This fraction is close to 97% among those born between September 1957 and August 1958, the first birth cohort affected by the reform. One can interpret this discontinuous change, which has been documented (21, 42), as the effect of the reform on education. In the UKB sample, we estimate that the policy increased the fraction staying in school until age 16 by 14 percentage points (SI Appendix, section E). In our previous work, we showed that the policy also led individuals to obtain more qualifications, earn higher income, and work on occupations with higher socioeconomic status (23).
To estimate the causal effect of education on health, we used a regression discontinuity design (RDD). The RDD compares the health outcomes of individuals born just before and just after September 1, 1957, controlling for cohort trends. Intuitively, individuals born on August 31, 1957, and individuals born on September 1, 1957, were comparable (e.g., in terms of their childhood health) before the reform. In other words, the health of those born on August 31, 1957, provides a counterfactual of the health that those born on September 1, 1957, would have had had they not been forced to stay in school until age 16. For this reason, any later-life health differences between these two groups can be attributed to the causal effect of the additional year of schooling. In SI Appendix, section B, we offer evidence that those born just before and just after September 1, 1957, were comparable before the reform. For example, we show that the two groups are genetically similar. Genetic markers are useful to test the RDD assumption because genotypes are objectively measured, determined at conception, and immutable.
To investigate whether the effect of education on health varies with genetic makeup, we compared the discontinuous changes in health of groups with different PGSs, accounting for the differences in the fraction of individuals affected by the reform in different PGS groups. Fig. 2 B and C shows that, among cohorts born before September 1957, those with higher BMI PGSs and those with lower EA PGSs were less likely to stay in school until age 16. As expected, the results in Fig. 2C represent the strongest GxE effect resulting from the reform: The difference in the fraction staying in school until age 16 between the bottom and top EA PGS terciles fell from 18.4 percentage points before the reform to 3.1 percentage points afterward. Because almost everyone stayed in school until at least age 16 after the reform, there was little variation in EA at this level left after the reform to be explained by the EA PGS.
Formally, we estimated the following regression:
[1] |
where Healthi is a health outcome; Edu16i is an indicator for staying in school until age 16; PGSi is the BMI or EA PGS; f(DoBi) is a quadratic polynomial in date of birth (we allow for different pretrends and posttrends); PCi is a vector of the first 15 principal components of the genotypic data; and xi is a vector of predetermined characteristics—namely age, age-squared, gender, month, and country of birth. We include and to correct for population stratification (43, 44). To account for the endogeneity of Edu16i and for the differential impacts of the reform on the education of groups with different PGSs, we estimated Eq. 1 through two-stages least squares (2SLS), using the reform as an instrument. The 2SLS estimates the effect of staying in school until age 16 among those affected by the reform (i.e., those who would have dropped out at age 15 in the absence of the reform). In other words, our results cannot be explained by the fact that individuals with lower EA PGSs (or individuals with higher BMI PGSs) were more likely to have been affected by the reform. We restricted the sample to participants of European ancestry born within 10 y of September 1, 1957 (n = 253,715). In SI Appendix, section H, we show that our results are robust to tighter bandwidths and to linear trends.
Table 1 summarizes the main results (see SI Appendix, sections E and G for additional results). We find that, overall, the effects of education on health depend on the BMI PGS. In five of eight regressions, the P value on β1 is <0.05. In two cases, it is less than the Bonferroni-corrected value 0.05/16 = 0.0031. In contrast, there is no evidence that the effects of education on health depends on the EA PGS: None of the eight regressions has P values on the interaction term <0.05.
Table 1.
Above threshold | Continuous | |||||||
Body size | Lung function | Blood pressure | Summary index | Body size | Lung function | Blood pressure | Summary index | |
Interaction with BMI PGS | ||||||||
BMI PGS × Edu16 | −0.057*** (0.011) | −0.037** (0.016) | 0.016 (0.018) | −0.101*** (0.016) | 0.028 (0.033) | −0.091** (0.044) | 0.073** (0.036) | −0.010 (0.042) |
Edu16 | −0.060* (0.035) | −0.089* (0.048) | 0.106** (0.051) | −0.058 (0.050) | −0.119 (0.096) | −0.147 (0.124) | 0.118 (0.102) | −0.092 (0.122) |
BMI PGS | 0.124*** (0.010) | 0.048*** (0.015) | 0.025 (0.016) | 0.152*** (0.015) | 0.263*** (0.030) | 0.127*** (0.040) | 0.024 (0.033) | 0.206*** (0.038) |
P value for H0: no effect of education | 7.97 × 10−10 | 6.09 × 10−5 | 0.004 | 1.50 × 10−14 | 0.455 | 0.002 | 0.002 | 0.545 |
Interaction with EA PGS | ||||||||
EA PGS × Edu16 | −0.013 (0.020) | 0.030 (0.028) | 0.021 (0.030) | 0.049* (0.029) | −0.093 (0.059) | −0.001 (0.074) | −0.024 (0.060) | −0.054 (0.074) |
Edu16 | −0.089** (0.042) | −0.085 (0.056) | 0.117* (0.060) | −0.067 (0.059) | −0.192 (0.119) | −0.172 (0.148) | 0.117 (0.121) | −0.146 (0.147) |
EA PGS | −0.013 (0.020) | −0.043 (0.027) | −0.054* (0.029) | −0.078*** (0.028) | −0.036 (0.057) | −0.060 (0.072) | −0.054 (0.058) | −0.076 (0.072) |
P value for H0: no effect of education | 0.006 | 3.36 × 10−7 | 0.014 | 2.11 × 10−8 | 0.257 | 0.037 | 0.016 | 0.576 |
Observations | 249,699 | 203,048 | 253,377 | 200,398 | 249,699 | 203,048 | 253,377 | 200,398 |
Dep. Var. mean among compliers | 0.215 | 0.287 | 0.611 | 0.317 | 0.261 | 0.269 | 0.090 | 0.311 |
The 2SLS estimates are shown. Above threshold is an indicator of whether the health index is greater than the threshold specified in ref. 22. Edu16 is an indicator for staying in school until age 16 and is instrumented by an indicator for being born after September 1, 1957. The P value for H0: no effect of education is the P value from a joint test that β1 = β2 = 0. The last row shows means of the dependent variable (Dep. Var.) among prereform compliers, defined as individuals born before September 1, 1957, who dropped out before age 16. ***P < 0.01, **P < 0.05, *P < 0.1.
We can reject the hypothesis that staying in school until age 16 has no effects on health in middle age. In 12 of 16 regressions, the P value of the joint test that β1 = β2 = 0 is <0.05 and in 10 cases less than the Bonferroni-corrected value of 0.05/16 = 0.0031. The direction of these results is consistent with previous work (23, 42).
For the binary measures of the body size, lung function, and summary indices, the improvements in health are larger for individuals with a higher BMI PGS. Similarly, for the continuous measure of lung function, improvements in health are larger for individuals with higher BMI PGSs. While the estimate for the continuous measure of blood pressure suggests an interaction of the BMI PGS and education, there are reasons to question the credibility of this particular result: its marginal significance (P value of 0.041), the weak direct effect of the PGS (P value of 0.458), and the low power anticipated in the preanalysis plan (in the most optimistic case, 17% power to detect an effect at 5% significance).
The results shown in Table 1 assume that the effect of staying in school until age 16 varies linearly with the PGS. In Fig. 3, we adopt a more nonparametric specification and estimate separate effects for the bottom, middle, and top terciles of the BMI PGS distribution. The bars show point estimates of the effects on the binary outcomes with 95% confidence intervals. Figures presented in SI Appendix, section F for the continuous measures and EA PGS show results qualitatively similar to the corresponding results in Table 1.
Fig. 3 shows that education reduced the differences in body size by genetic risk shown in Fig. 1. For the top tercile of the BMI PGS distribution, staying in school until age 16 reduced the fraction above the body size threshold by 13 percentage points. For the bottom tercile, there was a modest, statistically insignificant increase. As a result, the additional year of education reduced the gap in “unhealthy body size” (i.e., being above the body size threshold) between the top and the bottom PGS terciles from 20 to 6 percentage points.
The above results correspond to indices that were constructed as a weighted average of related health outcomes. While an index has the advantage of being better powered than a single outcome, it has the disadvantage of being a nonstandard composite measure. For comparison with more traditional measures of health, Table 2 shows separate results for the outcomes that compose each index. The upper part shows results for the binary measures. The lower part shows results for the continuous measures. To construct the thresholds for the binary measures of the outcomes, we followed the same procedure used to construct the thresholds for the binary measures of the indices. Note that, because thresholds are calculated separately, the fraction above the thresholds for each measure differs from each other and from the corresponding index.
Table 2.
Body size | Lung function | Blood pressure | ||||||
BMI | Body fat percentage | Waist–hip ratio | FEV1 | FVC | PEF | Diastolic | Systolic | |
Above threshold | ||||||||
BMI PGS × Edu16 | −0.024* (0.013) | −0.065*** (0.012) | −0.031*** (0.010) | −0.044** (0.020) | −0.055*** (0.018) | −0.012 (0.018) | 0.029 (0.018) | 0.018 (0.017) |
Edu16 | −0.068 (0.042) | −0.068* (0.037) | −0.090*** (0.031) | −0.116** (0.059) | −0.115** (0.054) | −0.054 (0.051) | 0.081 (0.051) | 0.107** (0.050) |
BMI PGS | 0.137*** (0.012) | 0.137*** (0.011) | 0.059*** (0.010) | 0.061*** (0.019) | 0.075*** (0.017) | 0.013 (0.016) | 0.016 (0.016) | 0.013 (0.016) |
P value for H0: no effect of education | 0.004 | 7.11 × 10−12 | 4.28 × 10−8 | 3.81 × 10−5 | 2.98 × 10−7 | 0.164 | 0.002 | 0.002 |
Dep. Var. mean among compliers | 0.322 | 0.223 | 0.162 | 0.488 | 0.368 | 0.307 | 0.609 | 0.655 |
Continuous | ||||||||
BMI PGS × Edu16 | 0.03 (0.160) | 0.450** (0.222) | 0.001 (0.002) | 0.053** (0.025) | 0.063** (0.031) | 6.977* (4.119) | 0.922** (0.359) | 0.474 (0.587) |
Edu16 | −0.201 (0.469) | −0.693 (0.641) | −0.009 (0.007) | 0.113 (0.070) | 0.139 (0.088) | 11.196 (11.817) | 0.964 (1.029) | 2.340 (1.689) |
BMI PGS | 1.675*** (0.146) | 1.441*** (0.203) | 0.010*** (0.002) | −0.077*** (0.023) | −0.108*** (0.028) | −6.545* (3.802) | 0.146 (0.329) | 0.619 (0.540) |
P value for H0: no effect of education | 0.913 | 0.127 | 0.363 | 3.21 × 10−4 | 5.06 × 10−4 | 0.016 | 4.91 × 10−4 | 0.057 |
Observations | 252,926 | 249,743 | 253,155 | 203,048 | 203,048 | 203,048 | 253,377 | 253,377 |
Dep. Var. mean among compliers | 28.470 | 32.340 | 0.881 | 2.870 | 3.773 | 413.700 | 83.530 | 135.400 |
The 2SLS estimates are shown. Above threshold is an indicator of whether the outcome is greater than its threshold. Edu16 is an indicator for staying in school until age 16 and is instrumented by an indicator for being born after September 1, 1957. The P value for H0: no effect of education is the P value from a joint test that β1 = β2 = 0. The last row shows means of the dependent variable (Dep. Var.) among prereform compliers, defined as individuals born before September 1, 1957, who dropped out before age 16. ***P < 0.01, **P < 0.05, *P < 0.1.
When analyzing the outcomes separately in Table 2, we reach the same conclusions drawn from the analysis of the indices in Table 1. For the binary measures of the outcomes that compose the body size and the lung function indices, the health improvements are larger for individuals with higher BMI PGSs. Among the body-size measures, the interaction coefficient for BMI is not as significant as the results for the other two outcomes. This illustrates the power gained by using indices: Had we analyzed BMI alone, we would have ignored the rich information available in the body-fat percentage and in the waist–hip ratio outcomes. For the continuous measures of the outcomes that compose the lung function index, the health improvements are also larger for individuals with higher BMI PGSs (for these measures, a higher value corresponds to better health).
To maximize statistical power, the thresholds used to construct the binary measures were chosen as the values where we previously estimated the largest distributional effects of education on health because individuals in this part of the distribution are expected to be most responsive to the policy. These thresholds do not necessarily correspond to clinical cutoffs used for medical diagnosis. For diastolic blood pressure, for example, the threshold is 78.6 for women and 82.6 for men, which is within 3 points of the clinical cutoff used to diagnose stage 1 hypertension (80 mm Hg). For BMI, the threshold is 29.7 for women and 30.1 for men, which are even more similar to the clinical cutoff of 30 used to diagnose obesity. For completeness, Table 3 shows results for the following clinical cutoffs: a BMI > 30 (obesity), a FEV1–FVC ratio < 0.7 (chronic obstructive pulmonary disease; COPD), a diastolic blood pressure >80 or a systolic blood pressure >130 (stage 1 hypertension), and a diastolic blood pressure >90 or a systolic blood pressure >140 (stage 2 hypertension).
Table 3.
Clinical cutoffs | ||||
Obesity: BMI ≥ 30 | COPD: FEV1/FVC ≤ 0.7 | Hypertension | ||
Stages 1 and 2: Diastolic ≥ 80 or systolic ≥ 130 | Stage 2: Diastolic ≥ 90 or systolic ≥ 140 | |||
BMI PGS × Edu16 | −0.024* (0.013) | 0.004 (0.013) | 0.028* (0.017) | 0.014 (0.017) |
Edu16 | −0.055 (0.042) | −0.056 (0.038) | 0.060 (0.047) | 0.036 (0.049) |
BMI PGS | 0.136*** (0.012) | −0.009 (0.012) | 0.010 (0.015) | 0.017 (0.016) |
P value for H0: no effect of education | 0.009 | 0.262 | 0.005 | 0.240 |
Observations | 252,926 | 203,048 | 253,377 | 253,377 |
Dep. Var. mean among compliers | 0.315 | 0.144 | 0.717 | 0.422 |
The 2SLS estimates are shown. Edu16 is an indicator for staying in school until age 16 and is instrumented by an indicator for being born after September 1, 1957. The P value for H0: no effect of education is the P value from a joint test that β1 = β2 = 0. The last row shows means of the dependent variable (Dep. Var.) among prereform compliers, defined as individuals born before September 1, 1957, who dropped out before age 16. ***P < 0.01, **P < 0.05, *P < 0.1.
The results for obesity and hypertension are consistent with the results shown in Tables 1 and 2. For example, the additional year of education reduces obesity among those with a BMI PGS one SD above the mean by ∼8 percentage points, while the additional year of education reduces obesity among those with an average BMI PGS by 5.5 percentage points (31.5% of compliers were obese). The P value of the interaction term is 0.073. While this estimate is weaker than the one we found for the binary body size index, recall that obesity as measured solely by BMI ignores information based on body-fat percentage and the waist–hip ratio outcomes, resulting in a lower-powered analysis (Table 2).
We find, however, no effect of education on COPD and no evidence that such effect varies with one’s BMI PGS. Even though staying in school until age 16 led to increases in FEV1 and FVC (Table 2), COPD was not affected because FEV1 and FVC increased by the same proportion. Despite no effects on COPD, the larger increases in FEV1 and in FVC for those with higher BMI PGSs are consistent with larger improvements in lung function for those with higher genetic risk of obesity.
Our results challenge the notion of genetic determinism (45); yet the question of why we observed larger health improvements for those with higher genetic predisposition to obesity remains. Broadly speaking, the channels through which education are thought to affect health can be divided in two general categories: changes in material resources and changes in health behaviors. Education increases income, giving the more educated access to material resources, more/higher-quality health care, and a healthier diet. Changes in health behaviors may come about for a host of reasons. For example, education may lead individuals to value the future more and provide them with more knowledge, better critical-thinking skills, and the ability to process information.
In previous work (23), we found evidence that the additional year of education increased income and led to healthier diets. Given the results in this paper, it is natural to ask whether these changes were larger among those with higher predisposition to obesity. We find no evidence that this was the case when using UKB data on diet, physical activity, and income (SI Appendix, section K), but we stress that these data have several important limitations. For example, the measures of diet and income are self-reported, and physical activity measures are only available for a subset of the sample. Moreover, since UKB participants have higher socioeconomic status and are healthier than the general population (46), measures of diet, physical activity, and income might have less variation in this sample. These limitations decrease power to find significant interactions.
Overall, this work highlights the importance of maintaining statistical power when conducting GxE research. By combining some of the most powerful PGSs available with a large natural experiment and samples of unprecedented size, we were able to identify a robust interaction of genes and education on health. In view of these results, it may be tempting to adopt a cynical outlook on GxE research: Indeed, finding impactful, exogenous variation in environment for a large, genotyped sample is somewhat rare. Nevertheless, we are optimistic about this research agenda for several reasons. For example, employing other research designs, such as randomized controlled trials, may be better powered and produce more precise estimates than the RDD used here (47). Relatedly, a higher treatment compliance rate would also increase the statistical power. As GWAS samples increase, the predictive power of PGSs will also increase, and new PGSs with reasonable power will become available for a variety of health and behavioral phenotypes. This will allow for a better match between outcomes and PGSs; in our case, it might have been helpful to have sufficiently predictive PGSs for blood pressure and lung function. As a result, as long as researchers are attentive to the statistical power of their studies, we anticipate that this will be a fruitful line of research in the future.
Our work has implications for the literature on social determinants of health, which argues that interventions that increase education, income, or socioeconomic status can improve health (48, 49). Our findings show that the effects of education on health were not uniform across genetic backgrounds, benefitting those with greater genetic risk for obesity more. In other words, education not only affected health, corroborating the social determinants hypothesis, but it also reduced the role played by genetic factors: The association between genetic predisposition to obesity and unhealthy body size was reduced among cohorts who were forced to stay in school longer. Future work in this area may want to include considerations about how the effects of social determinants on health vary across individuals and the potential role of social determinants in moderating the relationship between genetic makeup and health.
Investigating the generalizability of our results will be an important next step. While our estimates have internal validity, they only offer evidence on the causal effects of an additional year of compulsory schooling at age 15 in a specific national and historical context. Historical context and the phenotype being studied have been shown to matter when estimating GxE interactions in smoking behavior (50, 51). Furthermore, other policies may be more effective than changes in the compulsory schooling age when it comes to reducing middle-age obesity rates. As a result, following up on the analyses above with different PGSs, phenotypes, and in different policy contexts will inform whether the findings presented here generalize, increasing our understanding of the role that social policy can have in mitigating possible health differences arising from genetic background.
Supplementary Material
Acknowledgments
We thank Dan Benjamin, David Cesarini, Benjamin Neale, Arthur Stone, Hans van Kippersluis, and Lauren Schmitz for discussions; seminar participants at the 2018 meeting of the Population Association of America and the National Bureau of Economic Research Summer Institute Aging Workshop for feedback; and Rosie Li, Sean Lee, and Peter Bowers for excellent research assistance. Research reported in this publication was supported by the National Institute on Aging of the National Institutes of Health under Awards K01AG050811 (to S.H.B.), RF1AG055654 (to L.S.C.), and 3P30AG024962-13S1 (to L.S.C.). P.T. thankfully acknowledges funding from the National Institute of Mental Health (Grant 1R01MH107649-03), the National Institute on Aging (Grant R01-AG042568), the Ragnar Söderbergs stiftelse (Ragnar Söderberg Foundation Grant E42/15), the Open Philanthropy Project (Grant 2016-152872), and the Pershing Square Fund for Research on the Foundations of Human Behavior. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or any of the funders. This research has been conducted using the UK Biobank Resource under Application 15666.
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Data deposition: The code necessary for replicating the results are publicly available at https://osf.io/9dyfz/.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1802909115/-/DCSupplemental.
References
- 1.Freese J, Shostak S. Genetics and social inquiry. Annu Rev Sociol. 2009;35:107–128. [Google Scholar]
- 2.Cutler DM, Lleras-Muney A. Education and health: Evaluating theories and evidence. In: House J, Schoeni R, Kaplan G, Pollack H, editors. Making Americans Healthier: Social and Economic Policy as Health Policy. Russell Sage Foundation; New York: 2008. [Google Scholar]
- 3.Case A, Fertig A, Paxson C. The lasting impact of childhood health and circumstance. J Health Econ. 2005;24:365–389. doi: 10.1016/j.jhealeco.2004.09.008. [DOI] [PubMed] [Google Scholar]
- 4.Bulik-Sullivan BK, et al. Schizophrenia Working Group of the Psychiatric Genomics Consortium LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet. 2015;47:291–295. doi: 10.1038/ng.3211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Almond D, Currie J, Duque V. Childhood Circumstances and Adult Outcomes: Act II. National Bureau of Economic Research; Cambridge, MA: 2017. [Google Scholar]
- 6.Amin V, et al. Gene-environment interactions between education and body mass: Evidence from the UK and Finland. Soc Sci Med. 2017;195:12–16. doi: 10.1016/j.socscimed.2017.10.027. [DOI] [PubMed] [Google Scholar]
- 7.Galama TJ, Lleras-Muney A, van Kippersluis H. The Effect of Education on Health and Mortality: A Review of Experimental and Quasi-Experimental Evidence. National Bureau of Economic Research; Cambridge, MA: 2018. [Google Scholar]
- 8.Caspi A, et al. Role of genotype in the cycle of violence in maltreated children. Science. 2002;297:851–854. doi: 10.1126/science.1072290. [DOI] [PubMed] [Google Scholar]
- 9.Caspi A, et al. Influence of life stress on depression: Moderation by a polymorphism in the 5-HTT gene. Science. 2003;301:386–389. doi: 10.1126/science.1083968. [DOI] [PubMed] [Google Scholar]
- 10.Shanahan MJ, Vaisey S, Erickson LD, Smolen A. Environmental contingencies and genetic propensities: Social capital, educational continuation, and dopamine receptor gene DRD2. AJS. 2008;114(Suppl 1):S260–S286. doi: 10.1086/592204. [DOI] [PubMed] [Google Scholar]
- 11.Duncan LE, Keller MC. A critical review of the first 10 years of candidate gene-by-environment interaction research in psychiatry. Am J Psychiatry. 2011;168:1041–1049. doi: 10.1176/appi.ajp.2011.11020191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Hewitt JK. Editorial policy on candidate gene association and candidate gene-by-environment interaction studies of complex traits. Behav Genet. 2012;42:1–2. doi: 10.1007/s10519-011-9504-z. [DOI] [PubMed] [Google Scholar]
- 13.Munafò MR, Durrant C, Lewis G, Flint J. Gene X environment interactions at the serotonin transporter locus. Biol Psychiatry. 2009;65:211–219. doi: 10.1016/j.biopsych.2008.06.009. [DOI] [PubMed] [Google Scholar]
- 14.Chabris CF, Lee JJ, Cesarini D, Benjamin DJ, Laibson DI. The fourth law of behavior genetics. Curr Dir Psychol Sci. 2015;24:304–312. doi: 10.1177/0963721415580430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Schmitz LL, Conley D. The effect of Vietnam-era conscription and genetic potential for educational attainment on schooling outcomes. Econ Educ Rev. 2017;61:85–97. doi: 10.1016/j.econedurev.2017.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Barth D, Papageorge N, Thom K. Genetic Ability, Wealth and Financial Decision-Making. National Bureau of Economic Research; Cambridge, MA: 2017. [Google Scholar]
- 17.Boardman JD, et al. Gene-environment interactions related to body mass: School policies and social context as environmental moderators. J Theor Polit. 2012;24:370–388. doi: 10.1177/0951629812437751. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Schmitz L, Conley D. Modeling gene‐environment interactions with quasi‐natural experiments. J Pers. 2017;85:10–21. doi: 10.1111/jopy.12227. [DOI] [PubMed] [Google Scholar]
- 19.D’Onofrio BM, Lahey BB, Turkheimer E, Lichtenstein P. Critical need for family-based, quasi-experimental designs in integrating genetic and social science research. Am J Public Health. 2013;103(Suppl 1):S46–S55. doi: 10.2105/AJPH.2013.301252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lee DS, Lemieux T. Regression discontinuity designs in economics. J Econ Lit. 2010;48:281–355. [Google Scholar]
- 21.Clark D, Royer H. The effect of education on adult mortality and health: Evidence from Britain. Am Econ Rev. 2013;103:2087–2120. doi: 10.1257/aer.103.6.2087. [DOI] [PubMed] [Google Scholar]
- 22.Turley P. 2017 Genetic heterogeneity in treatment effects of the 1972 ROSLA. Available at osf.io/9dyfz. Accessed September 4, 2018.
- 23.Barcellos SH, Carvalho LS, Turley P. 2017. Distributional effects of education on health. Center for Economic and Social Research–Schaeffer working paper 2018-002.
- 24.Anderson ML. Multiple inference and gender differences in the effects of early intervention: A reevaluation of the Abecedarian, Perry Preschool, and Early Training Projects. J Am Stat Assoc. 2008;103:1481–1495. [Google Scholar]
- 25.Locke AEA, et al. LifeLines Cohort Study; ADIPOGen Consortium; AGEN-BMI Working Group; CARDIOGRAMplusC4D Consortium; CKDGen Consortium; GLGC; ICBP; MAGIC Investigators; MuTHER Consortium; MIGen Consortium; PAGE Consortium; ReproGen Consortium; GENIE Consortium; International Endogene Consortium Genetic studies of body mass index yield new insights for obesity biology. Nature. 2015;518:197–206. doi: 10.1038/nature14177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Okbay A, et al. LifeLines Cohort Study Genome-wide association study identifies 74 loci associated with educational attainment. Nature. 2016;533:539–542. doi: 10.1038/nature17671. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Turley P, et al. 23andMe Research Team; Social Science Genetic Association Consortium Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat Genet. 2018;50:229–237. doi: 10.1038/s41588-017-0009-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Vilhjálmsson BJ, et al. Schizophrenia Working Group of the Psychiatric Genomics Consortium, Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE) study Modeling linkage disequilibrium increases accuracy of polygenicrisk scores. Am J Hum Genet. 2015;97:576–592. doi: 10.1016/j.ajhg.2015.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Liu H, Guo G. Lifetime socioeconomic status, historical context, and genetic inheritance in shaping body mass in middle and late adulthood. Am Sociol Rev. 2015;80:705–737. doi: 10.1177/0003122415590627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Gottesman II, Shields J. A polygenic theory of schizophrenia. Int J Ment Health. 1972;1:107–115. [Google Scholar]
- 31.Zubin J, Spring B. Vulnerability–A new view of schizophrenia. J Abnorm Psychol. 1977;86:103–126. doi: 10.1037//0021-843x.86.2.103. [DOI] [PubMed] [Google Scholar]
- 32.Qi Q, et al. Sugar-sweetened beverages and genetic risk of obesity. N Engl J Med. 2012;367:1387–1396. doi: 10.1056/NEJMoa1203039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Ahmad S, et al. InterAct Consortium; DIRECT Consortium Gene × physical activity interactions in obesity: Combined analysis of 111,421 individuals of European ancestry. PLoS Genet. 2013;9:e1003607. doi: 10.1371/journal.pgen.1003607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Demerath EW, et al. The positive association of obesity variants with adulthood adiposity strengthens over an 80-year period: A gene-by-birth year interaction. Hum Hered. 2013;75:175–185. doi: 10.1159/000351742. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Walter S, Mejía-Guevara I, Estrada K, Liu SY, Glymour MM. Association of a genetic risk score with body mass index across different birth cohorts. JAMA. 2016;316:63–69. doi: 10.1001/jama.2016.8729. [DOI] [PubMed] [Google Scholar]
- 36.Belsky J, Pluess M. Beyond diathesis stress: Differential susceptibility to environmental influences. Psychol Bull. 2009;135:885–908. doi: 10.1037/a0017376. [DOI] [PubMed] [Google Scholar]
- 37.Kong A, et al. The nature of nurture: Effects of parental genotypes. Science. 2018;359:424–428. doi: 10.1126/science.aan6877. [DOI] [PubMed] [Google Scholar]
- 38.Bulik-Sullivan B, et al. ReproGen Consortium; Psychiatric Genomics Consortium; Genetic Consortium for Anorexia Nervosa of the Wellcome Trust Case Control Consortium 3 An atlas of genetic correlations across human diseases and traits. Nat Genet. 2015;47:1236–1241. doi: 10.1038/ng.3406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Leone N, et al. Lung function impairment and metabolic syndrome: The critical role of abdominal obesity. Am J Respir Crit Care Med. 2009;179:509–516. doi: 10.1164/rccm.200807-1195OC. [DOI] [PubMed] [Google Scholar]
- 40.Lucas CP, Estigarribia JA, Darga LL, Reaven GM. Insulin and blood pressure in obesity. Hypertension. 1985;7:702–706. doi: 10.1161/01.hyp.7.5.702. [DOI] [PubMed] [Google Scholar]
- 41.Salome CM, King GG, Berend N. Physiology of obesity and effects on lung function. J Appl Physiol (1985) 2010;108:206–211. doi: 10.1152/japplphysiol.00694.2009. [DOI] [PubMed] [Google Scholar]
- 42.Davies NM, Dickson M, Smith GD, Van Den Berg GJ, Windmeijer F. The causal effects of education on health outcomes in the UK Biobank. Nat Hum Behav. 2018;2:117–125. doi: 10.1038/s41562-017-0279-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Price AL, et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38:904–909. doi: 10.1038/ng1847. [DOI] [PubMed] [Google Scholar]
- 44.Keller MC. Gene× environment interaction studies have not properly controlled for potential confounders: The problem and the (simple) solution. Biol Psychiatry. 2014;75:18–24. doi: 10.1016/j.biopsych.2013.09.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Jencks C. Heredity, environment, and public policy reconsidered. Am Sociol Rev. 1980;45:723–736. [PubMed] [Google Scholar]
- 46.Fry A, et al. Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am J Epidemiol. 2017;186:1026–1034. doi: 10.1093/aje/kwx246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Schochet PZ. Statistical power for regression discontinuity designs in education evaluations. J Educ Behav Stat. 2009;34:238–266. [Google Scholar]
- 48.Marmot M, Friel S, Bell R, Houweling TA, Taylor S. Commission on Social Determinants of Health Closing the gap in a generation: Health equity through action on the social determinants of health. Lancet. 2008;372:1661–1669. doi: 10.1016/S0140-6736(08)61690-6. [DOI] [PubMed] [Google Scholar]
- 49.Marmot M, Wilkinson R, editors. Social Determinants of Health. 2nd Ed Oxford Univ Press; Oxford: 2006. [Google Scholar]
- 50.Boardman JD. State-level moderation of genetic tendencies to smoke. Am J Public Health. 2009;99:480–486. doi: 10.2105/AJPH.2008.134932. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Boardman JD, et al. Population composition, public policy, and the genetics of smoking. Demography. 2011;48:1517–1533. doi: 10.1007/s13524-011-0057-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.