Abstract
This paper examines the relationship between health endowment and later-life outcomes in the labour market. The analysis is based on reduced-form models in which labour market outcomes are regressed on genetic variants related to the increased risk of cardiovascular diseases. We use linked Finnish data that have many strengths. Genetic risk scores constitute exogenous measures for health endowment, and accurate administrative tax records on earnings, employment and social income transfers provide a comprehensive account of an individual’s long-term performance in the labour market. The results show that although the direction of an effect is generally consistent with theoretical reasoning, the effects of health endowment on outcomes are statistically weak, and the hypothesis of no effect can be rejected only in one case: genetic endowment related to obesity influences male earnings and employment in prime age. Due to the sample size (N = 1651), the results should be interpreted with caution and should be confirmed in larger samples and in other institutional settings.
Keywords: Genetics, Health endowment, Earnings, Employment, Social income transfers, Reduced-form regression
Highlights
-
•
Labour market outcomes are regressed on genetic risk scores.
-
•
Genetic effects on outcomes are generally statistically weak.
-
•
Endowment related to obesity influences male earnings and employment in prime age.
1. Introduction
Empirical research shows statistically significant associations between individuals’ health and labour market outcomes (Currie, 2009, Conti and Heckman, 2010, Jäckle and Himmler, 2010, Cawley, 2015). However, the findings may not be robust or straightforward to interpret. For example, low income may result in poor health, and health problems can result in lower earnings. Likewise, the observed associations may reflect omitted confounders. For example, cognitive and non-cognitive endowments developed at an early age may be important determinants of both adult health and labour market performance. Furthermore, the underlying mechanisms behind the observed relationships are not easily identified. In addition to poor health, discrimination on the basis of physical appearance (Rooth, 2009, Puhl and Heuer, 2009, Pomeranz and Puhl, 2013) or lower cognitive ability and educational disadvantages in childhood (Hack et al., 2002) may yield worse labour market performance in later life.
Recent empirical research has exploited data on genetic variants across individuals as instruments (the so-called Mendelian randomization, MR) to improve the identification and validity of the estimates. For example, Böckerman et al. (2019), Norton and Han (2008) and Tyrrell, Jones, Beaumont, and Freyling (2016) use genetic instruments to estimate the effect of weight on labour market outcomes and education. Ding, Lehrer, Rosenquist, and Audrain-McGovern (2009) examine the impact of poor health on academic performance, and Von Hinke, Smith, Lawlor, Propper, and Windmeijer (2013) investigate the effect of height on depression symptoms and behavioural problems.
The instrumental variable (IV) assumptions regarding exclusion restrictions for genetic variables are not always examined with sufficient attention. For example, genetic variants measured by either single nucleotide polymorphisms (SNPs) or genetic risk scores (GRSs) may affect outcomes via pathways other than the exposures, i.e., genes have pleiotropic effects (Von Hinke et al., 2016, Greco et al., 2015). Similarly, the effects may depend not only on the value of the exposure but also on the environment; i.e., there are significant gene-environment interactions (Lundborg and Stenberg, 2010, Conley, 2016, Pehkonen et al., 2017). Analysis can also suffer from measurement error and reverse causality if the exposures are incomplete characterizations of all aspects, are time-varying or are collected after measurement of the outcome (Cawley, Han, & Norton, 2011). This complexity implies that caution should be taken when using genetic information as an instrument in instrumental variable research designs (Lawlor et al., 2008, VanderWeele et al., 2016).
In this study, we analyse the relationships between genetic health endowments and later-life outcomes in the labour market. To accomplish this, we apply a reduced-form (RF) approach by regressing outcome variables directly on genetic markers associated with biomarkers that elevate the risk of cardiovascular diseases. As VanderWeele et al. (2016) show, the method achieves robustness gains against several possible biases. In particular, the fact that there is no need for data of the exposure eliminates the sources of bias related to measurement error in exposures, reverse causality and gene-environmental interactions. The drawback of the approach is that it only provides a test for the presence of an effect of the exposure on the outcome and does not provide causal estimates on the relationship. However, in many cases, information on the existence and direction of an effect may be sufficient. For example, a test result showing no association between genetic variant and outcome with the findings that the genetic variant affects the exposure of interest suggests that there is no effect of the exposure on the outcome among compliers, i.e., among those whose exposure variable of interest is raised via the impact of genetic variation.
In this study, we measure individuals’ performance in the labour market during the prime working age in terms of earnings, employment months, and social security benefits. The data are drawn from high-quality administrative records (Statistics Finland, SF), and the variables are measured as averages over a ten-year period, thus minimizing the possible influence of idiosyncratic variation from year to year. The labour market outcomes are regressed on genetic health endowments related to established cardiovascular risk factors, including obesity (BMI and waist-hip ratio), increased low-density lipoprotein cholesterol, decreased high-density lipoprotein cholesterol, and increased triglycerides (Danaei et al., 2009, Teslovich et al., 2010, Zhu et al., 2018). GRSs are drawn from genome-wide association studies (GWAS). We apply a data-driven approach: we use risk scores in the data of the Cardiovascular Risk in Young Finns Study (YFS) that can be related to enhanced cardiovascular risk. As such, our study supplements that by Böckerman et al. (2019), which quantifies the effect of one health endowment (obesity) on later-life outcomes in the labour market using the IV approach in the context of the YFS data. Fig. 1 graphically illustrates our empirical approach, in terms of estimators, of exploring the effects of health endowments on later-life outcomes in the labour market.
Fig. 1.
Directed acyclic graph for exploring the effects of enhanced risk of cardiovascular diseases on later-life outcomes in the labour market. Comparison of OLS, IV and RF-estimators.
The results of the reduced-form regressions show that the effects of genetic health endowments on later-life labour market outcomes are generally correctly signed but statistically weak, and, if they exist, they are related to genetic endowments of obesity and apply for men. Furthermore, an increase in the number of SNPs in the risk score is associated with a higher proportion of non-significant estimates for risk scores. The finding may reflect pleiotropic effects of SNPs with a tendency that genetic variants have associations with various traits that oppose the labour market effects. We conclude the paper with a discussion on shortcomings and possible extensions for future research.
2. Data
The linked data used come from three sources. The main data source is the YFS, which collects information on individuals through questionnaires, physical measurements, and blood tests. Eight waves of data have been collected at 3–9-year intervals, starting with the baseline in 1980 and most recently in 2011-12. In 1980, a total of 3,596 persons participated in the study, and all anthropometric measurements were conducted by medical professionals at local health centres (Raitakari, Juonala, Rönnemaa, & Jula, 2008).
The second data source is the Finnish Longitudinal Employer-Employee Data (FLEED) of SF. It is a source of information on labour market outcomes, i.e., employment status, wage compensation and social transfer. SF maintains the FLEED data that come directly from tax authorities and other administrative registers. The third dataset, the Longitudinal Population Census (LPC), is the source of information on parental education and income.
The linkage of both FLEED and LPC to the YFS data is performed using personal identifiers that are available for both parents and their children. As a result, our study avoids the shortcomings created by errors in record linkages. Such register-based data have less measurement error than self-reported data from surveys. For example, the income data do not suffer from underreporting or recall error, nor are they top coded. This accuracy increases the efficiency of the estimates, which is particularly important in relatively small samples.
2.1. Measures of genetic health endowments
Recent research in health genetics has identified genetic variants that are robustly associated with health traits (Belsky, Moffitt, & Caspi, 2013). In this study, we measured health endowment by GRSs that are associated with a risk of cardiovascular diseases (Teslovich et al., 2010). GRSs consist of SNPs that have been found to be significantly (at least p < 1.0 × 10−6) associated with traits of obesity (BMI), body fat distribution measured by waist-hip ratio (WHR), triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), and low-density lipoprotein cholesterol (LDL-C).
We use GRSs instead of individual SNPs for two reasons. First, GRSs account for more variation in health traits, which increases the statistical power of estimation. Second, they reduce the risk that any individual SNP will bias estimates via an alternative biological pathway. Palmer et al. (2012) provide evidence that supports the use of genetic scores over indicator variables for individual SNPs, and Böckerman et al. (2019) document the benefits of the unweighted risk scores as opposed to the weighted scores in the YFS data.
The measures of genetic health endowment for YFS subjects were drawn in 2009. Genotyping was implemented using the Illumina Bead Chip (Human 670 K), and the genotypes were called using the Illumina clustering algorithm (Teo et al., 2007). SHAPEIT v1 and IMPUTE2 software (Delaneau, Marchini, & Zagury, 2012) were used for genotype imputation with the 1000 Genomes Phase I Integrated Release Version 3 (March 2012 haplotypes) as a reference panel (Howie et al., 2009, 1000 Genomes Project Consortium, 2010).
We follow a data-driven approach and use risk scores that are in our linked data and can be associated with an enhanced risk of cardiovascular diseases. We use unweighted risk scores, which were calculated as the sum of the genotyped risk alleles or imputed allele dosages carried by an individual. The calculated risk scores are equal to the sum of the alleles in SNPs that put an individual at elevated risk. The GRS for a risk score consisting, for example, of 32 individual SNPs is a count from 1 to 64.
The YFS data provide alternative risk measures for BMI, WHR, LDL-C and TG. This is an advantage because GRSs with different numbers of SNPs may involve trade-offs between predictive power and bias related to pleiotropic effects. In our case, SNPs may affect later-life labour market outcomes, for example, through cognitive or non-cognitive traits and not only through health endowments.
The risk scores for BMI are based on 32 or 97 SNPs reported in Speliotes, Willer, Berndt, and Loos (2010) and Locke et al. (2015), respectively. For WHR, we use the risk scores consisting of 14 or 16 SNPs identified in Heid et al. (2010). For HDL-C, we use the risk score of 38 SNPs identified in Teslovich et al. (2010). The risk score for LDL-C is based on 14 SNPs reported in Teslovich et al. (2010) or 58 SNPs reported in Hernesniemi et al. (2015). Similarly, the risk scores for TG (25 or 41 SNPs) are based on the work of Teslovich et al. (2010) and Hernesniemi et al. (2015).
We calculated correlations between risk scores and biomarker measures for BMI, WHR, HDL-C, LDL-C and TG in the YFS data for 2011. The findings were consistent with the GWAS evidence; i.e., the associations were statistically significant (p < 0.01) for all biomarkers excluding WHR. The strongest association was between LDL-C and LDL58 (r = 0.243, p < 0.01), and the weakest was between WHR and WHR14 (r = 0.029, p < 0.305).
2.2. Labour market outcome measures
The FLEED of SF is the source of information on employment status, earnings and social transfers. We describe the participant’s labour market outcomes from two complementary perspectives. First, high (low) values for annual earnings and employment months are typical indicators of strong (weak) labour market attachment and markers of positive (negative) externalities related to health endowments (Conti & Heckman, 2010). Second, we augmented this information by social income transfers received (consisting of, e.g., unemployment, housing and disability benefits) that are the main indicators of poor labour market success and markers of negative externalities related to health endowments (Cawley, 2015).
We use three specific measures: the logarithm of the participant’s average annual wage and salary earnings, the share of years employed and the logarithm of the average of the participant’s annual social income transfers received. The outcomes are calculated as averages of the values over the 2001–2012 period. Thus, they capture persistent differences in an individual’s labour market behaviour in the prime working age. In 2012, the YFS participants were between 35 and 50 years old.
2.3. Covariates and sample considerations
The baseline model controls for the participant’s age and gender, which are predetermined variables. In addition, we augment the baseline model using measures of parental environment (family income and education). This serves two purposes. First, the inclusion may control for biases arising from a correlation of genetic variants of children and their parents and the possibility that parents’ genes influence their own environments and thus the parental socioeconomic status (Lundborg & Stenberg, 2010). Second, additional predetermined covariates reduce the variability of the independent variable, improving the statistical precision of the estimates (Von Hinke, Smith, Lawlor, Propper, & Windmeijer, 2011). This is useful in a relatively small sample.
The main characteristics of the linked data are summarized in Table 1. Two issues are worth noting. First, although the number of YFS participants in our estimation sample (N = 1638–1675) is smaller than the original sample size (N = 3,596), the samples are representative of the baseline (Böckerman et al., 2019). Second, although the sample is relatively small (N = 1675 or less), there is an important advantage over the prior literature: labour market data are not self-reported but come from comprehensive national registers that avoid reporting errors and are not top coded.
Table 1.
Summary statistics.
| Variable | Mean (SD) | N |
|---|---|---|
| Labour market outcomes, averages from 2001–2012 | ||
| Log of average annual earnings | 9.875 (0.886) | 1651 |
| *Log of earnings in 2012 | 9.480 (2.584) | 1638 |
| Share of years employed | 0.858 (0.243) | 1675 |
| *Indicator for being employed in 2012 | 0.880 (0.326) | 1660 |
| Log of average annual social income transfers | 5.665 (2.931) | 1651 |
| *Log of social income transfers in 2012 | 2.449 (3.768) | 1638 |
| Genetic health endowments, risk scores | ||
| BMI 32 SNPs | 29.14 (3.337) | 1651 |
| BMI 97 SNPs* | 2.315 (0.161) | 1651 |
| WHR 14 SNPs | 15.178 (2.367) | 1651 |
| WHR 16 SNPs | 16.250 (2.495) | 1651 |
| TG 25 SNPs | 26.10 (2.886) | 1651 |
| TG 41 SNPs | 0.986 (0.095) | 1651 |
| LDL-C 14 SNPs | 14.466 (2.184) | 1651 |
| LDL-C 58 SNPs | 0.959 (0.078) | 1651 |
| HDL-C 25 SNPs | 44.646 (3.714) | 1651 |
| Parental environment | ||
| University education (1980), mother | 0.073 (0.261) | 1651 |
| University education (1980), father | 0.110 (0.313) | 1651 |
| Income (1980), mother (euros) | 4642.3 (3433.7) | 1651 |
| Income (1980), father (euros) | 8684.4 (5633.1) | 1651 |
| Background information on participants | ||
| Age (2001) | 31.570 (4.914) | 1651 |
| Female (2001) | 0.545 (0.498) | 1651 |
| Married (2001) | 0.451 (0.498) | 1651 |
| BMI (2011) | 26.464 (4.839) | 1185 |
| WHR (2011) | 0.898 (0.088) | 1185 |
| TG (2011) | 1.205 (0.623) | 1185 |
| LDL-C (2011) | 3.247 (0.823) | 1185 |
| HDL-C (2011) | 1.333 (0.322) | 1185 |
Notes: Descriptive statistics are reported for the samples used in the estimations. * = weighted.
3. Results
3.1. Evidence from the reduced-form models
Table 2, Table 3 present the results of regressing labour market outcomes on risk scores. We acknowledge that the relatively small sample size of the YFS limits the statistical power of the analysis. For this reason, the models are first estimated for men and women pooled (Table 2), and thus represent the average effect across both genders. In Table 3, we report the results by gender.
Table 2.
Genetic health endowments and labour market outcomes: estimates of reduced-form models for the 2001–2012 period.
|
Labour Market Outcomes, averages over 2001–2012 |
|||
|---|---|---|---|
| (1) | (2) | (3) | |
| Dependent variable | Log of average earnings | Share of years employed | Log of average social income transfers |
| (N = 1651) | (N = 1675) | (N = 1651) | |
| Genetic risk score | |||
| BMI 32 | −0.012** | −0.004** | 0.042** |
| (0.006) | (0.002) | (0.020) | |
| BMI 97 | −0.001 | −0.023 | 0.380 |
| (0.130) | (0.037) | (0.421) | |
| WHR 14 | −0.011 | −0.004 | 0.042 |
| (0.009) | (0.003) | (0.029) | |
| WHR 16 | −0.011 | −0.004* | 0.045 |
| (0.009) | (0.002) | (0.028) | |
| HDL 38 | 0.001 | −0.002 | 0.012 |
| (0.007) | (0.002) | (0.020) | |
| LDL 14 | −0.023** | -0.004 | −0.037 |
| (0.010) | (0.003) | (0.032) | |
| LDL 58 | −0.157 | 0.010 | −0.997 |
| (0.252) | (0.078) | (0.852) | |
| TG 25 | 0.011 | 0.003* | −0.040* |
| (0.007) | (0.002) | (0.024) | |
| TG 41 | 0.168 | 0.123** | −0.372 |
| (0.199) | (0.061) | (0.693) | |
Notes: Earnings are measured as the log of average earnings over the period 2001–2012. Employment is measured as the average share of employment years over the period 2001–2012. Social income transfers are measured as the log of average transfers over the period 2001–2012. All models control for gender and cohort. Heteroscedasticity-robust standard errors are reported in parentheses: * statistically significant at the 0.10 level; ** at the 0.05 level; *** at the 0.01 level.
Table 3.
Genetic health endowments and labour market outcomes: estimates of reduced-form models for the 2001–2012 period, men and women separately.
|
Labour Market Outcomes, averages over 2001–2012 |
||||||
|---|---|---|---|---|---|---|
|
Men |
Woman |
|||||
| (1) | (2) | (3) | (4) | (5) | (6) | |
| Dependent variable | Earnings | Years employed | Social income transfers | Earnings | Years employed | Social income transfers |
| (N = 751) | (N = 758) | (N = 751) | (N = 900) | (N = 917) | (N = 900) | |
| Genetic risk score | ||||||
| BMI 32 | −0.016* | −0.005** | 0.074** | −0.008 | −0.003 | 0.016 |
| (0.009) | (0.002) | (0.031) | (0.008) | (0.002) | (0.025) | |
| BMI 97 | 0.031 | −0.032 | 0.643 | −0.020 | −0.017 | 0.208 |
| (0.206) | (0.055) | (0.668) | (0.169) | (0.051) | (0.538) | |
| WHR 14 | −0.020* | −0.002 | 0.066 | −0.004 | −0.005 | 0.024 |
| (0.012) | (0.004) | (0.046) | (0.013) | (0.004) | (0.036) | |
| WHR 16 | −0.023** | −0.004 | 0.093** | −0.002 | −0.005 | 0.008 |
| (0.011) | (0.003) | (0.045) | (0.012) | (0.004) | (0.034) | |
| HDL 38 | −0.010 | −0.005** | 0.013 | 0.011 | 0.001 | 0.013 |
| (0.011) | (0.003) | (0.030) | (0.008) | (0.002) | (0.026) | |
| LDL 14 | −0.019 | −0.004 | −0.080* | −0.029** | −0.005 | 0.008 |
| (0.015) | (0.004) | (0.048) | (0.012) | (0.004) | (0.042) | |
| LDL 58 | −0.106 | −0.039 | −1.870 | −0.209 | 0.045 | −0.255 |
| (0.406) | (0.116) | (1.386) | (0.321) | (0.106) | (1.058) | |
| TG 25 | 0.011 | 0.001 | −0.018 | 0.010 | 0.006** | −0.061* |
| (0.011) | (0.003) | (0.036) | (0.009) | (0.002) | (0.031) | |
| TG 41 | 0.154 | 0.050 | −0.145 | 0.151 | 0.174** | −0.455 |
| (0.340) | (0.093) | (1.072) | (0.236) | (0.082) | (0.908) | |
Notes: Earnings are measured as the log of average earnings over the period 2001–2012. Employment is measured as the average share of employment years over the period 2001–2012. Social income transfers are measured as the log of average transfers over the period 2001–2012. All models control for gender and cohort. Heteroscedasticity-robust standard errors are reported in parentheses: * statistically significant at the 0.10 level; ** at the 0.05 level; *** at the 0.01 level.
In general, the results reveal statistically non-significant or only weakly significant associations between genetic health endowments and later-life outcomes: in 19 out of 27 cases, we cannot reject the null hypothesis of no effect of genetic variant on the labour market outcome (Table 2). Encouragingly, in most cases, the estimates are in accordance with our theoretical reasoning, i.e., the direction of an effect is correct. The estimates that are against our priors are related to the effects of triglycerides (all six estimates) and the outcomes that measure social income received (four cases out of nine).
The results for risk scores of BMI and WHR show the clearest effects. We find four cases that indicate a statistically significant and correctly signed effect of genetic endowment on later-life outcome. BMI32 has a significant negative effect on earnings (p < 0.05) and years in employment (p < 0.05) and a significant positive effect on social income transfers (p < 0.05). WHR16 indicates a significant negative effect on the share of years employed (p < 0.10). However, genetic endowment is not statistically significantly linked to poor later-life performance when the risk score for BMI is based on 97 SNPs or when the risk score for WHR is based on 14 SNPs. However, in all cases, the direction of an effect is consistent with our priors.
The results for risk scores for lipoproteins (HDL38, LDL14, LDL58, TG25, and TG41) are muted. There are one correctly signed and three incorrectly signed estimates that obtain statistical significance. LDL14 indicates a negative effect on earnings (p < 0.05) as expected. An increased risk of higher triglycerides (TG25 and TG41) implies more employment years (p < 0.10), in contrast to our a priori beliefs.
Table 3 presents the results separately by gender. These results are consistent with the pooled model; i.e., most of the estimates are imprecise, partly reflecting weak power related to a small sample. However, the results show notable gender differences. For men, there are seven statistically significant estimates that are consistent with theoretical reasoning; see columns 1–3. BMI32 has a significant negative effect on later-life outcomes for earnings (p < 0.10), for years in employment (p < 0.05), and for social income transfers (p < 0.05). WHR16 and WHR14 indicate a significant negative effect on earnings (p < 0.05), and WHR16 also has a significant positive effect on social income transfers (p < 0.05). For women, there are no effects of obesity or body shape on later-life outcomes; see columns 4–6. The findings of gender-specific effects of obesity are consistent with the results documented in Böckerman et al. (2019), Cawley (2004), Kline and Tobias (2008), Johansson, Böckerman, Kiiskinen, and Heliövaara (2009). As in the pooled sample, the effects of risk scores based on lipoproteins show no statistically significant relationships (24 cases in total), or the effects are contrary to prior beliefs (five cases in total; HLD38 for employment and LDL14 for social income transfers; TG25 and TG41 for employment and TG41 for social income transfers).
3.2. Extensions and robustness
Our analysis suggests that (i) there is an effect of obesity (BMI and WHR) on several later-life labour market outcomes, and (ii) there are no effects of low- or high-density lipoprotein cholesterol and triglycerides on later-life outcomes or that the effects are counterintuitive. Furthermore, findings are (iii) sensitive to the measurement of genetic endowments and (iv) show differences across genders.
Table 4 extends the analysis by testing the joint significance of two GRSs (BMI32 and WHR16) on earnings, employment and social security benefits. The models are estimated separately for men (columns 1–3) and women (columns 4–6). For robustness, we augment the models for age (Panel A), parental environment (Panel B), and all GRSs used in Table 3 (Panel C). In Panel D, all controls are included in the model simultaneously. In Table 5, we repeat the analysis using only cross-sectional data from the year 2012. This model provides a tentative check for the possibility that the risks of cardiovascular diseases for an individual’s performance in the labour market may be materialized only at an older age.
Table 4.
Genetic health endowments and labour market outcomes. Joint significance of BMI32 and WHR16 on various outcomes, men and women separately.
|
Labour Market Outcomes, averages over 2001–2012 |
||||||
|---|---|---|---|---|---|---|
|
Men |
Women |
|||||
| Earnings | Years employed | Social income transfers | Earnings | Years employed | Social income transfers | |
| (N = 751) | (N = 758) | (N = 751) | (N = 900) | (N = 917) | (N = 900) | |
| Panel A | 3.94 | 3.23 | 4.65 | 0.57 | 1.56 | 0.22 |
| (0.020) | (0.040) | (0.010) | (0.568) | (0.210) | (0.802) | |
| Panel B | 3.84 | 3.03 | 4.82 | 0.59 | 1.82 | 0.23 |
| Added: family background | (0.022) | (0.049) | (0.008) | (0.553) | (0.163) | (0.792) |
| Panel C | 3.34 | 4.19 | 5.06 | 0.47 | 0.45 | 0.64 |
| Added: all remaining GRSs | (0.036) | (0.016) | (0.007) | (0.625) | (0.638) | (0.528) |
| Panel D | 3.40 | 4.16 | 5.49 | 0.45 | 0.55 | 0.64 |
| Added: all together | (0.034) | (0.016) | (0.004) | (0.635) | (0.577) | (0.529) |
Table reports F-statistics for a joint test that all GRS coefficients (BMI32 and WHR16) are zero and p-values in parentheses. All models include controls for age.
Table 5.
Genetic health endowments and labour market outcomes. Joint significance of BMI32 and WHR16 on various outcomes, men and woman separately.
|
Labour Market Outcomes in 2012 |
||||||
|---|---|---|---|---|---|---|
|
Men |
Woman |
|||||
| Earnings | Years employed | Social income transfers | Earnings | Years employed | Social income transfers | |
| (N = 745) | (N = 751) | (N = 745) | (N = 893) | (N = 909) | (N = 893) | |
| Panel A | 3.54 | 3.71 | 0.81 | 0.55 | 1.30 | 1.11 |
| (0.030) | (0.025) | (0.446) | (0.578) | (0.273) | (0.330) | |
| Panel B | 3.57 | 3.76 | 0.79 | 0.63 | 1.42 | 1.09 |
| Added: family background | (0.029) | (0.024) | (0.453) | (0.531) | (0.243) | (0.336) |
| Panel C | 4.38 | 5.80 | 4.08 | 1.14 | 0.43 | 0.60 |
| Added: all remaining GRSs | (0.013) | (0.003) | (0.017) | (0.319) | (0.651) | (0.551) |
| Panel D | 4.40 | 5.93 | 4.30 | 1.22 | 0.51 | 0.58 |
| Added: all together | (0.013) | (0.003) | (0.014) | (0.294) | (0.598) | (0.561) |
Table reports the F-statistics for a joint test that all GRS coefficients (BMI32 and WHR16) are zero and p-values in parentheses. All models include controls for age.
The results confirm the gender differences in the effects of obesity on later-life outcomes in the labour market. The baseline results for men (Table 4, Panel A) show that there are statistically significant joint effects of genetic health endowment related to body weight and body shape on earnings (p = 0.020 column 1), employment years (p = 0.040, column 2), and social income transfers (p = 0.010, column 3) for men; see Panel A in columns 1–3. Similar findings for women cannot be found: the corresponding p-statistics are 0.568, 0.210, and 0.802; see columns 4–6. The difference between genders remains robust after controlling for parental background (Panel B), all remaining risk scores in the data set (Panel C), and both sets of controls simultaneously (Panel D). The use of cross-sectional data from 2012, instead of averaged data for the 2001–2012 period, yielded comparable results (Table 5). For women, the results show no effect in all specifications; see columns 4–6. For men, the result of no effect is found for social income transfers only when we control for age and family background (Panels A and B). In all other cases, the results are similar to those in Table 4.
4. Discussion
This study uses the YFS data to examine whether genetic endowments related to an enhanced risk for cardiovascular diseases have significant effects on long-term labour market performance in prime age. The reduced-form approach of the study is useful because there is previous evidence according to which health correlates with labour market outcomes and different mechanisms that may explain the relationship. Furthermore, the observed relationships may be reciprocal. The approach of testing, instead of estimating a causal relationship, has several advantages related to measurement errors in exposures, reverse causality and potential gene-environmental interactions.
Our empirical analysis provides two main findings. First, the effects of genetic health endowments related to an enhanced risk of cardiovascular diseases on later-life outcomes in the labour market are generally correctly signed but statistically weak, and if they exist, they are related to endowments associated with obesity and prevail only for men. Second, an increase in the number of SNPs in risk scores tends to result in non-significant estimates. This result may reflect pleiotropic effects of SNPs with a tendency that genetic variants have strong associations with various traits with opposing labour market effects. This finding, in turn, implies that caution is warranted when using genetic information in instrumental variable research designs (Burgess, Bowden, Fall, Ingelsson, & Thompson, 2017).
The possible limitations of the study relate to the sample, the construction and power of GRSs, and the measurement of outcomes. First, the size of the YFS data limits the statistical power and the use of subsamples by gender and wave, although the baseline findings are robust to certain permutations of the sample. Second, the power of GRSs in explaining biomarkers related to cardiovascular risk is limited. On average, the GRSs account only for a small proportion of total variation in health traits. Third, it is not obvious which risk score to use in analysis. The use of a risk score with the highest number of SNPs may entail a trade-off between the predictive power of the GRS and the possible bias related to pleiotropic effects. In our case, SNPs may affect later-life labour market outcomes through unobserved characteristics that are not related to health endowments. Fourth, a risk of cardiovascular disease for individual performance in the labour market may materialize only at an older age: the average age of participants in the sample was 36 years, and therefore, it is possible that the increased genetic risks were not yet fully developed among the participants. Furthermore, the findings are based on data from a single country, Finland, which may raise issues related to generalization of the results. However, Finland is a prominent example of a highly developed Nordic welfare state, and thus, our results may apply to European labour markets as well.
We suggest five extensions for future research. First, the analyses could be confirmed by using information on individual SNPs instead of risk scores to detect and account for possible pleiotropic effects (Von Hinke et al., 2016, Zhu et al., 2018). Alternatively, polygenic risk scores, which account for genome-wide variation and achieve greater predictive power compared with GRSs comprising a small number of genome-wide significant SNPs, could be used (Visscher et al., 2017, Choi et al., 2018, Inouye et al., 2018). Unfortunately, the YFS data that are currently linked to the comprehensive registers of Statistics Finland contain only data on risk scores and no information on individual SNPs or polygenetic risk scores. Second, the results should be confirmed in other and preferably larger samples. This research would provide more information and precision, in particular, on the gender differences in the effects of health endowment for later-life outcomes (Cawley, 2004). Third, it would be useful to repeat our analyses by using genetic health endowments that are related to other dysfunctions and abnormal conditions of the body, along with cardiovascular diseases (Zhu et al., 2018). Fourth, the analysis could be augmented with information on parental health endowments (Conley et al., 2015). This approach would alleviate the problems associated with assortative mating and the possibility that parental endowments shape the childhood environment, which, in turn, affects offspring’s later performance in the labour market. Fifth, it could be interesting to augment the analysis by using the variables that measure participant’s health behaviour in youth. This approach could provide a more complete picture of the interplay between early genetic endowments, early health behaviour and later labour market outcomes (Conti and Heckman, 2010).
Ethical statement
Ethics approval/Young Finns Data/University of Turku.
Funding
Academy of Finland (117787, 121584, 124282, 126925, 134309, 286284, 41071); Yrjö Jahnsson Foundation (6664, 6647); Palkansaaja Foundation (2013, 2016).
References
- Belsky D.W., Moffitt T.E., Caspi A. Genetics in population health science: Strategies and opportunities. American Journal of Public Health. 2013;103(S1):S73–S83. doi: 10.2105/AJPH.2012.301139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Böckerman P., Cawley J., Viinikainen J., Lehtimäki T., Rovio S., Pehkonen J., Raitakari O. The effect of weight on labor market outcomes: An application of genetic instrumental variables. Health Economics. 2019;28:65–77. doi: 10.1002/hec.3828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burgess S., Bowden J., Fall T., Ingelsson E., Thompson S.G. Sensitivity analyses of robust causal inference from mendelian randomization analyses with multiple genetic variants. Epidemiology. 2017;28(1):30–42. doi: 10.1097/EDE.0000000000000559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cawley J. The impact of obesity on wages. Journal of Human Resources. 2004;39(2):451–474. [Google Scholar]
- Cawley J. An economy of scales: A selective review of obesity’s economic causes, consequences, and solutions. Journal of Health Economics. 2015;43:244–268. doi: 10.1016/j.jhealeco.2015.03.001. [DOI] [PubMed] [Google Scholar]
- Cawley J., Han E., Norton E. The validity of genes related to neurotransmitters as instrumental variables. Health Economics. 2011;20(8):884–888. doi: 10.1002/hec.1744. [DOI] [PubMed] [Google Scholar]
- Choi, S.W., Mak, T.S.H. & Reilly, P. (2018). A guide to performing Polygenic Risk Score analyses. bioRxiv. 10.1101/416545. [DOI] [PMC free article] [PubMed]
- Conley D. Socio-genomic research using genome-wide molecular data. Annual Review of Sociology. 2016;42:275–299. [Google Scholar]
- Conley D., Domingue B.W., Cesarini D., Dawes C., Rietveld C.A., Boardman J.D. Is the effect of parental education on offspring biased or moderated by genotype? Sociological Science. 2015;2:82–105. doi: 10.15195/v2.a6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Conti G., Heckman J.J. Understanding the early origins of the education–health gradient: A framework that can also be applied to analyze gene–environment interactions. Perspectives on Psychological Science. 2010;5(5):585–605. doi: 10.1177/1745691610383502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Currie J. Healthy, wealthy and wise: Socioeconomic status, poor health in childhood and human capital development. Journal of Economic Literature. 2009;47(1):87–122. [Google Scholar]
- Danaei G., Ding L.E., Mozaffarian D., Taylor B., Rehm J., Murray C.J.L. The preventable causes of death in the United States: Comparative risk assessment of dietary, lifestyle, and metabolic risk factors. PloS Med. 2009;6(4):e1000058. doi: 10.1371/journal.pmed.1000058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Delaneau O., Marchini J., Zagury J.F. A linear complexity phasing method for thousands of genomes. Nat Methods. 2012;9(2):179–181. doi: 10.1038/nmeth.1785. [DOI] [PubMed] [Google Scholar]
- Ding W., Lehrer S.F., Rosenquist J.N., Audrain-McGovern J. The impact of poor health on academic performance: New evidence using genetic markers. Journal of Health Economics. 2009;28:578–597. doi: 10.1016/j.jhealeco.2008.11.006. [DOI] [PubMed] [Google Scholar]
- 1000 Genomes Project Consortium A map of human genome variation from population-scale sequencing. Nature. 2010;467(7319):1061–1073. doi: 10.1038/nature09534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greco M.F.D., Minelli C., Sheehan N.A., Thompson J.R. Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Statistics in Medicine. 2015;34(21):2926–2940. doi: 10.1002/sim.6522. [DOI] [PubMed] [Google Scholar]
- Hack M., Flannery D.J., Schluchter M., Cartar L., Borawski E., Klein N. Outcomes in young adulthood for very-low-birth-weight infants. New England Journal of Medicine. 2002;346(3):149–157. doi: 10.1056/NEJMoa010856. [DOI] [PubMed] [Google Scholar]
- Heid I.M., Jackson A.U., Randall J.C., Winkler T.W., Qi L., Steinthorsdottir V. Meta-analysis identifies 13 new loci associated with waist-hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution. Nat Genet. 2010;42(11):949–960. doi: 10.1038/ng.685. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hernesniemi J.A., Lyytikäinen L.P., Oksala N., Seppälä I., Kleber M.E., Mononen N., Martiskainen M. Predicting sudden cardiac death using common genetic risk variants for coronary artery disease. European Heart Journal. 2015;36(26):1669–1675. doi: 10.1093/eurheartj/ehv106. [DOI] [PubMed] [Google Scholar]
- Howie B.N., Donnelly P., Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 2009;5(6):e1000529. doi: 10.1371/journal.pgen.1000529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Inouye M., Abraham G., Nelson C.P., Samani N.J. Genomic risk prediction of coronary artery disease in 480,000 adults: implications for primary prevention. Journal of the American College of Cardiology. 2018;72(16):1883–1893. doi: 10.1016/j.jacc.2018.07.079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jäckle R., Himmler O. Health and wages: Panel estimates considering selection and endogeneity. Journal of Human Resources. 2010;45(2):364–406. [Google Scholar]
- Johansson E., Böckerman P., Kiiskinen U., Heliövaara M. Obesity and labour market success in Finland: The difference between having a high BMI and being fat. Economics and Human Biology. 2009;7:36–45. doi: 10.1016/j.ehb.2009.01.008. [DOI] [PubMed] [Google Scholar]
- Kline B., Tobias J.L. The wages of BMI: Bayesian analysis of a skewed treatment–response model with nonparametric endogeneity. Journal of Applied Econometrics. 2008;23:767–793. [Google Scholar]
- Lawlor D.A., Harbord R.M., Sterne J., Timpson N., Smith G. Mendelian randomization: Using genes as instruments for making inferences in epidemiology. Statistics in Medicine. 2008;27:1133–1163. doi: 10.1002/sim.3034. [DOI] [PubMed] [Google Scholar]
- Locke A.E. Genetic studies of body mass index yield new insight for obesity biology. Nature. 2015 doi: 10.1038/nature14177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lundborg P., Stenberg A. Nature, nurture and socioeconomic policy—What can we learn from molecular genetics? Economics and Human Biology. 2010;8(3):320–330. doi: 10.1016/j.ehb.2010.08.002. [DOI] [PubMed] [Google Scholar]
- Norton E.C., Han E. Genetic information, obesity and labor market outcomes. Health Economics. 2008;17(9):1089–1104. doi: 10.1002/hec.1383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palmer T.M., Lawlor D.A., Harbord R.M., Sheehan N.A., Tobias J.H., Timpson N.J.…Sterne J.A.C. Using multiple genetic variants as instrumental variables for modifiable risk factors. Statistical Methods in Medical Research. 2012;21:223–242. doi: 10.1177/0962280210394459. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pehkonen J., Viinikainen J., Böckerman P., Lehtimäki T., Pitkänen N., Raitakari O. Genetic endowments, parental resources and adult health: Evidence from the Young Finns study. Social Science and Medicine. 2017;188:191–200. doi: 10.1016/j.socscimed.2017.04.030. [DOI] [PubMed] [Google Scholar]
- Pomeranz J.L., Puhl R.M. New developments in the law for obesity discrimination protection. Obesity. 2013;21(3):469–471. doi: 10.1002/oby.20094. [DOI] [PubMed] [Google Scholar]
- Puhl R.M., Heuer C.A. The stigma of obesity: A review and update. Obesity. 2009;17(5):941–965. doi: 10.1038/oby.2008.636. [DOI] [PubMed] [Google Scholar]
- Raitakari O.T., Juonala M., Rönnemaa T., Jula A. Cohort profile: The cardiovascular risk in Young Finns Study. International Journal of Epidemiology. 2008;37:1220–1226. doi: 10.1093/ije/dym225. [DOI] [PubMed] [Google Scholar]
- Rooth d-O. Obesity, attractiveness, and differential treatment in hiring: A field experiment. Journal of Human Resources. 2009;44:710–735. [Google Scholar]
- Speliotes E.K., Willer C.J., Berndt S.I., Loos R.J.F. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet. 2010;42(11):937–948. doi: 10.1038/ng.686. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teo Y.Y., Inouye M., Small K.S., Gwilliam R., Deloukas P., Kwiatkowski D.P., Clark T.G. A genotype calling algorithm for the Illumina BeadArray platform. Bioinformatics. 2007;23(20):2741–2746. doi: 10.1093/bioinformatics/btm443. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teslovich T.M., Musunuru K., Smith A.V., Edmondson A.C., Stylianou I.M., Koseki M. Biological, clinical and population relevance of 95 loci for blood lipids. Nature. 2010;466(7307):707–713. doi: 10.1038/nature09270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tyrrell J., Jones S.E., Beaumont R., Freyling T.M. Height, body mass index, and socioeconomic status: Mendelian randomisation study in UK Biobank. British Medical Journal. 2016;352:i582. doi: 10.1136/bmj.i582. [DOI] [PMC free article] [PubMed] [Google Scholar]
- VanderWeele T., Tchetgen E.J., Cornelis M., Kraft P. Methodological challenges in Mendelian randomization. Epidemiology. 2016;25(3):427–435. doi: 10.1097/EDE.0000000000000081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Visscher P.M., Wray N.R., Zhang Q., Sklar P., McCarthy M.I., Brown M.A., Yang J. 10 years of GWAS discovery: Biology, function, and translation. American Journal of Human Genetics. 2017;101(1):5–22. doi: 10.1016/j.ajhg.2017.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Von Hinke S., Smith G., Lawlor D.A., Propper C., Windmeijer F. Mendelian randomization: The use of genes in instrumental variable analysis. Health Economics. 2011;20:893–896. doi: 10.1002/hec.1746. [DOI] [PubMed] [Google Scholar]
- Von Hinke S., Smith G., Lawlor D.A., Propper C., Windmeijer F. The impact of poor health on academic performance: New evidence using genetic markers. Journal of Health Economics. 2013;28:578–597. doi: 10.1016/j.jhealeco.2008.11.006. [DOI] [PubMed] [Google Scholar]
- Von Hinke S., Smith G.D., Lawlor D.A., Propper C., Windmeijer F. Genetic markers as instrumental variables. Journal of Health Economics. 2016;45:131–134. doi: 10.1016/j.jhealeco.2015.10.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu Z., Zheng Z., Zhang F., Yang J. Causal associations between risk factors and common diseases inferred from GWS summary data. Nature Communications. 2018;9:224. doi: 10.1038/s41467-017-02317-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

