Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2021 Aug 17;16(8):e0255348. doi: 10.1371/journal.pone.0255348

Polygenic scores for smoking and educational attainment have independent influences on academic success and adjustment in adolescence and educational attainment in adulthood

Brian M Hicks 1,*, D Angus Clark 1, Joseph D Deak 2,3, Jonathan D Schaefer 4, Mengzhen Liu 5, Seonkyeong Jang 5, C Emily Durbin 6, Wendy Johnson 7, Sylia Wilson 4, William G Iacono 5, Matt McGue 5, Scott I Vrieze 5
Editor: Edelyn Verona8
PMCID: PMC8370636  PMID: 34403414

Abstract

Educational success is associated with greater quality of life and depends, in part, on heritable cognitive and non-cognitive traits. We used polygenic scores (PGS) for smoking and educational attainment to examine different genetic influences on facets of academic adjustment in adolescence and educational attainment in adulthood. PGSs were calculated for participants of the Minnesota Twin Family Study (N = 3225) and included as predictors of grades, academic motivation, and discipline problems at ages 11, 14, and 17 years-old, cigarettes per day from ages 14 to 24 years old, and educational attainment in adulthood (mean age 29.4 years). Smoking and educational attainment PGSs had significant incremental associations with each academic variable and cigarettes per day. About half of the adjusted effects of the smoking and education PGSs on educational attainment in adulthood were mediated by the academic variables in adolescence. Cigarettes per day from ages 14 to 24 years old did not account for the effect of the smoking PGS on educational attainment, suggesting the smoking PGS indexes genetic influences related to general behavioral disinhibition. In sum, distinct genetic influences measured by the smoking and educational attainment PGSs contribute to academic adjustment in adolescence and educational attainment in adulthood.

Introduction

Educational success is important for a variety of important life outcomes including wealth accumulation, health and longevity, and happiness [13]. Success in school entails multiple facets including motivation and enthusiasm for striving for academic goals, willingness to conform to school’s standards of conduct, and earning good grades. Consequently, educational success is complex and calls upon a variety of cognitive and non-cognitive traits including intellectual abilities (learning, memory, reasoning), persistence in pursuit of long-term goals, positive activation for goal striving, self-control, and internalizing the importance of academic goals and cultivating positive relationships with adults in education settings [46].

Meta-analyses of twin studies have found that genetic influences account for about 40% while shared and nonshared environmental influences each account for about 30% of the variation in educational attainment [7, 8]. These cumulative genetic and environmental influences are observed earlier in development on intermediate phenotypes associated with academic success including grades, academic motivation, and disciplinary conformity, though relative proportions of genetic and environmental influence may differ across these domains and with age [9, 10]. Large genome-wide association study (GWAS) meta-analyses of educational attainment have now identified over 1200 genome-wide significant associations (p < 5.0 x 10−8) with individual single nucleotide polymorphisms (SNPs), a key step in delineating biological processes that contribute to educational attainment [11]. Because effect sizes for individual SNPs are typically very small, polygenic scores (PGS) are often used to aggregate the effects of all SNPs from a GWAS [12]. By weighing all SNPs according to their effect sizes in GWAS, PGS account for about 10% of the variance in educational attainment. PGS then can be used to examine the genetic associations between educational attainment and its known correlates.

Smoking is a non-cognitive trait that has a strong association with lower educational attainment [13, 14]. Rather than a direct causal effect of education on smoking (or vice versa), however, there has been a long recognition that this association is due to the common influences of third variables. For example, smoking patterns tend to be established in the late teens to early 20’s, which is prior to the completion of higher education but later than when consistent individual differences in factors strongly related to educational attainment (e.g., GPA, academic motivation, discipline problems) have emerged [15]. Further, sib-pair difference analyses have found that familial factors account for the association between smoking and educational attainment [16]. Finally, recent GWAS findings have estimated genetic correlations from r = .27 to .56 between smoking and educational attainment phenotypes, indicating at least some of their familial association is due to common genetic influences [17, 18].

Behavioral disinhibition refers to difficulty inhibiting impulses to behave in socially undesirable or restricted actions [19] and is a another non-cognitive trait that has been associated with academic success [4]. Externalizing problems are manifestations of these poor inhibitory abilities and include impulsivity, aggression, rule breaking, oppositionality, hyperactivity, and inattention. They are associated with lower grades, poor academic motivation, and more disciplinary problems, and predict lower educational attainment [20, 21], with most of the overlap attributable to shared genetic influences [9]. Smoking, especially in adolescence, is strongly correlated with externalizing behaviors, alcohol use, and other drug use, all of which are manifestations of a higher-order behavioral disinhibition trait [19, 2224]. It is possible then that the association between smoking and education attainment is actually due to the overlap between smoking and the broader trait of behavioral disinhibition.

Recently, we examined the predictive validity of a PGS for having ever been a regular smoker that was derived from the largest GWAS of smoking-related phenotypes to date (N = 1,232,091) [25]. In replication samples, this PGS accounted for 4% of the variance in a similar smoking phenotype and was also significantly associated with use measures of alcohol, cannabis, cocaine, amphetamines, ecstasy, and hallucinogens [25, 26]. Using the same twin sample as in this report, we found that this smoking PGS predicted trajectories of nicotine and alcohol use from ages 14 to 34, even after adjusting for nicotine and alcohol use and a PGS for drinks per week [27].

This smoking PGS was also associated with the externalizing dimension of the Child Behavior Checklist in a large sample of pre-adolescents, even after adjusting for a general factor of psychopathology [28]. We followed up these results and found that the smoking PGS was associated with externalizing problems and personality traits associated with behavioral control—but not internalizing problems and extraversion—from ages 11 to 17 [29]. We concluded that the smoking PGS was also a measure of genetic influences on general behavioral disinhibition rather than smoking or nicotine addiction specifically, and so could be used to investigate the role that genetic influences related to behavioral disinhibition have on the development of other near-neighbor outcomes.

Here, we examined the relative effects of PGSs for educational attainment and smoking on educational attainment in adulthood. We also took a developmental approach and examined associations between the PGSs for smoking and educational attainment and several intermediate phenotypes that contribute to educational success including grades, academic motivation, and disciplinary problems in childhood and adolescence. We operationalized these intermediate academic phenotypes using the stable variance across multiple occasions (ages 11, 14, and 17-years old), which removed time-specific influences and unsystematic measurement error from these measures. We then tested whether the PGSs for educational attainment and smoking had incremental effects over and above each other, and if their effects differed across the different facets of academic adjustment in adolescence and educational attainment in adulthood (mean age 29.4 years). Associations between the PGSs and the intermediate academic variables can be conceptualized as examples of gene-environment correlation processes, wherein genetic dispositions influence the environments people shape or are exposed to, which then influences later outcomes [30, 31]. That is, some genetic influences are mediated through environmental experiences. Consequently, we also fit a path analysis model to delineate the effects of the PGSs for educational attainment and smoking on the intermediate academic variables in adolescence, and tested whether these intermediate variables mediated the associations of the PGSs on the more distal adulthood outcome of educational attainment. Finally, we also examined whether the expressed phenotype of cigarettes per day from ages 14 to 24 years old accounted for the association between the smoking PGS and educational attainment, or if the effect of the smoking PGS was mediated through the academic variables, which would be more consistent with a general effect of behavioral disinhibition.

Methods

Participants

Participants were members of the Minnesota Twin Family Study (MTFS), a longitudinal study of 3762 (52% female) twins (1881 pairs) [32]. All twin pairs were the same sex and lived with at least one biological parent within driving distance to the University of Minnesota laboratories when recruited. Exclusion criteria included any cognitive or physical disability that would interfere with study participation. Twins were recruited the year they turned either 11-years old (n = 2510; ‘younger cohort’) or 17-years old (n = 1252; ‘older cohort’). Twins in the older cohort were born between 1972 and 1979, while twins in the younger cohort were born from 1977 to 1984 and 1988 to 1994. Families were representative of the recruitment area on socioeconomic status, history of mental health treatment, and urban-rural residence [33]. Consistent with the demographics of Minnesota for the target birth years, 96% of participants reported non-Hispanic White ethnicity and race. All study protocols were evaluated and approved by the Institutional Review Board at the University of Minnesota. Written consent was obtained from all participants ages 18 years-old and older; written consent from parents and written assent from participants was obtained for all participants under age 18 years-old.

The younger cohort was assessed at ages 11 (Mage = 11.78 years, SD = 0.43 years) and 14 (Mage = 14.90 years, SD = 0.31 years), and all twins were assessed at ages 17 (Mage = 17.85 years, SD = 0.64 years), 21 (Mage = 21.08 years, SD = 0.79 years), and 24 (Mage = 24.87 years, SD = 0.94 years). All twins from the 1972–1979 and 1977–1984 birth cohorts were also assessed at age 29 (Mage = 29.43 years, SD = 0.67 years), and a subset (n = 866) of the latter cohort was also assessed at age 34 (Mage = 34.62 years, SD = 1.30 years). Table 1 provides the number of participants and descriptive statistics for the measures of academic adjustment at ages 11, 14, and 17. Retention rates were 91.4% and 86.3% at ages 14 and 17, respectively, for the younger cohort. The total sample included 1205 monozygotic (51.5% female) and 676 dizygotic (52.8% female) twin pairs.

Table 1. Descriptive information for academic adjustment variables at ages 11, 14, and 17.

Age 11 Age 14 Age 17
M SD N M SD N r11 M SD N r11 r14
Grade Point Average 3.10 .66 2492 3.06 .79 2357 .64 3.02 .77 3449 .56 .73
Academic Motivation 20.89 2.50 2502 19.83 2.94 2335 .47 19.76 3.08 3412 .41 .62
 Self/Parent Report
Academic Motivation 19.87 3.13 1754 19.64 3.40 1808 .58 19.67 3.42 2249 .53 .58
 Teacher Report
Disciplinary Problems .00 .61 2429 .00 .80 2357 .25 .00 .75 3512 .25 .48

Note. M = mean; SD = standard deviation; N = number of ratings; r11 = correlation with corresponding variable at age 11; r14 = correlation with corresponding variable at age 14.

Assessment

Grade Point Average (GPA)

Twins and their mothers reported on the grades twins typically received in reading/English, math, social studies/history, and science classes by indicating whether the grades were much better than average (A = 4), above average (B = 3), average (C = 2), below average (D = 1) or very much below average, failing (F = 0). This approach was taken to standardize grade assessment and facilitate comparison since participants attended different school districts that employed different grading formats, procedures, and standards. We used the mean rating across class subjects for the GPA variable, and averaged the GPA scores across twin and mother reports (r = .79). The validity of this approach was tested using 67 school transcripts from a random sample of younger cohort twins, and the correlation between the MTFS rating and actual grades was r = .89 [9]. Participants who dropped out of high school reported grades for the last year they attended school.

Academic motivation

Twins and their mothers completed a 6-item (α = .83) scale assessing twins’ attitudes about school (interested in school work; enjoys attending school; turns in assignments on time; liked by teachers; has a good attitude about school; motivated to earn good grades) [9]. We used the mean of the self and mother reports (r = .51) for the academic motivation score.

Teachers also rated twins on the same items using a teacher rating form that was completed by up to 3 teachers nominated by the twins. We used the mean rating across teachers whenever more than one teacher rating was available (~75% of participants with teacher rating data had at least two teacher informants). Teacher ratings were collected at each assessment and were available for 69.9%, 72.0%, and 59.8% of participants at ages 11, 14, and 17, respectively. It was Minnesota state policy to place twins from the same pair in separate classrooms whenever possible, which minimized bias due to twin contrast or comparison. The correlation between the teacher and self/mother ratings of academic motivation was r = .55.

Disciplinary problems

Twins and mothers reported on twins receiving school disciplinary actions for misbehavior including: sent to detention or held after school; sent to principal’s office; notes sent home or parents called about student’s behavior; parent-teacher conferences regarding student’s behavior; skipping school or cutting classes; suspended or expelled from school. Responses were coded as 0 = never, 1 = once or twice, 2 = two or more times, and a behavior was considered present if reported by either the twin or mother (r = .62). We estimated disciplinary problems factor scores by fitting a 1-factor confirmatory factor analysis model to the six discipline problems items (mean factor loading = .83).

Cigarettes per day

Smoking was assessed using the average number of cigarettes smoked per day (or equivalent amount of an alternative form of nicotine use such as chews, cigars, etc.) at the target ages of 14, 17, 21, and 24 years old. Free responses were converted to a 0 (no use) to 6 (20 or more cigarettes per day) scale.

Educational attainment

We used the last assessment that a twin reported on their educational experiences (mean and median age 29.4 years, range 19.7 to 39.9 years) to code their highest level of educational attainment (n = 3463; 92.1% of the total sample). The educational attainment variable was coded as follows: 1 = less than high school diploma (9.5%), 2 = high school graduate or GED (9.8%), 3 = vocational degree, some college, or an associate’s degree (30.5%), 4 = bachelor’s degree (31.9%), 5 = master’s level degree (8.8%), 6 = PhD or other advanced professional degree (e.g., MD, JD) (4.2%). Educational attainment was coded as missing (7.9%; n = 299) if the participant did not have data for a post high school assessment (i.e., age 20 or older). Because there was a significant correlation between age and educational attainment (r = .27, p < .001), we regressed educational attainment on age, and used the unstandardized residual score in all analyses.

PGS methods

We generated PGSs for smoking and educational attainment from the GWAS summary statistics of large discovery samples for having ever smoked regularly [25] and years of education [11], after removing the MTFS sample that contributed to those GWAS to remove overlap with our study sample. We created smoking PGSs for participants of European ancestry in the MTFS target sample following imputation to the most recent Haplotype Reference Consortium reference panel [34, 35], and restricted to variants with minor allele frequency ≥ .01 and with imputation quality scores greater than 0.7. For the educational attainment PGS, we applied the same QC procedure for summary statistics of educational attainment GWAS [11] and additionally removed the MHC region (chr6:28477797–33448354). We then generated beta weights in the MTFS sample for the resulting ~1 million filtered HapMap3 variants using LDpred v.1.0 [36], including variants of all significance levels (i.e., p ≤ 1) to capture all genetic influences across the genome. We then calculated smoking and educational attainment PGSs in PLINK 1.9 [37] for all participants meeting this study’s inclusion criteria (n = 3225).

Data analytic strategy

We examined associations among the smoking and educational attainment PGSs and the longitudinal measures of academic adjustment using multiple regression models and random intercept panel models (RI-PM; see Fig 1). In the multiple regression models, we entered an academic variable at a single time point as the outcome and regressed on the smoking PGS or the educational attainment PGS, as well as the covariates of participant sex and the first five genetic principal components to adjust for ancestral stratification [38].

Fig 1. Conditional random intercept model.

Fig 1

PGS = polygenic risk score; RI = random intercept factor; GPA = grade point average; R = residual factors at age 11, 14, and 17; CVS = set of covariates including the first five genetic principal components, sex. The two PGS Predictor model for GPA is specifically depicted here for illustrative purposes, but all conditional random intercept models followed this general structure (with either one or two PGS predictor variables). Variances/residual variances and mean structure omitted from figure for clarity of presentation.

We then fit univariate, unconditional RI-PMs to the longitudinal measures of academic adjustment (GPA11, GPA14, GPA17 in Fig 1). In each model, we specified the measures of a given academic variable at each time point to load on a time-invariant random intercept factor (RI in Fig 1). We fixed factor loadings to 1, and allowed indicator intercepts to vary. We fixed the mean of the random intercept to 0 and estimated its variance freely. The random intercept captured the variance in the indicators shared across time points, that is, the stable trait variance across time [39]. For example, a positive random intercept score indicates that an individual consistently ranked higher than the sample mean across time points. We specified occasion-specific residual factors (R11 through R17 in Fig 1) with factor loadings fixed to 1, means fixed to 0, and variances freely estimated. We added autoregressive paths from one residual factor to the subsequent residual factor. These paths captured the extent to which time point-specific deviations at one time-point were related to time point-specific deviations at the subsequent time points, and were included because not accounting for residual autoregressive variance could lead to biased variance estimates in the intercept factors [40, 41]. We then fit conditional models in which the random intercept factors were either regressed on a single PGS and the control variables (1 PGS model), or both PGSs and the control variables (2 PGS model). We fit a similar conditional RI-PM for cigarettes per day, except that we used data for ages 14, 17, 21, and 24 years old, as there was very little smoking at age 11, and levels of cigarettes per day tend to peak in the early to mid-20’s.

Finally, we fit a path analysis mediation model to predict educational attainment in adulthood (Fig 2). In this model, the smoking PGS, educational attainment PGS, and control variables (sex and ancestry principal components) were the independent variables, and scores on the random intercepts of the adolescent academic variables and cigarettes per day were the mediator variables. To increase the model’s computational feasibility, we first estimated factor scores for the five random intercepts (via maximum a posteriori estimation) to include in the path analysis model. We specified paths from the independent variables to the five random intercept scores and educational attainment, and from the five random intercepts to educational attainment. We included covariances between all independent variables and specified residual covariances among the random intercept variables.

Fig 2. Educational attainment mediation model.

Fig 2

PGS = polygenic risk score. Circles represent random intercept factor scores. The first five genetic principal components and sex were included alongside the PGSs as predictors of the random intercept factor scores and educational attainment in adulthood. Covariate paths, covariances between predictor variables, covariances between random intercepts, variances/residual variances, and mean structure omitted from figure for clarity of presentation. Random intercepts represented via random intercept factor scores from the initial, unconditional random intercept models.

We fit all models in Mplus version 8.4 [42] using full information maximum likelihood estimation. We derived confidence intervals using clustered (by family) nonparametric percentile bootstrap (1000 draws), which provides reliable assessments of parameter estimate precision under a variety of complex data conditions [43]. We considered a parameter estimate statistically significant if the bootstrapped 95% confidence interval did not include 0, and its p-value was < .005. We used the Mplus Automation Package [44] in R [45] to facilitate the analyses.

Results

Descriptive information for the academic variables including the N’s at ages 11, 14, and 17 and autocorrelations are reported in Table 1. The GPA (mean autocorrelation = .69), academic motivation (mean autocorrelation = .55), teacher rating of academic motivation (mean autocorrelation = .58), and discipline problems (mean autocorrelation = .37) measures had moderate to high stability over time. Mean-levels of cigarettes per day increased through the age 14 (M = 0.53, SD = 1.29, n = 2334), age 17 (M = 1.36, SD = 1.86, n = 3444), and age 21 (M = 2.00, SD = 2.06, n = 2698) assessments, and then declined slightly at the age 24 (M = 1.79, SD = 2.05, n = 3258) assessment, and exhibited moderate to high rank-order stability (mean autocorrelation = .67). The unconditional RI-PMs were fully saturated and thus perfectly fit the data. The variance component of each random intercept was statistically significant, and the residual structure autoregressive coefficients were small to moderate (mean coefficients of .21 from age 11 to 14, and .40 from 14 to 17).

GPA

Results for the multiple regression models and RI-PMs for GPA, academic motivation, teacher rating of academic motivation, and disciplinary problems are presented in Table 2. For GPA, all the regression coefficients were statistically significant and small to medium in size for both the smoking (mean β = -.15) and educational attainment (mean β = .23) PGS. In the RI-PMs, the associations for the smoking (β = -.20, 95% CI: -.27, -.14) and educational attainment (β = .35, 95% CI: .29, .42) PGSs were larger than those in the multiple regression models, as expected given removal of time-specific variance including measurement error. When the two PGSs were included in the same RI-PM, both the smoking (β = -.13, 95% CI: -.19, -.07) and educational attainment (β = .32, 95% CI: .26, .40) PGSs remained statistically significant, though the effect size for the smoking PGS decreased by about 35%.

Table 2. Standardized regression coefficients from smoking and education polygenic scores (PGS) to outcome variables.

Smoking PGS Education PGS
Age 11 Age 14 Age 17 RI RI Age 11 Age 14 Age 17 RI RI
1 PGS 2 PGS 1 PGS 2 PGS
Grade Point Average -.13 -18 -.15 -.20 -.13 .20 .25 .25 .35 .32
[-.18, -.07] [-.23, -.13] [-.19, -.11] [-.27, -.14] [-.19, -.07] [.15, .26] [.21, .30] [.20, .29] [.29, .42] [.26, .40]
Academic Motivation -.10 -.17 -.16 -.22 -.19 .06 .14 .15 .17 .12
 Self/Parent Report [-.15, -.05] [-.22, -.13] [-.20, -.12] [-.28, -.16] [-.26, -.13] [.01, .10] [.10, .19] [.10, .18] [.11, .23] [.07, .19]
Academic Motivation -.14 -.19 -.15 -.21 -.15 .19 .25 .18 .28 .25
 Teacher Report [-.19, -.07] [-.24, -.14] [-.20, -.10] [-.27, -.15] [-.21, -.09] [.13, .25] [.20, .30] [.13, .23] [.23, .34] [.19, .31]
Disciplinary Problems .08 .17 .16 .28 .26 -.03 -.12 -.08 -.15 -.09
[.03, .13] [.12, .21] [.12, .20] [.20, .36] [.18, .34] [-.08, .02] [-.17, -.07] [-.04, -.12] [-.22, -.07] [-.16, -.01]
Cigarettes per day -- -- -- .20 .18 -- -- -- -.14 .10
[.16, .24] [.14, .21] [-.18, -.10] [-.15, -.06]
Educational Attainment -- -- -- -.19 -.14 -- -- -- .26 .23
[-.23, -.15] [-.18, -.10] [.22, .30] [.19, .27]

Note. Age 11 = regression paths from PGS to outcome variable at age 11; Age 14 = regression paths from PGS to outcome variable at age 14; Age 17 = regression paths from PGS to outcome variable at age 17; RI = random intercept factor; 1 PGS = coefficients from one PGS predictor model; 2 PGS = coefficients from two PGS predictor model; Bold = 95% confidence interval does not include 0. Only one PGS was entered as a predictor in each one PGS predictor model; both PGSs were entered as predictors simultaneously in the two PGS predictor model. Smoking is a random intercept factor score for nicotine quantity assessed at ages 14, 17, 21, and 24 years old. Educational attainment was adjusted for age (mean and median age 29.4 years, SD = 3.9 years, range 19.7 to 39.9 years). All models include participant sex and first five genetic principal components as covariates (covariate regression paths not included). 95% confidence intervals presented under estimates; estimates for which confidence intervals do not include 0 are presented in bold. Confidence intervals derived via clustered, non-parametric percentile bootstrapping (with 1,000 random draws).

Academic motivation

All regression coefficients were statistically significant for the associations between the measures of academic motivation and the smoking (mean β’s = -.14 and -.16 for self/parent and teacher ratings, respectively) and educational attainment (mean β’s = .12 and .21, for self/parent and teacher ratings, respectively) PGS. In the RI-PM, associations with the random intercept factors for the self/parent and teacher ratings of academic motivation were slightly larger for both the smoking (mean β = -.22) and educational attainment (mean β = .23) PGSs. When the two PGSs were included in the same RI-PMs, the adjusted associations with the smoking (mean β = -.17) and educational attainment (mean β = .19) PGSs remained statistically significant, with an average reduction in effect sizes of about 20%.

Disciplinary problems

All regression coefficients were statistically significant for the associations between disciplinary problems and the smoking PGS (mean β = .14). Associations between disciplinary problems and the educational attainment PGS were significant at ages 14 (β = -.12, 95% CI: -.17, -.07) and 17 (β = -.08, 95% CI: -.12, -.04, respectively), but not age 11 (β = -.03, 95% CI: -.08, .02). In the RI-PM, associations with the random intercept factor of disciplinary problems was much larger for the smoking PGS (β = .28, 95% CI: .20, .36) and slightly larger for the educational attainment PGS (β = -.15, 95% CI: -.22, -.07). When the two PGSs were included in the same RI-PM, the adjusted associations between the random intercept factor and the smoking (β = .26, 95% CI: .18, .34) and educational attainment (β = -.09, 95% CI: -.16, -.01) PGSs remained statistically significant, though the effect size for the educational attainment PGS decreased by about 40%.

Cigarettes per day

In the RI-PM, both the smoking (β = .20, 95% CI: .16, .24) and educational attainment (β = -.14, 95% CI: -.18, -.10) PGSs had significant associations with the random intercept factor for cigarettes per day (see Table 2). These effects remained significant after adjusting for their overlap, though the effects declined by about 29% for the educational attainment PGS and 10% for the smoking PGS.

Education attainment in adulthood

Both the smoking (β = -.19, 95% CI: -.24, -.15) and educational attainment (β = .26, 95% CI: .22, .30) PGSs had significant associations with educational attainment in adulthood (see Table 2). These effects remained significant after adjusting for their overlap, though the effects declined by about 26% for the smoking PGS and 12% for the educational attainment PGS. Table 3 includes the correlations among the smoking and educational PGSs, estimated scores for random intercept factors of the four academic variables in adolescence and cigarettes per day, and educational attainment in adulthood. The four academic variables had large associations with each other (mean r = |.53|) and educational attainment (r’s = |.35| to |.52|; R2 = .34). Cigarettes per day also had a robust association with educational attainment (r = -.30).

Table 3. Correlations among variables in the mediation model.

1 2 3 4 5 6 7 8
1. Smoking PGS
2. Education PGS -.23
3. RI-GPA -.14 .24
4. RI-Academic Motivation -.15 .11 .58
 Self/Parent report
5. RI-Academic Motivation -.17 .21 .67 .59
 Teacher report
6. RI-Disciplinary Problems .16 -.08 -.38 -.47 -.51
7. RI-Cigarettes per day .20 -.14 -.30 -.36 -.38 .37
8. Educational Attainment -.20 .26 .52 .39 .51 -.35 -.30

PGS = polygenic score; RI = random intercept; GPA = grade point average. Genetic principal components 1 through 5 and sex were also included in the mediation model as control variables; correlations for these control variables not presented. Random intercept correlations based on factor scores derived from the initial unconditional random intercept models. Educational attainment was adjusted for age (mean and median age 29.4 years, SD = 3.9 years, range 19.7 to 39.9 years).

Results from the mediation model that estimated the direct and indirect effects of the smoking and educational attainment PGSs via the academic variables in adolescence and cigarettes per day on educational attainment in adulthood are presented in Table 4. Inclusion of the smoking and educational attainment PGSs resulted in a significant increase in ΔR2 = .02 (R2 = .36; Δχ2(2) = 89.56, p < .001) over and above the four adolescent academic variables and cigarettes per day. Both the smoking (β = |.09| to |.18|) and educational attainment (β = |.05| to |.22|) PGSs had significant associations on the random intercept scores for each academic variable in adolescence and cigarettes per day.

Table 4. Standardized coefficients from educational attainment mediation model.

Random Intercept Factors
GPA Academic Motivation (Self/Parent) Academic Motivation (Teacher) Disciplinary Problems Cigarettes per day Educational Attainment
Smoking PGS
 PGS → RI -.09 -.13 -.13 .15 .18 --
[-.13, -.05] [-.17, -.09] [-.17, -.09] [.11, .19] [.14, .22]
 PGS → Educational Attainment -- -- -- -- -- -.07
[-.11, -.04]
 PGS → RI → Educational Attainment -.03 .00 -.03 -.01 -.01 -.08
[-.04, -.01] [-.01, .00] [-.04, -.02] [-.02, -.01] [-.02, .00] [-.10, -.05]
Education PGS
 PGS → RI .22 .09 .19 -.05 -.10 --
[.18, .27] [.05, .13] [.15, .23] [-.10, -.01] [-.14, -.06]
 PGS → Educational Attainment -- -- -- -- -- .11
[.08, .15]
 PGS → RI → Educational Attainment .06 .00 .04 .00 -.01 .12
[.05, .08] [.00, .01] [.03, .05] [.00, .01] [.00, .01] [.10, .14]
Educational Attainment
 RI → Educational Attainment .28 .03 .21 -.08 -.05 --
[.24, .33] [-.02, .07] [.16, .26] [-.12, -.04] [-.09, -.02]

Note. PGS = polygenic score; RI = random intercept; GPA = grade point average; PGS → RI = paths from the PGS to the random intercept; PGS → Educational Attainment = paths from the PGS to educational attainment (i.e., the direct effect); PGS → RI → Educational Attainment = indirect effects from the PGS to educational attainment through the corresponding RI (final entry in these rows correspond to the total indirect effect through all RIs); RI → Educational Attainment = paths from the RI to educational attainment; Bold = 95% confidence interval does not include 0. All paths come from a single model with one outcome variable (educational attainment), five intervening variables (the random intercepts), and 6 independent variables (the PGSs and covariates of sex and 5 genetic principal components to adjust for ancestry; coefficients for control variables not presented). In this model, the random intercepts were represented via random intercept factor scores from the initial unconditional random intercept models. Confidence intervals derived via clustered non-parametric percentile bootstrap with 1,000 draws. Educational attainment was adjusted for age (mean age 29.4 years, SD = 3.9 years, range 19.7 to 39.9 years). R2 = .36.

Random intercept scores for GPA, teacher ratings of academic motivation, disciplinary problems, and cigarettes per day were in turn significantly associated with educational attainment in adulthood (last row Table 4). These effects adjusted for the common variance among all the predictors, and so were substantially smaller than the unadjusted correlations, but were still robust for GPA (β = .28, 95% CI: .24, .28) and teacher ratings of academic motivation (β = .21, 95% CI: .16, .26) and small for disciplinary problems (β = -.08, 95% CI: -.12, -.04) and cigarettes per day (β = -.05, 95% CI: -.09, -.02). Consequently, the smoking and educational attainment PGSs each had small but statistically significant indirect effects on educational attainment via GPA and teacher ratings of academic motivation in adolescence, and the smoking PGS also had a small indirect effect via disciplinary problems.

Cumulatively, the random intercept scores for the four academic variables and cigarettes per day accounted for about 50% of the adjusted effects of the smoking (β = -.08, 95% CI: -.10, -.05) and educational attainment (β = .12, 95% CI: .10, .14) PGSs on educational attainment in adulthood. Finally, the smoking (β = -.07, 95% CI: -.11, -.04) and educational attainment (β = .12, 95% CI: .10, .14) PGSs continued to have small but significant direct effects on educational attainment in adulthood, even after adjusting for their overlap, the four adolescent academic variables and cigarettes per day.

Discussion

The results provided strong evidence that PGSs for smoking and for educational attainment each predicted educational attainment in adulthood. Most importantly, our analyses demonstrated that genetic influences on smoking provide incremental prediction of educational attainment, even after accounting for a PGS specifically designed to predict educational attainment. This indicates that PGSs calibrated on different phenotypes can provide additional information about genetic influences on a target phenotype, even ones that have already been the subject of large gene discovery analyses. Our results also illustrate the complexity of a distal outcome such as educational attainment, which is the result of the cumulative influences of numerous genetic and environmental processes.

To begin to parse these processes, we took a developmental approach and also examined associations between the smoking and educational attainment PGSs and several variables associated with academic adjustment in childhood and adolescence. Interestingly, both the smoking and educational attainment PGSs had at least small and significant associations with each facet of academic adjustment we examined, indicating each PGS measures non-specific genetic influences that contribute to a variety of intermediate academic variables. However, there were some indications of specificity for the PGSs, especially after adjusting for their overlap. Specifically, the educational attainment PGS had its strongest association with GPA while the strongest association with the smoking PGS was with disciplinary problems, and the strongest association for one PGS was the weakest for the other (see Table 4). These results are relatively intuitive given that grades in middle and high school were the most predictive of the variables of later educational attainment that we examined [5], and disciplinary problems were the variables most strongly associated with externalizing problems of which smoking is highly correlated [20].

Given the non-specific associations of both the smoking and educational attainment PGSs, it will be important to continue to establish their construct validity, that is, what these scores measure in terms of their phenotypic associations and the biological processes associated with the specific genes driving their effects. Substantial evidence is mounting that the smoking PGS measures the broader construct of behavioral disinhibition rather than the narrow phenotype of nicotine addiction [19], given its associations with the use of multiple substance classes, externalizing problems, antisocial peers, facets of poor academic adjustment, and low educational attainment [2529]. The failure to detect an indirect effect of the smoking PGS on educational attainment via cigarettes per day is further evidence that the smoking PGS taps a broader behavioral style than risk for nicotine addiction specifically. The educational attainment PGS has now been associated with longitudinal measurements of reading skills, mental age of IQ tests, and grades, suggesting it measures processes associated with cognitive development and academic skill acquisition [46]. Both PGSs, however, seem to index processes that are eventually expressed as broad psychological processes that likely have cognitive (e.g., intellectual abilities), affective (e.g., positive emotions related to academic activities), and behavioral (e.g., ability to follow directions, persist in tasks, withhold responses) components that contribute to numerous traits and life outcomes [6]. Though GWAS and PGSs are important advances in behavioral genetic methods, delineating the numerous biological, contextual, and psychological linkages between specific genetic markers and life outcomes such as educational attainment will continue to be a complex task.

Cumulatively, the intermediate academic variables accounted for about 50% of the adjusted effects of the PGSs on educational attainment in adulthood. Defining the ‘environment’ broadly, these academic variables are proxies for some aspects of educational context. Because they are genetically influenced, their role in educational attainment reflects gene-environment correlation processes wherein genetic influences contribute to exposure to experiences that then contribute to later educational attainment [30, 31]. For example, genetic influences that contribute to better grades and greater academic motivation likely contribute to receiving more rewards, reinforcement, encouragement from adults for pursuing academic activities, and admission to higher education, which further facilitates reaching the maximal phenotypic expression of a person’s genetic potential.

Notably, we were only able to account for about one third of the variability in educational attainment. This was in spite of several design strengths including inclusion of relevant academic variables assessed on multiple occasions using multiple informants and several facets of academic adjustment in addition to the two PGSs. Also, the sample was not racially or ethnically diverse, which reduces variability in the United States. Unassessed variables may account for substantial portions of additional variance in educational attainment, such as family attitudes about education and the availability of resources to contribute to obtaining higher levels of education [47]. Whether a person pursues advanced education, however, depends on both idiosyncratic and social-structure factors such as availability of job opportunities not requiring additional education, family and partner relationships, specific academic experiences (e.g., satisfying versus dissatisfying), financial constraints, stereotypes about pursuing certain fields of interest, and incentives to return to school after an extended hiatus. Such factors were not well captured in our models.

The study had other limitations. The PGSs did not identify specific genetic variants that point to biological processes that might account for their associations with educational attainment. Functional genomic information is needed to understand the biological processes accounting for these associations [48, 49]. Also, while the hope is that PGSs will eventually have practical value in predicting individual outcomes and informing intervention efforts, this is not yet viable given the small effect sizes. Further, the sample was restricted to people of European ancestry and persons growing up in Minnesota so it is unclear whether the results generalize to other ancestral groups with different allelic frequencies, or societies with different educational systems (e.g., societies with weaker educational infrastructure and fewer opportunities or those with universal access to higher education). Additionally, societal influences related to racial, ethnic, and gender inequities and discrimination in education and cultural values and resources committed to education might moderate genetic influences measured by the PGSs [50]. Given substantial overlap between ancestry status and socially defined racial/ethnic status, efforts to improve educational outcomes using PGS approaches have the potential to increase existing disparities if these findings are only applicable to people of European ancestry or culturally defined White people, further prioritizing extending these kinds of studies to diverse ancestry and racial/ethnic groups [51].

Despite these limitations, this study extended prior work by demonstrating the incremental value of multiple PGSs to predict educational attainment and some of the intermediate phenotypes related to this distal outcome. Hopefully, continued validation of PGSs and delineation of linkages between biological and environmental processes will contribute to improved educational outcomes and human flourishing.

Data Availability

The data are available in files posted on the Open Science Framework (https://osf.io/92esr/).

Funding Statement

This work was supported by United States Public Health Service grants R37 AA009367 (McGue), R01 AA024433 (Hicks), R21 AA026632 (Wilson), T32 AA028259 (Vasiliou), and T32 AA007477 (Blow) from the National Institute of Alcohol Abuse and Alcoholism, T32 MH015755 (Cicchetti) from the National Institute of Mental Health, and R01 DA034606 (Hicks), R37 DA005147 (Iacono), R01 DA013240 (Iacono), R01 DA044283 (Vrieze), R01 DA037904 (Vrieze), and U01 DA046413 (Vrieze) from the National Institute on Drug Abuse.The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Gross C, Jobst A, Jungbauer-Gans M, Schwarze J. Educational returns over the life course. Z Erziehwiss. 2011; 14: 139–53. 10.1007/s11618-011-0195-2 [DOI] [Google Scholar]
  • 2.Ishida H, Muller W, Ridge JM. Class origin, class destination, and education: A cross-national study of 10 industrial nations Am J Sociol. 1995; 101(1): 145–93. 10.1086/230701 [DOI] [Google Scholar]
  • 3.Mackenbach JP, Stirbu I, Roskam AJR, Schaap MM, Menvielle G, Leinsalu M, et al. Socioeconomic inequalities in health in 22 European countries. N Engl J Med. 2008; 358(23): 2468–81. 10.1056/NEJMsa0707519 [DOI] [PubMed] [Google Scholar]
  • 4.McGue M, Rustichini A, Iacono WG. Cognitive, Noncognitive, and Family Background Contributions to College Attainment: A Behavioral Genetic Perspective. J Pers. 2017; 85(1): 65–78. 10.1111/jopy.12230 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Richardson M, Abraham C, Bond R. Psychological correlates of university students’ academic performance: A systematic review and meta-Analysis. Psychol Bull. 2012; 138(2): 353–87. 10.1037/a0026838 [DOI] [PubMed] [Google Scholar]
  • 6.Demange PA, Malanchini M., Mallard T. T., Biroli P., Cox S. R., Grotzinger A. D., et al. Investigating the genetic architecture of noncognitive skills usign GWAS-by-subtraction. Nature Genet. 2021; 53: 35–44. 10.1038/s41588-020-00754-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Silventoinen K, Jelenkovic A., Sund R., Latvala A., Honda C. F., et al. Genetic and environmental variation in educational attainment: an individual-based analysis of 28 twin cohorts. Scientific Reports 2020;10. 10.1038/s41598-020-69526-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Branigan AR, McCallum KJ, Freese J. Variation in the heritability of educational attainment: An international meta-analysis. Soc Forces. 2013; 92(1): 109–40. 10.1093/sf/sot076 [DOI] [Google Scholar]
  • 9.Johnson W, McGue M, Iacono WG. Genetic and environmental influences on academic achievement trajectories during adolescence. Developmental Psychology. 2006; 42(3): 514–32. 10.1037/0012-1649.42.3.514 [DOI] [PubMed] [Google Scholar]
  • 10.Johnson W, McGue M, Iacono WG. Disruptive behavior and school grades: Genetic and environmental relations in 11-year-olds. J Educ Psychol. 2005; 97(3): 391–405. 10.1037/0022-0663.97.3.391 [DOI] [Google Scholar]
  • 11.Lee JJ, Wedow R, Okbay A, Kong E, Maghzian O, Zacher M, et al. Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nature Genet. 2018; 50(8): 1112–1121. 10.1038/s41588-018-0147-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Torkamani A, Wineinger NE, Topol EJ. The personal and clinical utility of polygenic risk scores. Nat Rev Genet. 2018; 19(9): 581–90. 10.1038/s41576-018-0018-x [DOI] [PubMed] [Google Scholar]
  • 13.Escobedo LG, Peddicord JP. Smoking prevalence in US birth cohorts: The influence of gender and education. Am J Public Health. 1996; 86(2): 231–6. 10.2105/ajph.86.2.231 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Hu MC, Davies M, Kandel DB. Epidemiology and correlates of daily smoking and nicotine dependence among young adults in the United States. Am J Public Health. 2006; 96(2): 299–308. 10.2105/AJPH.2004.057232 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Farrell P, Fuchs V.R. Schooling and health: The cigarette connection. Journal of Health Economics. 1982; 1: 217–30. 10.1016/0167-6296(82)90001-7 [DOI] [PubMed] [Google Scholar]
  • 16.Gilman SE, Martin LT, Abrams DB, Kawachi I, Kubzansky L, Loucks EB, et al. Educational attainment and cigarette smoking: a causal association? Int J Epidemiol. 2008; 37(3): 615–24. 10.1093/ije/dym250 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Jang S, Saunders G., Liu M., 23andMe Research Team, Jiang Y., Liu D., & Vrieze S. Genetic correlation, pleiotropy, and causal associations between substance use and psychiatric disorder. Psychological Medicine. 2020:1–11. 10.1017/S003329172000272X. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Quach BC, Bray M. J., Gaddis N. C., Liu M., Palviainen T., Minica C. C., et al. Expanding the genetic architecture of nicotine dependence and its shared genetics with multiple traits. Nature Communications. 2020; 11. 10.1038/s41467-020-19265-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Iacono WG, Malone SM, McGue M. Behavioral disinhibition and the development of early-onset addiction: Common and specific influences. Annual Review of Clinical Psychology. 2008; 4: 325–48. 10.1146/annurev.clinpsy.4.022007.141157 [DOI] [PubMed] [Google Scholar]
  • 20.Hinshaw SP. Externalizing behavior problems and academic underachievement in childhood and adolescence: Causal relationships and underlying mechanisms. Psychol Bull. 1992; 111(1): 127–55. 10.1037/0033-2909.111.1.127 [DOI] [PubMed] [Google Scholar]
  • 21.Esch P, Bocquet V, Pull C, Couffignal S, Lehnert T, Graas M, et al. The downward spiral of mental disorders and educational attainment: a systematic review on early school leaving. BMC Psychiatry. 2014; 14: 13. 10.1186/s12888-014-0237-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.McGue M, Iacono WG, Krueger R. The association of early adolescent problem behavior and adult psychopathology: A multivariate behavioral genetic perspective. Behavior Genetics. 2006; 36(4): 591–602. 10.1007/s10519-006-9061-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Jessor RJ S. L. Problem behavior and psychosocial development. New York: Academic Press; 1977. [Google Scholar]
  • 24.McGue M, Iacono WG. The association of early adolescent problem behavior with adult psychopathology. Am J Psychiat. 2005; 162(6): 1118–24. 10.1176/appi.ajp.162.6.1118 [DOI] [PubMed] [Google Scholar]
  • 25.Liu MZ, Jiang Y, Wedow R, Li Y, Brazel DM, Chen F, et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nature Genet. 2019; 51(2): 237–244. 10.1038/s41588-018-0307-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Chang L-Hsien, Couvy-Duchesne B, Liu M, Medland SE, Verhulst B, Benotsch EG, et al. Association between polygenic risk for tobacco or alcohol consumption and liability to licit and illicit substance use in young Australian adults. Drug and Alcohol Dependence. 2019; 197: 271–279. 10.1016/j.drugalcdep.2019.01.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Deak JD, Clark D. A., Liu M., Schaefer J. D., Jang S., Durbin C. E., et al. Polygenic risk scores are associated with the development of alcohol and nicotine use problems from adolescence through young adulthood. https://doi.org/1031234/osfio/axkqb. 2020, July 2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Waszczuk MA, Miao, J., Docherty, A. R., Shabaliln, A. A., Jonas, K. G., Michelini, G., et al. General vs. specific vulnerabilities: Polygenic risk scores and higher-order psychopathology dimensions in the Adolescent Brain Cognitive Development (ABCD) Study https://doiorg/1031234/osfio/km6v3. 2020, May 22. [DOI] [PMC free article] [PubMed]
  • 29.Hicks BM, Clark D. A., Deak J. D., Liu M., Durbin C. E., Schaefer J. D., et al. Polygenic score for smoking is associated with externalizing psychopathology and disinhibited personality traits but not internalizing psychopathology in adolescence. Clinical Psychological Science. 2021. 10.1177/21677026211002117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Scarr S, McCartney K. How people make their own environments: A theory of genotype-environment effects Child Development. 1983; 54(2): 424–35. 10.1111/j.1467-8624.1983.tb03884.x [DOI] [PubMed] [Google Scholar]
  • 31.Krapohl E, Hannigan LJ, Pingault JB, Patel H, Kadeva N, Curtis C, et al. Widespread covariation of early environmental exposures and trait-associated polygenic variation. Proc Natl Acad Sci U S A. 2017; 114(44): 11727–32. 10.1073/pnas.1707178114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Wilson S, Haroian K, Iacono WG, Krueger RF, Lee JMJ, Luciana M, et al. Minnesota Center for Twin and Family Research. Twin Res Hum Genet. 2019; 22(6): 746–52. 10.1017/thg.2019.107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Iacono WG, Carlson S.R., Taylor J., Elkins I.J., McGue M. Behavioral disinhibition and the development of substance use disorders: Findings from the Minnesota Twin Family Study. Dev Psychopathol. 1999; 11: 869–900. 10.1017/s0954579499002369 [DOI] [PubMed] [Google Scholar]
  • 34.Das S, Forer L, Schonherr S, Sidore C, Locke AE, Kwong A, et al. Next-generation genotype imputation service and methods. Nature Genet. 2016; 48(10): 1284–7. 10.1038/ng.3656 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.McCarthy S, Das S, Kretzschmar W, Delaneau O, Wood AR, Teumer A, et al. A reference panel of 64,976 haplotypes for genotype imputation. Nature Genet. 2016; 48(10): 1279–83. 10.1038/ng.3643 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Vilhjalmsson BJ, Yang J, Finucane HK, Gusev A, Lindstrom S, Ripke S, et al. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am J Hum Genet. 2015; 97(4): 576–92. 10.1016/j.ajhg.2015.09.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Chang CC, Chow CC, Tellier L, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015; 4: 16. 10.1186/s13742-015-0047-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nature Genet. 2006; 38(8): 904–9. 10.1038/ng1847 [DOI] [PubMed] [Google Scholar]
  • 39.Hamaker EL, Kuiper RM, Grasman R. A Critique of the Cross-Lagged Panel Model. Psychol Methods. 2015; 20(1): 102–16. 10.1037/a0038889 [DOI] [PubMed] [Google Scholar]
  • 40.Kwok OM, West SG, Green SB. The impact of misspecifying the within-subject covariance structure in multiwave longitudinal multilevel models: a Monte Carlo study. Multivariate Behav Res. 2007; 42(3): 557–92. 10.1080/00273170701540537 [DOI] [Google Scholar]
  • 41.Sivo S, Fan XT, Witta L. The biasing effects of unmodeled ARMA time series processes on latent growth curve model estimates. Struct Equ Modeling. 2005; 12(2): 215–31. 10.1207/s15328007sem1202_2 [DOI] [Google Scholar]
  • 42.Muthen LK, Muthen B. O. Mplus User’s Guide 8th ed. Los Angeles, CA: Muthen & Muthen; 2020. [Google Scholar]
  • 43.Falk CF. Are robust standard errors the best approach for interval estimation with nonnormal data in structural equation modeling? Struct Equ Modeling. 2018; 25(2): 244–66. 10.1080/10705511.2017.1367254 [DOI] [Google Scholar]
  • 44.Hallquist MN, Wiley JF. MplusAutomation: An R package for facilitating large-scale latent variable analyses in Mplus. Struct Equ Modeling. 2018; 25(4): 621–38. 10.1080/10705511.2017.1402334 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Team RC. R: A language and environment for statistical computing 3.6.0 ed. Vienna, Austria: R Foundation for Statistical Computing; 2019.
  • 46.Belsky DW, Moffitt TE, Corcoran DL, Domingue B, Harrington H, Hogan S, et al. The genetics of success: How single-nucleotide polymorphisms associated with educational attainment relate to life-course development. Psychol Sci. 2016; 27(7): 957–72. 10.1177/0956797616643070 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Guo G, Stearns E. The social influences on the realization of genetic potential for intellectual development. Soc Forces. 2002;80(3):881–910. [Google Scholar]
  • 48.Salvatore JE, Savage JE, Barr P, Wolen AR, Aliev F, Vuoksimaa E, et al. Incorporating functional genomic information to enhance polygenic signal and identify variants involved in gene-by-environment interaction for young adult alcohol problems. Alcoholism Clin Exp Res. 2018; 42(2): 413–23. 10.1111/acer.13551 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kichaev G, Bhatia G, Loh PR, Gazal S, Burch K, Freund MK, et al. Leveraging Polygenic Functional Enrichment to Improve GWAS Power. Am J Hum Genet. 2019; 104(1): 65–75. 10.1016/j.ajhg.2018.11.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Bailey ZD, Krieger N, Agenor M, Graves J, Linos N, Bassett MT. Structural racism and health inequities in the USA: evidence and interventions. Lancet. 2017; 389(10077): 1453–63. 10.1016/S0140-6736(17)30569-X [DOI] [PubMed] [Google Scholar]
  • 51.Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nature Genet. 2019; 51(4): 584–91. 10.1038/s41588-019-0379-x [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Edelyn Verona

23 Apr 2021

PONE-D-21-06130

Polygenic scores for smoking and educational attainment have Independent Influences on academic success and adjustment in adolescence and educational attainment in adulthood

PLOS ONE

Dear Dr. Hicks,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

I was able to obtain two reviews from scholars with expertise on the topic and methods of your study. I reviewed the paper independently. The reviewers were positive about the article; it is well written, with a large and informative dataset. The paper is clear in focus, and the analytic models are thoroughly described and justified. There are some issues that would need to be resolved if the paper is to be accepted for publication in PLOS ONE. First, both reviewers requested more justification for the use of the nicotine PGS to capture genetic variance associated with behavioral disinhibition, which should relate to eventual educational attainment. I was also curious about how you would interpret the positive correlation between the two PGS scores (.23), and their opposite relationships with external criteria. Second, more information is needed on the retention rate by age 29, and how many participants were missing educational attainment data at age 29 (and what assessment time point was used as a substitution in those cases). Third, the size and meaningfulness of the effect sizes can be emphasized much more, and conclusions should align with expected impact of the findings.

As Reviewer 2 also noted, the correlations between the education PGS and academic variables are shown as negative in Table 3, whereas the regression coefficients (Table 2) are all positive. I assume these are typo’s?

The reviewers made other comments that you should attend to in your response and revision. Thank you for submitting your research for consideration in PLOS ONE.

Please submit your revised manuscript by Jun 06 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Edelyn Verona

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

  1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

  1. PLOS requires an ORCID iD for the corresponding author in Editorial Manager on papers submitted after December 6th, 2016. Please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field. This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager. Please see the following video for instructions on linking an ORCID iD to your Editorial Manager account: https://www.youtube.com/watch?v=_xcclfuvtxQ

  1. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

4. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This is a nice and clear paper on the relationship of PRS for nicotine and educational attainment on academic process indicators (motivation, gpa, disciplinary issues) as well as measured educational achievement at age 29. A mediation analysis showed that academic process indicators partially mediated relationship between PRSs and outcome. I have several concerns, questions, and suggestions for this paper.

First, it might be good to be clear why PRS for nicotine and educational attainment – rather than for behavioral disinhibition and cognitive ability – are being used. I understand the argument that the nic PRS predicts a host of externalizing psychopathology, but one could make the argument that it’s missing some of the effects to broader externalizing/behavioral disinhibition.

In the literature, are there any LD regressions that document the rG between educational attainment and nicotine?

I was not entirely clear how the authors are testing rGE from their into setup or their methods. Academic process and outcome variables are going to be influenced by both genes and environment, most certainly. How is a correlation between e.g., nicotine PRS with academic process variables a test of rGE? Either be clear about the logic or tone it down.

Educational attainment was taken from age 17 assessment if age 29 assessment was missing. If someone has e.g., really good grades and is academically on track at age 17, but fails to come in at age 29 is classified as ‘completed high school’, isn’t that a possible misclassification? How do the authors get around that and why not just treat the missing people as missing? Also, what were the retention rates at age 29 from baseline and also from age 17 – maybe I’m missing this portion?

Table 2 was a tad confusing, especially the columns relating to the 2 PRS analysis. If there are two PRSs in that analysis, why is there only a single value for each criterion? I might be missing something obvious here, perhaps.

Possible rater and sex effects. The authors are using sex as a covariate, and they are also collapsing mom and child reports of educational process. This is reasonable, but the readers might also want to know if there are rater and sex effects. It might be good to redo the models by rater and separately by sex and supplement the information (even though the sex effects models might have power issues, they might also highlight if the effects are particularly strong in one sex).

Typo – page 13 – last line in bracket – should be beta rather than a square (something happened with formatting)

Reviewer #2: # PONE-D-21-06130 Polygenic scores for smoking and educational attainment have Independent Influences on academic success and adjustment in adolescence and educational attainment in adulthood

In participants of the Minnesota Twin Family Study, polygenic scores (PGS) for smoking and educational attainment from large GWAMA consortia (after excluding sample overlap) proved to be significant and independent predictors of grades, academic motivation, and discipline problems at ages 11, 14, and 17 years-old, and of educational attainment at age 29. About half of the adjusted effects of the smoking and educational PGSs on educational attainment at age 29 were mediated by the academic variables in adolescence.

The paper is well-written and well-structured and the polygenic risk score methods used are sound and clearly described. My two concerns are (1) the use of a smoking PRS as a stand-in for a behavioural inhibition PRS and (2) the very small added predictive value of both PRS for educational attainment compared to the observed academic variables (3%).

The empirical results are based on a smoking PRS. I believe it would be best to stick with that concept up till the discussion. While I find the idea that the smoking PRS partly captures behavioral disinhibition not unreasonable, I see it more as an annotation of the results that should be put in the discussion and not to be claimed up front as early as the abstract and introduction as ‘a measure of genetic influences on behavioral disinhibition’.

The predictive effect sizes of the two PRS are very small – I find these results not very compatible with the strong conclusions in the summary in the discussion ‘The results provided strong evidence that PGSs for smoking—a measure of genetic influences on behavioral disinhibition—and for educational attainment each predicted educational attainment in adulthood. Most importantly, our analyses demonstrated that genetic influences on behavioral disinhibition provide incremental prediction of educational attainment, even after accounting for a PGS specifically designed to predict educational attainment.’

MINOR:

In table 3 the educational attainment PGS is negatively associated with the academic skills and with educational attainment at age 29. Is this an error or am I missing a reverse coding step? In table 2 the sign of the prediction beta seems OK. Confusing.

Does the .23 correlation between the PGS for smoking and educational attainment reflect pleiotropy? How does it affect the method used?

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Aug 17;16(8):e0255348. doi: 10.1371/journal.pone.0255348.r002

Author response to Decision Letter 0


1 Jun 2021

28 May 2021

Dear Dr. Verona

We are submitting a revised manuscript entitled, “Polygenic Scores for Smoking and Educational Attainment have Independent Influences on Academic Success and Adjustment in Adolescence and Educational Attainment in Adulthood,” for publication in PLOS One.

The review was very positive, but also noted several points for improvement including some errors in the original manuscript. Also, note that the data has now been uploaded to an OSF page (https://osf.io/92esr/).

Below, we describe revisions in response to feedback from you and the anonymous reviewers:

1) Confusion regarding the direction of associations among variables:

“I was also curious about how you would interpret the positive correlation between the two PGS scores (.23), and their opposite relationships with external criteria.”

“As Reviewer 2 also noted, the correlations between the education PGS and academic variables are shown as negative in Table 3, whereas the regression coefficients (Table 2) are all positive. I assume these are typo’s?”

Unfortunately, there were several typos in the manuscript, which seems to have been a function having different analysts calculating each polygenic score (PGS) and a third lead analyst fitting the regression/SEM models. After a thorough review, all discrepancies have been resolved. Specifically, the smoking and educational attainment PGS are negatively correlated (r = -.23). The smoking PGS is negatively correlated with the educational attainment outcome and the adolescent academic variables, and positively correlated with disciplinary problems. The educational attainment PGS is positively correlated with the educational attainment outcome and the adolescent academic variables, and negatively correlated with disciplinary problems.

2) More information about the educational attainment outcome:

“Second, more information is needed on the retention rate by age 29, and how many participants were missing educational attainment data at age 29 (and what assessment time point was used as a substitution in those cases).”

“Educational attainment was taken from age 17 assessment if age 29 assessment was missing. If someone has e.g., really good grades and is academically on track at age 17, but fails to come in at age 29 is classified as ‘completed high school’, isn’t that a possible misclassification? How do the authors get around that and why not just treat the missing people as missing? Also, what were the retention rates at age 29 from baseline and also from age 17 – maybe I’m missing this portion?”

We have since learned that the educational attainment variable is more complex than originally described. Rather than all participants reporting educational attainment at age 29, the educational attainment variable was coded based on the last post high school assessment the participant reported on their educational experiences. Consequently, while the mean, median, and modal age for educational attainment was 29.4 years old (SD = 3.9 years), the range was 19.7 to 39.9 years old (the older ages are due to a subset of twins participating in a target age 34 assessment). If a participant did not have information on educational experiences in adulthood (i.e., target age 20 assessment or older), their educational attainment was coded as missing. As such, educational attainment data was available for 95.2% (3070/3225) of the sample with PGS data and 92.1% (3463/3762) of the total sample. Because there was a significant correlation between age and educational attainment (r = .27, p < .001), we regressed educational attainment on age, and used the unstandardized residual score in all analyses. These details regarding the educational attainment outcome variable have been included in the Methods (pp. 10) and are provided below:

Educational Attainment. We used the last assessment that a twin reported on their educational experiences (mean and median age 29.4 years, range 19.7 to 39.9 years) to code their highest level of educational attainment (n = 3463; 92.1% of the total sample). The educational attainment variable was coded as follows: 1 = less than high school diploma (9.5%), 2 = high school graduate or GED (9.8%), 3 = vocational degree, some college, or an associate’s degree (30.5%), 4 = bachelor’s degree (31.9%), 5 = master’s level degree (8.8%), 6 = PhD or other advanced professional degree (e.g., MD, JD) (4.2%). Educational attainment was coded as missing (7.9%; n = 299) if the participant did not have data for a post high school assessment (i.e., age 20 or older). Because there was a significant correlation between age and educational attainment (r = .27, p < .001), we regressed educational attainment on age, and used the unstandardized residual score in all analyses.

3) Conceptualization of smoking PGS as a measure of genetic influences for behavioral disinhibition:

“First, both reviewers requested more justification for the use of the nicotine PGS to capture genetic variance associated with behavioral disinhibition, which should relate to eventual educational attainment.”

“First, it might be good to be clear why PRS for nicotine and educational attainment – rather than for behavioral disinhibition and cognitive ability – are being used. I understand the argument that the nic PRS predicts a host of externalizing psychopathology, but one could make the argument that it’s missing some of the effects to broader externalizing/behavioral disinhibition. In the literature, are there any LD regressions that document the rG between educational attainment and nicotine?”

“The empirical results are based on a smoking PRS. I believe it would be best to stick with that concept up till the discussion. While I find the idea that the smoking PRS partly captures behavioral disinhibition not unreasonable, I see it more as an annotation of the results that should be put in the discussion and not to be claimed up front as early as the abstract and introduction as ‘a measure of genetic influences on behavioral disinhibition’.”

We have made a number of changes to address these comments. First, we have added a paragraph describing the association between smoking and educational attainment, and the potential role for common genetic influences accounting for this association (pp. 5), see below:

Smoking is a non-cognitive trait that has a strong association with lower educational attainment (13, 14). Rather than a direct causal effect of education on smoking (or vice versa), however, there has been a long recognition that this association is due to the common influences of third variables. For example, smoking patterns tend to be established in the late teens to early 20’s, which is prior to the completion of higher education but later than when consistent individual differences in factors strongly related to educational attainment (e.g., GPA, academic motivation, discipline problems) have emerged (15). Further, sib-pair difference analyses have found that familial factors account for the association between smoking and educational attainment (16). Finally, recent GWAS findings have estimated genetic correlations from r = .27 to .56 between smoking and educational attainment phenotypes, indicating at least some of their familial association is due to common genetic influences (17, 18).

While we cut some references that substitutes the smoking PGS as a measure of behavioral disinhibition, we think it is important to retain the conceptualization that the smoking PGS operates more as a measure of genetic influences related to a broader behavioral disinhibition trait rather than a narrow nicotine addiction susceptibility. However, we now provide more rationale that smoking is highly correlated with externalizing problems and other substance use, which are all manifestations of a broad behavioral disinhibition trait. Further, we now frame the paper as a test of a hypothesis that the smoking PGS is associated with educational attainment due to its overlap with behavioral disinhibition. We also provide more details about recent studies consistent with hypothesis that the smoking PGS is a measure of genetic influences on behavioral disinhibition. See relevant text below, which directly follows the paragraph above (pp. 5-6)

Behavioral disinhibition refers to difficulty inhibiting impulses to behave in socially undesirable or restricted actions (19) and is a another non-cognitive trait that has been associated with academic success (4). Externalizing problems are manifestations of these poor inhibitory abilities and include impulsivity, aggression, rule breaking, oppositionality, hyperactivity, and inattention. They are associated with lower grades, poor academic motivation, and more disciplinary problems, and predict lower educational attainment (20, 21), with most of the overlap attributable to shared genetic influences (9). Smoking, especially in adolescence, is strongly correlated with externalizing behaviors, alcohol use, and other drug use, all of which are manifestations of a higher-order behavioral disinhibition trait (19, 22-24). It is possible then that the association between smoking and education attainment is actually due to the overlap between smoking and the broader trait of behavioral disinhibition.

Recently, we examined the predictive validity of a PGS for having ever been a regular smoker that was derived from the largest GWAS of smoking-related phenotypes to date (N = 1,232,091)(25). In replication samples, this PGS accounted for 4% of the variance in a similar smoking phenotype and was also significantly associated with use measures of alcohol, cannabis, cocaine, amphetamines, ecstasy, and hallucinogens (25, 26). Using the same twin sample as in this report, we found that this smoking PGS predicted trajectories of nicotine and alcohol use from ages 14 to 34, even after adjusting for nicotine and alcohol use and a PGS for drinks per week (27).

This smoking PGS was also associated with the externalizing dimension of the Child Behavior Checklist in a large sample of pre-adolescents, even after adjusting for a general factor of psychopathology (28). We followed up these results and found that the smoking PGS was associated with externalizing problems and personality traits associated with behavioral control—but not internalizing problems and extraversion—from ages 11 to 17 (29). We concluded that the smoking PGS was also a measure of genetic influences on general behavioral disinhibition rather than smoking or nicotine addiction specifically, and so could be used to investigate the role that genetic influences related to behavioral disinhibition have on the development of other near-neighbor outcomes.

Finally, we added a random intercept variable for cigarettes per day at ages 14, 17, 21, and 24 years old to the mediation model. The rationale for this is to test well cigarettes per day, an expressed phenotype for the smoking PGS, either accounted for or provided an indirect path from the smoking PGS to educational attainment. We found that cigarettes per day did not account for the effect of the smoking PGS on educational attainment, and, in fact, the smoking PGS continued to have a significant direct effect on educational attainment even after accounting for all the other variables in the model. Further, cigarettes per day did not even provide a significant indirect path for the effect of the smoking PGS on educational attainment, but the academic variables of GPA and disciplinary problems did. We think this is additional and strong evidence that the smoking PGS measures genetic influences on a broad behavioral style consistent with conceptualizations of behavioral disinhibition. See text below from the Results (pp. 17-19):

Cigarettes per day

In the RI-PM, both the smoking (standardized B = .20, 95% CI: .16, .24) and educational attainment (B = -.14, 95% CI: -.18, -.10) PGSs had significant associations with the random intercept factor for cigarettes per day (see Table 2). These effects remained significant after adjusting for their overlap, though the effects declined by about 29% for the educational attainment PGS and 10% for the smoking PGS.

Education Attainment in adulthood

Both the smoking (standardized B = -.19, 95% CI: -.24, -.15) and educational attainment (B = .26, 95% CI: .22, .30) PGSs had significant associations with educational attainment in adulthood (see Table 2). These effects remained significant after adjusting for their overlap, though the effects declined by about 26% for the smoking PGS and 12% for the educational attainment PGS. Table 3 includes the correlations among the smoking and educational PGSs, estimated scores for random intercept factors of the four academic variables in adolescence and cigarettes per day, and educational attainment in adulthood. The four academic variables had large associations with each other (mean r = |.53|) and educational attainment (r’s = |.35| to |.52|; R2 = .34). Cigarettes per day also had a robust association with educational attainment (r = -.30).

Results from the mediation model that estimated the direct and indirect effects of the smoking and educational attainment PGSs via the academic variables in adolescence and cigarettes per day on educational attainment in adulthood are presented in Table 4. Inclusion of the smoking and educational attainment PGSs resulted in a significant increase in 𝚫R2 = .02 (R2 = .36; 𝚫χ2(2) = 89.56, p < .001) over and above the four adolescent academic variables and cigarettes per day. Both the smoking (B = |.09| to |.18|) and educational attainment (B = |.05| to |.22|) PGSs had significant associations on the random intercept scores for each academic variable in adolescence and cigarettes per day.

Random intercept scores for GPA, teacher ratings of academic motivation, disciplinary problems, and cigarettes per day were in turn significantly associated with educational attainment in adulthood (last row Table 4). These effects adjusted for the common variance among all the predictors, and so were substantially smaller than the unadjusted correlations, but were still robust for GPA (B = .28, 95% CI: .24, .28) and teacher ratings of academic motivation (B = .21, 95% CI: .16, .26) and small for disciplinary problems (B = -.08, 95% CI: -.12, -.04) and cigarettes per day (B = -.05, 95% CI: -.09, -.02). Consequently, the smoking and educational attainment PGSs each had small but statistically significant indirect effects on educational attainment via GPA and teacher ratings of academic motivation in adolescence, and the smoking PGS also had a small indirect effect via disciplinary problems.

Cumulatively, the random intercept scores for the four academic variables and cigarettes per day accounted for about 50% of the adjusted effects of the smoking (B = -.08, 95% CI: -.10, -.05) and educational attainment (B = .12, 95% CI: .10, .14) PGSs on educational attainment in adulthood. Finally, the smoking (B = -.07, 95% CI: -.11, -.04) and educational attainment (B = .12, 95% CI: .10, .14) PGSs continued to have small but significant direct effects on educational attainment in adulthood, even after adjusting for their overlap, the four adolescent academic variables and cigarettes per day.

Given all these results, we think it is best to frame the paper around the conceptualization of the smoking PGS being related to behavioral disinhibition early in the manuscript. One, that was our original conceptualization of the analysis. Two, we think this is most reasonable interpretation of the results. Three, it is much easier for the reader to understand and interpret the analytic strategy and results if the framing is done at the beginning of the paper.

4) Interpretations of results regarding associations between PGS and educational attainment:

“Third, the size and meaningfulness of the effect sizes can be emphasized much more, and conclusions should align with expected impact of the findings.”

“The predictive effect sizes of the two PRS are very small – I find these results not very compatible with the strong conclusions in the summary in the discussion ‘The results provided strong evidence that PGSs for smoking—a measure of genetic influences on behavioral disinhibition—and for educational attainment each predicted educational attainment in adulthood. Most importantly, our analyses demonstrated that genetic influences on behavioral disinhibition provide incremental prediction of educational attainment, even after accounting for a PGS specifically designed to predict educational attainment.’”

We have reviewed the manuscript and find that all our statements are consistent with the data analysis presented. Note that when we state, “The results provide strong evidence that PGSs for smoking and educational attainment each predicted educational attainment in adulthood.”, we do not state that the PGSs exhibit large effect sizes in their association with educational attainment in adulthood. We think the phrase “the evidence is strong” is appropriate to describe the situation wherein a biological measure that can be assessed in childhood predicts a complex adult outcome, even after accounting for several much more proximal covariates (both in terms of age and phenotypic overlap with the outcome). This is especially true for the smoking PGS, because, in theory, the educational attainment PGS should account for all the genetic influences on the educational attainment phenotype that can be measured using a common array of genetic markers (our results indicate this is not the case). Including cigarettes per day in the analysis, a putatively expressed phenotype of the smoking PGS that has a robust association with educational attainment, further increases the rigor and riskiness of the test of the ability of the smoking and education PGS to predict educational attainment. We also devote substantial text in the Discussion to various limitations to the study including the size of the effects, generalizability, and other potentially relevant variables that influence the strength of inferences that can be drawn given the data and analysis. See text below (pp. 23-24):

Notably, we were only able to account for about one third of the variability in educational attainment. This was in spite of several design strengths including inclusion of relevant academic variables assessed on multiple occasions using multiple informants and several facets of academic adjustment in addition to the two PGSs. Also, the sample was not racially or ethnically diverse, which reduces variability in the United States. Unassessed variables may account for substantial portions of additional variance in educational attainment, such as family attitudes about education and the availability of resources to contribute to obtaining higher levels of education (47). Whether a person pursues advanced education, however, depends on both idiosyncratic and social-structure factors such as availability of job opportunities not requiring additional education, family and partner relationships, specific academic experiences (e.g., satisfying versus dissatisfying), financial constraints, stereotypes about pursuing certain fields of interest, and incentives to return to school after an extended hiatus. Such factors were not well captured in our models.

The study had other limitations. The PGSs did not identify specific genetic variants that point to biological processes that might account for their associations with educational attainment. Functional genomic information is needed to understand the biological processes accounting for these associations (48, 49). Also, while the hope is that PGSs will eventually have practical value in predicting individual outcomes and informing intervention efforts, this is not yet viable given the small effect sizes. Further, the sample was restricted to people of European ancestry and persons growing up in Minnesota so it is unclear whether the results generalize to other ancestral groups with different allelic frequencies, or societies with different educational systems (e.g., societies with weaker educational infrastructure and fewer opportunities or those with universal access to higher education). Additionally, societal influences related to racial, ethnic, and gender inequities and discrimination in education and cultural values and resources committed to education might moderate genetic influences measured by the PGSs (50). Given substantial overlap between ancestry status and socially defined racial/ethnic status, efforts to improve educational outcomes using PGS approaches have the potential to increase existing disparities if these findings are only applicable to people of European ancestry or culturally defined White people, further prioritizing extending these kinds of studies to diverse ancestry and racial/ethnic groups (51).

5) Presentation of PGS regression results.

“Table 2 was a tad confusing, especially the columns relating to the 2 PRS analysis. If there are two PRSs in that analysis, why is there only a single value for each criterion? I might be missing something obvious here, perhaps.”

Coefficients for the 2 PGS models are presented next to the coefficients for the 1 PGS model, to facilitate examining the change in coefficients after adjusting for the overlap in the other PGS. The two PGS results are only provided for the random intercept criterion variables.

6) Gene-environment correlation.

“I was not entirely clear how the authors are testing rGE from their into setup or their methods. Academic process and outcome variables are going to be influenced by both genes and environment, most certainly. How is a correlation between e.g., nicotine PRS with academic process variables a test of rGE? Either be clear about the logic or tone it down.”

The text being referred to is merely to acknowledge that the intermediate phenotypes or mediating variables have both genetic and environmental contributions. To the extent we conduct a test of gene-environment correlation, it is to regress the intermediate phenotypes (genetic and environmental variable) on the PGS’s (genetic only variable), so that a significant association would be evidence of a gene-environment correlation.

7) Possible rater and sex effects.

“Possible rater and sex effects. The authors are using sex as a covariate, and they are also collapsing mom and child reports of educational process. This is reasonable, but the readers might also want to know if there are rater and sex effects. It might be good to redo the models by rater and separately by sex and supplement the information (even though the sex effects models might have power issues, they might also highlight if the effects are particularly strong in one sex).”

These analyses provide a level of detail that are beyond the scope of our aims and so we do not include them in the manuscript. However, we have posted the data to a public server so that those interested can complete these analyses if they choose.

I confirm that this work is original and has not been published elsewhere, nor is it currently under consideration for publication elsewhere. A pre-print version of this manuscript was posted on 2/3/21 (https://psyarxiv.com/mueqg/) and pre-print version of the revision will be posted following submission of the revision. All authors have no conflicts of interest to disclose.

Thank you for considering of this manuscript.

Sincerely,

Brian M. Hicks, Ph.D.

Associate Professor

Department of Psychiatry

University of Michigan Medical School

Decision Letter 1

Edelyn Verona

15 Jul 2021

Polygenic scores for smoking and educational attainment have Independent Influences on academic success and adjustment in adolescence and educational attainment in adulthood

PONE-D-21-06130R1

Dear Dr. Hicks,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Edelyn Verona

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #2: I have offered various suggestions to improve the paper. It seems that the authors are determined to stand by their choices and did provide quite a bit of text to explain why they feel no improvements were necessary.

Whereas I don't agree (particularly with regard to the 'strong evidence' statements and really also still about the use of a PRS for smoking as a proxy for dis-inhibition in general) - none of this is sufficiently grave to change my overall opinion that this should be published.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Acceptance letter

Edelyn Verona

4 Aug 2021

PONE-D-21-06130R1

Polygenic Scores for Smoking and Educational Attainment have Independent Influences on Academic Success and Adjustment in Adolescence and Educational Attainment in Adulthood

Dear Dr. Hicks:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Edelyn Verona

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Availability Statement

    The data are available in files posted on the Open Science Framework (https://osf.io/92esr/).


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES