Skip to main content
Cell Genomics logoLink to Cell Genomics
. 2025 Nov 25;6(3):101075. doi: 10.1016/j.xgen.2025.101075

Polygenic scores capture genetic modification of the adiposity-cardiometabolic risk factor relationship

Kenneth E Westerman 1,2,3,8,, Julie E Gervis 2,3,5, Luke J O’Connor 6,7, Miriam S Udler 2,3,4,5, Alisa K Manning 1,2,3
PMCID: PMC12985365  PMID: 41297543

Summary

Polygenic scores (PGSs) that can predict response to interventions can facilitate precision medicine and are detectable in observational datasets as PGS-by-exposure (PGS×E) interactions. PGSs based on interactions (iPGSs) or variance effects (vPGSs) may be more powerful than standard PGSs for detecting PGS×E, but these have yet to be systematically compared. We describe a generalized pipeline for developing and comparing these PGS types and apply it to detect genetic modification of the relationship between adiposity (measured by BMI) and a broad set of cardiometabolic risk factors. Our applied analysis in the UK Biobank identified significant PGS×BMI for 16/20 risk factors, most consistently for the iPGS approach. Many interactions replicated in All of Us (AoU); for example, we observed a 72% larger BMI-alanine aminotransferase association in the top iPGS decile in AoU. Our study provides a framework for the comparison of PGS×E strategies and informs efforts toward clinically useful response-focused PGSs.

Keywords: gene-environment interactions, polygenic scores, adiposity, cardiometabolic

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • Our pipeline compares polygenic score (PGS) types for the detection of interactions

  • PGS-by-adiposity interactions impact cardiometabolic risk factors in the UK Biobank

  • PGS built from interaction effects show more consistent and replicable interactions

  • PGS particularly strongly modifies the adiposity-liver biomarker association


Polygenic scores can improve the power to detect gene-environment interactions, with implications for genome-wide interventions. Westerman et al. introduce a framework for comparing the performance of several types of polygenic scores, built from genetic main, interaction, and variance effects. They find a broad signal for interactions with adiposity impacting cardiometabolic biomarkers.

Introduction

Clinical decision-making is often based on risk estimates, in which patients at higher risk for a disease are prioritized for lifestyle changes or pharmaceutical treatments. However, individuals can vary widely in their response to these clinical interventions,1 motivating the use of molecular measurements to predict therapeutic response and enable more targeted treatment recommendations. Genetic factors contribute to this inter-individual heterogeneity via gene-environment interactions (G×E), which quantify the genetic effects on the association between some exposure (e.g., a behavior or pharmacological treatment) and the outcome of interest. As with standard risk prediction, G×E testing in epidemiological contexts can uncover stronger effects by combining information from variants across the genome using polygenic scores (PGSs).

Several types of genome-wide statistical tests have been described for the development of PGSs for G×E testing. Summary statistics from a genome-wide association study (GWAS) can be used to produce a standard marginal PGS (mPGS). This approach has successfully detected polygenic G×E,2,3,4 but requires a strong and typically unsatisfied assumption that genetic main effects and interaction effects are proportional genome-wide.5 Alternatively, G×E effects from a genome-wide interaction study can be used to produce an interaction PGS (iPGS). Prior studies have shown that iPGSs can increase the power for PGS×E detection,6 improve genetic prediction performance (explaining more outcome variability),7,8,9,10,11,12 and predict response to interventions.5,13 Finally, genome-wide variance-quantitative trait locus (vQTL) analysis tests genetic associations with the variability, rather than the mean, of quantitative traits and can appear as the result of underlying interactions.14,15 These summary statistics can be aggregated into variance PGSs (vPGSs),16,17,18 which have the advantage that their development does not require the explicit modeling of often poorly-measured exposure variables.

These PGS types have not yet been compared for their detection of interactions in a systematic way. Here, we propose a generalized pipeline for the generation of PGSs for PGS×E testing and compare the performance of PGSs based on each of these three genome-wide association models. We hypothesize that, by more directly quantifying effect modification, the iPGS and vPGS will detect stronger PGS×E interactions compared to the mPGS. We first explore this hypothesis through simulations, focusing on the impact of exposure distribution and measurement error on the relative performance of these PGS types for detecting interactions. We then conduct extensive analysis of genome-wide genetic modification of the strong known relationship between adiposity and cardiometabolic risk factors (CRFs) in the UK Biobank (UKB) and All of Us (AoU) datasets. As an exposure, we use body mass index (BMI), a measure of adiposity that strongly predicts chronic disease risk and participates in G×Es at the single-variant19,20 and PGS12,21 levels. As outcomes, we use a set of 20 continuous serum CRFs capturing a broad cross-section of physiological processes and genetic architectures.22 The resulting PGS for each CRF will thus quantify the expected degree of change in that CRF in response to weight change.

Results

Conceptual overview of the PGS generation pipeline

This pipeline was designed to produce multiple PGSs based on different underlying association models but test them uniformly for the modification of an exposure-outcome relationship. The analysis pipeline depicted in Figure 1 includes three steps: genome-wide association testing, PGS generation and optimization, and PGS×E testing. Genome-wide scans were performed using each of three statistical approaches. The first is the standard GWAS:

Yi=β0+βGGi+βCTCi+ϵi,

where Yi is the outcome for individual i, Gi is the genotype vector, Ci is a vector of covariates, and ϵi captures the residual error. The genome-wide interaction study (iGWAS) model is a straightforward extension of the GWAS model:

Yi=β0+βGGi+βEEi+βGxEGiEi+βCTCi+ϵi.

where additional terms have been added for Ei, the exposure, and its product term with Gi. The key estimate of interest from this model is βGxE (the interaction effect) rather than βG. Finally, the genome-wide vQTL study (vGWAS) models the genetic effects on trait variability, which can capture the underlying interactions without directly modeling the exposure. This produces an estimate of variability change per allele, denoted here as βGv (see methods).

Figure 1.

Figure 1

Conceptual overview of the analytical approach

(A) A PGS for response, developed using observational data, generalizes the concept of a G×E from a single genetic variant to a continuous, multi-variant score.

(B) Various genome-wide scans, using individual-level data, produce PGSs that might interact with a given exposure, with the optimal choice possibly depending on the specific biological question.

(C) Multiple regression approaches can be used in the genome-wide scan, including genetic main effects (GWAS), interaction effects (iGWAS), and genetic variance effects, i.e., variance-quantitative loci (vGWAS).

(D) Practical illustration of this pipeline for one outcome biomarker, using a P&T strategy. Genome-wide summary statistics are first generated in a training data subset and used to develop associated PGSs at a series of p value thresholds. These PGSs are tested for interaction with the exposure in the optimization subset to select an optimal threshold for each PGS type based on the significance of the interaction effect, βPGS×E. Finally, these optimized PGSs are each tested in a similar regression in the held-out testing subset and compared based on the same βPGS×E estimate.

For each set of genome-wide summary statistics (βG, βGxE, and βGv), a pruning-and-thresholding (P&T) algorithm is used to develop a group of PGSs based on different p value thresholds. We note that more advanced PGS algorithms would be conceptually applicable, but will require additional methods and software development to optimize for the detection of interaction effects. An independent optimization data subset is then used to test for PGS×E interactions to select the optimal p value threshold for each PGS type:

Yi=β0+βPGSPGSi+βEEi+βPGSxEPGSiEi+βCTCi+ϵi,

where PGSi refers to an arbitrary PGS type (regardless of the underlying regression), and the optimal threshold is chosen as the one maximizing the significance of βPGSxE. Previous work using the iPGS approach has shown that this inclusion of a PGS main effect in the regression model is necessary to control type I error.6 Finally, a third independent testing data subset uses the same regression model to evaluate the significance and clinical significance of the βPGSxE effect. This effect, estimated in an independent dataset, quantifies the expected change in the exposure-outcome relationship for each unit increase in the PGS and allows the direct comparison of the performance of the mPGS, iPGS, and vPGS strategies.

Simulation results

Prior simulation studies exploring the use of mPGSs and iPGSs for detecting PGS×E interactions have reported several key results. The control of type I error for PGS×E testing requires adjustment for the main effect of the PGS of interest in the context of G-E correlation and residual-environment interaction (i.e., heteroscedasticity), as well as an additional permutation-based testing approach in some contexts.6 Additionally, the relative power of the iPGS compared to the mPGS improves as the correlation between genetic main and interaction effects decreases.5 Here, we sought to expand on a few specific components. First, we integrated the vPGS into type I error analyses within a single simulation and analysis pipeline. Second, we explored nonnegative exposure distributions. Though standard normal exposures are the default in simulations, they do not reflect most true biological quantities and generative models, including negative exposure values, which can produce interactions that are difficult to interpret and potentially unrealistic.23 Third, we added simulated exposure measurement error into power comparisons between the three approaches, acknowledging that this is a major challenge for G×E studies.24 The simulation strategy and parameter choices are described in detail in the methods and in Figure S1.

In type I error analyses using a normally distributed E, we found that type I error was controlled for all PGS types regardless of the presence of G or E main effects (Figure 2A). Using a nonnegative (gamma-distributed) E, type I error remained controlled (Figure 2A) and was robust to the G-E correlation (Figure 2B). However, in keeping with theoretical expectations for single-variant G×E analysis, type I error became modestly inflated in the presence of a nonlinear E-Y relationship, and more substantially when this nonlinearity was combined with the G-E correlation25 (Figure 2B). We note that permutation-based testing approaches may more effectively control for model misspecification in iPGS analysis.6 Also, though vPGS type I error estimates were generally consistent with those of other PGS types, they should be interpreted with caution due to their notably lower power in the non-centered exposure simulation scenarios (described below). These type I error findings largely recapitulated the existing observations while validating the control of false positives using the analytical pipeline deployed in this study.

Figure 2.

Figure 2

Simulation studies

(A and B) Type I error plots showing the false positive rate for the detection of PGS×E interaction when there is no underlying simulated G×E effect. Scenarios include standard normally distributed, gamma-distributed (shape α = 1, scale θ = 1; producing mean μ = 1), and gamma-distributed (α = 4, θ = 0.5; producing mean = 2), with or without the main effects (A). In the gamma (μ = 1) case, the effect of G-E correlation and a nonlinear E-Y relationship were further modified (B).

(C–E) Power plots show the performance of each PGS type in detecting PGS×E interaction as the variability due to G×E (x axis) is manipulated. Scenarios include manipulation of G and E main effects with a standard normal E (C), manipulation of the E distribution (D), and manipulation of the simulated E measurement error with a standard normal E (E). All G and E main effects, when present, were set to explain 10% of phenotypic variance. For all panels, error bars denote 95% confidence intervals. ICC, intraclass correlation coefficient (lower ICC denotes higher measurement error).

In power analyses using a standard normally distributed E and assuming no correlation between simulated genetic main and interaction effects, we found that the iPGS approach was broadly the most powerful, followed by vPGS and finally mPGS (Figure 2C). This matches expectations, given that the associated statistical test most closely matches the underlying simulated G×E interaction effects. When using nonnegative E distributions, increasing mean E produced a general decrease in power across PGS types, along with a relative increase in power for the mPGS (Figure 2D). As has been discussed in depth in the literature,23 an increasing mean of E raises the correlation of G and G×E product terms; this increases collinearity and thus standard errors for the interaction effect estimates, but also improves the ability of the marginal genetic test to detect G×E.23 This increase in the relative mPGS power is unlikely to be due to false positives, given that we did not observe increased type I error using a gamma-distributed E (Figure 2A). Exposure measurement error hurt the statistical power of all approaches (Figure 2E), but with less proportional impact on the vPGS (which detects interaction patterns without explicitly testing the exposure during PGS development).

Primary PGS development and testing in the UKB

The UKB dataset was used for the primary applied data analysis portion of this investigation. A summary of the relevant multi-ancestry, unrelated subgroup of the UKB population is provided in Table S1, including within the relevant training (70%), optimization (10%), and testing (20%) subsets. First, GWAS, iGWAS, and vGWAS were conducted for each CRF (see Methods; CRF details are provided in Table S2). For each of these approaches, summary statistics were linkage disequilibrium (LD) pruned, and PGSs were generated corresponding to a series of p value thresholds. This P&T strategy enabled PGS optimization (i.e., choice of optimal P&T p value threshold) based on the strength of PGS×BMI interaction, rather than the PGS main effect (see Methods). Next, a single optimal PGS was created for each combination of CRF and approach by choosing the p value threshold, optimizing the significance of the βPGSxBMI regression term in the optimization subset (Figure S2; all PGS weights are provided in the supplemental materials). Finally, the PGS performance was evaluated based on the magnitude and significance of the same βPGSxBMI term in the testing subset, adjusting for covariates including basic demographics and genetic principal components. As a positive control, we confirmed that this data splitting and PGS development pipeline produced mPGSs with strong marginal effects in the testing subset (Figure S3).

Significance for the primary estimates was assigned based on a Bonferroni threshold adjusting for 10.3 effective CRFs as previously described for analysis of many biomarkers in the UKB (see methods).15,26 Of 20 total CRFs, 16 passed the significance threshold for at least one PGS type. The iPGS reached Bonferroni significance for the greatest number of CRFs (Figures 3A–3C) and generated the most significant PGS×BMI interaction for 11 of the 16 CRFs that reached Bonferroni significance for any approach. All PGS×BMI results are provided in Table S3.

Figure 3.

Figure 3

Primary results for optimized PGSs in the UKB testing set

(A) Standardized interaction estimates (units of SDY/SDPGS/SDBMI) are plotted for all CRFs (x axis) and PGS types (colors).

(B) Example stratified plot showing best-fit lines for the relationship between BMI and log(ALT) within quintiles of the ALT iPGS.

(C) Number of CRFs reaching Bonferroni significance for each approach.

(D) Standardized interaction estimates for each approach, with each point corresponding to a CRF.

(E) Boxplot and individual points representing the absolute values of the genetic correlation between iGWAS main and interaction effects, as computed by LD-score regression.

(F) Interaction effect for the mPGS is plotted against signed genetic correlation estimates.

Of the three PGS types, the iPGS approach most frequently captured the strongest BMI interactions across CRFs. This was not solely explained by better performance on a single, large cluster of correlated CRFs (see Figure S4). The mPGS approach showed substantial negative interaction estimates for some CRFs (i.e., a higher mPGS leads to a decreased BMI-CRF association; Figure 3D), notably for many having an inverse relationship with BMI (Figure S5). This observation fits with a previously described pattern of genetic effect amplification by adverse exposures4 (see discussion for further commentary). The vPGS showed negative interaction estimates for the same set of CRFs, consistent with the strong known relationship between genetic main and variance effects.16,26 Though there were substantial vPGS×E interactions for some CRFs (e.g., alanine aminotransferase [ALT] and aspartate aminotransferase [AST]), the vPGS did not meaningfully improve upon the iPGS for any of these.

As previously noted, the value of mPGS for the detection of G×E is directly related to the proportionality of genetic main and interaction effects. We quantified this directly by calculating the genetic correlations (ρg) between the main and interaction effects from the same iGWAS using the LD-score regression (Table S4). Of 17 CRFs with ρg estimates (the interaction signal was insufficient for ρg estimation for three), the magnitude of these correlations ranged from 0.05 (nonsignificant; bilirubin direct) to 0.51 (p = 5.7 × 10−14) (Figure 3E). These magnitudes are much smaller than the perfect correlation assumed by the mPGS approach, agreeing with results from Zhai and colleagues in a different domain of gene-statin interactions.5 Though mPGS interaction effect signs matched ρg signs, there was no correlation between the magnitude of these quantities, as might be expected theoretically (Pearson correlation between |ρg| and |βmPGS×BMI| of 0.02; p = 0.9) (Figure 3F).

Sensitivity analyses in the UKB

As demonstrated in our simulation study, a key concern in G×E testing is that interactions can appear as a statistical artifact of the combination of G-E correlation and nonlinearity of the E-Y relationship.25 This issue is particularly relevant in this application, given the highly polygenic nature of BMI. However, sensitivity models including either nonlinear BMI effects (squared BMI main effect term), nonlinear PGS effects, or using robust standard errors did not affect the results (Figures S6A–S6C). Furthermore, when swapping out the iPGS in favor of an mPGS for BMI (which maximizes the achievable PGS-BMI correlation), the results were less strong for most CRFs (Figure S6D). Together, these results suggest that PGS-BMI correlation is not solely responsible for the observed interaction effects.

We used further sensitivity models to explore two questions relevant to PGS×E testing. First, when testing iPGSs for interaction, it may be valuable to adjust for the main effect of an mPGS in addition to the existing iPGS main effect. This could explain additional variability in the outcome and thus improve the significance of the interaction estimate (Jayasinghe and colleagues6; personal communication, D. Jayasinghe). We wanted to avoid this sort of adjustment for multiple PGSs in our primary models for maximal interpretability, but we ran a set of sensitivity analyses including mPGS main effects in iPGS×BMI interaction models (in addition to the iPGS main effect already present in the model). This adjustment did not meaningfully affect the interaction results (Figure S7).

Second, when using an exposure such as BMI that is under strong genetic influence, it is possible to replace the measured exposure with a PGS for that exposure before testing the interaction; ultimately, this results in a form of G×G test. This may be useful in two ways: it can strengthen the causal inference (by using a genetic causal anchor for BMI) and might reveal stronger underlying interactions occurring “upstream” of realized BMI (Figures S8A and S8B). To test this, we tested each of the primary iPGS interaction models after replacing the measured BMI with an mPGS for BMI (the same one used to replace the iPGS above). For most CRFs, the resulting iPGS×mPGSBMI interactions were nonzero but less significant than those using measured BMI (Figure S8C). This finding is consistent with these interactions involving true causal effects of BMI that are not limited to its genetic component.

Replication in AoU

PGSs were calculated in the AoU based on optimized UKB variant weights for each approach-CRF combination (population summary in Table S5; biomarker metadata in Table S6). Regressions mirroring those in UKB were then run to understand how these scores generalize to a fully independent dataset and population (regression results in Table S7). We saw replication (at nominal p < 0.05) of many of the PGS interactions in the primary, pooled-ancestry dataset: 5/11 for mPGS, 6/13 for iPGS, and 3/8 for vPGS (Figures 4A–4C; full set of results comparing UKB and AoU in Table S8). PGS×BMI interaction effect sizes were strongly associated between the two cohorts, with Pearson correlations of 0.54, 0.70, and 0.69 for the mPGS, iPGS, and vPGS, respectively (Figure 4D). Some of the higher-level patterns observed in UKB were also seen in AoU, including the general outperformance of the iPGS and a directionality of mPGS interaction effects consistent with the amplification model.

Figure 4.

Figure 4

Replication results from AoU

(A) Standardized interaction estimates (units of SDY/SDPGS/SDBMI) are plotted for each combination of CRF (x axis; ordered by decreasing sample size) and PGS type (colors). Results are only shown for CRFs with a sample size greater than 1,000.

(B and C) Replication of significant UKB interactions (black) in AoU (gray). The y axis indicates the number of CRFs with significant interactions, either from all available CRFs (B) or only those with n > 1,000 samples available for replication in AoU (C). Counts for AoU in gray are for only those CRFs that were significant in UKB.

(D) Standardized AoU interaction estimates plotted against UKB estimates for the same PGS type (panels) and CRFs (individual points).

The ancestral and ethnic diversity of the AoU dataset provided an opportunity to not only explore PGS generalizability in a substantially different population from the European-focused UKB but also test its performance in specific population strata (Figure S9). Though existing results suggest that P&T-based PGSs generated in the European-enriched UKB do not generalize as well to non-European populations,27 we did not see a major difference in the PGS×BMI interactions in the pooled (37% non-European) versus European-only subsets (Figure S9C).

Genetic modification of the BMI-ALT relationship

PGS performance was especially strong for two liver-related CRFs, ALT and AST, for which higher levels can indicate liver damage. For both the iPGS and vPGS outperformed the mPGS in the UKB testing set (Figure 3) and the AoU replication dataset (Figure 4), with substantial consistency across multiple ancestry groups (Figure S10). On this basis, we further interrogated the performance and biology of these PGSs, focusing on ALT due to its greater specificity for liver function.

In both cohorts, we observed a strong positive correlation between BMI and log(ALT) (Figure 5A). Likewise, iPGS-stratified analysis demonstrated the increasing magnitude of the BMI-log(ALT) association in step with the iPGS, especially in its highest deciles; this was consistent in both cohorts despite a weaker overall BMI-log(ALT) association in AoU (Figure 5B). As another, more clinically applicable angle on this effect heterogeneity, covariate-adjusted BMI-log(ALT) associations were substantially stronger in the top iPGS decile compared to the remaining 90% of the population (Figure 5C). The relative magnitude of this increase was much larger in AoU (72%) than UKB (27%), which can be traced to the smaller general BMI-log(ALT) effect size in AoU.

Figure 5.

Figure 5

Exploration of the genetic modification of the BMI-ALT relationship

(A) Smooth spline curves (shrunken cubic spline) of the BMI-CRF relationship with 95% confidence intervals in gray. Curves are based on samples between the 5th and 95th percentiles of the BMI distribution.

(B) Regression effect estimates of BMI on log(ALT) (y axis), stratified by PGS decile (x axis).

(C) Regression effect estimates of BMI on log(ALT) (y axis), stratified by a cutoff for “high” iPGS values (colors correspond to iPGS bin; threshold displayed on x axis).

(D) Heatmap shows regression coefficients linking the mPGS and iPGS for ALT to biologically relevant CRFs and adiposity measures. Stars correspond to associations with p < 0.05.

(E) Diagram shows how the mPGS and iPGS capture different components of the genetic effect on ALT, with only a subset modifying the causal effect of BMI/adiposity.

Moving beyond their regression performance, we make several observations about the biological mechanisms captured by the iPGS for ALT compared to the mPGS. At the optimized P&T p value threshold (p < 5 × 10−8), the iPGS contained signals from only a handful of genomic regions (11 variants in eight loci; Table S9), all of which are within 100 kb of one of the 311 variants composing the mPGS. These eight loci correspond to genes that are well known in liver-related diseases like metabolic disease-associated steatotic liver dysfunction, such as PNPLA3 and TM6SF2.28 The iPGS and mPGS are also associated differently with other CRFs. For example, the mPGS was positively associated with total cholesterol (TC) and triglycerides (TGs; nonsignificant), suggesting that it captures effects related to broader metabolic dysfunction (associations shown in Figure 5D). In contrast, the iPGS was negatively associated with TC and TG, consistent with a mechanism in which higher BMI leads to hepatic lipid buildup and genetic factors (encoded by the iPGS) reduce the liver’s ability to export these lipids within the lipoproteins (Figure 5E). This phenomenon is known to involve both PNPLA3 and TM6SF2 and explains the discordant effects of some genetic variants on liver disease versus coronary artery disease.28,29 Thus, without explicitly leveraging external variant or pathway annotations, the iPGS for ALT ultimately “selected” BMI-interacting genetic factors from the set of all biological pathways related to liver stress.

To support the above conclusions, we conducted gene set enrichment analysis using the full set of ALT GWAS and iGWAS summary statistics. Indeed, the Reactome gene sets most enriched for iGWAS signal were related to lipoprotein assembly and export (e.g., plasma lipoprotein assembly, penrichment = 2.7 × 10−6; Table S10). In contrast, this pathway had minimal enrichment for GWAS signal (penrichment = 0.11), which instead was enriched for pathways related to cellular remodeling and response to stress (e.g., Rho GTPase cycle, penrichment = 2.6 × 10−6).

Genetic modification of the BMI-HDL relationship

The iPGS approach also performed modestly better in detecting BMI interactions for high-density lipoprotein (HDL)-C, with consistent significance and relative strength of the PGS types across UKB and AoU. The HDL-C example extends the two observations described above. First, the mPGS and vPGS showed signs of negative interaction effect, as expected, given that BMI and HDL-C are inversely correlated. This would be predicted by the phenomenon of genetic amplification: assuming positive synergy among HDL-C-raising factors, an HDL-C-raising genetic score would interact negatively with BMI (since BMI associates with lower HDL-C). We do not see this effect for the iPGS, since variants with the same effect pattern would have negative interaction effect signs in the iGWAS, thus contributing to a lower iPGS and reversed interaction effect direction. Put another way, signs in the iGWAS and iPGS are not tied to be in the direction of increasing marginal HDL-C values, as they are with standard GWAS (and often with a vGWAS, given the strong correlation between genetic main and variance effects).

Second, the HDL-C is another example in which the iPGS-contributing loci (five in total) comprised a subset of the mPGS-contributing loci. However, unlike the ALT case, we saw little difference in the biological pathways enriched in GWAS and iGWAS signal (Table S11). These primarily corresponded to cholesterol and lipoprotein-related pathways, such as “plasma lipoprotein assembly remodeling and clearance” (penrichment, GWAS = 4.4 × 10−9 and penrichment, iGWAS = 5.5 × 10−3, respectively). This case, in which the mPGS and iPGS capture the same basic set of biological mechanisms, agrees with a scenario described by Durvasula and Price in which polygenic G×E occurs at the level of the total genetic liability for the trait of interest.30 Importantly, it demonstrates the ability of the iPGS to perform as well as the mPGS for detecting interactions, even when the mPGS is not capturing additional “noise” from non-interacting biological pathways (as was the case for ALT).

Discussion

Two individuals may respond differently to the same change in physiology, clinical treatment, or lifestyle due to polygenic factors modifying the exposure-health relationship. Here, we conducted a comprehensive exploration of PGSs optimized for the detection of this type of polygenic G×E. Our simulations suggested that the iPGS approach is often most powerful, but that the mPGS or vPGS may be more effective in some cases, depending on the distribution and measurement accuracy of the exposure. Our applied analysis revealed an influence of genetics on the BMI-CRF relationship across a broad range of CRFs (most notably markers of liver stress), with the iPGS most frequently capturing the strongest interactions.

The most important contribution from our simulation study was the manipulation of the exposure distribution and measurement precision. Most biological quantities are nonnegative, whether molecular (e.g., concentrations of some factor in blood) or behavioral (e.g., physical activity levels), and this should be accounted for when conducting G×E-focused simulation studies. As previously described, the exposure distribution in the underlying data-generating model is a critical factor in interpreting G×E interaction results and directly impacts the degree to which tests of marginal genetic effects (as used in the mPGS in this study) will capture G×E relationships.23 We note that the exposure measurement quality, while a substantial concern in general, is less problematic for the BMI exposure used in this study’s applied analysis.

Based on the theoretical expectations and our simulation results, we can make some preliminary statements comparing these PGS approaches for G×E detection. The mPGS, which leverages the highest-powered underlying statistical test (compared to G×E or vQTL single-variant tests), will be the most powerful when the true underlying exposure has a mean that is far from zero.23 The mPGS performance will also track with the underlying correlation between genetic main and interaction effects, as demonstrated empirically in our UKB results (Figures 3E and 3F). The iPGS approach will likely be optimal for detecting PGS×E in many cases, given the alignment between its underlying variant-specific statistical test (G×E interaction) and the ultimate PGS×E interaction test of interest. Importantly, its performance depends on having sufficient statistical power in these underlying single-variant interaction tests (which in turn depends on factors like sample size and measurement error). Finally, the vPGS will be most effective in particular cases involving very poor exposure measurement, which disproportionately hurts the performance of the G×E test/iPGS approach that directly uses measured exposure values. Though not included in these simulations, we note that vPGS performance will also degrade with the number of overall genetic effect modifiers (genetic and non-genetic), since vQTL tests are not specific to the exposure in question.

Our applied analysis considered genome-wide genetic modification of the BMI-CRF relationship. This builds on previous observations of polygenic interactions with adiposity; stratification by BMI has been shown to improve the performance of PGSs for type 2 diabetes (allowing increased contribution of beta cell-related pathways in low-BMI individuals21), and the inclusion of interactions with waist-hip ratio (a related adiposity measure) modestly improves the predictive performance of PGSs for blood biomarkers.12 Here, we explored a broad set of CRFs with a specific focus on comparing the performance of the three PGS types. In this study, the iPGS approach showed the most consistent performance across CRFs, matching expectations, as described above. The less consistent performance of the vPGS may reflect the fact that BMI is measured well, meaning that there is less relative benefit compared to the iPGS. However, in many cases, the vPGS detected equally strong interactions to the iPGS (e.g., for ALT and AST) and mPGS (e.g., for sex hormone-binding globulin and direct bilirubin). Its interaction effects tracked most closely with the mPGS, which is unsurprising given the close relationship between genetic mean and variance effects.16 We also note that vPGSs may have applications beyond capturing G×E, such as predicting within-individual variability over time.17

Our mPGS results are consistent with a series of recent studies supporting amplification as the primary mode of polygenic G×E.4,30,31 In this framework, disease-associated genetic predisposition and disease-associated exposures are synergistic in increasing risk. We note that these studies primarily leverage standard PGSs, which is comparable to our mPGS analyses. Indeed, we saw positive mPGS×BMI interactions for CRFs with positive disease and BMI associations and negative interactions for CRFs with negative associations (i.e., the mPGS and BMI were synergistic; Figure S5). In contrast, iPGS interaction effect signs were uniformly positive, since G×E effects are not constrained to be positively correlated with the outcome. Importantly, this broad pattern of genetic amplification does not mean that the amplification model is correct for every contributing genetic locus.

As noted above, the choice to include an mPGS main effect in iPGS (or vPGS) interaction models is not expected to impact type I error. Rather, it may reduce the standard error of PGS×E estimates by explaining additional variability in the outcome, though our primary UKB results using the iPGS did not meaningfully change when including this adjustment (as shown in Figure S7). In general, this choice of adjustment strategy may differ depending on the goal of the analysis. For many studies that include the PGS×E interactions, the goal is enhancing the outcome prediction beyond solely standard PGSs (as is most commonly explored in the current iPGS literature). In that case, it is naturally important to include the main effect PGS (which will typically explain much more variance than the interaction effect, based on existing results from whole-genome variance components approaches32). However, in cases like the present study, for which the evaluation of a PGS×E effect is of primary interest, adjustment for a separate PGS (beyond that being evaluated for interaction) may complicate interpretation if it is correlated with the focal interaction term of interest.

We saw particularly strong interactions of BMI with the ALT and AST iPGSs, reinforcing known variant-specific findings from a European ancestry subset of the UKB.33 We conducted a more in-depth investigation of the ALT iPGS to explore potential biological mechanisms. Compared to the associated mPGSs, iPGSs had specific contributions from alleles hindering the export of hepatic lipids (known to be increased in obesity24,34) in the form of lipoproteins. This mechanism was supported by inverse associations of the iPGS with circulating TGs and TC as well as by pathway enrichment for lipoprotein assembly and export. In contrast, associations between the ALT mPGS and these traits were positive; this is a more intuitive relationship, given that general metabolic dysfunction increases both liver stress and circulating lipid levels. As referenced above, this contrast reveals a key insight about the iPGS strategy: from the overall genetic architecture of a trait, it has the capacity to naturally “select” subsets of variants related to specific biological mechanisms interacting with the exposure of interest. Our results further converge with findings derived from more hypothesis-driven “partitioned PGSs” constructed based on the consistency of variant effect directions with liver fat versus circulating TGs.35 Our HDL-related findings complement these observations, showing that the iPGS retains similar power to the mPGS even when they do not represent substantially different biological pathways.

This study represents the first effort to directly compare the performance of genetic main, interaction, and variance PGSs within a unified analysis pipeline for detecting interactions. It is strengthened by the incorporation of best practices from existing PGS pipelines and genome-wide studies for each of the relevant approaches, enabling the detection of PGS interactions explaining as little as 0.01% of outcome variance using biobank-scale datasets. Our results support and expand the existing observations about iPGS performance5,6 and the amplification model4,31 and show that genetic factors meaningfully alter the relationship between adiposity and cardiometabolic risk. These findings indicate the potential of genetic scores to contribute to more personalized chronic disease prevention strategies.

Limitations of the study

Despite the methodological results reported here, there remains uncertainty as to the optimal way to generate and test these response-focused PGSs. Especially for vQTL studies, it is not clear which statistical framework is optimal for single-variant studies or associated vPGSs (though recent studies are beginning to make the necessary comparisons36). An additional limitation is that our applied investigation uses only BMI as an exposure and cardiometabolic biomarkers as outcomes. Different exposures, such as lifestyle-related variables or pharmaceutical drugs, might have substantially different “response genetic architectures,” and even within the domain of adiposity, BMI is an imperfect measure that does not capture body fat distribution and reflects other factors such as muscle mass.37 We intentionally selected continuously valued CRFs for this study to accommodate the inclusion of vQTL tests, but the iPGS strategy can be straightforwardly applied to binary outcomes as well, with some additional methodological considerations.6 Finally, the three score types tested here are not exhaustive; for example, machine learning-based approaches can model the sensitivity of exposure-disease relationships to genetic factors.38

Resource availability

Lead contact

Further information and requests should be directed to and will be fulfilled by the lead contact, Kenneth E. Westerman (kewesterman@mgb.org).

Materials availability

This study did not generate new, unique reagents.

Data and code availability

No new genetic or phenotypic data have been generated for this study. The UKB data, including genetic and phenotypic data, are under controlled access but can be obtained through application at https://www.ukbiobank.ac.uk/. UKB will consider data applications from bona fide researchers for health-related research that is in the public interest. AoU controlled tier data are available to authorized users on the Researcher Workbench (https://workbench.researchallofus.org/). Variant-specific weights allowing the calculation of all PGSs described here are provided as a supplemental file (Data S1). The code supporting the conclusions of this manuscript can be found on Zenodo (https://doi.org/10.5281/zenodo.17238511) and GitHub (https://github.com/kwesterman/ipgs).

Acknowledgments

We thank Andrew R. Marderstein for helpful feedback on the manuscript. We gratefully acknowledge the AoU participants for their contributions, without whom this research would not have been possible. We also thank the National Institutes of Health’s All of Us Research Program for making available the participant data examined in this study. Selected diagrams presented in this paper were generated using https://BioRender.com. K.E.W. was supported by K01DK133637. M.S.U. was supported by Doris Duke Foundation award 2022063. A.K.M. was supported by R01HL145025 and U01HG011723.

Author contributions

Conceptualization, K.E.W. and A.K.M.; methodology, K.E.W. and L.J.O.; funding acquisition, K.E.W.; formal analysis: K.E.W.; investigation, K.E.W. and J.E.G.; writing – original draft, K.E.W. and J.E.G.; writing – review & editing, J.E.G., L.J.O., M.S.U., and A.K.M.; visualization, J.E.G.; and funding acquisition, K.E.W.

Declaration of interests

The authors declare no competing interests.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Deposited data

Polygenic score weights This paper https://doi.org/10.5281/zenodo.17238445

Software and algorithms

Custom code This paper https://doi.org/10.5281/zenodo.17238511
R (v4.x) R Foundation https://www.r-project.org
QUAIL Miao et al.17 https://github.com/qlu-lab/QUAIL/
PLINK2 Chang et al.39 https://www.cog-genomics.org/plink/2.0/
Hail (v0.2) Hail Team https://hail.is
ANNOVAR (version 2018-04-16) Wang et al.40 https://annovar.openbioinformatics.org/en/latest/
LD-score regression Bulik-Sullivan et al.41,42 https://github.com/bulik/ldsc
MAGMA de Leeuw et al.43 https://cloufield.github.io/GWASTutorial/09_Gene_based_analysis/

Experimental model and study participant details

UK biobank cohort

This work was conducted under a Not Human Subjects Research determination for UKB data analysis (NHSR-4298 at the Broad Institute of MIT and Harvard), under UKB application 27892. UKB is a large prospective cohort with both deep phenotyping and molecular data, including genome-wide genotyping, on over 500,000 individuals ages 40–69 living throughout the UK between 2006-2010.44

Genotyping, imputation, and initial quality control on the genetic dataset have been described previously.45 Work was conducted on genetic data release version 3, with imputation to both Haplotype Reference Consortium and 1000 Genomes Project (1KGP). For sensitivity analyses in ancestry-specific data subsets, genetic ancestry labels were retrieved from the Pan-UKBB project.45

Body mass index (BMI; kg/m2), the primary exposure of interest, was collected from assessment center anthropometric measurements. As outcomes, we focused on 20 serum biomarkers related to cardiovascular disease and metabolism, including but not limited to lipids, liver enzymes, glycemic parameters, and kidney function markers (see Table S2). Blood samples were collected at the baseline visit for the majority of participants, and specific biomarkers were measured using colorimetric, enzymatic, and immunoassays (details available at: https://biobank.ctsu.ox.ac.uk/crystal/crystal/docs/serum_biochemistry.pdf).

We excluded individuals that had withdrawn consent by the time of analysis excluded as well as those with diabetes, coronary heart disease, cirrhosis, end-stage renal disease, cancer diagnosis within one year prior to their assessment center visit, or who were pregnant within one year of the assessment center visit. Cholesterol, LDL-C, and Apolipoprotein B were also adjusted for statin use using methods described previously22: in individuals with self-reported use of a statin medication, each of these biomarkers was divided by an adjustment factor (0.749, 0.684, and 0.719, respectively) that had been empirically estimated by Sinnott-Armstrong and colleagues22 in the same population. After these adjustments, a subset of highly skewed biomarkers was log-transformed (see Table S2) and outliers (greater than 5 standard deviations from the mean) were set to missing for BMI and all biomarkers. Finally, we further subset to a group of unrelated samples used for genetic principal components (gPCs) analysis during central genetic data preprocessing.46

After all phenotype preprocessing steps, the unrelated, multi-ancestry UKB sample was randomly subdivided into three groups: training (70%), optimization (10%), and testing (20%). This split devotes a majority of the sample to the generation of genome-wide summary statistics, which contribute to PGS performance and out-of-dataset generalizability, and follows similar splits used in PGS analyses.22,47 Due to the substantial correlation between the 20 CRFs, we calculated a smaller number of “effective” biomarkers in the training set using a PCA-based approach we have previously deployed for blood biomarkers in this dataset.26

All of Us cohort

The AoU cohort contains data from over 413,000 participants, of which more than 245,000 have genetic data available from whole-genome sequencing. AoU operates under a “data passport” model in which project-specific IRB approval is not needed for the analysis of de-identified data. AoU research, as Participants were assigned to specific genetically-inferred ancestry groups, including African/African American (AFR), American Admixed/Latino (AMR), East Asian (EAS); European (EUR) and South Asian (SAS). Genetic principal components (gPCs) were available from central genotype preprocessing. Basic covariates, including sex at birth and age [determined from date of birth], were derived from survey responses.

BMI (concept ID: 3038553) was available as measured from outpatient settings (visit occurrence concepts: “Outpatient Visit” or “Office Visit”). To account for multiple measurements, all non-missing BMI values were averaged within each person. Outliers (greater than 5 interquartile ranges [IQRs] from the median) followed by values less than 10 were removed.

Blood biomarkers were chosen to match the 20 analyzed in the UKB (associated concept names provided in Table S6). They were retrieved as measured in outpatient settings (visit occurrence concepts: “Outpatient Visit”, “Office Visit”, or “Laboratory Visit”) and filtered for measurement using relevant units (Table S6). All valid biomarker values were averaged within each person. Finally, negative values were removed, zero values were imputed with half of the minimum non-zero value, and outliers (greater than 5 IQRs from the median) were removed.

Method details

Simulation study

A simulation study was conducted to validate existing results in the literature, explore additional conditions under which these scores might have inflated type I error, and compare their statistical power for the detection of PGS×E interaction under varying assumptions about the exposure distribution and measurement characteristics. The simulation workflow and parameter settings are shown in detail in Figure S1. We summarize it here:

First, a single genotype dataset was generated for N = 10,000 samples and M = 100 independent variants, with genotype values drawn from a binomial distribution and minor allele frequencies (MAFs) drawn from a uniform distribution between 0.01 and 0.5.

Next, for each simulation scenario (defined by a specific set of parameters), a set of P independent replicates were generated: 5000 for type I error simulations and 500 for power simulations. To accomplish this, P random, standard normally-distributed exposures E were first simulated, with possible contribution from the simulated G based on a specified variance explained (σGE2; set to either 0 or 0.1). To simulate measurement error, we added noise according to a parameter (ICCE) specifying the amount of variance in the measured E explained by the “true” E. To generate Gamma-distributed (nonnegative) distributions, we used transformations of the existing E variables (after incorporating any G-E correlation and measurement error). Specifically, we used a Gaussian copula (“normal to anything”) approach48: first generating standard normal probabilities via the cumulative distribution function, then passing those through the inverse Gamma cumulative distribution function to produce Gamma marginals. We used two Gamma distributions, both with a standard deviation of 1 but having a mean of 1 (shape α = 1, scale θ = 1) or 2 (α = 4, θ = 0.5).

A corresponding series of P standard normal phenotypes were then generated with contributions from G and E main effects, a G×E product term, and additional random error to produce a final outcome variable with variance one. Parameters specified included: variance explained by exposure main effects (σE2), genetic main effects (σG2), and their interaction (σG×E2). For a subset of simulations with Gamma-distributed E, a nonlinear E effect was specified by replacing the main effect of E with an effect of E resulting in the same explained variance.

After phenotype simulation, a random 70% of each simulated sample was assigned to the training set, with the remaining 30% assigned to the testing set. For each of these P phenotypes, a “genome-wide” set of results (for each statistical test) was generated in the training subset by performing a series of statistical tests for marginal effects (estimating βG), interaction effects (estimating βG×E), and variance effects (estimating βGv). Marginal and interaction effects were estimated using their respective standard linear models, while variance effects were estimated using the deviation regression model method, which regresses absolute deviations from genotype-specific median values on additively-coded genotype values.16 These summary statistics were converted into PGS weights via simple thresholding: regression estimates were used directly for variants with p < 0.05, and weights were otherwise set to zero. Each of the three PGSs were then calculated as weighted sums based on these weights. Finally, these PGSs (three per phenotype vector) were tested for PGS×E interaction using a significance threshold of p < 0.05 to determine type I error (for σG×E2=0) and power (for σG×E20). Standard errors were calculated based on the fraction of significant results f and number of simulation replicates P as SE=f(1f)P, with 95% CIs calculated as f±1.96SE.

Genome-wide models

For each biomarker of interest, three statistical models were run genome-wide in the UKB training set to generate summary statistics that would inform subsequent PGS development. Each was run on common variants (MAF >1%) with imputation INFO score greater than 0.5.

  • i.

    Main effects: genome-wide association study (GWAS).

The GWAS model is as follows:

Yi=β0+βGGi+βCTCi+ϵi

Where Yi is the outcome for individual i, Gi is the genotype vector, Ci is a vector of covariates, and ϵi captures residual error. Covariates included sex, age, age,2 an age-by-sex product term, and 10 gPCs. This model produces βG (genetic main effect) estimates and p-values for each variant. The GWAS models were run using the GEM program49 with no exposure specified and model-based standard errors.

  • ii.

    Interaction effects: genome-wide interaction study (iGWAS).

The iGWAS model is a straightforward extension of the GWAS model:

Yi=β0+βGGi+βEEi+βGxEGiEi+βCTCi+ϵi

Where additional terms have been added for Ei, the exposure, and its product term with Gi. Covariates matched those from the GWAS, with the addition of exposure-by-gPC product terms for each of the 10 gPCs (as found to be critical in pooled ancestry interaction analyses,50 based on the argument from Keller51). The key estimate of interest from this model is βGxE (the interaction effect), rather than βG. The iGWAS models were run using GEM with mean-centered BMI as the exposure and robust standard errors.

  • iii.

    Variance effects: genome-wide variance study (vGWAS).

Variance-quantitative trait locus (vQTL) analysis quantifies genetic effects on trait variability (rather than mean). This analysis doesn’t directly model interactions, but may nonetheless have greater power in some cases to detect variants supporting interaction with environmental exposures. For example, the deviation regression model from Marderstein and colleagues16 is:

Zi=β0+βGvGi+βCTCi+ϵi
Zi=|YikY˜k|

Where k indexes the genotype group corresponding to individual i and Y˜k is the median phenotype value in genotype group k, such that Zi represents the individual’s absolute deviation from the genotype-specific median. For computational efficiency and statistical robustness, the statistical model used in the applied UKB analysis is the quantile integral linear model (QUAIL).17 A standard quantile regression model follows:

QY(τ|G=g)=gβτ

Where τ is the quantile and βτ is the regression coefficient associated with that quantile. In this setup, β1τβτ corresponds to the vQTL effect (i.e., the genotype effect on the quantile differs across lower versus higher quantiles). Aggregating information across quantiles results in the quantile-integrated model tested in the QUAIL program:

βQI=00.5(β1τβτ)dτ.

Which can be tested for genetic variants genome-wide (see Miao 2022 for more details), with these βQI (denoted moving forward as βGv, indicating “variance”, for clarity), being the primary estimates of interest).17

Polygenic score generation and optimization

A series of mPGSs were generated for each biomarker of interest using a basic P&T strategy as implemented in the PRSice-2 program.52 Inputs included GWAS summary statistics (effect estimates and p-values) and an LD reference panel consisting of a random 20,000 individuals from the UKB. For pruning within PRSice-2, a grid of p-value thresholds of 0.05, 0.01, 0.005, …, 1 × 10−7, 5 × 10−8 was used, along with a clumping radius of 250kb and r2 threshold of 0.1. Ambiguous variants (A/T or C/G) and those with duplicated rsIDs in the original UKB annotation file were excluded during the PGS development step, prior to pruning.

Though the standard PRSice pipeline includes PGS p-value threshold optimization using linear regression models in the target dataset, we did not use this functionality here since the goal was to optimize for the detection of interaction rather than main effects. Instead, this optimization was performed separately, using a held-out optimization subset of the UKB. Specifically, mPGSs corresponding to each value of the threshold were included in separate regression models including main and exposure interaction effects:

Yi=β0+βPGSPGSi+βEEi+βPGSxEPGSiEi+βCTCi+ϵi

Covariates for these optimization regressions were identical to those from the iGWAS (i.e., including 10 BMI-by-gPC product terms). The optimal threshold chosen to minimize the p-value of the estimated interaction effect, βPGSxE. Finally, the PGS corresponding to the optimal p-value threshold was evaluated using the same regression model in the fully held-out testing subset.

PGS generation, optimization, and evaluation proceeded similarly for the iPGS (using βGxE estimates) and vPGS (using βGv estimates).

PGSs were calculated in AoU based on weights determined and optimized in the prior UKB analysis. Genotypes for relevant variants were retrieved from whole-genome sequencing (ACAF threshold callset; v7.1) using the Hail program,53 based on chromosomal location, splitting multi-allelic variants using the split_multi_hts() function. PGS weights were harmonized by (1) flipping the sign of the PGS weight when the counted and non-counted alleles were the reverse of that from the UKB, and (2) dropping variants that were unavailable or for which alleles did not match. Score calculation was run using the “--score”function from PLINK2.39 For computational tractability, for the few scenarios in which the optimal score from UKB resulted in a very large number of variants included in the PGS, only the top 5,000 variants by p-value were used for PGS calculation.

AoU regressions mirrored those conducted in the UKB testing set, testing for PGS× BMI interaction as the primary estimate of interest while adjusting for sex at birth, age, age squared, 10 gPCs, and 10 gPC-by-BMI interaction product terms. Primary replication tests were performed in the full, multi-ancestry dataset, with PGS pre-adjusted for ancestry probabilities as previously described.54,55 Ancestry-stratified sensitivity analyses were performed based on genetically inferred ancestry groupings.56

Additional follow-up analyses

Genetic correlations between genetic main and interaction estimates from iGWAS were estimated using bivariate LD-score regression (LDSC).41,42 For each CRF, genetic main effect and interaction effect estimates and p-values (using robust standard errors) were retrieved from the same set of iGWAS summary statistics. LDSC was then run using a European-ancestry linkage disequilibrium reference dataset from the 1000 Genomes Project (https://alkesgroup.broadinstitute.org/LDSCORE/). Given the sensitivity of genetic main effect estimates to the centering of the interaction exposure,23 we reiterate that BMI was mean-centered prior to the iGWAS feeding these LDSC runs.

Variants included in selected PGSs were annotated to genes using ANNOVAR40 (version 2018-04-16; based on genome build GRCh38).

ALT-specific GWAS and iGWAS results were subject to enrichment analysis to prioritize gene sets with enrichment of signal in the surrounding genetic region. p-values from the associated genome-wide summary statistics were used as input to the MAGMA program,43 using the same LD reference panel as above and gene regions defined as 2kb upstream to 1kb downstream of the gene limits based on the NCBI database (GRCh37). Gene sets from the Reactome pathway collection57 were downloaded from mSigDB.58

Quantification and statistical analysis

Analytical approaches and regression models are described above in the method details section. All statistical tests were two-sided unless otherwise noted. Comprehensive regression summary statistics and specific sample sizes are provided in the Supplemental Tables. Simulations and all subsequent analyses were conducted using R versions 4.1 and 4.259 unless otherwise noted.

Published: November 25, 2025

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.xgen.2025.101075.

Supplemental information

Document S1. Figures S1–S10 and Tables S1–S9
mmc1.pdf (1.5MB, pdf)
Table S10. MAGMA gene set enrichment of ALT GWAS and iGWAS summary statistics, related to STAR Methods
mmc2.xlsx (189.4KB, xlsx)
Table S11. MAGMA gene set enrichment of HDL GWAS and iGWAS summary statistics, related to STAR Methods
mmc3.xlsx (189.9KB, xlsx)
Data S1. PGS weights for each score type and CRF as developed in the UKB, related to STAR Methods
mmc4.zip (8.1MB, zip)
Document S2. Article plus supplemental information
mmc5.pdf (9.7MB, pdf)

References

  • 1.Schork N.J. Personalized medicine: Time for one-person trials. Nature. 2015;520:609–611. doi: 10.1038/520609a. [DOI] [PubMed] [Google Scholar]
  • 2.Truong B., Ruan Y., Haidermota S., Patel A., Surakka I., Hornsby W., Koyama S., Lee S.H., Natarajan P. Modification of coronary artery disease clinical risk factors by coronary artery disease polygenic risk score. Med. 2024;5:459–468.e3. doi: 10.1016/j.medj.2024.02.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Barcellos S.H., Carvalho L.S., Turley P. Education can reduce health differences related to genetic risk of obesity. Proc. Natl. Acad. Sci. USA. 2018;115:E9765–E9772. doi: 10.1073/pnas.1802909115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Nagpal S., Gibson G. Dual exposure-by-polygenic score interactions highlight disparities across social groups in the proportion needed to benefit. medRxiv. 2024 doi: 10.1101/2024.07.29.24311065. Preprint at. [DOI] [Google Scholar]
  • 5.Zhai S., Zhang H., Mehrotra D.V., Shen J. Pharmacogenomics polygenic risk score for drug response prediction using PRS-PGx methods. Nat. Commun. 2022;13 doi: 10.1038/s41467-022-32407-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Jayasinghe D., Momin M.M., Beckmann K., Hyppönen E., Benyamin B., Lee S.H. Mitigating type 1 error inflation and power loss in GxE PRS: Genotype–environment interaction in polygenic risk score models. Genet. Epidemiol. 2024;48:85–100. doi: 10.1002/gepi.22546. [DOI] [PubMed] [Google Scholar]
  • 7.Hüls A., Ickstadt K., Schikowski T., Krämer U. Detection of gene-environment interactions in the presence of linkage disequilibrium and noise by using genetic risk scores with internal weights from elastic net regression. BMC Genet. 2017;18:55. doi: 10.1186/s12863-017-0519-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Arnau-Soler A., Macdonald-Dunlop E., Adams M.J., Clarke T.K., MacIntyre D.J., Milburn K., Navrady L., Generation Scotland. Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium. Hayward C., et al. Genome-wide by environment interaction studies of depressive symptoms and psychosocial stress in UK Biobank and Generation Scotland. Transl. Psychiatry. 2019;9 doi: 10.1038/s41398-018-0360-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Lin W.Y., Huang C.C., Liu Y.L., Tsai S.J., Kuo P.H. Polygenic approaches to detect gene-environment interactions when external information is unavailable. Brief. Bioinform. 2019;20:2236–2252. doi: 10.1092/bib/bby086. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Werme J., van der Sluis S., Posthuma D., de Leeuw C.A. Genome-wide gene-environment interactions in neuroticism: an exploratory study across 25 environments. Transl. Psychiatry. 2021;11 doi: 10.1038/s41398-021-01288-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Tang Y., You D., Yi H., Yang S., Zhao Y. IPRS: Leveraging Gene-Environment Interaction to Reconstruct Polygenic Risk Score. Front. Genet. 2022;13 doi: 10.3389/fgene.2022.801397. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Di Scipio M., Khan M., Mao S., Chong M., Judge C., Pathan N., Perrot N., Nelson W., Lali R., Di S., et al. A versatile, fast and unbiased method for estimation of gene-by-environment interaction effects on biobank-scale datasets. Nat. Commun. 2023;14 doi: 10.1038/s41467-023-40913-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Westerman K., Liu Q., Liu S., Parnell L.D., Sebastiani P., Jacques P., DeMeo D.L., Ordovás J.M. A gene-diet interaction-based score predicts response to dietary fat in the Women’s Health Initiative. Am. J. Clin. Nutr. 2020;111:893–902. doi: 10.1093/ajcn/nqaa037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Paré G., Cook N.R., Ridker P.M., Chasman D.I. On the Use of Variance per Genotype as a Tool to Identify Quantitative Trait Interaction Effects: A Report from the Women’s Genome Health Study. PLoS Genet. 2010;6 doi: 10.1371/journal.pgen.1000981. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Wang H., Zhang F., Zeng J., Wu Y., Kemper K.E., Xue A., Zhang M., Powell J.E., Goddard M.E., Wray N.R., et al. Genotype-by-environment interactions inferred from genetic effects on phenotypic variability in the UK Biobank. Sci. Adv. 2019;5 doi: 10.1126/sciadv.aaw3538. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Marderstein A.R., Davenport E.R., Kulm S., Van Hout C.V., Elemento O., Clark A.G. Leveraging phenotypic variability to identify genetic interactions in human phenotypes. Am. J. Hum. Genet. 2021;108:49–67. doi: 10.1016/j.ajhg.2020.11.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Miao J., Lin Y., Wu Y., Zheng B., Schmitz L.L., Fletcher J.M., Lu Q. A quantile integral linear model to quantify genetic effects on phenotypic variability. Proc. Natl. Acad. Sci. USA. 2022;119 doi: 10.1073/pnas.2212959119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Johnson R., Sotoudeh R., Conley D. Polygenic Scores for Plasticity: A New Tool for Studying Gene–Environment Interplay. Demography. 2022;59:1045–1070. doi: 10.1215/00703370-9957418. [DOI] [PubMed] [Google Scholar]
  • 19.Signer R., Seah C., Young H., Retallick-Townsley K., de Pins A., Cote A., Lee S., Jia M., Johnson J.S., Johnston K., et al. BMI Interacts with the Genome to Regulate Gene Expression Globally, with Emphasis in the Brain and Gut. medRxiv. 2024 doi: 10.1101/2024.11.26.24317923. Preprint at. [DOI] [Google Scholar]
  • 20.Tang H., Jiang L., Stolzenberg-Solomon R.Z., Arslan A.A., Beane Freeman L.E., Bracci P.M., Brennan P., Canzian F., Du M., Gallinger S., et al. Genome-wide gene-diabetes and gene-obesity interaction scan in 8,255 cases and 11,900 controls from panscan and PanC4 consortia. Cancer Epidemiol. Biomarkers Prev. 2020;29:1784–1791. doi: 10.1158/1055-9965.EPI-20-0275. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Ojima T., Namba S., Suzuki K., Yamamoto K., Sonehara K., Narita A., Tohoku Medical Megabank Project Study Group. Biobank Japan Project. Kamatani Y., Tamiya G., et al. Body mass index stratification optimizes polygenic prediction of type 2 diabetes in cross-biobank analyses. Nat. Genet. 2024;56:1100–1109. doi: 10.1038/s41588-024-01782-y. [DOI] [PubMed] [Google Scholar]
  • 22.Sinnott-Armstrong N., Tanigawa Y., Amar D., Mars N., Benner C., Aguirre M., Venkataraman G.R., Wainberg M., Ollila H.M., Kiiskinen T., et al. Genetics of 35 blood and urine biomarkers in the UK Biobank. Nat. Genet. 2021;53:185–194. doi: 10.1038/s41588-020-00757-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Aschard H. A perspective on interaction effects in genetic association studies. Genet. Epidemiol. 2016;40:678–688. doi: 10.1002/gepi.21989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Motsinger-Reif A.A., Reif D.M., Akhtari F.S., House J.S., Campbell C.R., Messier K.P., Fargo D.C., Bowen T.A., Nadadur S.S., Schmitt C.P., et al. Gene-environment interactions within a precision environmental health framework. Cell Genom. 2024;4 doi: 10.1016/j.xgen.2024.100591. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Westerman K.E., Sofer T. Many roads to a gene-environment interaction. Am. J. Hum. Genet. 2024;111:626–635. doi: 10.1016/j.ajhg.2024.03.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Westerman K.E., Majarian T.D., Giulianini F., Jang D.-K., Miao J., Florez J.C., Chen H., Chasman D.I., Udler M.S., Manning A.K., Cole J.B. Variance-quantitative trait loci enable systematic discovery of gene-environment interactions for cardiometabolic serum biomarkers. Nat. Commun. 2022;13:3993. doi: 10.1038/s41467-022-31625-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Wang Y., Kanai M., Tan T., Kamariza M., Tsuo K., Yuan K., Zhou W., Okada Y., BioBank Japan Project. Huang H., et al. Polygenic prediction across populations is influenced by ancestry, genetic architecture, and methodology. Cell Genom. 2023;3 doi: 10.1016/j.xgen.2023.100408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Simons N., Isaacs A., Koek G.H., Kuč S., Schaper N.C., Brouwers M.C.G.J. PNPLA3, TM6SF2, and MBOAT7 Genotypes and Coronary Artery Disease. Gastroenterology. 2017;152:912–913. doi: 10.1053/j.gastro.2016.12.020. [DOI] [PubMed] [Google Scholar]
  • 29.Liu D.J., Peloso G.M., Yu H., Butterworth A.S., Wang X., Mahajan A., Saleheen D., Emdin C., Alam D., Alves A.C., et al. Exome-wide association study of plasma lipids in >300,000 individuals. Nat. Genet. 2017;49:1758–1766. doi: 10.1038/ng.3977. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Durvasula A., Price A.L. Distinct explanations underlie gene-environment interactions in the UK Biobank. Am. J. Hum. Genet. 2025;112:644–658. doi: 10.1016/j.ajhg.2025.01.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Zhu C., Ming M.J., Cole J.M., Edge M.D., Kirkpatrick M., Harpak A. Amplification is the primary mode of gene-by-sex interaction in complex human traits. Cell Genom. 2023;3 doi: 10.1016/j.xgen.2023.100297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Pazokitoroudi A., Liu Z., Dahl A., Zaitlen N., Rosset S., Sankararaman S. A scalable and robust variance components method reveals insights into the architecture of gene-environment interactions underlying complex traits. Am. J. Hum. Genet. 2024;111:1462–1480. doi: 10.1016/j.ajhg.2024.05.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Gao C., Marcketta A., Backman J.D., O’Dushlaine C., Staples J., Ferreira M.A.R., Lotta L.A., Overton J.D., Reid J.G., Mirshahi T., et al. Genome-wide association analysis of serum alanine and aspartate aminotransferase, and the modifying effects of BMI in 388k European individuals. Genet. Epidemiol. 2021;45:664–681. doi: 10.1002/gepi.22392. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Younossi Z.M., Koenig A.B., Abdelatif D., Fazel Y., Henry L., Wymer M. Global epidemiology of nonalcoholic fatty liver disease—Meta-analytic assessment of prevalence, incidence, and outcomes. Hepatology. 2016;64:73–84. doi: 10.1002/hep.28431. [DOI] [PubMed] [Google Scholar]
  • 35.Jamialahmadi O., De Vincentis A., Tavaglione F., Malvestiti F., Li-Gao R., Mancina R.M., Alvarez M., Gelev K., Maurotti S., Vespasiani-Gentilucci U., et al. Partitioned polygenic risk scores identify distinct types of metabolic dysfunction-associated steatotic liver disease. Nat. Med. 2024;30:3614–3623. doi: 10.1038/s41591-024-03284-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Zhang X., Bell J.T. Detecting genetic effects on phenotype variability to capture gene-by-environment interactions: a systematic method comparison. G3 (Bethesda). 2024;14 doi: 10.1093/G3JOURNAL/JKAE022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Bray G.A. Beyond BMI. Nutrients. 2023;15 doi: 10.3390/nu15102254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Naito T., Inoue K., Namba S., Sonehara K., Suzuki K., BioBank Japan. Matsuda K., Kondo N., Toda T., Yamauchi T., et al. Machine learning reveals heterogeneous associations between environmental factors and cardiometabolic diseases across polygenic risk scores. Commun. Med. 2024;4:181. doi: 10.1038/s43856-024-00596-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Chang C.C., Chow C.C., Tellier L.C., Vattikuti S., Purcell S.M., Lee J.J. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Wang K., Li M., Hakonarson H. ANNOVAR: Functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164. doi: 10.1093/nar/gkq603. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Bulik-Sullivan B.K., Loh P.-R., Finucane H.K., Ripke S., Yang J., Schizophrenia Working Group of the Psychiatric Genomics Consortium. Patterson N., Daly M.J., Price A.L., Neale B.M. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 2015;47:291–295. doi: 10.1038/ng.3211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Bulik-Sullivan B., Finucane H.K., Anttila V., Gusev A., Day F.R., Loh P.-R., ReproGen Consortium. Psychiatric Genomics Consortium. Genetic Consortium for Anorexia Nervosa of the Wellcome Trust Case Control Consortium 3. Duncan L., et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 2015;47:1236–1241. doi: 10.1038/ng.3406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.de Leeuw C.A., Mooij J.M., Heskes T., Posthuma D. MAGMA: Generalized Gene-Set Analysis of GWAS Data. PLoS Comput. Biol. 2015;11 doi: 10.1371/journal.pcbi.1004219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Sudlow C., Gallacher J., Allen N., Beral V., Burton P., Danesh J., Downey P., Elliott P., Green J., Landray M., et al. UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age. PLoS Med. 2015;12 doi: 10.1371/journal.pmed.1001779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Karczewski K.J., Gupta R., Kanai M., Lu W., Tsuo K., Wang Y., Walters R.K., Turley P., Callier S., Baya N., et al. Pan-UK Biobank GWAS improves discovery, analysis of genetic architecture, and resolution into ancestry-enriched effects. medRxiv. 2024 doi: 10.1101/2024.03.13.24303864. Preprint at. [DOI] [PubMed] [Google Scholar]
  • 46.Bycroft C., Freeman C., Petkova D., Band G., Elliott L.T., Sharp K., Motyer A., Vukcevic D., Delaneau O., O’Connell J., et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–209. doi: 10.1038/s41586-018-0579-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Tanigawa Y., Qian J., Venkataraman G., Justesen J.M., Li R., Tibshirani R., Hastie T., Rivas M.A. Significant sparse polygenic risk scores across 813 traits in UK Biobank. PLoS Genet. 2022;18 doi: 10.1371/journal.pgen.1010105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Cario M.C., Nelson B.L. Industrial Engineering; 1997. Modeling and Generating Random Vectors with Arbitrary Marginal Distributions and Correlation Matrix. [Google Scholar]
  • 49.Westerman K.E., Pham D.T., Hong L., Chen Y., Sevilla-González M., Sung Y.J., Sun Y.V., Morrison A.C., Chen H., Manning A.K. GEM: scalable and flexible gene–environment interaction analysis in millions of samples. Bioinformatics. 2021;37:3514–3520. doi: 10.1093/bioinformatics/btab223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Pham D.T., Westerman K.E., Pan C., Chen L., Srinivasan S., Isganaitis E., Vajravelu M.E., Bacha F., Chernausek S., Gubitosi-Klug R., et al. Re-analysis and meta-analysis of summary statistics from gene–environment interaction studies. Bioinformatics. 2023;39 doi: 10.1093/bioinformatics/btad730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Keller M.C. Gene × Environment Interaction Studies Have Not Properly Controlled for Potential Confounders: The Problem and the (Simple) Solution. Biol. Psychiatry. 2014;75:18–24. doi: 10.1016/j.biopsych.2013.09.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Choi S.W., O’Reilly P.F. PRSice-2: Polygenic Risk Score software for biobank-scale data. GigaScience. 2019;8 doi: 10.1093/gigascience/giz082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Hail Team. Hail 0.2. https://hail.is/.
  • 54.Szczerbinski L., Mandla R., Schroeder P., Porneala B.C., Li J.H., Florez J.C., Mercader J.M., Manning A.K., Udler M.S. Algorithms for the identification of prevalent diabetes in the All of Us Research Program validated using polygenic scores – a new resource for diabetes precision medicine. medRxiv. 2023 doi: 10.1101/2023.09.05.23295061. Preprint at. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Khera A.V., Chaffin M., Zekavat S.M., Collins R.L., Roselli C., Natarajan P., Lichtman J.H., D’onofrio G., Mattera J., Dreyer R., et al. Whole-Genome Sequencing to Characterize Monogenic and Polygenic Contributions in Patients Hospitalized With Early-Onset Myocardial Infarction. Circulation. 2019;139:1593–1602. doi: 10.1161/CIRCULATIONAHA.118.035658. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Bick A.G., Metcalf G.A., Mayo K.R., Lichtenstein L., Rura S., Carroll R.J., Musick A., Linder J.E., Jordan I.K., Nagar S.D., et al. Genomic data in the All of Us Research Program. Nature. 2024;627:340–346. doi: 10.1038/s41586-023-06957-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Milacic M., Beavers D., Conley P., Gong C., Gillespie M., Griss J., Haw R., Jassal B., Matthews L., May B., et al. The Reactome Pathway Knowledgebase 2024. Nucleic Acids Res. 2024;52:D672–D678. doi: 10.1093/nar/gkad1025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Liberzon A., Birger C., Thorvaldsdóttir H., Ghandi M., Mesirov J.P., Tamayo P. The Molecular Signatures Database Hallmark Gene Set Collection. Cell Syst. 2015;1:417–425. doi: 10.1016/j.cels.2015.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.R Core Team . R Foundation for Statistical Computing; 2022. R: A Language and Environment for Statistical Computing. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S10 and Tables S1–S9
mmc1.pdf (1.5MB, pdf)
Table S10. MAGMA gene set enrichment of ALT GWAS and iGWAS summary statistics, related to STAR Methods
mmc2.xlsx (189.4KB, xlsx)
Table S11. MAGMA gene set enrichment of HDL GWAS and iGWAS summary statistics, related to STAR Methods
mmc3.xlsx (189.9KB, xlsx)
Data S1. PGS weights for each score type and CRF as developed in the UKB, related to STAR Methods
mmc4.zip (8.1MB, zip)
Document S2. Article plus supplemental information
mmc5.pdf (9.7MB, pdf)

Data Availability Statement

No new genetic or phenotypic data have been generated for this study. The UKB data, including genetic and phenotypic data, are under controlled access but can be obtained through application at https://www.ukbiobank.ac.uk/. UKB will consider data applications from bona fide researchers for health-related research that is in the public interest. AoU controlled tier data are available to authorized users on the Researcher Workbench (https://workbench.researchallofus.org/). Variant-specific weights allowing the calculation of all PGSs described here are provided as a supplemental file (Data S1). The code supporting the conclusions of this manuscript can be found on Zenodo (https://doi.org/10.5281/zenodo.17238511) and GitHub (https://github.com/kwesterman/ipgs).


Articles from Cell Genomics are provided here courtesy of Elsevier

RESOURCES