Skip to main content
. 2023 Aug 25;14:5196. doi: 10.1038/s41467-023-40913-7

Table 4.

Comparison of current methods estimating gene-by-environment contributions to MonsterLM

Method Description Advantages Limitations MonsterLM
StructLMM29 Evaluates interaction variance for multiple environmental factors with a single SNP. Fast, robust model for a variety of different environmental exposures. Limited to interaction effects of only a single SNP or genotype. Analyzes variance explained by interactions genome-wide (after LD-pruning).
CGI-GREML30 Uses a mix of parametrized models and restricted likelihood methods to estimate GxE. Well-structured for identifying GxE interactions with categorial exposures. >250 Likelihood Ratio Tests; Slow; Cannot use continuous traits. Can analyze continuous traits without categorizing them; quick, efficient Wald-test for R2.
GxEMM31 Linear mixed model method to detect GxE interactions across the genome and a single exposure. Multiple parametrizations available to efficiently model GxE interaction effects. Small sample size only; Minimal number of SNPs. Can analyze a large sample size with many SNPs via genotype partitioning and conjugate gradient method.
GRSxE32 Method to detect total GxE interactions with a Gene-Risk Score. Estimates the GxE contribution for all possible environmental factors with SNPs. Assumes each SNP interacts equally with E. Accounts for the unique interaction effect of each SNP with E.
LEMMA33 Linear mixed model method to detect GxE interactions across the genome and an estimated linear combination of exposures. Considers the impact of over-lapping environmental exposures when computing total GxE contributions across the genome. Requires parametrization and model specification; uses an estimated linear combination of exposures, assuming all E’s interact with the same SNPs Tests for specific interaction with E rather than a linear combination of E.
GxEsum34 Estimates genome-wide GxE variance using GWAS summary statistics. Has controlled type I error rates and unbiased GxE estimates; efficient computation-wise; can also be applied to binary disease traits; summary statistics advantages. Disadvantages of using summary statistics include information limitations and population stratification bias susceptibility. Individual-level data advantages include better handling of LD and complex interactions.
LDSC GxE35 Estimates genome-wide GxE variance using GxE GWAS (GWIS) summary statistics. GWIS replaces GWAS in standard LDSC regression37 to estimate hGxE2. Utilizes GWAS summary statistics; computationally efficient; typically robust to confounding from stratification and common environmental effects. Risk of h2 underestimation in high LD or low polygenicity regions; bias when LD scores from reference population and GWAS mismatch. Individual-level data advantages include better handling of LD and complex interactions; robust in low polygenicity scenarios.
MTG2 IGE28, 36 Estimates variances explained by additive effects of exposure variables, by ExE interactions, and covariance between genetic effects and exposomic effects. Precise estimates with low standard deviations, flexible for a variety of exposure-based interactions. Requires generation of a genomic relationship matrix file (grm file) using PLINK which requires >1TB RAM as N > 100,000. Does not require generating a genomic relationship matrix, is computationally efficient, and can use biobank-scale individual level data.

Existing methods of biobank GxE estimation compared to MonsterLM. A description of eight methods to estimate GxE estimates with respective advantages, limitations, and comparisons to MonsterLM.