Table 4.
Comparison of current methods estimating gene-by-environment contributions to MonsterLM
Method | Description | Advantages | Limitations | MonsterLM |
---|---|---|---|---|
StructLMM29 | Evaluates interaction variance for multiple environmental factors with a single SNP. | Fast, robust model for a variety of different environmental exposures. | Limited to interaction effects of only a single SNP or genotype. | Analyzes variance explained by interactions genome-wide (after LD-pruning). |
CGI-GREML30 | Uses a mix of parametrized models and restricted likelihood methods to estimate GxE. | Well-structured for identifying GxE interactions with categorial exposures. | >250 Likelihood Ratio Tests; Slow; Cannot use continuous traits. | Can analyze continuous traits without categorizing them; quick, efficient Wald-test for R2. |
GxEMM31 | Linear mixed model method to detect GxE interactions across the genome and a single exposure. | Multiple parametrizations available to efficiently model GxE interaction effects. | Small sample size only; Minimal number of SNPs. | Can analyze a large sample size with many SNPs via genotype partitioning and conjugate gradient method. |
GRSxE32 | Method to detect total GxE interactions with a Gene-Risk Score. | Estimates the GxE contribution for all possible environmental factors with SNPs. | Assumes each SNP interacts equally with E. | Accounts for the unique interaction effect of each SNP with E. |
LEMMA33 | Linear mixed model method to detect GxE interactions across the genome and an estimated linear combination of exposures. | Considers the impact of over-lapping environmental exposures when computing total GxE contributions across the genome. | Requires parametrization and model specification; uses an estimated linear combination of exposures, assuming all E’s interact with the same SNPs | Tests for specific interaction with E rather than a linear combination of E. |
GxEsum34 | Estimates genome-wide GxE variance using GWAS summary statistics. | Has controlled type I error rates and unbiased GxE estimates; efficient computation-wise; can also be applied to binary disease traits; summary statistics advantages. | Disadvantages of using summary statistics include information limitations and population stratification bias susceptibility. | Individual-level data advantages include better handling of LD and complex interactions. |
LDSC GxE35 | Estimates genome-wide GxE variance using GxE GWAS (GWIS) summary statistics. GWIS replaces GWAS in standard LDSC regression37 to estimate . | Utilizes GWAS summary statistics; computationally efficient; typically robust to confounding from stratification and common environmental effects. | Risk of underestimation in high LD or low polygenicity regions; bias when LD scores from reference population and GWAS mismatch. | Individual-level data advantages include better handling of LD and complex interactions; robust in low polygenicity scenarios. |
MTG2 IGE28, 36 | Estimates variances explained by additive effects of exposure variables, by ExE interactions, and covariance between genetic effects and exposomic effects. | Precise estimates with low standard deviations, flexible for a variety of exposure-based interactions. | Requires generation of a genomic relationship matrix file (grm file) using PLINK which requires >1TB RAM as N > 100,000. | Does not require generating a genomic relationship matrix, is computationally efficient, and can use biobank-scale individual level data. |
Existing methods of biobank GxE estimation compared to MonsterLM. A description of eight methods to estimate GxE estimates with respective advantages, limitations, and comparisons to MonsterLM.