Skip to main content
eLife logoLink to eLife
. 2024 Jun 24;13:e90459. doi: 10.7554/eLife.90459

Discovering non-additive heritability using additive GWAS summary statistics

Samuel Pattillo Smith 1,2,3,4,, Gregory Darnell 1,5,, Dana Udwin 6, Julian Stamp 1, Arbel Harpak 3,4, Sohini Ramachandran 1,2,7,, Lorin Crawford 1,6,8,‡,
Editors: George H Perry9, George H Perry10
PMCID: PMC11196113  PMID: 38913556

Abstract

LD score regression (LDSC) is a method to estimate narrow-sense heritability from genome-wide association study (GWAS) summary statistics alone, making it a fast and popular approach. In this work, we present interaction-LD score (i-LDSC) regression: an extension of the original LDSC framework that accounts for interactions between genetic variants. By studying a wide range of generative models in simulations, and by re-analyzing 25 well-studied quantitative phenotypes from 349,468 individuals in the UK Biobank and up to 159,095 individuals in BioBank Japan, we show that the inclusion of a cis-interaction score (i.e. interactions between a focal variant and proximal variants) recovers genetic variance that is not captured by LDSC. For each of the 25 traits analyzed in the UK Biobank and BioBank Japan, i-LDSC detects additional variation contributed by genetic interactions. The i-LDSC software and its application to these biobanks represent a step towards resolving further genetic contributions of sources of non-additive genetic effects to complex trait variation.

Research organism: Human

Introduction

Heritability is defined as the proportion of phenotypic trait variation that can be explained by genetic effects (Bulik-Sullivan et al., 2015b, Bulik-Sullivan et al., 2015a, Shi et al., 2016). Until recently, studies of heritability in humans have been reliant on typically small sized family studies with known relatedness structures among individuals (Zaitlen et al., 2013; Polderman et al., 2015). Due to advances in genomic sequencing and the steady development of statistical tools, it is now possible to obtain reliable heritability estimates from biobank-scale data sets of unrelated individuals (Bulik-Sullivan et al., 2015b; Shi et al., 2016; Hou et al., 2019; Pazokitoroudi et al., 2020). Computational and privacy considerations with genome-wide association studies (GWAS) in these larger cohorts have motivated a recent trend to estimate heritability using summary statistics (i.e. estimated effect sizes and their corresponding standard errors). In the GWAS framework, additive effect sizes and standard errors for individual single nucleotide polymorphisms (SNPs) are estimated by regressing phenotype measurements onto the allele counts of each SNP independently. Through the application of this approach over the last two decades, it has become clear that many traits have a complex and polygenic basis—that is, hundreds to thousands of individual genetic loci across the genome often contribute to the genetic basis of variation in a single trait (Yengo et al., 2018).

Many statistical methods have been developed to improve the estimation of heritability from GWAS summary statistics (Bulik-Sullivan et al., 2015b, Shi et al., 2016, Speed and Balding, 2019, Song et al., 2022). The most widely used of these approaches is linkage disequilibrium (LD) score regression and the corresponding LDSC software (Bulik-Sullivan et al., 2015b), which corrects for inflation in GWAS summary statistics by modeling the relationship between the variance of SNP-level effect sizes and the sum of correlation coefficients between focal SNPs and their genomic neighbors (i.e. the LD score of each variant). The formulation of the LDSC framework relies on the fact that the expected relationship between chi-square test statistics (i.e. the squared magnitude of GWAS allelic effect estimates) and LD scores holds when complex traits are generated under the infinitesimal (or polygenic) model which assumes: (i) all causal variants have the same expected contribution to phenotypic variation and (ii) causal variants are uniformly distributed along the genome. Initial simulations in Bulik-Sullivan et al. showed that violations of these assumptions can be tolerated to a point, but begin to affect the estimation of narrow-sense heritability once a certain proportion of variants have nonzero effects. Importantly, the estimand of the LDSC model is the proportion of phenotypic variance attributable to additive effects of genotyped SNPs. The main motivation behind the LDSC model is that, for polygenic traits, many marker SNPs tag nonzero effects. This may simply arise because some of these SNPS are in LD with causal variants (Bulik-Sullivan et al., 2015b) or because their statistical association is the product of a confounding factor such as population stratification.

As of late, there have been many efforts to build upon and improve the LDSC framework. For example, recent work has shown that it is possible to estimate the proportion of phenotypic variation explained by dominance effects (Palmer et al., 2023) and local ancestry (Chan et al., 2023) using extensions of the LDSC model. One limitation of LDSC is that, in practice, it only uses the diagonal elements of the squared LD matrix in its formulation which, while computationally efficient, does not account for information about trait architecture that is captured by the off-diagonal elements. This tradeoff helps LDSC to scale genome-wide, but it has also been shown to lead to heritability estimates with large standard error (Ning et al., 2020, Zhang et al., 2021, Song et al., 2022). Recently, newer approaches have attempted to reformulate the LDSC model by using the eigenvalues of the LD matrix to leverage more of the information present in the correlation structure between SNPs (Shi et al., 2016, Song et al., 2022).

In this paper, we show that the LDSC framework can be extended to estimate greater proportions of genetic variance in complex traits (i.e. beyond the variance that is attributable to additive effects) when a subset of causal variants is involved in a gene-by-gene (G×G) interaction. Indeed, recent association mapping studies have shown that G×G interactions can drive heterogeneity of causal variant effect sizes (Patel et al., 2022). Importantly, non-additive genetic effects have been proposed as one of the main factors that explains ‘missing’ heritability—the proportion of heritability not explained by the additive effects of variants (Eichler et al., 2010).

The key insight we highlight in this manuscript is that SNP-level GWAS summary statistics can provide evidence of non-additive genetic effects contributing to trait architecture if there is a nonzero correlation between individual-level genotypes and their statistical interactions. We present the ‘interaction-LD score’ regression model or i-LDSC: an extension of the LDSC framework which recovers ‘missing’ heritability by leveraging this ‘tagged’ relationship between linear and nonlinear genetic effects. To validate the performance of i-LDSC in simulation studies, we focus on synthetic trait architectures that have been generated with contributions stemming from second-order and cis-acting statistical SNP-by-SNP interaction effects; however, note that the general concept underlying i-LDSC can easily be extended to other sources of non-additive genetic effects (e.g. gene-by-environment interactions). The main difference between i-LDSC and LDSC is that the i-LDSC model includes an additional set of ‘cis-interaction’ LD scores in its regression model. These scores measure the amount of phenoytpic variation contributed by genetic interactions that can be explained by additive effects. In practice, these additional scores are efficient to compute and require nothing more than access to a representative pairwise LD map, same as the input required for LD score regression.

Through extensive simulations, we show that i-LDSC recovers substantial non-additive heritability that is not captured by LDSC when genetic interactions are indeed present in the generative model for a given complex trait. More importantly, i-LDSC has a calibrated type I error rate and does not overestimate contributions of genetic interactions to trait variation in simulated data when only additive effects are present. While analyzing 25 complex traits in the UK Biobank and BioBank Japan, we illustrate that pairwise interactions are a source of ‘missing’ heritability captured by additive GWAS summary statistics—suggesting that phenotypic variation due to non-additive genetic effects is more pervasive in human phenotypes than previously reported. Specifically, we find evidence of tagged genetic interaction effects contributing to heritability estimates in all of the 25 traits in the UK Biobank, and 23 of the 25 traits we analyzed in the BioBank Japan. We believe that i-LDSC, with our development of a new cis-interaction score, represents a significant step towards resolving the true contribution of genetic interactions.

Results

Overview of the interaction-LD score regression model

Interaction-LD score regression (i-LDSC) is a statistical framework for estimating heritability (i.e. the proportion of trait variance attributable to genetic variance). Here, we will give an overview of the i-LDSC method and its corresponding software, as well as detail how its underlying model differs from that of LDSC (Bulik-Sullivan et al., 2015b). We will assume that we are analyzing a GWAS dats set 𝒟={𝐗,𝐲} where 𝐗 is an N×J matrix of genotypes with J denoting the number of SNPs (each of which is encoded as {0, 1, 2} copies of a reference allele at each locus j) and 𝐲 is an N-dimensional vector of measurements of a quantitative trait. The i-LDSC framework only requires summary statistics of individual-level data: namely, marginal effect size estimates for each SNP 𝜷^ and a sample LD matrix 𝐑 (which can be provided via reference panel data).

We begin by considering the following generative linear model for complex traits

y=b0+Xβ+Wθ+ε,εN(0,(1H2)I), (1)

where b0 is an intercept term; 𝜷=(β1,,βJ) is a J-dimensional vector containing the true additive effect sizes for an additional copy of the reference allele at each locus on y; W is an N×M matrix of (pairwise) cis-acting SNP-by-SNP statistical interactions between some subset of causal SNPs, where columns of this matrix are assumed to be the Hadamard (element-wise) product between genotypic vectors of the form 𝐱j𝐱k for the j-th and k-th variants; 𝜽=(θ1,,θM) is an M-dimensional vector containing the interaction effect sizes; 𝜺 is a normally distributed error term with mean zero and variance scaled according to the proportion of phenotypic variation not explained by genetic effects (Bulik-Sullivan et al., 2015b), which we will refer to as the broad-sense heritability of the trait denoted by H2; and 𝐈 denotes an N×N identity matrix. For convenience, we will assume that the genotype matrix (column-wise) and the trait of interest have been mean-centered and standardized (Strandén and Christensen, 2011; de Los Campos et al., 2013; Zhou et al., 2013). Lastly, we will let the intercept term b0 be a fixed parameter and we will assume that the effect sizes are each normally distributed with variances proportional to their individual contributions to trait heritability (Yang et al., 2010; Wu et al., 2011; Zhou et al., 2013; Crawford et al., 2017)

βjN(0,φβ2/J),θmN(0,φθ2/M). (2)

Effectively, we say that 𝕍[𝐗𝜷]=φβ2 is the proportion of phenotypic variation contributed by additive SNP effects under the generative model, while 𝕍[𝐖𝜽]=φθ2 makes up the proportion of phenotypic variation contributed by genetic interactions. While the appropriateness of treating genetic effects as random variables in analytical derivations has been questioned (de Los Campos et al., 2015), later, we will justify the theory presented here with simulation results showing that i-LDSC accurately recovers non-additive genetic variance in Equation 1 under a broad range of conditions.

There are two key takeaways from the generative model specified above. First, Equation 2 implies that the additive and non-additive components in Equation 1 are orthogonal to each other. In other words, 𝔼[𝜷𝐗𝐖𝜽]=𝔼[𝜷]𝐗𝐖𝔼[𝜽]=𝟎. This is important because it means that there is a unique partitioning of genetic variance when studying a trait of interest. The second key takeaway is that the genotype matrix 𝐗 and the matrix of genetic interactions 𝐖 themselves are correlated despite being linearly independent (see Materials and methods). This property stems from the fact that the pairwise interaction between two SNPs is encoded as the Hadamard product of two genotypic vectors in the form 𝐰m=𝐱j𝐱k (which is a nonlinear function of the genotypes).

A central objective in GWAS studies is to infer how much phenotypic variation can be explained by genetic effects. To achieve that objective, a key consideration involves incorporating the possibility of non-additive sources of genetic variation to be explained by additive effect size estimates obtained from GWAS analyses (Hill et al., 2008). If we assume that the genotype and interaction matrices are correlated, then X and 𝐖 are not completely orthogonal (i.e. such that XW0) and the following relationship between the moment matrix Xy, the observed marginal GWAS summary statistics β^, and the true coefficient values β from the generative model in Equation 1 holds in expectation (see Materials and methods)

E[Xy]=(XX)β+(XW)θE[β^]=Rβ+Vθ (3)

where 𝐑 is a sample estimate of the LD matrix, and 𝐕 represents a sample estimate of the correlation between the individual-level genotypes 𝐗 and the span of genetic interactions between causal SNPs in 𝐖. Intuitively, the term Vθ can be interpreted as the subset of pairwise interaction effects that are tagged by the additive effect estimates from the GWAS study. Note that, when (i) non-additive genetic effects do not contribute to the overall architecture of a trait (i.e. such that θ=0) or (ii) the genotype and interaction matrices 𝐗 and 𝐖 are uncorrelated, the equation above simplifies to a relationship between LD and summary statistics that is assumed in many GWAS studies and methods (Hormozdiari et al., 2014; Nakka et al., 2016; Zhu and Stephens, 2017; Zhang et al., 2018; Zhu and Stephens, 2018; Cheng et al., 2020; Demetci et al., 2021).

The goal of i-LDSC is to increase estimates of genetic variance by accounting for sources of non-additive genetic effects that can be explained by additive GWAS summary statistics. To do this, we extend the LD score regression framework and the corresponding LDSC software (Bulik-Sullivan et al., 2015b). Here, according to Equation 3, we note that β^N(Rβ+Vθ,λR) where λ is a scale variance term due to uncontrolled confounding effects (Guan and Stephens, 2011; Song et al., 2022). Next, we condition on Θ=(β,θ) and take the expectation of chi-square statistics χ2=Nβ^β^ to yield

E[β^β^]=E[E[β^β^|Θ]]=E[V[β^|Θ]+E[β^|Θ]E[β^|Θ]]=E[λR+(Rβ+Vθ)(Rβ+Vθ)]=E[λR+RββR+2RβθV+VθθV]=λR+(φβ2J)R2+(φθ2M)V2. (4)

We define j=krjk2 as the LD score for the additive effect of the j-th variant (Bulik-Sullivan et al., 2015b), and fj=mvjm2 represents the ‘cis-interaction’ LD score which encodes the pairwise interaction between the j-th variant and all other variants within a genomic window that is a pre-specified number of SNPs wide (Crawford et al., 2017), respectively. By considering only the diagonal elements of LD matrix in the first term, similar to the original LDSC approach (Bulik-Sullivan et al., 2015b; Song et al., 2022), we get the following simplified regression model

E[χ2]1+τ+fϑ (5)

where χ2=(χ12,,χJ2) is a J-dimensional vector of chi-square summary statistics, and =(1,,J) and f=(f1,,fJ) are J-dimensional vectors of additive and cis-interaction LD scores, respectively. Furthermore, we define the variance components τ=Nφβ2/J and ϑ=Nφθ2/M as the additive and non-additive regression coefficients of the model, and 1 is the intercept meant to model the bias factor due to uncontrolled confounding effects (e.g. cryptic relatedness structure). In practice, we efficiently compute the cis-interaction LD scores by considering only a subset of interactions between each j-th focal SNP and SNPs within a cis-proximal window around the j-th SNP. In our validation studies and applications, we base the width of this window on the observation that LD decays outside of a window of 1 centimorgan (cM); therefore, SNPs outside the 1 cM window centered on the j-th SNP will not significantly contribute to its LD scores. Note that the width of this window can be relaxed in the i-LDSC software when appropriate. We fit the i-LDSC model using weighted least squares to estimate regression parameters and derive p-values for identifying traits that have significant statistical evidence of tagged cis-interaction effects by testing the null hypothesis H0:ϑ=0. Importantly, under the null model of a trait being generated by only additive effects, the i-LDSC model in Equation 5 reduces to an infinitesimal model (Fisher, 1999) or, in the case some variants have no effect on the trait, a polygenic model.

Lastly, we want to note the empirical observation that the additive () and interaction (𝒇) LD scores are lowly correlated. This is important because it indicates that the presence of cis-interaction LD scores in the model specified in Equation 5 has little-to-no influence over the estimate for the additive coefficient τ. Instead, the inclusion of 𝒇 creates a multivariate model that can identify the proportion of variance explained by both additive and non-additive effects in summary statistics. In other words, we can interpret ϑ^ as an estimate of the phenotypic variation explained by tagged cis-acting interaction effects. The concept of additive genetic effects partially explaining non-additive variation has also described in various studies from quantitative genetics (Hill et al., 2008; Hivert et al., 2021; Mäki-Tanila and Hill, 2014). Under Hardy-Weinberg equilibrium, it can be shown that the additive variance explained by J SNPs takes on the following form (Materials and methods) (Falconer and Mackay, 1983)

σA2=j=1J2pj(1pj)[βj+2kjJpkθjk]2. (6)

The expression for the additive variance σA2 in Equation 6 is important because it represents the theoretical upper bound on the proportion of total phenotypic variance that can be recovered from GWAS summary statistics using the i-LDSC framework. As a result, we use the sum of coefficient estimates τ^+ϑ^σA2 to construct i-LDSC heritability estimates. A full derivation of the cis-interaction regression framework and details about its corresponding implementation in our software i-LDSC can be found in Materials and Methods.

Detection of tagged pairwise interaction effects using i-LDSC in simulations

We illustrate the power of i-LDSC across different genetic trait architectures via extensive simulation studies (Materials and methods). We generate synthetic phenotypes using real genome-wide genotype data from individuals of self-identified European ancestry in the UK Biobank. To do so, we first assume that traits have a polygenic architecture where all SNPs have a nonzero additive effect. Next, we randomly select a set of causal cis-interaction variants and divide them into two interacting groups (Materials and methods). One may interpret the SNPs in group #1 as being the ‘hubs’ in an interaction map (Crawford et al., 2017), whereas SNPs in group #2 are selected to be variants within some kilobase (kb) window around each SNP in group #1. We assume a wide range of simulation scenarios by varying the following parameters:

  • heritability: H2 = 0.3 and 0.6;

  • proportion of phenotypic variation that is generated by additive effects: ρ= 0.5, 0.8, and 1;

  • percentage of SNPs selected to be in group #1: 1%, 5%, and 10%;

  • genomic window used to assign SNPs to group #2: ± 10 and ± 100 kb.

We also varied the correlation between SNP effect size and minor allele frequency (MAF; as discussed in Schoech et al., 2019). All results presented in this section are based on 100 different simulated phenotypes for each parameter combination.

Figure 1 demonstrates that i-LDSC robustly detects significant tagged non-additive genetic variance, regardless of the total number of causal interactions genome-wide. Instead, the power of i-LDSC depends on the proportion of phenotypic variation that is generated by additive versus interaction effects (ρ), and its power tends to scale with the window size used to compute the cis-interaction LD scores (see Materials and methods). i-LDSC shows a similar performance for detecting tagged cis-interaction effects when the effect sizes of causal SNPs depend on their minor allele frequency and when we varied the number of SNPs assigned to be in group #2 within 10 kb and 100 kb windows, respectively (Figure 1—figure supplements 15).

Figure 1. Power of the i-LDSC framework to detect tagged pairwise genetic interaction effects on simulated data.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 1%, 5%, and 10% of the total number of SNPs genome-wide (see the x-axis in each panel). These interact with the group #2 SNPs which are selected to be variants within a ± 10 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with no minor allele frequency dependency α=0 (see Materials and methods). Panels (A) and (B) are results with simulations using a heritability H2=0.3, while panels (C) and (D) were generated with H2=0.6. We also varied the proportion of heritability contributed by additive effects to (A, C) ρ=0.5 and (B, D) ρ=0.8, respectively. Here, we are blind to the parameter settings used in generative model and run i-LDSC while computing the cis-interaction LD scores using different estimating windows of ± 5 (green), ± 10 (orange), ± 25 (purple), and ± 50 (pink) SNPs. Results are based on 100 simulations per parameter combination and the horizontal bars represent standard errors. Generally, the performance of i-LDSC increases with larger heritability and lower proportions of additive variation. Note that LDSC is not shown here because it does not search for tagged interaction effects in summary statistics.

Figure 1.

Figure 1—figure supplement 1. Power calculations for the i-LDSC framework to detect tagged pairwise genetic interaction effects on simulated data using a ± 10 kilobase (kb) window to generate cis-interactions around a focal SNP with a moderate minor allele frequency dependency α=-0.5 for effect sizes.

Figure 1—figure supplement 1.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 1%, 5%, and 10% of the total number of SNPs genome-wide (see the x-axis in each panel). These interact with the group #2 SNPs which are selected to be variants within a ± 10 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with minor allele frequency dependency α=-0.5 (see Materials and methods). Panels (A) and (B) are results of simulations where the total heritability explained by additive SNP effects and cis-interaction effects is H2=0.3, while panels (C) and (D) were generated with H2=0.6. We also varied the proportion of phenotypic variation explained by additive SNP effects to (A, C) ρ=0.5 and (B, D) ρ=0.8, respectively. Here, we are blind to the parameter settings used in generative model and run i-LDSC while computing the cis-interaction LD scores using different estimation windows of ± 5 (green), ± 10 (orange), ± 25 (purple), and ± 50 (pink) SNPs. Results are based on 100 simulations per parameter combination and the horizontal black bars represent standard errors.
Figure 1—figure supplement 2. Power calculations for the i-LDSC framework to detect tagged pairwise genetic interaction effects on simulated data using a ± 10 kilobase (kb) window to generate cis-interactions around a focal SNP with a strong minor allele frequency dependency α=-𝟏 for effect sizes.

Figure 1—figure supplement 2.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 1%, 5%, and 10% of the total number of SNPs genome-wide (see the x-axis in each panel). These interact with the group #2 SNPs which are selected to be variants within a ± 10 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with minor allele frequency dependency α=-0.5 (see Materials and methods). Panels (A) and (B) are results of simulations where the total heritability explained by additive SNP effects and cis-interaction effects is H2=0.3, while panels (C) and (D) were generated with H2=0.6. We also varied the proportion of phenotypic variation explained by additive SNP effects to (A, C) ρ=0.5 and (B, D) ρ=0.8, respectively. Here, we are blind to the parameter settings used in generative model and run i-LDSC while computing the cis-interaction LD scores using different estimation windows of ± 5 (green), ± 10 (orange), ± 25 (purple), and ± 50 (pink) SNPs. Results are based on 100 simulations per parameter combination and the horizontal black bars represent standard errors.
Figure 1—figure supplement 3. Power calculations for the i-LDSC framework to detect tagged pairwise genetic interaction effects on simulated data using a ± 10 kilobase (kb) window to generate cis-interactions around a focal SNP with no minor allele frequency dependency α=𝟎 for effect sizes.

Figure 1—figure supplement 3.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 1%, 5%, and 10% of the total number of SNPs genome-wide (see the x-axis in each panel). These interact with the group #2 SNPs which are selected to be variants within a ± 10 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with minor allele frequency dependency α=-0.5 (see Materials and methods). Panels (A) and (B) are results of simulations where the total heritability explained by additive SNP effects and cis-interaction effects is H2=0.3, while panels (C) and (D) were generated with H2=0.6. We also varied the proportion of phenotypic variation explained by additive SNP effects to (A, C) ρ=0.5 and (B, D) ρ=0.8, respectively. Here, we are blind to the parameter settings used in generative model and run i-LDSC while computing the cis-interaction LD scores using different estimation windows of ± 5 (green), ± 10 (orange), ± 25 (purple), and ± 50 (pink) SNPs. Results are based on 100 simulations per parameter combination and the horizontal black bars represent standard errors.
Figure 1—figure supplement 4. Power calculations for the i-LDSC framework to detect tagged pairwise genetic interaction effects on simulated data using a ± 100 kilobase (kb) window to generate cis-interactions around a focal SNP with a moderate minor allele frequency dependency α=-0.5 for effect sizes.

Figure 1—figure supplement 4.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 1%, 5%, and 10% of the total number of SNPs genome-wide (see the x-axis in each panel). These interact with the group #2 SNPs which are selected to be variants within a ± 10 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with minor allele frequency dependency α=-0.5 (see Materials and methods). Panels (A) and (B) are results of simulations where the total heritability explained by additive SNP effects and cis-interaction effects is H2=0.3, while panels (C) and (D) were generated with H2=0.6. We also varied the proportion of phenotypic variation explained by additive SNP effects to (A, C) ρ=0.5 and (B, D) ρ=0.8, respectively. Here, we are blind to the parameter settings used in generative model and run i-LDSC while computing the cis-interaction LD scores using different estimation windows of ± 5 (green), ± 10 (orange), ± 25 (purple), and ± 50 (pink) SNPs. Results are based on 100 simulations per parameter combination and the horizontal black bars represent standard errors.
Figure 1—figure supplement 5. Power calculations for the i-LDSC framework to detect tagged pairwise genetic interaction effects on simulated data using a ± 100 kilobase (kb) window to generate cis-interactions around a focal SNP with a strong minor allele frequency dependency α=-𝟏 for effect sizes.

Figure 1—figure supplement 5.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 1%, 5%, and 10% of the total number of SNPs genome-wide (see the x-axis in each panel). These interact with the group #2 SNPs which are selected to be variants within a ± 10 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with minor allele frequency dependency α=-0.5 (see Materials and methods). Panels (A) and (B) are results of simulations where the total heritability explained by additive SNP effects and cis-interaction effects is H2=0.3, while panels (C) and (D) were generated with H2=0.6. We also varied the proportion of phenotypic variation explained by additive SNP effects to (A, C) ρ=0.5 and (B, D) ρ=0.8, respectively. Here, we are blind to the parameter settings used in generative model and run i-LDSC while computing the cis-interaction LD scores using different estimation windows of ± 5 (green), ± 10 (orange), ± 25 (purple), and ± 50 (pink) SNPs. Results are based on 100 simulations per parameter combination and the horizontal black bars represent standard errors.

Importantly, i-LDSC does not falsely identify putative non-additive genetic effects in GWAS summary statistics when the synthetic phenotype was generated by only additive effects (ρ=1). Figure 2 illustrates the performance of i-LDSC under the null hypothesis H0:ϑ=0, with the type I error rates for different estimation window sizes of the cis-interaction LD scores highlighted in panel A. Here, we also show that, when no genetic interaction effects are present, i-LDSC unbiasedly estimates the cis-interaction coefficient in the regression model to be ϑ^=0 (Figure 2B), robustly estimates the heritability (Figure 2C), and provides well-calibrated p-values when assessed over many traits (Figure 2D). This behavior is consistent across different MAF-dependent effect size distributions, and p-value calibration is not sensitive to misspecification of the estimation windows used to generate the cis-interaction LD scores (Figure 2—figure supplements 12).

Figure 2. The i-LDSC framework is well-calibrated under the null hypothesis and does not identify evidence of tagged non-additive effects when polygenic traits are generated by only additive effects.

In these simulations, synthetic trait architecture is made up of only additive genetic variation (i.e. ρ=1). Coefficients for additive and interaction effects were simulated with no minor allele frequency dependency α=0 (see Materials and methods). Here, we are blind to the parameter settings used in generative model and run i-LDSC while computing the cis-interaction LD scores using different estimating windows of ± 5 (green), ± 10 (orange), ± 25 (purple), and ± 50 (pink) SNPs. (A) Mean type I error rate using the i-LDSC framework across an array of estimation window sizes for the cis-interaction LD scores. This is determined by assessing the p-value of the cis-interaction coefficient (ϑ) in the i-LDSC regression model and checking whether p < 0.05. (B) Estimates of the cis-interaction coefficient (ϑ). Since traits were simulated with only additive effects, these estimates should be centered around zero. (C) Estimates of the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) where the true additive variance is set to H2ρ=0.6. (D) QQ-plot of the p-values for the cis-interaction coefficient (ϑ) in i-LDSC. Results are based on 100 simulations per parameter combination and the horizontal bars represent standard errors.

Figure 2.

Figure 2—figure supplement 1. The i-LDSC framework is well-calibrated under the null hypothesis and does not identify evidence of tagged non-additive effects when polygenic traits are generated by only additive effects and a moderate minor allele frequency dependency α=-0.5 for effect sizes.

Figure 2—figure supplement 1.

In these simulations, synthetic trait architecture is made up of only additive genetic variation (i.e. ρ=1). Coefficients for additive and interaction effects were simulated with minor allele frequency dependency α=-0.5 (see Materials and methods). Here, we are blind to the parameter settings used in generative model and run i-LDSC while computing the cis-interaction LD scores using different estimation windows of ± 5 (green), ± 10 (orange), ± 25 (purple), and ± 50 (pink) SNPs. (A) Mean type I error rate using the i-LDSC framework across an array of estimation window sizes for the cis-interaction LD scores. This is determined by assessing the p-value of the cis-interaction coefficient (ϑ) in the i-LDSC regression model and checking whether p < 0.05. (B) Estimates of the cis-interaction coefficient (ϑ). Since traits were simulated with only additive effects, these estimates should be centered around zero. (C) Estimates of the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) where the true additive variance is set to H2ρ=0.6. (D) QQ-plot of the p-values for the cis-interaction coefficient (ϑ) in i-LDSC. Results are based on 100 simulations per parameter combination and the horizontal black bars represent standard errors.
Figure 2—figure supplement 2. The i-LDSC framework is well-calibrated under the null hypothesis and does not identify evidence of tagged non-additive effects when polygenic traits are generated by only additive effects and a strong minor allele frequency dependency α=-𝟏 for effect sizes.

Figure 2—figure supplement 2.

In these simulations, synthetic trait architecture is made up of only additive genetic variation (i.e. ρ=1). Coefficients for additive and interaction effects were simulated with minor allele frequency dependency α=-0.5 (see Materials and methods). Here, we are blind to the parameter settings used in generative model and run i-LDSC while computing the cis-interaction LD scores using different estimation windows of ± 5 (green), ± 10 (orange), ± 25 (purple), and ± 50 (pink) SNPs. (A) Mean type I error rate using the i-LDSC framework across an array of estimation window sizes for the cis-interaction LD scores. This is determined by assessing the p-value of the cis-interaction coefficient (ϑ) in the i-LDSC regression model and checking whether p < 0.05. (B) Estimates of the cis-interaction coefficient (ϑ). Since traits were simulated with only additive effects, these estimates should be centered around zero. (C) Estimates of the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) where the true additive variance is set to H2ρ=0.6. (D) QQ-plot of the p-values for the cis-interaction coefficient (ϑ) in i-LDSC. Results are based on 100 simulations per parameter combination and the horizontal black bars represent standard errors.

One of the innovations that i-LDSC offers over the traditional LDSC framework is increased heritability estimates after the identification of non-additive genetic effects that are tagged by GWAS summary statistics. Here, we applied both methods to the same set of simulations in order to understand how LDSC behaves for traits generated with cis-interaction effects. Figure 3 depicts boxplots of the heritability estimates for each approach and shows that, across an array of different synthetic phenotype architectures, LDSC captures less of phenotypic variance explained by all genetic effects. It is important to note that i-LDSC can yield upwardly biased heritability estimates when the cis-interaction scores are computed over genomic window sizes that are too small; however, these estimates become more accurate for larger window size choices (Figure 3—figure supplement 1). In contrast to LDSC, which aims to capture phenotypic variance attributable to the additive effects of genotyped SNPs, i-LDSC accurately partitions genetic effects into additive versus cis-interacting components, which in turn generally leads the ability of i-LDSC to capture more genetic variance. The mean absolute error between the true generative heritability and heritability estimates produced by i-LDSC and LDSC are shown in Supplementary files 1 and 2, respectively. Generally, the error in heritability estimates is higher for LDSC than it is for i-LDSC across each of the scenarios that we consider.

Figure 3. i-LDSC robustly and accurately estimates the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) in simulations in polygenic traits, compared to LDSC, due to our accounting for interaction effects tagged in additive GWAS summary statistics.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank (Materials and Methods). All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 10% of the total number of SNPs genome-wide. These interact with the group #2 SNPs which are selected to be variants within a ± 100 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with no minor allele frequency dependency α=0 (see Materials and methods). Here, we assume a heritability (A) H2=0.3 or (B) H2=0.6 (marked by the black dotted lines, respectively), and we vary the proportion contributed by additive effects with ρ={0.2,0.4,0.6,0.8}. The grey dotted lines represent the total contribution of additive effects in the generative model for the synthetic traits (H2ρ). i-LDSC outperforms LDSC in recovering heritability across each scenario. Results are based on 100 simulations per parameter combination.

Figure 3.

Figure 3—figure supplement 1. i-LDSC robustly and accurately estimates the proportions of phenotypic variance explained (PVE) by genetic effects in polygenic traits by accounting for interaction effects tagged by GWAS summary statistics.

Figure 3—figure supplement 1.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 10% of the total number of SNPs genome-wide. These interact with the group #2 SNPs which are selected to be variants within a ± 100 kilobase (kb) window around each SNP in group #1. Coefficients for additive and cis-interaction effects were simulated with no minor allele frequency dependency α=0 (see Materials and methods). Here, we assume a total heritability explained by additive SNP and cis-interaction effects is (A) H2=0.3 or (B) H2=0.6 (marked by the black dotted lines, respectively), and we vary the proportion contributed by additive effects with ρ={0.2,0.4,0.6,0.8}. The grey dotted line represents the total contribution of additive effects in the generative model for the synthetic traits (H2ρ). We run i-LDSC while computing the cis-interaction LD scores using different estimating windows of ± 5, ± 10, ± 25, and ± 50 SNPs, respectively. These results help motivate the selection of scores calculated using a ± 50 SNP window in our empirical analyses. Results are based on 100 simulations per parameter combination.
Figure 3—figure supplement 2. Performance of LDSC and i-LDSC on simulated polygenic traits with architectures that are determined by additive, cis-interaction, and gene-by-environment (G×E) effects.

Figure 3—figure supplement 2.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 10% of the total number of SNPs genome-wide. These interact with the group #2 SNPs which are selected to be variants within a ± 100 kilobase (kb) window around each SNP in group #1. G×E effects were simulated using an amplification model (Zhu et al., 2023 ; see Materials and methods) where we split the sample population in half to emulate two subsets of individuals coming from different environments. We randomly draw variant effect sizes for the first environment from a standard Gaussian distribution. Then effect sizes for the second environment are set to be the product of the effect sizes in from with first environment with an amplifier w=[1.1,1.2,,2] (see the x-axis in each panel). Both the cis-interaction and G×E effects were set to explain a quarter of the total phenotypic variation and the remaining half was explained by additive SNP effects. Panels (A) and (B) show estimates of the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) from LDSC and i-LDSC, respectively. Panels (C) and (D) show i-LDSC estimates of the phenotypic variation explained by tagged non-additive genetic effects using the cis-interaction LD score (i.e. estimates of ϑ). We assume the total heritability explained by all genetic effects to be (A, C) H2=0.6 and (B, D) H2=0.3. Results are based on 100 simulations per parameter combination.
Figure 3—figure supplement 3. Performance of LDSC and i-LDSC on simulated polygenic traits with architectures that are determined by additive, cis-interaction, and gene-by-ancestry (G×Ancestry) effects with principal components (PCs) included in the GWAS model to correct for additional structure.

Figure 3—figure supplement 3.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 10% of the total number of SNPs genome-wide. These interact with the group #2 SNPs which are selected to be variants within a ± 100 kilobase (kb) window around each SNP in group #1. G×Ancestry effects were simulated as the product of individual genotypes and the SNP loadings for each of the first 10 PCs (see the x-axis in each panel). Both the cis-interaction and G×Ancestry effects were set to explain a quarter of the total phenotypic variation and the remaining half was explained by additive SNP effects. The proportion of genotypic variance explained by each PC is shown in green. Panels (A) and (B) show estimates of the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) from LDSC and i-LDSC, respectively. Panels (C) and (D) show i-LDSC estimates of the phenotypic variation explained by tagged non-additive genetic effects using the cis-interaction LD score (i.e. estimates of ϑ). We assume the total heritability explained by all genetic effects to be (A, C) H2=0.6 and (B, D) H2=0.3. Results are based on 100 simulations per parameter combination.
Figure 3—figure supplement 4. Performance of LDSC and i-LDSC on simulated polygenic traits with architectures that are determined by additive, cis-interaction, and gene-by-ancestry (G×Ancestry) effects without correcting for the additional structure in the GWAS analysis.

Figure 3—figure supplement 4.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 10% of the total number of SNPs genome-wide. These interact with the group #2 SNPs which are selected to be variants within a ± 100 kilobase (kb) window around each SNP in group #1. G×Ancestry effects were simulated as the product of individual genotypes and the SNP loadings for each of the first 10 PCs (see the x-axis in each panel). Both the cis-interaction and G×Ancestry effects were set to explain a quarter of the total phenotypic variation and the remaining half was explained by additive SNP effects. The proportion of genotypic variance explained by each PC is shown in green. Panels (A) and (B) show estimates of the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) from LDSC and i-LDSC, respectively. Panels (C) and (D) show i-LDSC estimates of the phenotypic variation explained by tagged non-additive genetic effects using the cis-interaction LD score (i.e. estimates of ϑ). We assume the total heritability explained by all genetic effects to be (A, C) H2=0.6 and (B, D) H2=0.3. Results are based on 100 simulations per parameter combination.
Figure 3—figure supplement 5. Performance of LDSC and i-LDSC on simulated polygenic traits with architectures that are determined by only additive and gene-by-environment (G×E) effects.

Figure 3—figure supplement 5.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). G×E effects were simulated using an amplification model65 (see Materials and methods) where we split the sample population in half to emulate two subsets of individuals coming from different environments. We randomly draw variant effect sizes for the first environment from a standard Gaussian distribution. Then effect sizes for the second environment are set to be the product of the effect sizes in from with first environment with an amplifier w=[1.1,1.2,,2] (see the x-axis in each panel). Additive and G×E effects were set to explain half of the phenotypic variation. Note that unlike results depicted in Figure 3—figure supplement 2, there are no cis-interaction effects that affect trait architecture. Here, panels (A) and (B) show estimates of the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) from LDSC and i-LDSC, respectively. Panels (C) and (D) show i-LDSC estimates of the phenotypic variation explained by tagged non-additive genetic effects using the cis-interaction LD score (i.e. estimates of ϑ). We assume the total heritability explained by all genetic effects to be (A, C) H2=0.6 and (B, D) H2=0.3. Results are based on 100 simulations per parameter combination.
Figure 3—figure supplement 6. Performance of LDSC and i-LDSC on simulated polygenic traits with architectures that are determined by only additive and gene-by-ancestry (G×Ancestry) effects with principal components (PCs) included in the GWAS model to correct for additional structure.

Figure 3—figure supplement 6.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). G×Ancestry effects were simulated as the product of individual genotypes and the SNP loadings for each of the first 10 PCs (see the x-axis in each panel). Additive and G×E effects were set to explain half of the phenotypic variation. The proportion of genotypic variance explained by each PC is shown in green. Note that unlike results depicted in Figure 3—figure supplement 3, there are no cis-interaction effects that affect trait architecture. Here, panels (A) and (B) show estimates of the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) from LDSC and i-LDSC, respectively. Panels (C) and (D) show i-LDSC estimates of the phenotypic variation explained by tagged non-additive genetic effects using the cis-interaction LD score (i.e. estimates of ϑ). We assume the total heritability explained by all genetic effects to be (A, C) H2=0.6 and (B, D) H2=0.3. Results are based on 100 simulations per parameter combination.
Figure 3—figure supplement 7. Performance of LDSC and i-LDSC on simulated polygenic traits with architectures that are determined by only additive and gene-by-ancestry (G×Ancestry) effects without correcting for the additional structure in the GWAS analysis.

Figure 3—figure supplement 7.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). G×Ancestry effects were simulated as the product of individual genotypes and the SNP loadings for each of the first 10 PCs (see the x-axis in each panel). Additive and G×E effects were set to explain half of the phenotypic variation. The proportion of genotypic variance explained by each PC is shown in green. Note that unlike results depicted in Figure 3—figure supplement 4, there are no cis-interaction effects that affect trait architecture. Here, panels (A) and (B) show estimates of the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) from LDSC and i-LDSC, respectively. Panels (C) and (D) show i-LDSC estimates of the phenotypic variation explained by tagged non-additive genetic effects using the cis-interaction LD score (i.e., estimates of ϑ). We assume the total heritability explained by all genetic effects to be (A, C) H2=0.6 and (B, D) H2=0.3. Results are based on 100 simulations per parameter combination.
Figure 3—figure supplement 8. Performance of LDSC and i-LDSC on simulated traits with sparse architectures that are determined by only additive effects.

Figure 3—figure supplement 8.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. Here, traits were generated with solely additive effects where only variants with the top or bottom {1,5,10,25,50,100} percentile of LD scores were given nonzero coefficients in the generative model (see the x-axis in each panel). Panels (A) and (B) show estimates of the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) from LDSC and i-LDSC, respectively. Panels (C) and (D) show i-LDSC estimates of the phenotypic variation explained by tagged non-additive genetic effects using the cis-interaction LD score (i.e. estimates of ϑ). We assume the total heritability explained by all genetic effects to be (A, C) H2=0.6 and (B, D) H2=0.3. Results are based on 100 simulations per parameter combination. The overall takeaway is that breaking the assumed relationship between LD scores and chi-squared test statistics (i.e. that they are generally positively correlated) led to unbounded estimates of heritability for both LDSC and i-LDSC in all but the (polygenic) scenario when 100% of SNPs contributed to phenotypic variation.
Figure 3—figure supplement 9. The non-additive component estimates in i-LDSC are robust to unobserved additive effects in a haplotype.

Figure 3—figure supplement 9.

Synthetic trait architectures are simulated such that a substantial proportion of genetic variance is explained by an additive effect that is not directly observed. The goal of these simulations was to assess how these unobserved effects influence the estimation of the non-additive variance component in the i-LDSC model. In each simulation, we generated haplotypes that each contain 5000 variants. Next, we select either (A, B) a single causal variant with only an additive effect or (C, D) a set of ten causal variants with only additive effects. In each case, the causal variants have a MAF that is randomly selected between: (i) (0.01, 0.1), (ii) (0.1, 0.2), (iii) (0.2, 0.3), (iv) (0.3, 0.4), or (v) (0.4, 0.5) as depicted on the x-axis. The corresponding additive effect size for each causal variant across the haplotypes is simulated to be inversely proportional to its MAF (Schoech et al., 2019). On the y-axis, we measure the difference (Δ) between i-LDSC coefficient estimates when every variant is included in the model versus when the haplotype causal variants are omitted for two different trait architectures with broad-sense heritability set to (A, C) H2=0.6 and (B, D) H2=0.3. Results are based on 100 simulations per parameter combination.
Figure 3—figure supplement 10. The i-LDSC framework protects against the false discovery of non-additive genetic variance when causal interacting SNPs are unobserved and the proportion of genetic variance explained by additive effects is equal to ρ= 0.5.

Figure 3—figure supplement 10.

Synthetic trait architectures are simulated such that a substantial proportion of genetic variance is explained by pairwise genetic interaction effects that are not directly observed. The goal of these simulations was to assess how these unobserved effects influence the estimation of the non-additive variance component in the i-LDSC model. In each simulation, we generated haplotypes that each contain 5000 variants. Every SNP in the genome had at least a small additive effect. The corresponding additive effect size for each variant across the haplotypes is simulated to be inversely proportional to its MAF (Schoech et al., 2019). We then set (A, C) 1% or (B, D) 5% of causal variants in each haplotype to have non-zero interaction effects. On the y-axis, we measure the difference (Δ) between i-LDSC coefficient estimates when every variant is included in the model versus when the specified percentage of variants with pairwise genetic interaction effects are omitted for two different trait architectures with broad-sense heritability set to (A, B) H2=0.6 and (C, D) H2=0.3. Results are based on 100 simulations per parameter combination.
Figure 3—figure supplement 11. The i-LDSC framework protects against the false discovery of non-additive genetic variance when causal interacting SNPs are unobserved and the proportion of genetic variance explained by additive effects is equal to ρ= 0.8.

Figure 3—figure supplement 11.

Synthetic trait architectures are simulated such that a substantial proportion of genetic variance is explained by pairwise genetic interaction effects that are not directly observed. The goal of these simulations was to assess how these unobserved effects influence the estimation of the non-additive variance component in the i-LDSC model. In each simulation, we generated haplotypes that each contain 5000 variants. Every SNP in the genome had at least a small additive effect. The corresponding additive effect size for each variant across the haplotypes is simulated to be inversely proportional to its MAF (Schoech et al., 2019). We then set (A, C) 1% or (B, D) 5% of causal variants in each haplotype to have non-zero interaction effects. On the y-axis, we measure the difference (Δ) between i-LDSC coefficient estimates when every variant is included in the model versus when the specified percentage of variants with pairwise genetic interaction effects are omitted for two different trait architectures with broad-sense heritability set to (A, B) H2=0.6 and (C, D) H2=0.3. Results are based on 100 simulations per parameter combination.
Figure 3—figure supplement 12. Bias in LDSC and i-LDSC estimates when the additive and interaction effect sizes in the generative model of complex traits are correlated.

Figure 3—figure supplement 12.

To simulate synthetic trait architectures, we first simulated additive effects for each variant to be MAF-dependent (i.e., α=-1). Here, we set the corresponding interaction effect sizes to have a correlation with the additive effect sizes equal to r={-1,-0.8,-0.6,,0.6,0.8,1} (labeled across the x-axis). On the y-axis, we measure the bias in the LDSC and i-LDSC estimates of phenotypic variance explained (PVE) by genetic effects. In each simulation, we generate traits with an equal proportion of variance explained by additive and interaction effects and a total broad-sense heritability set to (A) H2=0.6 and (B) H2=0.3. Results are based on 100 simulations for each parameter value.
Figure 3—figure supplement 13. Bias in LDSC and i-LDSC estimates when interaction effect sizes in the generative model of complex traits are a linear or squared function of the the additive effects.

Figure 3—figure supplement 13.

To simulate synthetic trait architectures, we first simulated additive effects for each variant to be MAF-dependent (i.e., α=-1). Here, we set the corresponding interaction effect sizes to be either (A, C) a linear function or (B, D) a squared function of the additive effects with a scaling factor q={0.1,0.2,,0.8,1} (labeled across the x-axis). On the y-axis, we measure the bias in the LDSC and i-LDSC estimates of the phenotypic variance explained (PVE) by genetic effects. In each simulation, we generate traits with an equal proportion of variance explained by additive and interaction effects and a total broad-sense heritability set to (A, B) H2=0.6 and (C, D) H2=0.3. Results are based on 100 simulations for each parameter value.

Next, we perform an additional set of simulations where we explore other common generative models for complex trait architecture that involve non-additive genetic effects. Specifically, we compare heritability estimates from LDSC and i-LDSC in the presence of additive effects, cis-acting interactions, and a third source of genetic variance stemming from either gene-by-environment (G×E) or or gene-by-ancestry (G×Ancestry) effects. Details on how these components were generated can be found in Materials and Methods. In general, i-LDSC underestimates overall heritability when additive effects and cis-acting interactions are present alongside G×E (Figure 3—figure supplement 2) and/or G×Ancestry effects when PCs are included as covariates (Figure 3—figure supplement 3). Notably, when PCs are not included to correct for residual stratification, both LDSC and i-LDSC can yield unbounded heritability estimates greater than 1 (Figure 3—figure supplement 4). Also interestingly, when we omit cis-interactions from the generative model (i.e. the genetic architecture of simulated traits is only made up of additive and G×E or G×Ancestry effects), i-LDSC will still estimate a nonzero genetic variance component with the cis-interaction LD scores (Figure 3—figure supplements 57). Collectively, these results empirically show the important point that cis-interaction scores are not enough to recover missing genetic variation for all types of trait architectures; however, they are helpful in recovering phenotypic variation explained by statistical interaction effects. Recall that the linear relationship between (expected) χ2 test statistics and LD scores proposed by the LDSC framework holds when complex traits are generated under the polygenic model where all causal variants have the same expected contribution to phenotypic variation. When cis-interactions affect genetic architecture (e.g. in our earlier simulations in Figure 3), these assumptions are violated in LDSC, but the inclusion of the additional nonlinear scores in i-LDSC help recover the relationship between the expectation of χ2 test statistics and LD.

As a further demonstration of how i-LDSC performs when assumptions of the original LD score model are violated, we also generated synthetic phenotypes with sparse architectures using the spike-and-slab model (Zhou et al., 2013). Here, traits were simulated with solely additive effects, but this time only variants with the top or bottom {1,5,10,25,50,100} percentile of LD scores were given nonzero effects (see Materials and methods). Breaking the relationship assumed under the LDSC framework between LD scores and chi-squared statistics (i.e. that they are generally positively correlated) led to unbounded estimates of heritability in all but the (polygenic) scenario when 100% of SNPs contributed to the phenotypic variation (Figure 3—figure supplement 8).

Finally, we performed a set of polygenic simulations to assess if i-LDSC estimates of non-additive genetic variance could be spuriously inflated due to either (i) unobserved additive effects (see, for example, Hemani et al., 2014), (ii) unobserved SNPs that are involved in genetic interactions, or by (iii) nonzero correlation between the additive and interaction effect sizes in the generative model (i.e. breaking the independence assumption in Equation 2). In the first setting, we observed that, across a range of both minor allele frequencies and effect sizes, the omission of causal haplotypes had a negligible effect on the estimated value of the coefficients in i-LDSC (Figure 3—figure supplement 9). We hypothesize this is due to the fact that the simulations were done for polygenic architectures where all SNPs have at least an additive effect. As a result, not observing a small subset of SNPs does not hinder the ability of i-LDSC to estimate genetic variance because the effect size of each SNP is small. If these simulations were conducted for sparse architectures, we would have likely seen a greater impact on i-LDSC; although, we have already shown the LD score regression framework to be uncalibrated for traits with sparse genetic architectures (again see Figure 3—figure supplement 8). In the second setting, we observed that the i-LDSC framework protects against the false discovery of non-additive genetic effects and underestimates the variance component ϑ when causal variants involved in pairwise interactions were unobserved (Figure 3—figure supplements 10 and 11). As a direct comparison, estimates of the additive variance component τ in i-LDSC were not affected by the unobserved interacting variants. Lastly, in the third setting, we observed that the mean estimate of the genetic variance in both LDSC and i-LDSC had a slight upward bias as the correlation between additive and interaction effect sizes in the generative model increased; however, the median of these bias estimates was still near zero across all simulated scenarios and their corresponding replicates (Figure 3—figure supplements 12 and 13).

Application of i-LDSC to the UK Biobank and BioBank Japan

To assess whether pairwise interaction genetic effects are significantly affecting estimates of heritability in empirical biobank data, we applied i-LDSC to 25 continuous quantitative traits from the UK Biobank and BioBank Japan (Supplementary file 3). Protocols for computing GWAS summary statistics for the UK Biobank are described in the Materials and methods; while pre-computed summary statistics for BioBank Japan were downloaded directly from the consortium website (https://pheweb.jp/downloads). We release the cis-acting SNP-by-SNP interaction LD scores used in our analyses on the i-LDSC GitHub repository from two reference groups in the 1000 Genomes: 489 individuals from the European superpopulation (EUR) and 504 individuals from the East Asian (EAS) superpopulation (see also Supplementary files 4 and 5).

In each of the 25 traits, we analyzed in the UK Biobank, we detected significant proportions of estimated genetic variation stemming from tagged pairwise cis-interactions (Table 1). This includes many canonical traits of interest in heritability analyses: height, cholesterol levels, urate levels, and both systolic and diastolic blood pressure. Our findings in Table 1 are supported by multiple published studies identifying evidence of non-additive effects playing a role in the architectures of different traits of interest. For example, Li et al., 2020 found evidence for genetic interactions that contributed to the pathogenesis of coronary artery disease. It was also recently shown that non-additive genetic effects plays a significant role in body mass index (Song et al., 2022). Generally, we find that the traditional LDSC produces lower estimates of trait heritability because it does not consider the additional sources of genetic signal that i-LDSC does (Table 1). In BioBank Japan, 23 of the 25 traits analyzed had a significant nonlinear component detected by i-LDSC — with HDL and triglyceride levels being the only exceptions.

Table 1. i-LDSC heritability estimates and p-values highlighting statistically significant contributions of tagged pairwise genetic interaction effects for 25 traits in the UK Biobank and BioBank Japan.

Here, LDSC heritability estimates are included as a baseline. The difference between the approaches is that the i-LDSC heritability estimates include proportions of phenotypic variation that are explained by tagged non-additive variation (see columns with estimates of ϑ). Note that all 25 traits analyzed in the UK Biobank and 23 of the 25 traits analyzed in BioBank Japan have a statistically significant amount of tagged non-additive genetic effects as detected by the cis-interaction LD score (p < 0.05). The two traits without significant tagged non-additive genetic effects in BioBank Japan were HDL (p = 0.081) and Triglyceride (p = 0.110). These traits are indicated by *. The i-LDSC p-values are related to the estimates of the ϑ coefficients which are also displayed in Figure 4.

Trait UKB (LDSC) UKB (i-LDSC) UKB ϑ^ UKB p-value BBJ (LDSC) BBJ (i-LDSC) BBJ ϑ^ BBJ p-value
Basophil 0.0250 0.0315 0.0065 1.572× 10−12 0.0684 0.1548 0.0864 0.025
BMI 0.1757 0.2349 0.0592 3.083× 10−84 0.1667 0.2656 0.0989 2.438× 10−18
Cholesterol 0.0954 0.0974 0.0020 1.821× 10−16 0.0629 0.1268 0.0639 2.740× 10−4
CRP 0.0354 0.0414 0.0060 9.845× 10−12 0.0202 0.1625 0.1423 0.020
DBP 0.0940 0.1203 0.0263 1.118× 10−65 0.0605 0.1267 0.0662 1.675× 10−7
EGFR 0.1521 0.1999 0.0478 1.187× 10−46 0.1010 0.1225 0.0215 4.232× 10−5
Eosinophil 0.1055 0.1375 0.0320 1.230× 10−18 0.0785 0.1973 0.1188 0.001
HBA1C 0.0906 0.1083 0.0177 1.578× 10−26 0.1057 0.1308 0.0251 0.031
HDL* 0.1599 0.1768 0.0169 9.636× 10−37 0.1590 0.1838 0.0248 0.081
Height 0.3675 0.4815 0.1140 1.038× 10−64 0.3941 0.7336 0.3395 7.433× 10−33
Hematocrit 0.1078 0.1352 0.0274 2.479× 10−25 0.0752 0.0928 0.0176 3.689× 10−5
Hemoglobin 0.1177 0.1433 0.0256 4.284× 10−27 0.0702 0.0752 0.0050 9.037× 10−4
LDL 0.0802 0.0859 0.0057 5.087× 10−13 0.0745 0.1438 0.0693 0.018
Lymphocyte 0.0402 0.0501 0.0099 4.906× 10−19 0.0844 0.1757 0.0913 5.479× 10−5
MCH 0.1361 0.1597 0.0236 1.785× 10−25 0.1536 0.2831 0.1295 1.042× 10−5
MCHC 0.0317 0.0364 0.0047 3.730× 10−12 0.0571 0.0650 0.0079 0.027
MCV 0.1630 0.1902 0.0272 1.180× 10−29 0.1530 0.2818 0.1288 1.042× 10−5
Monocyte 0.0788 0.0955 0.0167 5.257× 10−18 0.0888 0.1549 0.0661 0.004
Neutrophil 0.1102 0.1391 0.0289 1.777× 10−33 0.1191 0.2114 0.0923 5.050× 10−5
Platelet 0.1992 0.2447 0.0455 2.303× 10−37 0.1565 0.2436 0.0871 7.724× 10−9
RBC 0.1574 0.1933 0.0359 3.292× 10−31 0.1203 0.2068 0.0865 5.972× 10−8
SBP 0.0954 0.1201 0.0247 8.660× 10−75 0.0769 0.1604 0.0835 9.075× 10−10
Triglycerides* 0.1061 0.1204 0.0143 1.410× 10−26 0.1171 0.2670 0.1499 0.110
Urate 0.1217 0.1550 0.0333 9.642× 10−38 0.1395 0.3462 0.2067 0.015
WBC 0.0962 0.1250 0.0288 9.866× 10−34 0.1024 0.2266 0.1242 1.346× 10−8

For each of the 25 traits that we analyzed, we found that the i-LDSC heritability estimates are significantly correlated with corresponding estimates from LDSC in both the UK Biobank (r2=0.988, P=5.936×10-24) and BioBank Japan (r2=0.849, P=6.061×10-11) as shown in Figure 4A. Additionally, we found that the heritability estimates for the same traits between the two biobanks are highly correlated according to both LDSC (r2=0.848, P=7.166×10-11) and i-LDSC (r2=0.666, P=6.551×10-7) analyses as shown in Figure 4B. After comparing the i-LDSC heritability estimates to LDSC, we then assessed whether there was significant difference in the amount of phenotypic variation explained by the non-additive genetic effect component in the GWAS summary statistics derived from the the UK Biobank and BioBank Japan (i.e. comparing the estimates of ϑ; see Figure 4—figure supplement 1A). We show that, while heterogeneous between traits, the phenotypic variation explained by genetic interactions is relatively of the same magnitude for both biobanks (r2=0.372, P=0.0119). Notably, the trait with the most significant evidence of tagged cis-interaction effects in GWAS summary statistics is height which is known to have a highly polygenic architecture.

Figure 4. The i-LDSC framework recovers heritability and provides estimates of tagged cis-interactions in GWAS summary statistics (ϑ) for 25 quantitiative traits in the UK Biobank and BioBank Japan.

(A) In both the UK Biobank (green) and BioBank Japan (purple), estimates of phenotypic variance explained (PVE) by genetic effects from i-LDSC and LDSC are highly correlated for 25 different complex traits. The Spearman correlation coefficient between heritability estimates from LDSC and i-LDSC for the UK Biobank and BioBank Japan are r2=0.989 and r2=0.850, respectively. The y=x dotted line represents the values at which estimates from both approaches are the same. (B) PVE estimates from the UK Biobank are better correlated with those from the BioBank Japan across 25 traits using LDSC (Spearman r2=0.848) than i-LDSC (Spearman r2=0.666). (C) Both the original and stratified LDSC models recover the same amount of PVE when the cis-interaction LD score is included as an additional component in the UK Biobank analysis (Spearman r2=0.989). These models are listed as i-LDSC and s+i-LDSC, respectively. For s+i-LDSC, we included 97 functional annotations from Gazal et al. to estimate heritability. (D) Estimates of non-additive variance components in i-LDSC versus s+i-LDSC (Spearmen r2=0.184). While not statistically significant in the stratified analysis with the additional annotations, the non-additive component still makes nonzero contributions to the PVE estimation for all 25 traits in the UK Biobank (see Tables 1 and 2).

Figure 4.

Figure 4—figure supplement 1. Additional results from applying LDSC and i-LDSC for 25 quantitiative traits in the UK Biobank and BioBank Japan.

Figure 4—figure supplement 1.

(A) i-LDSC estimates of the phenotypic variation explained by tagged non-additive genetic effects using the cis-interaction LD score (i.e., estimates of ϑ) between traits in the UK Biobank and BioBank Japan (Spearman r2=0.372). (B) Estimates of i-LDSC and LDSC intercept terms for 25 traits analyzed in the UK Biobank and BioBank Japan. Intercept terms using LDSC and i-LDSC are highly correlated in both the UK Biobank (Spearman r2=0.888) and BioBank Japan (Spearman r2=0.813). The x=y dotted line represents points for when the two sets of estimates are equal.

The intercepts estimated by LDSC and i-LDSC are also highly correlated in both the UK Biobank and the BioBank Japan (Figure 4—figure supplement 1B). Recall that these intercept estimates represent the confounding factor due to uncontrolled effects. For LDSC, this does include phenotypic variation that is due to unaccounted for pairwise statistical genetic interactions. The i-LDSC intercept estimates tend to be correlated with, but are generally different than, those computed with LDSC — empirically indicating that non-additive genetic variation is partitioned away and is missed when using the standard LD score alone. This result shows similar patterns in both the UK Biobank (r2=0.888, P=1.962×10-12) and BioBank Japan (r2=0.813, P=7.814×10-10).

Lastly, we performed an additional analysis in the UK Biobank where the cis-interaction scores are included as an annotation alongside 97 other functional categories in the stratified-LD score regression framework and its software s-LDSC (Gazal et al., 2017; Materials and methods). Here, s-LDSC heritability estimates still showed an increase with the interaction scores versus when the publicly available functional categories were analyzed alone, but albeit at a much smaller magnitude (Table 2). The contributions from the pairwise interaction component to the overall estimate of genetic variance ranged from 0.005 for MCHC (P=0.373) to 0.055 for HDL (P=0.575; Figure 4C and D). Furthermore, in this analysis, the estimates of the non-additive components were no longer statistically significant for any of the traits in the UK Biobank (Table 2). Despite this, these results highlight the ability of the i-LDSC framework to identify sources of ‘missing’ phenotypic variance explained in heritability estimation. Importantly, moving forward, we suggest using the cis-interaction scores with additional annotations whenever they are available as it provides more conservative estimates of the role of non-additive effects on trait architecture.

Table 2. Comparison of s-LDSC and i-LDSC estimates of phenotypic variance explained (PVE) by genetic effects for 25 complex traits in the UK Biobank.

Here, we use stratified LD score regression (s-LDSC) to partition heritability across different genomic elements (Finucane et al., 2015). We used 97 functional annotations from Gazal et al. to estimate heritability in 25 traits. We then appended cis-interaction LD scores as an additional annotation to obtain heritability estimates (this method is referred to as s+i-LDSC in the table). p-values for the s+i-LDSC model detailing the contributions of tagged non-additive genetic effects for 25 traits are provided in the last column. Note that, while not statistically significant in this stratified analysis with the additional annotations, the non-additive component still makes nonzero contributions to the PVE estimation for all 25 traits.

Trait UKB PVE (s-LDSC) UKB PVE (s+i-LDSC) s+i-LDSC p-value
Basophil 0.0363 0.0375 0.4728
BMI 0.2100 0.2482 0.8126
Cholesterol 0.1042 0.1358 0.6202
CRP 0.0452 0.0524 0.6483
DBP 0.1228 0.1441 0.6125
EGFR 0.1826 0.2105 0.8507
Eosinophil 0.1403 0.1578 0.1867
HBA1C 0.1040 0.1275 0.6917
HDL 0.1820 0.2373 0.5754
Height 0.4315 0.4726 0.5224
Hematocrit 0.1416 0.1646 0.3956
Hemoglobin 0.1504 0.1795 0.2299
LDL 0.0858 0.1131 0.8812
Lymphocyte 0.0545 0.0651 0.1453
MCH 0.1497 0.1545 0.0968
MCHC 0.0450 0.0496 0.3728
MCV 0.1814 0.1930 0.1530
Monocyte 0.1085 0.1431 0.5421
Neutrophil 0.1320 0.1599 0.2499
Platelet 0.2317 0.2628 0.7371
RBC 0.1933 0.2223 0.3197
SBP 0.1206 0.1419 0.1100
Triglycerides 0.1335 0.1621 0.5301
Urate 0.1530 0.1736 0.1177
WBC 0.1221 0.1482 0.5155

Discussion

In this paper, we present i-LDSC, an extension of the LD score regression framework which aims to recover missing heritability from GWAS summary statistics by incorporating an additional score that measures the non-additive genetic variation that is tagged by genotyped SNPs. Here, we demonstrate how i-LDSC builds upon the original LDSC model through the development of new ‘cis-interaction’ LD scores which help to investigate signals of cis-acting SNP-by-SNP interactions (Figure 1 and Figure 1—figure supplements 15). Through extensive simulations, we show that i-LDSC is well-calibrated under the null model when polygenic traits are generated only by additive effects (Figure 2 and Figure 2—figure supplements 12), we highlight that i-LDSC provides greater heritability estimates over LDSC when traits are indeed generated with cis-acting SNP-by-SNP interaction effects (Figure 3 and Figure 3—figure supplement 1, and Supplementary files 1 and 2), and we tested the robustness of i-LDSC on phenotypes where assumptions of the original LD score model are violated (Figure 3—figure supplements 213). Finally, in real data, we show examples of many traits with estimated GWAS summary statistics that tag cis-interaction effects in the UK Biobank and BioBank Japan (Figure 4 and Figure 4—figure supplement 1, Tables 1 and 2, and Supplementary files 3-5). We have made i-LDSC a publicly available command line tool that requires minimal updates to the computing environment used to run the original implementation of LD score regression. In addition, we provide pre-computed cis-interaction LD scores calculated from the European (EUR) and East Asian (EAS) reference populations in the 1000 Genomes phase 3 data (see Data and Software Availability under Materials and Methods).

The current implementation of the i-LDSC framework offers many directions for future development and applications. First, an area of future work would be to explore how the relationship between cis-interaction LD scores and interaction effect sizes from the generative model of complex traits might bias heritability estimates provided by i-LDSC (e.g., similar to the relationship we explored between the standard LD scores and linear effect sizes in Figure 3—figure supplement 8). Second, as we showed with our simulation studies (Figure 3—figure supplements 28), the cis-interaction LD scores that we propose are not always enough to recover explainable non-additive genetic effects for all types of trait architectures. While we focus on pairwise cis-acting SNP-by-SNP statistical interactions in this work, the theoretical concepts underlying i-LDSC can easily be adapted to other types of interactions as well. Third, in our analysis of the UK Biobank and BioBank Japan, we showed that the inclusion of additional categories via frameworks such as stratified LD score regression (Finucane et al., 2015) can be used to provide more refined heritability estimates from GWAS summary statistics while accounting for linkage (see results in Table 1 versus Table 2). A key part of our future work is to continue to explore whether considering functional annotation groups would also improve our ability to identify tagged non-additive genetic effects. Lastly, we have only focused on analyzing one phenotype at a time in this study. However, many previous studies have extensively shown that modeling multiple phenotypes can often dramatically increase power (Runcie et al., 2020; Stamp et al., 2022). Therefore, it would be interesting to extend the i-LDSC framework to multiple traits to study nonlinear genetic correlations in the same way that LDSC was recently extended to uncover additive genetic correlation maps across traits (Naqvi et al., 2021).

Materials and methods

Generative statistical model for complex traits

Our goal in this study is to reanalyze summary statistics from genome-wide association studies (GWAS) and estimate heritability while accounting for both additive genetic associations and tagged interaction effects. We begin by assuming the following generative linear model for complex traits which can be seen as an extended view of Equation 1 in the main text

y=b0+Xβ+XDω+Wθ+ε,εN(0,(1H2)I), (7)

where 𝐲 denotes an N-dimensional vector of phenotypic states for a quantitative trait of interest measured in N individuals; b0 is an intercept term; 𝐗 is an N×J matrix of genotypes, with J denoting the number of single nucleotide polymorphism (SNPs) encoded as {0,1,2} copies of a reference allele at each locus; 𝜷=(β1,,βJ) is a J-dimensional vector containing the true additive effect sizes for an additional copy of the reference allele at each locus on y; XD is an N×J matrix that represents the dominance for each genotype encoded as {0,1,1} with corresponding effect sizes ω; W is an N×M matrix of genetic interactions; 𝜽=(θ1,,θM) is an M-dimensional vector containing the interaction effect sizes; 𝜺 is a normally distributed error term with mean zero and variance scaled according to the proportion of phenotypic variation not explained by the broad-sense heritability of the trait, denoted by H2; and 𝐈 denotes an N×N identity matrix. Note that the encoding for dominance in 𝐗D was chosen because it imposes orthogonality with the genotype encoding in 𝐗 (Purcell et al., 2007; Vitezica et al., 2017; Palmer et al., 2023).

For convenience, we will assume that the genotype matrix (column-wise), the dominance matrix (also column-wise), and trait of interest have all been standardized (Strandén and Christensen, 2011; de Los Campos et al., 2013; Zhou et al., 2013). Furthermore, while the matrix 𝐖 could encode any source of non-additive genetic interactions (e.g. gene-by-environmental effects) in theory, we limit our focus in this study to trait architectures that have been generated with contributions stemming from cis-acting statistical SNP-by-SNP (or pairwise) interactions. To that end, we assume that the columns of 𝐖 are the Hadamard (element-wise) product between genotypic vectors of the form 𝐱j𝐱k for the j-th and k-th variants. We also want to point out that the generative formulation of Equation 7 can also be easily extended to accommodate other fixed effects (e.g. age, sex, or genotype principal components), as well as other random effects terms that can be used to account for sample non-independence due to other environmental factors.

As a final set of assumptions, we will let the intercept term b0 be a fixed parameter while allowing the other coefficients to follow independent Gaussian distributions with variances proportional to their individual contributions to the trait heritability (Yang et al., 2010; Wu et al., 2011; Zhou et al., 2013; Jiang and Reif, 2015; Crawford et al., 2017)

βjN(0,φβ2/J),ωjN(0,φω2/J),θmN(0,φθ2/M), (8)

for j=1,,J and m=1,,M. The broad-sense heritability of the trait is defined as H2=φβ2+φω2+φθ2. Under the generative model in Equation 7, we then say that 𝕍[𝐗𝜷]=φβ2 is the proportion of phenotypic variation contributed by additive SNP effects, 𝕍[𝐗D𝝎]=φω2 is the proportion of phenotypic variation contributed by dominance effects, and the set of interactions involving some subset of causal SNPs contribute the remaining proportion to the heritability 𝕍[𝐖𝜽]=φθ2. As we mentioned in the main text, we recognize that the appropriateness of treating genetic effects as random variables in analytical derivations has been questioned (de Los Campos et al., 2015), but our simulation studies show that i-LDSC accurately recovers non-additive genetic variance in Equation 7 under a broad range of conditions.

Orthogonality between additive and non-additive genetic effects

Assuming that the effect sizes {𝜷,𝝎,𝜽} in Equation 8 follow independent and zero mean Gaussian distributions leads to orthogonality between the additive and non-additive components in Equation 7. Since the genotypes 𝐗 and the dominance values 𝐗D are fixed orthogonal matrices, it is straightforward to show that Cov[𝐗𝜷,𝐗D𝝎]=0 (Vitezica et al., 2017; Palmer et al., 2023). The same relationship can be shown for the additive and the pairwise interaction genetic effects where

Cov[Xβ,Wθ]=E[βXWθ]E[βX]E[Wθ]=E[rsβr(XW)rsθs]E[β]XWE[θ]=rs(XW)rsE[βrθs]0XW0=rs(XW)rsE[βr]E[θs]=0 (9)

with 𝐱j and 𝐰m denoting the j-th and m-th column of the individual-level genotype matrix 𝐗 and the interaction matrix 𝐖, respectively. Note that a similar derivation to Equation 9 can also be done for the dominance and pairwise genetic interaction effects. This concept of orthogonality is important because we want to preserve a unique partitioning of genetic variance when modeling a trait of interest.

Genotypes and their interactions are correlated despite being linearly independent

The design matrices 𝐗 and 𝐖 in Equation 7 are not linearly dependent because the pairwise interactions between two SNPs are encoded as the Hadamard product of two genotypic vectors in the form 𝐱j𝐱k (which is a nonlinear function). Linear dependence would have implied that one could find a transformation between a SNP and an interaction term in the form 𝐰m=c×𝐱j for some constant c. However, despite their linear independence, 𝐗 and 𝐖 are themselves not orthogonal and still have a nonzero correlation. This implies that the inner product between genotypes and their interactions is nonzero 𝐗𝐖𝟎. To see this, we focus on a focal SNP 𝐱j and consider three different types of interactions:

  • Scenario I: Interaction between a focal SNP with itself (𝐱j𝐱j).

  • Scenario II: Interaction between a focal SNP with a different SNP (𝐱j𝐱k).

  • Scenario III: Interaction between a focal SNP with a pair of different SNPs (𝐱k𝐱l).

The following derivations rely on the fact that: (1) we assume that genotypes have been mean-centered and scaled to have unit variance, and (2) under Hardy-Weinberg equilibrium, SNPs marginally follow a binomial distribution 𝐱jBin(2,p) where p represents the minor allele frequency (MAF) (Wray et al., 2007; Lippert et al., 2013).

Scenario I

The covariance between a focal SNP and an interaction with itself is Cov[𝐱j,𝐱j𝐱j]=𝔼[𝐱j3]-𝔼[𝐱j]𝔼[𝐱j2]. With mean-centered SNPs, this is proportional to 𝔼[𝐱j3]=(q-p)/2pq which is the skewness of the binomial distribution where, again, p= MAF and q= 1-MAF of the j-th SNP.

Scenario II

Assume that we have two SNPs, 𝐱jBin(2,pj) and 𝐱kBin(2,pk) where pj and pk represent their respective minor allele frequencies. We want to compute the correlation between 𝐱j and the interaction 𝐱j𝐱k where Cov[𝐱j,𝐱j𝐱k]=𝔼[𝐱j2𝐱k]-𝔼[𝐱j]𝔼[𝐱j𝐱k]. Again, with the mean-centered assumption, the covariance is proportional to the expectation 𝔼[𝐱j2𝐱k]. Here, with SNPs taking on values {0,1,2}, the joint distribution between 𝐱j2 and 𝐱k can be written out as the following Kang and Jung, 2001:

𝐱j2=0 𝐱j2=1 𝐱j2=4
𝐱k=0 ujk2 2ujk(1-pk-ujk) (1-pk-ujk)2
𝐱k=1 2ujk(1-pj-ujk) 2ujk(ujk+pj+pk1)+2(1pjujk)(1pkujk) 2(ujk+pj+pk-1)(1-pk-ujk)
𝐱k=2 (1-pj-ujk)2 2(ujk+pj+pk-1)(1-pj-ujk) (ujk+pj+pk-1)2

where ujk=(1-pj)(1-pk)+rjkpjpk(1-pj)(1-pk) and rjk is the Pearson correlation or linkage disequilibrium (LD) between the j-th and k-th SNPs.

Scenario III

The covariance between a focal SNP and an interaction with a pair of different SNPs Cov[𝐱j,𝐱k𝐱l] will be nonzero if the j-th SNP is correlated with either variant (i.e., rjk0 or rjl0).

Traditional estimation of additive GWAS summary statistics

As previously mentioned, the key to this work is that SNP-level GWAS summary statistics can also tag non-additive genetic effects when there is a nonzero correlation between individual-level genotypes and their interactions (as defined in Equation 7). Throughout the rest of this section, we will use 𝐗𝐗/N to denote the LD or pairwise correlation matrix between SNPs. We will then let 𝐑 represent an LD matrix empirically estimated from external data (e.g. directly from GWAS study data, or using a pairwise LD map from a population that is representative of the samples analyzed in the GWAS study). The important property here is the following

𝔼[𝐗𝐗]N𝐑,𝔼[𝐱j𝐱j]N,𝔼[𝐱j𝐱k]Nrjk (10)

where the term rjk is again defined as the Pearson correlation coefficient between the j-th and k-th SNPs, respectively.

In traditional GWAS studies, summary statistics of the true additive effects 𝜷=(𝐗𝐗)-1𝐗𝐲 in Equation 7 are typically derived by computing a marginal least squares estimate with the observed data

β^j=(𝐱j𝐱j)-1𝐱j𝐲    𝜷^=diag(𝐗𝐗)-1𝐗𝐲. (11)

There are two key identities that may be taken from Equation 11. The first uses Equation 10 and is the approximate relationship (in expectation) between the moment matrix 𝐗𝐲 and the linear effect size estimates 𝜷^:

𝔼[𝐗𝐲]=𝔼[diag(𝐗𝐗)𝜷^]N𝜷^. (12)

The second key point combines Equations 10 and 12 to describe the asymptotic relationship between the observed marginal GWAS summary statistics 𝜷^ and the joint coefficient values 𝜷 where (in expectation)

E[β]=E[(XX)1Xy](NR)1Nβ^=R1β^. (13)

After some algebra, the above mirrors a high-dimensional regression model (in expectation) where 𝜷^=𝐑𝜷 with the estimated summary statistics as the response variables and the empirically estimated LD matrix acting as the design matrix (Hormozdiari et al., 2014; Hormozdiari et al., 2016; Zhang et al., 2018; Cheng et al., 2020; Demetci et al., 2021). Theoretically, the resulting coefficients output from this high-dimensional model are the desired true effect size estimates used to generate the phenotype of interest.

Additive GWAS summary statistics with tagged interaction effects

When interactions contribute to the architecture of complex traits (i.e. 𝜽𝟎), the marginal GWAS summary statistics derived using least squares in Equation 11 will also explain non-additive variation when there is a nonzero correlation between genotypes and their interactions. To see this, we use the concept of ‘omitted variable bias’ (Barreto and Howland, 2005) where the fitted model aims to estimate the true additive coefficients 𝜷 but does not account for contributions from the non-additive components which also contribute to trait architecture. In this case, we get the following

β^=diag(XX)1Xy=diag(XX)1X[Xβ+XDω+Wθ+ε]. (14)

Since we assume that the genotypes are orthogonal to both the dominance effects in Equation 7, we know that 𝐗𝐗D=𝟎. This simplifies the above to be the following

β^=diag(XX)1XXβ+diag(XX)1XWθ+diag(XX)1Xε (15)

where the matrix 𝐗𝐖(which we showed to be nonzero) can be interpreted as the sample correlation between individual-level genotypes and the cis-interactions between causal SNPs. By taking the expectation using Equations 10 and 12, we get the following alternative (approximate) relationship between the observed marginal GWAS summary statistics 𝜷^ and the true coefficient values 𝜷

𝔼[𝜷^]=𝐑𝜷+𝐕𝜽, (16)

which results from our initial assumption that the residuals are normally distributed with mean zero 𝔼[𝜺]=𝟎 in Equation 7. Here, we define 𝐕 to represent a sample estimate of the correlation between the individual-level genotypes and the non-additive genetic interaction matrix such that 𝔼[𝐗𝐖]N𝐕. Similar to the LD matrix 𝐑, the correlation matrix 𝐕 is also assumed to be computed from reference panel data. Intuitively, when 𝜽𝟎 there is additional phenotypic variation contributed by pairwise interactions that can be explained by GWAS effect size estimates. Moreover, when 𝐕𝜽=𝟎, then the relationship in Equation 16 converges onto the conventional asymptotic assumption (in expectation) between GWAS summary statistics and the true additive coefficients in Equation 13; Hormozdiari et al., 2014; Hormozdiari et al., 2016; Zhang et al., 2018; Cheng et al., 2020; Demetci et al., 2021.

Connection to quantitative genetics theory

The concept of additive genetic effects partially explaining non-additive variation has also described in classical quantitative genetics (Hill et al., 2008; Hivert et al., 2021; Mäki-Tanila and Hill, 2014). Consider an individual genotyped at J loci each with major and minor alleles A and B, respectively. Let pj be the allele frequency of A at the j-th locus, aj denote the additive effect, and [aa]jk be the additive-by-additive (pairwise) interaction effect between loci j and k, and [aaa]jkl represent a third order interaction between loci j, k, and l. For simplicity in presentation, assume that dominance only makes a small contribution to the genetic variance (Palmer et al., 2023; Pazokitoroudi et al., 2021; Zhu et al., 2015). The population mean is given as the following

μ=2j=1Jpjaj+4j=1Jk>jJpjpk[aa]jk+8j=1Jk>jJl>k>jJpjpkpl[aaa]jkl+ (17)

We follow the assumption that the genetic variation in human complex traits can predominately be explained by additive effects, with the remainder variation being mostly explained by additive-by-additive effects (Weinreich et al., 2018; Jiang and Reif, 2015; Fisher, 1919; Lynch and Walsh, 1998). As a result, we will ignore the higher order interaction terms in Equation 17. Under Hardy-Weinberg equilibrium, we can find the average effect by taking the first derivative of the population mean with respect to the frequency of the increasing allele (Mäki-Tanila and Hill, 2014; Hivert et al., 2021). For the j-th SNP, the average effect (including terms up to second-order interaction) is given by the following

ηj=12(μpj)=aj+2kjJpk[aa]jk+O([aaa]jkl) (18)

which notably contains both the additive effect and a summation of additive-by-additive interactions between pairs of loci. The additive genetic variance for the j-th SNP takes on the following form

σA2(j)=2pj(1pj)[aj+2kjJpk[aa]jk]2=2pj(1pj)[aj2+2ajkjJpk[aa]jk+4(kjJpk[aa]jk)2] (19)

which is the product of the square of the average effect in Equation 18 and the heterozygosity at j-th locus 𝕍[𝐱j]=2pj(1-pj) (again assuming that SNPs marginally follow a binomial distribution). The total additive variance is then obtained by summing over the J loci such that σA2=jσA2(j) (Falconer and Mackay, 1983).

We can derive a parallel construction for additive genetic variance using the generative random effect model presented in Equation 7; Hivert et al., 2021. Here, we will leverage that with genotype data taken for N individuals, ixij/N=2pj. Ignoring the assumed small contributions from dominance effects, the population mean for a quantitative trait 𝐲 can be written as the following

μ=1Ni=1Nyi=1Ni=1N[b0+j=1Jxijβj+j=1Jk>jJxijxikθjk+εi]=b0+2j=1Jpjβj+4j=1Jk>jJpjpkθjk+1Ni=1Nεi. (20)

To find the average effect for the j-th locus, we this time take the first derivative of the population mean in Equation 20 with respect to the allele frequency such that

ηj=12(μpj)=βj+2kjJpkθjk (21)

which, similar to the theoretical form in quantitative genetics, also contains both the additive effect of the j-th SNP and additional terms encoding the interaction effect between the j-th SNP and all other variants in the data. Once again, under Hardy-Weinberg equilibrium, the additive variance for the j-th SNP is found as taking on the following form

σA2(j)=2pj(1pj)[βj+2kjJpkθjk]2=2pj(1pj)[βj2+2βjkjJpkθjk+4(kjJpkθjk)2] (22)

where we can explicitly draw connections between the two frameworks by setting βj=aj and θjk=[aa]jk. Note that when there no non-additive effects (such that 𝜽=𝟎), the above reduces to σA2=j2pj(1-pj)βj2 which resembles the classical form for the additive genetic variance (Lynch and Walsh, 1998).

Full derivation of interaction LD score regression

In order to derive the interaction LD score (i-LDSC) regression framework, recall that our goal is to recover missing heritability from GWAS summary statistics by incorporating an additional score that measures the non-additive genetic variation that is tagged by genotyped SNPs. To do this, we build upon the LD score regression framework and the LDSC software (Bulik-Sullivan et al., 2015b). Here, we assume nonzero contributions from cis-acting pairwise interaction effects in the generative model of complex traits as in Equation 16, and we use the observed least squares estimates from Equation 11 to compute chi-square statistics χj2=Nβ^j2 for every j=1,,J variant in the data. Taking the expectation of these statistics yields

𝔼[χj2]=N𝔼[β^j2]=N[𝕍[β^j]+(𝔼[β^j])2]. (23)

We can simplify Equation 23 in two steps. First, by combining the prior assumption in Equation 8 and the asymptotic approximation in Equation 16, we can show that marginal expectation (i.e. when not conditioning on the true coefficients) 𝔼[β^j]=0 for all variants. Second, by conditioning on the generative model from Equation 7, we can use the law of total variance to simplify 𝕍[β^j] where

V[β^j]=E[V[β^j|X]]+V[E[β^j|X]]E[V[xjy/N|X]]+0=E[1N2xj{V[y|X]}xj]=E[1N2xj{φβ2JXX+φω2JXDXD+φθ2MWW+(1H2)}xj]=E[1N2{φβ2JxjXXxj+φω2JxjXDXDxj+φθ2MxjWWxj+N(1H2)}]=E[1N2{φβ2JxjXXxj+φθ2MxjWWxj+N(1H2)}]

since 𝐱j𝐗D=𝟎. Using the same logic from the original LDSC regression framework (Bulik-Sullivan et al., 2015b), we can use Isserlis’ theorem Isserlis, 1918 to write the above in terms of more familiar quantities based on sample correlations

1N2𝐱j𝐗𝐗𝐱j=k=1Jr~jk2,1N2𝐱j𝐖𝐖𝐱j=m=1Mv~jm2 (24)

where r~jk is used to denote the sample correlation between additively-coded genotypes at the j-th and k-th variants, and v~jm is used to denote the sample correlation between the genotype of the j-th variant and the m-th genetic interaction on the phenotype of interest (again see Equation 16). Furthermore, we can use the delta method (only displaying terms up to 𝒪(1/N2)) to show that (in expectation)

𝔼[r~jk2]rjk2+(1-rjk2)/N,𝔼[v~jm2]vjm2+(1-vjm2)/N. (25)

Next, we can then approximate the quantities in Equation 24 via the following

𝔼[k=1Jr~jk2]j+(J-j)/N,𝔼[m=1Mv~jm2]fj+(M-fj)/N (26)

where j is the corresponding LD score for the additive effect of the j-th variant and fj represents the “interaction” LD score between the j-th SNP and all other variants in the data set (Crawford et al., 2017), respectively. Altogether, this leads to the specification of the univariate framework with the j-th SNP

𝔼[χj2]N[(φβ2J)j+(φθ2M)fj+1N(1-H2)]=jτ+fjϑ+1 (27)

where we define τ=Nφβ2/J as estimates of the additive genetic signal, the coefficient ϑ=Nφθ2/M as an estimate of the proportion of phenotypic variation explained by tagged pairwise interaction effects, and 1 is the intercept meant to model the misestimation due to uncontrolled confounding effects (e.g. cryptic relatedness and population stratification). Similar to the original LDSC formulation, an intercept greater than one means significant bias. Note that the simplification for many of the terms above such as (1-H2)/N1/N results from our assumption that the number of individuals in our study is large. For example, the sample sizes for each biobank-scale study considered in the analyses of this manuscript are at least on the order of N104 observations (see Supplementary file 5). Altogether, we can jointly express Equation 27 in multivariate form as

𝔼[𝝌2]τ+𝒇ϑ+𝟏 (28)

where 𝝌2=(χ12,,χJ2) is a J-dimensional vector of chi-square summary statistics, and =(1,,J) and 𝒇=(f1,,fJ) are J-dimensional vectors of additive and cis-interaction LD scores, respectively. It is important to note that, while 𝝌2 must be recomputed for each trait of interest, both vectors and 𝒇 only need to be constructed once per reference panel or individual-level genotypes (see next section for efficient computational strategies).

To identify summary statistics that have significant tagged interaction effects, we test the null hypothesis H0:ϑ=0. The i-LDSC software package implements the same model fitting strategy as LDSC. Here, we use weighted least squares to fit the joint regression in Equation 28 such that

ϑ^=(𝒇𝚿𝒇)-1𝒇𝚿𝝌2,ψjj=[jτ^+fjϑ^+1]-2 (29)

where 𝚿 is a J×J diagonal weight matrix with nonzero elements set to values inversely proportional to the conditional variance 𝕍[χj2|j,fj]=ψjj-1 to adjust for both heteroscedasticity and over-estimation of the summary statistics for each SNP (Bulik-Sullivan et al., 2015b). Standard errors for each coefficient estimate are derived via a jackknife over blocks of SNPs in the data (Finucane et al., 2015), and we then use those standard errors to derive p-values with a two-sided test (i.e. testing the alternative hypothesis HA:ϑ0). It is worth noting that the block-jackknife approach tends to be conservative and yield larger standard errors for hypothesis testing (Efron, 1982). As an alternative, we could first run i-LDSC using the block-jackknife procedure over all traits in a study and then use the average of the standard errors to calculate the statistical significance of coefficient estimates; but we do not explore this strategy here and leave that for future work. The quantitative genetics expression for the additive variance σA2 in Equation 22 is important because it represents the theoretical upper bound on the proportion of phenotypic variance that can be explained from GWAS summary statistics via i-LDSC. Using this relationship, we can write the following (approximate) inequality

τ^+ϑ^j=1J2pj(1-pj)[βj+2kjJpkθjk]2=σA2. (30)

For all analyses in this paper, we estimate proportion of phenotypic variance explained by genetic effects using a sum of the coefficients τ^+ϑ^ (i.e. the estimated additive component plus the additional genetic variance explained by the tagged pairwise interaction effects).

Efficient computation of cis-interaction LD scores

In practice, cis-interaction LD scores in i-LDSC can be computed efficiently through realizing two key opportunities for optimization. First, given J SNPs, the full matrix of genome-wide interaction effects 𝐖 contains on the order of J(J-1)/2 total pairwise interactions. However, to compute the cis-interaction score for each SNP, we simply can replace the full 𝐖 matrix with a subsetted matrix 𝐖j which includes only interactions involving the j-th SNP. Analogous to the original LDSC formulation (Bulik-Sullivan et al., 2015b), we consider only interactive SNPs within a cis-window proximal to the focal j-th SNP for which we are computing the i-LDSC score. In the original LDSC model, this is based on the observation that LD decays outside of a window of 1 centimorgan (cM) (Bulik-Sullivan et al., 2015b); therefore, SNPs outside the 1 cM window centered on the j-th SNP j will not significantly contribute to its LD score. The second opportunity for optimization comes from the fact that the matrix of interaction effects for any focal SNP, 𝐖j, does not need to be explicitly generated. Referencing Equation 24, the i-LDSC scores are defined as 𝐱j𝐖j𝐖j𝐱j/N2. This can be re-written as 𝐱j(𝐃j𝐗(j))(𝐃j𝐗(j))𝐱j, where 𝐃j=diag(𝐱j) is a diagonal matrix with the j-th genotype as its nonzero elements (Crawford et al., 2017) and 𝐗(j) denotes the subset SNPs within a cis-window proximal to the focal j-th SNP. This means that the i-LDSC score for the j-th SNP can be simply computed as the following

fj1N2(𝐱j)2𝐗(j)𝐗(j)(𝐱j)2. (31)

With these simplifications, the computational complexity of generating i-LDSC scores reduces to that of computing LD scores — modulo a vector-by-vector Hadamard product which, for each SNP, is constant factor of N (i.e. the number of genotyped individuals).

Coefficient estimates as determined by cis-interaction window size

When computing cis-interaction LD scores, the most important decision is choosing the number of interacting SNPs to include in 𝐗(j) (or equivalently 𝐖j for each j-th focal SNP in the calculation of fj in Equation 31). The i-LDSC framework considers different estimating windows to account for our lack of a priori knowledge about the ‘correct’ non-additive genetic architecture of traits. Theoretically, one could follow previous work Guan and Stephens, 2011; Carbonetto and Stephens, 2012; Zhou et al., 2013; Zhu and Stephens, 2017; Zhu and Stephens, 2018; Demetci et al., 2021 by considering an L-valued grid of possible SNP interaction window sizes. After fitting a series of i-LDSC regressions with cis-interaction LD scores 𝒇(l) generated under the L-different window sizes, we could compute normalized importance weights using their maximized likelihoods via the following

π(l)=(,𝒇(l);𝜷^)l(,𝒇(l);𝜷^),l=1Lπ(l)=1. (32)

As a final step in the model fitting procedure, we could then compute averaged estimates of the coefficients τ and ϑ by marginalizing (or averaging) over the L-different grid combinations of estimating windows

τ^=l=1Lπ(l)τ^(l),ϑ^=l=1Lπ(l)ϑ^(l). (33)

This final step can be viewed as an analogy to model averaging where marginal estimates are computed via a weighted average using the importance weights (Hoeting et al., 1999). In the current study, we explore the utility of cis-interaction LD scores generated with different window sizes ± 5, ± 10, ± 25, and ± 50 SNPs around each j-th focal SNP. In practice, we find that cis-interaction LD scores that are calculated using larger windows lead to the most robust estimates of heritability while also not over representing the total phenotypic variation explained by tagged non-additive genetic effects (see Figure 3—figure supplement 1). Therefore, unless otherwise stated, we use cis-interaction LD scores calculated with a ± 50 SNP interaction window for all simulations and real data analyses conducted in this work. For a direct comparison between choosing a single window size versus the model averaging strategy described above, see Supplementary files 1 and 2.

Relationship between minor allele frequency and effect size

The LDSC software computes LD scores using annotations over equally spaced minor allele frequency (MAF) bins. These annotations enable the per trait relationship between the MAF and the effect size of each variant in the genome to vary based on the discrete category (or MAF bin) it is placed into. This additional flexibility is intended to help LDSC be more robust when estimating heritability. The relationship between MAF and effect size is already implicitly encoded in the LDSC formulation since we assume genotypes are normalized. When normalizing by the variance of each SNP (or equivalently its MAF), we make the assumption that rare variants inherently have larger effect sizes. There exists a true functional relationship between MAF and effect size which is likely to be somewhere between the two extremes of (i) normalizing each SNP by its MAF and (ii) allowing the variance per SNP to be dictated by its MAF.

Recent approaches have proposed using a single parameter α to better represent the nonlinear relationship between MAF and variant effect size. The main idea is that this α not only provides the same additional flexibility to LDSC as the MAF-based discrete annotations, but it also empirically yields even more precise heritability estimates (Zabad et al., 2021). Namely, we use

j(c):=kLjk(α)ac(k),Ljk(α)=rjk2𝕍[𝐱k]1-α (34)

where ac(k) is the annotation value for the c-th categorical bin. The α parameter is unknown in practice and needs to be estimated for any given trait. While standard ranges for α can be used for heritability estimates, we use a restricted maximum likelihood (REML) based method which was recently developed (Schoech et al., 2019).

In the i-LDSC software, we use this α construction to handle the relationship between MAF and variant effect size for two specific reasons. First, by constructing the LD scores using α, we more accurately capture the variation in chi-square test statistics due to additive effects (Zabad et al., 2021). Second, we note that there is correlation between MAF and (i) LD scores, (ii) cis-interaction LD scores, and (iii) trait architecture. To that end, if we do not properly condition on MAF, there becomes additional bias, and we may falsely attribute some amount of variation in the chi-square test statistics to LD or the tagged interaction effects. Therefore, in our formulation, we include an α term on the LD scores to condition on this effect. We demonstrate in simulations that this removes the bias introduced by the relationship between MAF and trait architecture, and it mitigates potential inflation of type I error rates in the i-LDSC test.

Estimation of allele frequency parameters

In the main text, we analyzed 25 complex traits in both the UK Biobank and BioBank Japan data sets. In order to account for minor allele frequency (MAF) dependent trait architecture, we calculated α values for each trait that had not been analyzed by previous studies (Schoech et al., 2019). The α estimates for each of the 25 traits analyzed in this study are shown in Supplementary file 4. Intuitively, α parameterizes the weighting of the effects of each individual variant given its frequency in the study cohort and can take on values in the range of [–1,0]. More negative values of α indicate that lower frequency variants contribute more to the observed variation in a trait of interest, whereas values of α closer to zero indicate that common variants contribute a greater amount of variation to observed trait values.

We took α values for 11 traits (again see Supplementary file 4) that had previously been calculated from Schoech et al. For the remaining 14 traits analyzed in this study, we followed the estimation protocol described in the same manuscript. Specifically, using the variants passing the quality control step in our pipeline for 25,000 randomly selected individuals in the UK Biobank cohort, we constructed MAF-dependent genetic relatedness matrices for values of α={-1,-0.95,-0.9,,0} using the GRM-MAF-LD software (Schoech, 2018). We then used the GCTA software (Yang et al., 2011) to obtain heritability and likelihood estimates using REML for each α-trait pairing. We then fit a trait-specific profile likelihood across the range of α values and estimate the maximum likelihood value of α using a natural cubic spline.

Simulation studies

We used a simulation scheme to generate synthetic quantitative traits and SNP-level summary statistics under multiple genetic architectures using real genome-wide data from individuals of self-identified European ancestry in the UK Biobank. Here, we consider phenotypes that have some combination of additive effects, cis-acting interactions, and a third source of genetic variance stemming from either gene-by-environment (G×E) or gene-by-ancestry (G×Ancestry) effects. For each scenario, we select some set of SNPs to be causal and assume that complex traits are generated via the following general linear model

𝐲=𝐗𝜷+𝐖𝜽+𝐙𝜸+𝜺,𝜺𝒩(𝟎,δ2𝐈), (35)

where 𝐲 is an N-dimensional vector containing all the phenotypes; 𝐗 is an N×J matrix of genotypes encoded as 0, 1, or 2 copies of a reference allele; β is a J-dimensional vector of additive effect sizes for each SNP; 𝐖 is an N×M matrix which holds all pairwise interactions between the randomly selected subset of the interacting SNPs with corresponding effects θ is an N×K matrix of either G×E or G×Ancestry interactions with coefficients 𝜸; and 𝜺 is an N-dimensional vector of environmental noise. The phenotypic variation is assumed to be 𝕍[𝐲]=1. All additive and interaction effect sizes for SNPs are randomly drawn from independent standard Gaussian distributions and then rescaled so that they explain a fixed proportion of the phenotypic variance 𝕍[𝐗𝜷]+𝕍[𝐖𝜽]+𝕍[𝐙𝜸]=H2. Note that we do not assume any specific correlation structure between the effect sizes β, θ, and 𝜸. We then rescale the random error term such that 𝕍[𝜺]=(1-H2). In the main text, we compare the traditional LDSC to its direct extension in i-LDSC. For each method, GWAS summary statistics are computed by fitting a single-SNP univariate linear model via least squares where β^j=(𝐱j𝐱j)-1𝐱j𝐲 for every j=1,,J SNP in the data. These effect size estimates are used to derive the chi-square test statistics χj2=Nβ^j2. We implement both LDSC and i-LDSC with the LD matrix 𝐑=𝐗𝐗/N and the cis-interaction correlation matrix 𝐕=𝐗𝐖/N being computed using a reference panel of 489 individuals from the European superpopulation (EUR) of the 1000 Genomes Project (https://mathgen.stats.ox.ac.uk/impute/data_download_1000G_phase1_integrated.html). The resulting matrices 𝐑 and 𝐕 are used to compute the additive and cis-interaction LD scores, respectively.

Polygenic simulations with cis-interactions

In our first set of simulations, we consider phenotypes with polygenic architectures that are made up of only additive and cis-acting SNP-by-SNP interactions. Here, we begin by assuming that every SNP in the genome has at least a small additive effect on the traits of interest. Next, when generating synthetic traits, we assume that the additive effects make up ρ% of the heritability while the pairwise interactions make up the remaining (1-ρ)%. Alternatively, the proportion of the heritability explained by additivity is said to be 𝕍[𝐗𝜷]=ρH2, while the proportion detailed by interactions is given as 𝕍[𝐖𝜽]=(1-ρ)H2. The setting of ρ=1 represents the limiting null case for i-LDSC where the variation of a trait is driven by solely additive effects. Here, we use the same simulation strategy used in Crawford et al. where we divide the causal cis-interaction variants into two groups. One may view the SNPs in group #1 as being the ‘hubs’ of an interaction map. SNPs in group #2 are selected to be variants within some kilobase (kb) window around each SNP in group #1. Given different parameters for the generative model in Equation 35, we simulate data mirroring a wide range of genetic architectures by toggling the following parameters:

  • heritability: H2= 0.3 and 0.6;

  • proportion of phenotypic variation that is generated by additive effects: ρ= 0.5, 0.8, and 1;

  • percentage of SNPs selected to be in group #1: 1% (sparse), 5%, and 10% (polygenic);

  • genomic window used to assign SNPs to group #2: ± 10 and ± 100 kilobase (kb);

  • allele frequency parameter: α= −1,–0.5, and 0.

All figures and tables show the mean performances (and standard errors) across 100 simulated replicates.

Polygenic simulations with gene-by-environmental effects

In our second set of simulations, we continue to consider phenotypes with polygenic architectures that are made up of only additive and cis-acting SNP-by-SNP interactions; however, now we also consider each trait to have contributions stemming from nonzero G×E effects. Here, both the additive and cis-interaction effects are simulated in the same way as previously described where, for the two groups of interacting variants, 10% of SNPs were selected to be in group #1 and we chose ±10 kb windows to assign SNPs to group #2. To create G×E effects, we follow a simulation strategy implemented by Zhu et al. and split our sample population in half to emulate two subsets of individuals coming from different environments. We randomly draw the effect sizes for the first environment from a standard Gaussian distribution which we denote as 𝜸1. We then selected an amplification coefficient w and set the effect sizes of the G×E interactions in the second environment to be a scaled version of the first environment effects where 𝜸2=w𝜸1. In this paper, we generate traits with heritability H2={0.3,0.6} and amplification coefficients set to w=[1.1,1.2,,2]. For the first set of simulations, we hold the proportion of phenotypic variation explained by the different genetic components constant by fixing:

  • H2=0.3: 𝕍[𝐗𝜷]=0.15; 𝕍[𝐖𝜽]=0.075; and 𝕍[𝐙𝜸]=0.075;

  • H2=0.6: 𝕍[𝐗𝜷]=0.3; 𝕍[𝐖𝜽]=0.15; and 𝕍[𝐙𝜸]=0.15;

where 𝐙=[𝐗1,𝐗2] is the set of genotypes split according to environment and 𝜸=[𝜸1,𝜸2]. To test the sensitivity of the cis-interaction LD scores to other sources of non-additive variation, we also repeated the same simulations where there were only additive and G×E effects contributing equally to trait architecture:

  • H2=0.3: 𝕍[𝐗𝜷]=0.15; 𝕍[𝐖𝜽]=0; and 𝕍[𝐙𝜸]=0.15;

  • H2=0.6: 𝕍[𝐗𝜷]=0.3; 𝕍[𝐖𝜽]=0; and 𝕍[𝐙𝜸]=0.3.

Again all figures show the mean performances (and standard errors) across 100 simulated replicates.

Polygenic simulations with gene-by-ancestry effects

In our third set of simulations, we consider phenotypes with polygenic architectures that are made up of additive, cis-interactions, and G×Ancestry effects. Here, we follow Sohail et al. and first run a matrix decomposition on the individual-level genotype matrix 𝐗=𝐔𝐐 where 𝐔 is a unitary N×K score matrix, 𝐐 is a K×J loadings matrix, and K represents the number of (predetermined) principal components (PCs). To generate G×Ancestry interactions, we then create the matrix 𝐙k=𝐗𝐪k where 𝐪k is a J-dimensional vector of SNP loadings for the k-th principal component. In this paper, we generate traits with heritability H2={0.3,0.6} and interaction effects taken over k=1,,10 principal components. For the first set of simulations, we hold the proportion of phenotypic variation explained by the different genetic components constant by fixing:

  • H2=0.3: 𝕍[𝐗𝜷]=0.15; 𝕍[𝐖𝜽]=0.075; and 𝕍[𝐙𝜸]=0.075;

  • H2=0.6: 𝕍[𝐗𝜷]=0.3; 𝕍[𝐖𝜽]=0.15; and 𝕍[𝐙𝜸]=0.15;

To test the sensitivity of the cis-interaction LD scores to other sources of non-additive variation, we also repeated the same simulations where there were only additive and G×E effects contributing equally to trait architecture:

  • H2=0.3: 𝕍[𝐗𝜷]=0.15; 𝕍[𝐖𝜽]=0; and 𝕍[𝐙𝜸]=0.15;

  • H2=0.6: 𝕍[𝐗𝜷]=0.3; 𝕍[𝐖𝜽]=0; and 𝕍[𝐙𝜸]=0.3.

Note that, for each case, we generate summary statistics in two ways: (i) including the top 10 PCs as covariates in the marginal linear model to correct for population structure and (ii) not correcting for any population structure. Again all figures show the mean performances (and standard errors) across 100 simulated replicates.

Sparse simulation study design with additive effects

In this set of simulations, we consider phenotypes with sparse architectures (Zhou et al., 2013). Here, traits were simulated with solely additive effects such that 𝕍[𝐗𝜷]=H2, but this time only variants with the top or bottom {1,5,10,25,50,100} percentile of LD scores were given nonzero coefficients (a similar simulation approach was also previously implemented in both Bulik-Sullivan et al., 2015b and Lee et al., 2018). We once again generate traits with heritability H2={0.3,0.6}. We also want to note that, in each of these specific analyses, synthetic trait architectures were generated using all UK Biobank genotyped variants that passed initial preprocessing and quality control (see next section). Since not all of these SNPs are HapMap3 SNPs, some variants were omitted from the LDSC and i-LDSC regression. Overall, as shown in the main text with results taken over 100 replicates, breaking the assumed relationship between LD scores and chi-squared statistics (i.e. that they are generally positively correlated) led to unbounded estimates of heritability in all but the (more polygenic) scenario when 100% of SNPs contributed to phenotypic variation.

Polygenic simulations with unobserved additive effects

In this next set of simulations, we consider another extension of the polygenic case where a portion of the variants with only additive genetic effects are not observed due ascertainment or other quality control procedures. It was found in Hemani et al., 2014. that an initial set of signals pointing towards evidence of genetic interactions were actually better explained using linear models of unobserved variants in the same haplotype. Here, we test whether the i-LDSC framework is prone to overestimate the non-additive genetic variance when additive effects in the same haplotype are not included in the model. In each simulation, we generated haplotypes that each contain 5000 variants. Next, we select either a single causal variant with only an additive effect or a set of ten causal variants with only additive effects — each having an MAF that is randomly selected between: (i) (0.01, 0.1), (ii) (0.1, 0.2), (iii) (0.2, 0.3), (iv) (0.3, 0.4), and (v) (0.4, 0.5). The corresponding additive effect size for each causal variant across the haplotype is simulated inversely proportional with its MAF. For this analysis, we measure the difference between i-LDSC coefficient estimates when every variant is included in the model versus when the haplotype causal variants are omitted for two different trait architectures with broad-sense heritability set to H2= 0.3 and 0.6. Differences in the component estimates between the observed and unobserved single additive variant models are shown in Figure 3—figure supplement 9A and B. Similar estimates when the larger number of ten additive variants are unobserved in each haplotype are shown in Figure 3—figure supplement 9C and D. If i-LDSC was prone to overestimating the non-additive effects, then the omission of the variants with only significant additive effects would lead to increased estimates of τ and ϑ. However, across a range of generative broad-sense heritabilities and haplotype architectures we observe that estimates of τ and ϑ are robust. Intuitively, this is likely due to the fact that these simulations were done under polygenic trait architectures where, as a result, the omission of a few causal variants with small marginal effect sizes has little impact on the ability to estimate genetic variance.

Polygenic simulations with unobserved interaction effects

In this set of simulations, we extend the polygenic case to a setting where a portion of the variants involved in genetic interactions are unobserved. Similar to the case with unobserved additive effects, the purpose of these simulations is to assess whether the i-LDSC framework is prone to false discovery of non-additive genetic variance when causal interacting SNPs are not included during the estimation of GWAS summary statistics. In each simulation, we generated haplotypes that each contain 5000 variants. Traits were simulated using the generative model in Equation (35) with both additive and interaction effects such that 𝕍[𝐗𝜷]+𝕍[𝐖𝜽]=H2. Here, every SNP in the genome had at least a small additive effect with a corresponding effect size that was drawn to be inversely proportional to its MAF. Only 1% or 5% of variants within each haplotype had causal non-zero interaction effects. However, when running i-LDSC, only a percentage of the interacting SNPs {1%, 5%, 10%, 25% or 50%} were included in the estimation of ϑ^. We once again generate traits with heritability H2={0.3,0.6} such that the proportion of genetic variance explained by additive effects was equal to ρ={0.5,0.8}. As with the other simulation scenarios, all synthetic traits were generated using UK Biobank genotyped variants that passed initial preprocessing and quality control (see next section). Since not all of these SNPs are HapMap3 SNPs, some variants were omitted from the i-LDSC regression analyses. Overall, as discussed in the main text with results taken over 100 replicates, i-LDSC underestimated values of ϑ^ when there were unobserved interacting variants (see Figure 3—figure supplements 10 and 11). As expected, estimates of the additive variance component τ^, on the other hand, were not affected.

Polygenic simulations with correlated additive and interaction effects

In our last set of simulations, we sought out to better understand how the relationship between the additive (β) and interaction (θ) coefficients in the generative model of complex traits could potentially bias the additive and non-additive variance component estimates in LDSC and i-LDSC. To that end, we performed a set of simulations where we varied the correlation between the set of effects. Specifically, we first drew a set of additive effect sizes for each variant using the MAF-dependent procedure described above (i.e. α=-1). We next selected a subset of the causal variants to be in cis-interactions. Here, we set the interaction effect sizes to covary with the additive effect size vector in two different ways. In the first, we simply drew the additive and interaction effect sizes from a multivariate normal such that their correlation was equal to r={-1,-0.8,-0.6,,0.6,0.8,1} (see Figure 3—figure supplement 12). In the second, we simply amplified the interaction effects to be a linear function θ=β×q (Figure 3—figure supplement 13A and C) or a squared function θ=β2q (Figure 3—figure supplement 13B and D) of the additive effects where q={0.1,0.2,,0.9,1}. While testing 100 replicates for each value of q, we observed that the mean estimate of genetic variance had a slight upward bias as the correlation between the additive and interaction effect sizes in the generative model increased; however, the distribution of these bias estimates covered zero in the first and third quartiles of all results. We evaluated this behavior for multiple broad-sense heritability levels H2 = 0.3 and 0.6.

Preprocessing for the UK Biobank and BioBank Japan

In order to apply the i-LDSC framework to 25 continuous traits the UK Biobank (Bycroft et al., 2018), we first downloaded genotype data for 488,377 individuals in the UK Biobank using the ukbgene tool (https://biobank.ctsu.ox.ac.uk/crystal/download.cgi) and converted the genotypes using the provided ukbconv tool (https://biobank.ctsu.ox.ac.uk/crystal/refer.cgi?id=149660). Phenotype data for the 25 continuous traits were also downloaded for those same individuals using the ukbgene tool. Individuals identified by the UK Biobank as having high heterozygosity, excessive relatedness, or aneuploidy were removed (1,550 individuals). After separating individuals into self-identified ancestral cohorts using data field 21000, unrelated individuals were selected by randomly choosing an individual from each pair of related individuals. This resulted in N= 349,469 white British individuals to be included in our analysis. We downloaded imputed SNP data from the UK Biobank for all remaining individuals and removed SNPs with an information score below 0.8. Information scores for each SNP are provided by the UK Biobank (http://biobank.ctsu.ox.ac.uk/crystal/refer.cgi?id=1967).

Quality control for the remaining genotyped and imputed variants was then performed on each cohort separately using the following steps. All structural variants were first removed, leaving only single nucleotide polymorphisms (SNPs) in the genotype data. Next, all AT/CG SNPs were removed to avoid possible confounding due to sequencing errors. Then, SNPs with minor allele frequency less than 1% were removed using the PLINK 2.0 (Chang et al., 2015) command --maf 0.01. We then removed all SNPs found to be out of Hardy-Weinberg equilibrium, using the PLINK --hwe 0.000001 flag to remove all SNPs with a Fisher’s exact test p-value >106. Finally, all SNPs with missingness greater than 1% were removed using the PLINK --mind 0.01 flag.

We then performed a genome-wide association study (GWAS) for each trait in the UK Biobank on the remaining 8,981,412 SNPs. SNP-level GWAS effect sizes were calculated using PLINK and the --glm flag (Chang et al., 2015). Age, sex, and the first 20 principal components were included as covariates for all traits analyzed (Sohail et al., 2019). Principal component analysis was performed using FlashPCA 2.0 (Abraham et al., 2017) on a set of independent markers derived separately for each ancestry cohort using the PLINK command --indep-pairwise 100 10 0.1. Using the parameters --indep-pairwise removes all SNPs that have a pairwise correlation above 0.1 within a 100 SNP window, then slides forward in increments of ten SNPs genome-wide.

In order to analyze data from BioBank Japan, we downloaded publicly available GWAS summary statistics for the 25 traits listed in Supplementary file 5 from https://pheweb.jp/downloads. Summary statistics used age, sex, and the first ten principal components as confounders in the initial GWAS study. We then used individuals from the East Asian (EAS) superpopulation from the 1000 Genomes Project Phase 3 to calculate paired LDSC and i-LDSC scores from a reference panel. We pruned the reference panel using the PLINK command --indep-pairwise 100 10 0.5 to limit the computational time of calculating scores (Chang et al., 2015). This resulted in reference scores for 1,164,666 SNPs that are included on the i-LDSC GitHub repository (https://github.com/lcrawlab/i-LDSC). Using summary statistics from BioBank Japan, with scores calculated from the EAS population in the 1000 Genomes, we obtained i-LDSC heritability estimates for each of the 25 traits.

Acknowledgements

We thank Jeffrey P Spence, Roshni Patel, Matthew Aguirre, Mineto Ota, and our anonymous referees for insightful comments on an earlier version of this manuscript as well as the Harpak, Ramachandran, and Crawford Labs for helpful discussions. This research was conducted in part using computational resources and services at the Center for Computation and Visualization at Brown University. This research was also conducted using the UK Biobank Resource under Application Numbers 14649 (LC) and 22419 (SR). SP Smith and D Udwin were trainees supported under the Brown University Predoctoral Training Program in Biological Data Science (NIH T32 GM128596). SP Smith was also supported by NIH RF1AG073593. SP Smith and A Harpak were also supported by NIH R35 GM151108 to A Harpak. G Darnell was supported by NSF Grant No. DMS-1439786 while in residence at the Institute for Computational and Experimental Research in Mathematics (ICERM) in Providence, RI. This research was supported in part by an Alfred P Sloan Research Fellowship and a David & Lucile Packard Fellowship for Science and Engineering awarded to L Crawford. This research was also partly supported by US National Institutes of Health (NIH) grant R01 GM118652, NIH grant R35 GM139628, and National Science Foundation (NSF) CAREER award DBI-1452622 to S Ramachandran. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of any of the funders.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Lorin Crawford, Email: lcrawford@microsoft.com.

George H Perry, Pennsylvania State University, United States.

George H Perry, Pennsylvania State University, United States.

Funding Information

This paper was supported by the following grants:

  • National Institutes of Health T32 GM128596 to Samuel Pattillo Smith, Dana Udwin.

  • National Institutes of Health RF1 AG073593 to Samuel Pattillo Smith.

  • National Institutes of Health R35 GM151108 to Samuel Pattillo Smith, Arbel Harpak.

  • National Science Foundation DMS-1439786 to Gregory Darnell.

  • Alfred P. Sloan Foundation to Lorin Crawford.

  • David and Lucile Packard Foundation to Lorin Crawford.

  • National Institutes of Health R01 GM118652 to Sohini Ramachandran.

  • National Institutes of Health R35 GM139628 to Sohini Ramachandran.

  • National Science Foundation DBI-1452622 to Sohini Ramachandran.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Software, Formal analysis, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Conceptualization, Software, Formal analysis, Validation, Investigation, Methodology, Writing – original draft, Writing – review and editing.

Formal analysis, Validation, Investigation, Visualization, Writing – review and editing.

Investigation, Methodology, Writing – review and editing.

Formal analysis, Supervision, Funding acquisition, Investigation, Writing – review and editing.

Conceptualization, Resources, Data curation, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Additional files

Supplementary file 1. Comparison of LDSC and i-LDSC estimates of the proportion of phenotypic variance explained (PVE) by genetic effects (i.e., estimated heritability) when the true heritability is set to H𝟐=0.3 for polygenic traits.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e., creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 10% of the total number of SNPs genome-wide. These interact with the group #2 SNPs which are selected to be variants within a ± 100 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with no minor allele frequency dependency α=0 (see Materials and Methods). Here, we assume a heritability H2=0.3 and vary the proportion contributed by additive effects with ρ={0.2,0.4,0.6,0.8}. We run i-LDSC while computing the cis-interaction LD scores using different estimating windows of ± 5, ± 10, ± 25, and ± 50 SNPs. The “average” column represents results using model averaging over the different estimating windows (see Materials and Methods). We report the mean estimates of heritability (with standard errors in the parentheses) and use mean absolute error (MAE) to quantify the difference between the two methods. Results are based on 100 simulations per parameter combination. As shown in Figure 3—figure supplements 3 and 1, LDSC does not capture the contribution of non-additive genetic effects to trait variation.

elife-90459-supp1.xlsx (10.3KB, xlsx)
Supplementary file 2. Comparison of LDSC and i-LDSC estimates of the proportion of phenotypic variance explained (PVE) by genetic effects (i.e., estimated heritability) when the true heritability is set to H𝟐=0.6.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e., creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two interacting groups. The group #1 SNPs are chosen to be 10% of the total number of SNPs genome-wide. These interact with the group #2 SNPs which are selected to be variants within a ± 100 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with no minor allele frequency dependency α=0 (see Materials and Methods). Here, we assume a heritability H2=0.6 and vary the proportion contributed by additive effects with ρ={0.2,0.4,0.6,0.8}. We run i-LDSC while computing the cis-interaction LD scores using different estimating windows of ± 5, ± 10, ± 25, and ± 50 SNPs. The “average” column represents results using model averaging over the different estimating windows (see Materials and Methods). We report the mean estimates of heritability (with standard errors in the parentheses) and use mean absolute error (MAE) to quantify the difference between the two methods. Results are based on 100 simulations per parameter combination. As shown in Figure 3—figure supplements 3 and 1, LDSC does not capture the additional contribution of non-additive genetic effects to trait variation.

elife-90459-supp2.xlsx (10.3KB, xlsx)
Supplementary file 3. Abbreviations used throughout this study for 14 quantitative traits analyzed in this study.

The remaining 11 traits analyzed were Basophil count, Cholesterol, Eosinophil count, Height, Hematocrit, Hemoglobin, Lymphocyte count, Monocyte count, Neutrophil count, and Triglyceride levels, respectively. These are not abbreviated in the main text.

elife-90459-supp3.xlsx (9.8KB, xlsx)
Supplementary file 4. Trait-specific α parameters for each of the 25 traits analyzed.

Here, α values are used to weight each variant based on its minor allele frequency to account for frequency dependent architectures in each trait. The ∗ indicates α parameters that were taken directly from Schoech et al. The α parameters for other traits were calculated using the protocol used in that paper. Expansion of trait abbreviations are given in Supplementary file 3.

elife-90459-supp4.xlsx (10KB, xlsx)
Supplementary file 5. Number of individuals and total SNPs included in the analysis of each trait in BioBank Japan.
elife-90459-supp5.xlsx (10.3KB, xlsx)
MDAR checklist

Data availability

Source code and tutorials for implementing interaction-LD score regression via the i-LDSC package are written in Python and are publicly available online on GitHub (copy archived at Crawford and Smith, 2024). Files of LD scores, cis-interaction LD scores, and GWAS summary statistics used for our analyses of the UK Biobank and BioBank Japan can be downloaded from the Harvard Dataverse. All software for the traditional and stratified LD score regression framework with LDSC and s-LDSC were fit using the default settings, unless otherwise stated in the main text. Source code for these approaches was downloaded from https://github.com/bulik/ldsc (Bulik-Sullivan et al., 2020). When applying s-LDSC, we used 97 functional annotations from Gazal et al., 2017 to estimate heritability. Data from the UK Biobank Resource (Bycroft et al., 2018) was made available under Application Numbers 14649 and 22419. Data can be accessed by direct application to the UK Biobank.

The following dataset was generated:

Smith S, Darnell G, Udwin D, Stamp J, Harpak A, Ramachandran S, Crawford L. 2023. Replication Data for: Discovering non-additive heritability using additive GWAS summary statistics. Harvard Dataverse.

References

  1. Abraham G, Qiu Y, Inouye M. FlashPCA2: principal component analysis of Biobank-scale genotype datasets. Bioinformatics. 2017;33:2776–2778. doi: 10.1093/bioinformatics/btx299. [DOI] [PubMed] [Google Scholar]
  2. Barreto H, Howland F. Introductory Econometrics: Using Monte Carlo Simulation with Microsoft Excel. Cambridge University Press; 2005. [DOI] [Google Scholar]
  3. Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, Day FR, Loh P-R, Duncan L, Perry JRB, Patterson N, Robinson EB, Daly MJ, Price AL, Neale BM, ReproGen Consortium. Psychiatric Genomics Consortium. Genetic Consortium for Anorexia Nervosa of the Wellcome Trust Case Control Consortium 3 An atlas of genetic correlations across human diseases and traits. Nature Genetics. 2015a;47:1236–1241. doi: 10.1038/ng.3406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bulik-Sullivan BK, Loh P-R, Finucane HK, Ripke S, Yang J, Schizophrenia Working Group of the Psychiatric Genomics Consortium. Patterson N, Daly MJ, Price AL, Neale BM. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nature Genetics. 2015b;47:291–295. doi: 10.1038/ng.3211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bulik-Sullivan B, Finucane H, Walters RK, Gazal S, Poterba T. LDSC (LD score) v1.0.1GitHub. 2020 https://github.com/bulik/ldsc
  6. Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, Motyer A, Vukcevic D, Delaneau O, O’Connell J, Cortes A, Welsh S, Young A, Effingham M, McVean G, Leslie S, Allen N, Donnelly P, Marchini J. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–209. doi: 10.1038/s41586-018-0579-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Carbonetto P, Stephens M. Scalable variational inference for bayesian variable selection in regression, and its accuracy in genetic association studies. Bayesian Analysis. 2012;7:73–108. doi: 10.1214/12-BA703. [DOI] [Google Scholar]
  8. Chan TF, Rui X, Conti DV, Fornage M, Graff M, Haessler J, Haiman C. Estimating Heritability Explained by Local Ancestry and Evaluating Stratification Bias in Admixture Mapping from Summary Statistics. bioRxiv. 2023 doi: 10.1101/2023.04.10.536252. [DOI] [PMC free article] [PubMed]
  9. Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cheng W, Ramachandran S, Crawford L. Estimation of non-null SNP effect size distributions enables the detection of enriched genes underlying complex traits. PLOS Genetics. 2020;16:e1008855. doi: 10.1371/journal.pgen.1008855. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Crawford L, Zeng P, Mukherjee S, Zhou X. Detecting epistasis with the marginal epistasis test in genetic mapping studies of quantitative traits. PLOS Genetics. 2017;13:e1006869. doi: 10.1371/journal.pgen.1006869. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Crawford L, Smith SP. Interaction-LD score (I-LDSC) regression. swh:1:rev:2d828d50502a341a8148f14cde5825c812a04f90Software Heritage. 2024 https://archive.softwareheritage.org/swh:1:dir:67d977f98c37f23ab7de3a5cbb104492dfb138c6;origin=https://github.com/fred-atherden/90459-clone;visit=swh:1:snp:4b2ff84ebe13052a497fa5775ce0fa97fbe4cfb4;anchor=swh:1:rev:2d828d50502a341a8148f14cde5825c812a04f90
  13. de Los Campos G, Vazquez AI, Fernando R, Klimentidis YC, Sorensen D. Prediction of complex human traits using the genomic best linear unbiased predictor. PLOS Genetics. 2013;9:e1003608. doi: 10.1371/journal.pgen.1003608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. de Los Campos G, Sorensen D, Gianola D. Genomic heritability: what is it? PLOS Genetics. 2015;11:e1005048. doi: 10.1371/journal.pgen.1005048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Demetci P, Cheng W, Darnell G, Zhou X, Ramachandran S, Crawford L. Multi-scale inference of genetic trait architecture using biologically annotated neural networks. PLOS Genetics. 2021;17:e1009754. doi: 10.1371/journal.pgen.1009754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Efron B. The Jackknife, the Bootstrap and Other Resampling Plans. SIAM; 1982. [DOI] [Google Scholar]
  17. Eichler EE, Flint J, Gibson G, Kong A, Leal SM, Moore JH, Nadeau JH. Missing heritability and strategies for finding the underlying causes of complex disease. Nature Reviews. Genetics. 2010;11:446–450. doi: 10.1038/nrg2809. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Falconer DS, Mackay TFC. Quantitative Genetics. Longman; 1983. [Google Scholar]
  19. Finucane HK, Bulik-Sullivan B, Gusev A, Trynka G, Reshef Y, Loh P-R, Anttila V, Xu H, Zang C, Farh K, Ripke S, Day FR, ReproGen Consortium. Schizophrenia Working Group of the Psychiatric Genomics Consortium. RACI Consortium. Purcell S, Stahl E, Lindstrom S, Perry JRB, Okada Y, Raychaudhuri S, Daly MJ, Patterson N, Neale BM, Price AL. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nature Genetics. 2015;47:1228–1235. doi: 10.1038/ng.3404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Fisher RA. XV.—The correlation between relatives on the supposition of mendelian inheritance. Transactions of the Royal Society of Edinburgh. 1919;52:399–433. doi: 10.1017/S0080456800012163. [DOI] [Google Scholar]
  21. Fisher RA. The Genetical Theory of Natural Selection: A Complete Variorum Edition. Oxford University Press; 1999. [DOI] [Google Scholar]
  22. Gazal S, Finucane HK, Furlotte NA, Loh P-R, Palamara PF, Liu X, Schoech A, Bulik-Sullivan B, Neale BM, Gusev A, Price AL. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nature Genetics. 2017;49:1421–1427. doi: 10.1038/ng.3954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Guan Y, Stephens M. Bayesian variable selection regression for genome-wide association studies and other large-scale problems. The Annals of Applied Statistics. 2011;5:1780–1815. doi: 10.1214/11-AOAS455. [DOI] [Google Scholar]
  24. Hemani G, Shakhbazov K, Westra H-J, Esko T, Henders AK, McRae AF, Yang J, Gibson G, Martin NG, Metspalu A, Franke L, Montgomery GW, Visscher PM, Powell JE. Detection and replication of epistasis influencing transcription in humans. Nature. 2014;508:249–253. doi: 10.1038/nature13005. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
  25. Hill WG, Goddard ME, Visscher PM. Data and theory point to mainly additive genetic variance for complex traits. PLOS Genetics. 2008;4:e1000008. doi: 10.1371/journal.pgen.1000008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hivert V, Sidorenko J, Rohart F, Goddard ME, Yang J, Wray NR, Yengo L, Visscher PM. Estimation of non-additive genetic variance in human complex traits from a large sample of unrelated individuals. American Journal of Human Genetics. 2021;108:786–798. doi: 10.1016/j.ajhg.2021.02.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Hoeting JA, Madigan D, Raftery AE, Volinsky CT. Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors. Statistical Science. 1999;14:382–417. doi: 10.1214/ss/1009212519. [DOI] [Google Scholar]
  28. Hormozdiari F, Kostem E, Kang EY, Pasaniuc B, Eskin E. Identifying causal variants at loci with multiple signals of association. Genetics. 2014;198:497–508. doi: 10.1534/genetics.114.167908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Hormozdiari F, van de Bunt M, Segrè AV, Li X, Joo JWJ, Bilow M, Sul JH, Sankararaman S, Pasaniuc B, Eskin E. Colocalization of GWAS and eQTL signals detects target genes. American Journal of Human Genetics. 2016;99:1245–1260. doi: 10.1016/j.ajhg.2016.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Hou K, Burch KS, Majumdar A, Shi H, Mancuso N, Wu Y, Sankararaman S, Pasaniuc B. Accurate estimation of SNP-heritability from biobank-scale data irrespective of genetic architecture. Nature Genetics. 2019;51:1244–1251. doi: 10.1038/s41588-019-0465-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Isserlis L. On a formula for the product-moment coefficient of any order of a normal frequency distribution in any number of variables. Biometrika. 1918;12:134–139. doi: 10.1093/biomet/12.1-2.134. [DOI] [Google Scholar]
  32. Jiang Y, Reif JC. Modeling epistasis in genomic selection. Genetics. 2015;201:759–768. doi: 10.1534/genetics.115.177907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Kang SH, Jung SH. Generating correlated binary variables with complete specification of the joint distribution. Biometrical Journal. 2001;43:263–269. doi: 10.1002/1521-4036(200106)43:3<263::AID-BIMJ263>3.0.CO;2-5. [DOI] [Google Scholar]
  34. Lee JJ, McGue M, Iacono WG, Chow CC. The accuracy of LD Score regression as an estimator of confounding and genetic correlations in genome-wide association studies. Genetic Epidemiology. 2018;42:783–795. doi: 10.1002/gepi.22161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Li Y, Cho H, Wang F, Canela-Xandri O, Luo C, Rawlik K, Archacki S, Xu C, Tenesa A, Chen Q, Wang QK. Statistical and functional studies identify epistasis of cardiovascular risk genomic variants from genome-wide association studies. Journal of the American Heart Association. 2020;9:e014146. doi: 10.1161/JAHA.119.014146. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Lippert C, Quon G, Kang EY, Kadie CM, Listgarten J, Heckerman D. The benefits of selecting phenotype-specific variants for applications of mixed models in genomics. Scientific Reports. 2013;3:1815. doi: 10.1038/srep01815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lynch M, Walsh B. Genetics and Analysis of Quantitative Traits. Sinauer Sunderland; 1998. [Google Scholar]
  38. Mäki-Tanila A, Hill WG. Influence of gene interaction on complex trait variation with multilocus models. Genetics. 2014;198:355–367. doi: 10.1534/genetics.114.165282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Nakka P, Raphael BJ, Ramachandran S. Gene and network analysis of common variants reveals novel associations in multiple complex diseases. Genetics. 2016;204:783–798. doi: 10.1534/genetics.116.188391. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Naqvi S, Sleyp Y, Hoskens H, Indencleef K, Spence JP, Bruffaerts R, Radwan A, Eller RJ, Richmond S, Shriver MD, Shaffer JR, Weinberg SM, Walsh S, Thompson J, Pritchard JK, Sunaert S, Peeters H, Wysocka J, Claes P. Shared heritability of human face and brain shape. Nature Genetics. 2021;53:830–839. doi: 10.1038/s41588-021-00827-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Ning Z, Pawitan Y, Shen X. High-definition likelihood inference of genetic correlations across human complex traits. Nature Genetics. 2020;52:859–864. doi: 10.1038/s41588-020-0653-y. [DOI] [PubMed] [Google Scholar]
  42. Palmer DS, Zhou W, Abbott L, Wigdor EM, Baya N, Churchhouse C, Seed C, Poterba T, King D, Kanai M, Bloemendal A, Neale BM. Analysis of genetic dominance in the UK Biobank. Science. 2023;379:1341–1348. doi: 10.1126/science.abn8455. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Patel RA, Musharoff SA, Spence JP, Pimentel H, Tcheandjieu C, Mostafavi H, Sinnott-Armstrong N, Clarke SL, Smith CJ, V.A. Million Veteran Program. Durda PP, Taylor KD, Tracy R, Liu Y, Johnson WC, Aguet F, Ardlie KG, Gabriel S, Smith J, Nickerson DA, Rich SS, Rotter JI, Tsao PS, Assimes TL, Pritchard JK. Genetic interactions drive heterogeneity in causal variant effect sizes for gene expression and complex traits. American Journal of Human Genetics. 2022;109:1286–1297. doi: 10.1016/j.ajhg.2022.05.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Pazokitoroudi A, Wu Y, Burch KS, Hou K, Zhou A, Pasaniuc B, Sankararaman S. Efficient variance components analysis across millions of genomes. Nature Communications. 2020;11:4020. doi: 10.1038/s41467-020-17576-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Pazokitoroudi A, Chiu AM, Burch KS, Pasaniuc B, Sankararaman S. Quantifying the contribution of dominance deviation effects to complex trait variation in biobank-scale data. American Journal of Human Genetics. 2021;108:799–808. doi: 10.1016/j.ajhg.2021.03.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Polderman TJC, Benyamin B, de Leeuw CA, Sullivan PF, van Bochoven A, Visscher PM, Posthuma D. Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nature Genetics. 2015;47:702–709. doi: 10.1038/ng.3285. [DOI] [PubMed] [Google Scholar]
  47. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC. PLINK: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics. 2007;81:559–575. doi: 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Runcie D, Cheng H, Crawford L. Mega-Scale Linear Mixed Models for Genomic Predictions with Thousands of Traits. bioRxiv. 2020 doi: 10.1101/2020.05.26.116814. [DOI] [PMC free article] [PubMed]
  49. Schoech A. Grm-Maf-LD. GitHub. 2018 https://github.com/arminschoech/GRM-MAF-LD
  50. Schoech AP, Jordan DM, Loh P-R, Gazal S, O’Connor LJ, Balick DJ, Palamara PF, Finucane HK, Sunyaev SR, Price AL. Quantification of frequency-dependent genetic architectures in 25 UK Biobank traits reveals action of negative selection. Nature Communications. 2019;10:790. doi: 10.1038/s41467-019-08424-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Shi H, Kichaev G, Pasaniuc B. Contrasting the genetic architecture of 30 complex traits from summary association data. American Journal of Human Genetics. 2016;99:139–153. doi: 10.1016/j.ajhg.2016.05.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Sohail M, Maier RM, Ganna A, Bloemendal A, Martin AR, Turchin MC, Chiang CW, Hirschhorn J, Daly MJ, Patterson N, Neale B, Mathieson I, Reich D, Sunyaev SR. Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies. eLife. 2019;8:e39702. doi: 10.7554/eLife.39702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Song S, Jiang W, Zhang Y, Hou L, Zhao H. Leveraging LD eigenvalue regression to improve the estimation of SNP heritability and confounding inflation. American Journal of Human Genetics. 2022;109:802–811. doi: 10.1016/j.ajhg.2022.03.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Speed D, Balding DJ. SumHer better estimates the SNP heritability of complex traits from summary statistics. Nature Genetics. 2019;51:277–284. doi: 10.1038/s41588-018-0279-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Stamp J, DenAdel A, Weinreich D, Crawford L. Leveraging the Genetic Correlation between Traits Improves the Detection of Epistasis in Genome-Wide Association Studies. bioRxiv. 2022 doi: 10.1101/2022.11.30.518547. [DOI] [PMC free article] [PubMed]
  56. Strandén I, Christensen OF. Allele coding in genomic evaluation. Genetics, Selection, Evolution. 2011;43:25. doi: 10.1186/1297-9686-43-25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Vitezica ZG, Legarra A, Toro MA, Varona L. Orthogonal estimates of variances for additive, dominance, and epistatic effects in populations. Genetics. 2017;206:1297–1307. doi: 10.1534/genetics.116.199406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Weinreich DM, Lan Y, Jaffe J, Heckendorn RB. The influence of higher-order epistasis on biological fitness landscape topography. Journal of Statistical Physics. 2018;172:208–225. doi: 10.1007/s10955-018-1975-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Wray NR, Goddard ME, Visscher PM. Prediction of individual genetic risk to disease from genome-wide association studies. Genome Research. 2007;17:1520–1528. doi: 10.1101/gr.6665407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X. Rare-variant association testing for sequencing data with the sequence kernel association test. The American Journal of Human Genetics. 2011;89:82–93. doi: 10.1016/j.ajhg.2011.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, Goddard ME, Visscher PM. Common SNPs explain a large proportion of the heritability for human height. Nature Genetics. 2010;42:565–569. doi: 10.1038/ng.608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. American Journal of Human Genetics. 2011;88:76–82. doi: 10.1016/j.ajhg.2010.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Yengo L, Sidorenko J, Kemper KE, Zheng Z, Wood AR, Weedon MN, Frayling TM, Hirschhorn J, Yang J, Visscher PM, GIANT Consortium Meta-analysis of genome-wide association studies for height and body mass index in ∼700000 individuals of European ancestry. Human Molecular Genetics. 2018;27:3641–3649. doi: 10.1093/hmg/ddy271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Zabad S, Ragsdale AP, Sun R, Li Y, Gravel S. Assumptions about frequency‐dependent architectures of complex traits bias measures of functional enrichment. Genetic Epidemiology. 2021;45:621–632. doi: 10.1002/gepi.22388. [DOI] [PubMed] [Google Scholar]
  65. Zaitlen N, Kraft P, Patterson N, Pasaniuc B, Bhatia G, Pollack S, Price AL. Using extended genealogy to estimate components of heritability for 23 quantitative and dichotomous traits. PLOS Genetics. 2013;9:e1003520. doi: 10.1371/journal.pgen.1003520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Zhang Y, Qi G, Park J-H, Chatterjee N. Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits. Nature Genetics. 2018;50:1318–1326. doi: 10.1038/s41588-018-0193-x. [DOI] [PubMed] [Google Scholar]
  67. Zhang Y, Lu Q, Ye Y, Huang K, Liu W, Wu Y, Zhong X, Li B, Yu Z, Travers BG, Werling DM, Li JJ, Zhao H. SUPERGNOVA: local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits. Genome Biology. 2021;22:262. doi: 10.1186/s13059-021-02478-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Zhou X, Carbonetto P, Stephens M. Polygenic modeling with bayesian sparse linear mixed models. PLOS Genetics. 2013;9:e1003264. doi: 10.1371/journal.pgen.1003264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Zhu Z, Bakshi A, Vinkhuyzen AAE, Hemani G, Lee SH, Nolte IM, van Vliet-Ostaptchouk JV, Snieder H, LifeLines Cohort Study. Esko T, Milani L, Mägi R, Metspalu A, Hill WG, Weir BS, Goddard ME, Visscher PM, Yang J. Dominance genetic variation contributes little to the missing heritability for human complex traits. American Journal of Human Genetics. 2015;96:377–385. doi: 10.1016/j.ajhg.2015.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Zhu X, Stephens M. Bayesian large-scale multiple regression with summary statistics from genome-wide association studies. The Annals of Applied Statistics. 2017;11:1561–1592. doi: 10.1214/17-aoas1046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Zhu X, Stephens M. Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes. Nature Communications. 2018;9:4361. doi: 10.1038/s41467-018-06805-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Zhu C, Ming MJ, Cole JM, Edge MD, Kirkpatrick M, Harpak A. Amplification is the primary mode of gene-by-sex interaction in complex human traits. Cell Genomics. 2023;3:100297. doi: 10.1016/j.xgen.2023.100297. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

George H Perry 1

This study provides a valuable investigation into whether phenotypic variance due to interactions between genetic variants can be measured using genome-wide association summary statistics. The authors present a convincing method, i-LDSC, that uses statistics on the correlations between genotypes at different loci (linkage disequilibrium) to estimate the phenotypic variance explained by both additive genetic effects and pairwise interactions.

Decision letter

Editor: George H Perry1

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting the paper "Partitioning Tagged Non-Additive Genetic Effects in Summary Statistics Provides Evidence of Pervasive Epistasis in Complex Traits" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The reviewers have opted to remain anonymous.

Comments to the Authors:

We are sorry to say that, after consultation with the reviewers, we have decided that this work will not be considered further for publication by eLife. As the manuscript is currently written, it is not clear how the definitions of additive effects and pairwise interaction effects relate to definitions used in quantitative genetics. Further, both reviewers raised concerns about the empirical results comparing LDSC to MELD. We would be willing to consider a resubmission if the authors are able to address all of the reviewer's concerns.

Reviewer #1 (Recommendations for the authors):

In this study, Darnell and colleagues propose a new method to quantify the contribution of cis-epistasis (i.e., interactions between nearby genetic variants) to the heritability of complex traits. Their method, MELD, is an extension of the popular LD score regression methodology. The authors perform simulations conditional on real genotypes to assess the consistency of their estimators of SNP-based heritability and apply their methods to 25 complex traits measured in participants of the UK Biobank and Biobank Japan. The authors conclude that unaccounted epistasis biases estimates of (narrow sense) SNP-based heritability.

I have concerns regarding the method and the conclusions of this study.

1. Additive and non-additive effects are not orthogonal in the proposed model. The authors used the generative models in Equations (1) and (2) to define the narrow sense heritability. However, this definition is incorrect because the additive and non-additive components of their model are not orthogonal. A key point to understand is that there is an additive genetic variance even when all genetic effects arise from statistical interactions between loci. This is the core issue discussed in many papers cited by the authors (e.g., Hill, Goddard and Visscher) but this does not seem to be fully grasped in this study. Unfortunately, I think there is an important confusion between what quantitative genetics classically terms as additive genetic variance and what the authors define here. The same distinction applies for dominance variance. Therefore, I don't think that what the authors term as a "bias" is justified here. That being said, it could still be that what MELD estimates (i.e., not the narrow sense heritability) can teach us something interesting about the genetic architecture of complex traits. I'm not just sure what exactly but my next point below may help.

2. Observations are consistent with alternative explanations. The linear relationship between (expected) chi-square statistics and LD scores postulated by the LD score regression methodology holds under a number of assumptions: (i) all variants are causal and (ii) explain the same amount of variance. However, if (i) is violated then the expected chi-square statistic at a given SNP k is an affine function of the squared correlation between SNP k and all causal variants (in the vicinity). As consequence, if causal variants are enriched in low or high LD regions of the genome, then extra (non-linear) terms could be needed to represent the relationship between chi-square and LD. I believe that the real data application of MELD is affected by the non-random distribution of causal variants wrt LD. The authors could test this prediction by sampling causal variants as a function of their LD scores, simulate a trait, and apply MELD on their resulting GWAS summary statistics. Another related explanation could be if causal variants are poorly imputed. This should also create a signal detectable by MELD.

3. Estimates of narrow sense heritability from MELD are biased. Figure S8 shows that MELD estimates of narrow sense heritability are biased. The authors propose a model averaging approach, which is claimed to work. However, I remain sceptical that this is the solution to the problem. I believe that the model is trying to fit a very peculiar genetic architecture where there are many causal variants within narrow windows (e.g., 100 kb) all interacting with each other. While there is evidence of widespread allelic heterogeneity (e.g., many causal variants at a given locus), evidence of widespread local epistasis is clearly lacking.

Other comments.

a) Table S6: Estimates from LDSC are inconsistent with previously published data. For example, the LDSC estimates for height and BMI are 0.815 and 0.506, respectively. These estimates are much larger than REML estimates obtained with WGS data (~0.7 and ~0.3). This cannot be true. The MELD estimates for the two traits are 0.57 and 0.282 respectively. While the latter estimates make more sense, I'd be curious to see what is the part attributed by MELD to non-additivity. Sorry if that was reported somewhere but I could not see those.

b) Line 88 : Previous methods have been developed to estimate dominance variance using an LD score regression framework.

c) Line 111 (and elsewhere): polygenicity is not a confounding factor like genetic relatedness and population stratification.

d)Lines 137-138: I think this is incorrect. GWAS do not make assumptions (i) and (ii).

e) Lines 145 – 148: V * theta is not a bias because marginal effects and joint effects are different. Equation (3) uses the same notation for both. However, even when theta=0, β_hat would never equal β (unless R = Identity matrix, i.e. all SNPs are independent).

f) Lines 179 – 184: "Instead, …". This is evidence that the additive and non-additive components are not orthogonal. I'm expecting non-additive effects to explain variance on top of additive genetic variance.

g) Line 209: "phenotype they were" – I think that "they" should be deleted.

Reviewer #2 (Recommendations for the authors):

The authors present an approach for variance component estimation using summary statistics. In particular, they extend the univariate LDSC regression framework to include a parameter capturing local pairwise epistasis, in addition to the usual intercept and additive components, a method they call "MELD". The authors achieve this by computing a second set of LD scores that index the extent to which a given SNP tags SNP-SNP products within a local window (as opposed to tagging individual SNPs in vanilla LDSC regression). They perform a variety of simulations showing that their method is well callibrated under the standard additive LDSC model as well as their proposed generative model including local pairwise epistatic effects. Finally, they apply their method to a variety of phenotypes in the UK and Japan biobanks, identifying substantial putative non-additive genetic variance.

1. In their analyses comparing LDSC and MELD narrow-sense heritability estimates in the UK biobank (UKB), the LDSC heritability estimates are far higher than I've seen elsewhere. For instance, the authors report h2 estimates of 0.815 and 0.506 for height and BMI, respectively. In the Neale lab UKB h2 browser, which used a fairly liberal set of covariates (nealelab.github.io/UKBB_ldsc/h2_browser.html), these same phenotypes have h2 estimates of 0.485 and 0.249. Evans and colleagues report (doi.org/10.1038/s41588-018-0108-x) LDSC h2 estimates of 0.259 and 0.231, also in the UKB. I can only assume an error in the present work has resulted in such large LDSC h2 values. As such, it is hard to meaningfully compare the MELD and LDSC h2 estimates.

2. The authors' simulations only cover a subset of common generative models that I'd like to see before interpreting their findings. Specifically, they don't present simulations under

  • additive + GxE effects

  • the above + local pairwise epistatic effects

  • additive + long range pairwise epistasis (e.g., at loci on different chromosomes)

  • the above + local pairwise epistatic effects

  • additive + ancestry-by-G effects (e.g., the product of individual genotypes and a genomic PC)

    the above + local pairwise epistatic effects

which I would like to see both for their proposed method and for LDSC. They do not need to present a method that will perform well under all of these scenarios, only demonstrate how their method performs under other plausible generative models. To the extent that MELD outperforms LDSC under the MELD generative model, it may perform relatively poorly under alternative architectures.

3. It is believed that much of the signal we see in LDSC h2 estimates reflects the effects of unmeasured variants tagged by, e.g., a million or so HapMap3 SNPs. How does MELD perform when one or both of SNPs with epistatic effects aren't directly measured in the provided data?

4. There is an analogy between LDSC regression and Haseman-Elston (HE) regression (doi.org/10.1214/17-AOAS1052). What would the HE regression equivalent of MELD be? Answering this question will better situate MELD in existing methodological literature.

5. It would be helpful to also present the MELD broad-sense h2 estimates whenever MELD and LDSC narrow-sense h2 estimates are compared.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for submitting your article "Accounting for statistical non-additive interactions enables the recovery of missing heritability from GWAS summary statistics" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and George Perry as the Senior Editor.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

While the reviewers and editors appreciate that the manuscript has improved substantially since the initial submission – and that the method may be picking up something interesting – a key point that was made in the initial review was not addressed adequately: the authors have not clarified the relationship between the parameters their method estimates and the traditional decomposition of the genetic variance into additive and non-additive components. The traditional definition of the additive variance is an orthogonal decomposition, where the non-additive component is defined as the component that is orthogonal to the best linear prediction of the genetic values. This is not the case in the authors' method since they require that the genotype matrix and genetic interaction matrix are correlated in order for their method to pick up any 'epistasis' component. However, the authors claim to recover additional variance beyond the additive variance even from additive GWAS summary statistics. Unless the authors can clarify in detail how their parameters and empirical results relate to traditionally defined additive and non-additive variance components, we will not be able to publish the manuscript.

A further point that was raised during the consultation between the editors and reviewers was whether the authors' method is vulnerable to the artefact that led to a retraction of a Nature paper on cis-epistasis affecting gene expression: https://www.nature.com/articles/s41586-021-03766-y. This article was retracted because they found that what appeared to be epistasis was better explained by interaction genotypes better tagging haplotypes with causal variants.

We would require that the authors address the above points in addition to the point raised by reviewer 2 about whether the s-LDSC results should change the interpretation given to the authors' empirical results about the magnitude of epistasis.

Reviewer #2 (Recommendations for the authors):

This paper has a lot of strong points, and I commend the authors for the effort and ingenuity expended in tackling the difficult problem of estimating epistatic (non-additive) genetic variance from GWAS summary statistics. The mere possibility of the estimated univariate regression coefficient containing a contribution from epistasis, as represented in the manuscript's Equation~3 and elsewhere, is intriguing in and of itself.

Is i-LDSC Estimating Epistasis?

Perhaps the issue that has given me the most pause is uncertainty over whether the paper's method is really estimating the non-additive genetic variance, as this has been traditionally defined in quantitative genetics with great consequences for the correlations between relatives and evolutionary theory (Fisher, 1930, 1941; Lynch and Walsh, 1998; Burger, 2000; Ewens, 2004).

Let us call the expected phenotypic value of a given multiple-SNP genotype the total genetic value. If we apply least-squares regression to obtain the coefficients of the SNPs in a simple linear model predicting the total genetic values, then the partial regression coefficients are the average effects of gene substitution and the variance in the predicted values resulting from the model is called the additive genetic variance. (This is all theoretical and definitional, not empirical. We do not actually perform this regression.) The variance in the residuals---the differences between the total genetic values and the additive predicted values---is the non-additive genetic variance. Notice that this is an orthogonal decomposition of the variance in total genetic values. Thus, in order for the variance in Wθ to qualify as the non-additive genetic variance, it must be orthogonal to Xβ.

At first, I very much doubted whether this is generally true. And I was not reassured by the authors' reply to Reviewer~1 on this point, which did not seem to show any grasp of the issue at all. But to my surprise I discovered in elementary simulations of Equation 1 above that for mean-centered X1 and X2, (X1β1+X2β2) is uncorrelated with X1X2θ for seemingly arbitrary correlation between X1 and X2. A partition of the outcome's variance between these two components is thus an orthogonal decomposition after all. Furthermore, the result seems general for any number of independent variables and their pairwise products. I am also encouraged by the report that standard and interaction LD Scores are ‘lowly correlated' (line~179), meaning that the standard LDSC slope is scarcely affected by the inclusion of interaction LD Scores in the regression; this behavior is what we should expect from an orthogonal decomposition.

I have therefore come to the view that the additional variance component estimated by i-LDSC has a close correspondence with the epistatic (non-additive) genetic variance after all.

In order to make this point transparent to all readers, however, I think that the authors should put much more effort into placing their work into the traditional framework of the field. It was certainly not intuitive to multiple reviewers that Xβ is orthogonal to Wθ. There are even contrary suggestions. For if (Xβ)Wθ=βXWθ is to equal zero, we know that we can't get there by XW equaling zero because then the method has nothing to go on (e.g., line~139). We thus have a quadratic form---each term being the weighted product of an average (additive) effect and an interaction coefficient---needing to cancel out to equal zero. I wonder if the authors can put forth a rigorous argument or compelling intuition for why this should be the case.

In the case of two polymorphic sites, quantitative genetics has traditionally partitioned the total genetic variance into the following orthogonal components:

  • additive genetic variance, σAA2, the numerator of the narrow-sense heritability;

  • dominance genetic variance, σA2;

  • additive-by-additive genetic variance, σAA2;

  • additive-by-dominance genetic variance, σAD2; and

  • dominance-by-dominance genetic variance, σDD2.

See Lynch and Walsh (1998, pp. 88-92) for a thorough numerical example. This decomposition is not arbitrary or trivial, since each component has a distinct coefficient in the correlations between relatives. Is it possible for the authors to relate the variance associated with their Wθ to this traditional decomposition? Besides justifying the work in this paper, the establishment of a relationship can have the possible practical benefit of allowing i-LDSC estimates of non-additive genetic variance to be checked against empirical correlations between relatives. For example, if we know from other methods that σD2 is negligible but that i-LDSC returns a sizable σAA2, we might predict that the parent-offspring correlation should be equal to the sibling correlation; a sizable σD2 would make the sibling correlation higher. Admittedly, however, such an exercise can get rather complicated for the variance contributed by pairs of SNPs that are close together (Lynch and Walsh, 1998, pp. 146-152).

I would also like the authors to clarify whether LDSC consistently overestimates the narrow-sense heritability in the case that pairwise epistasis is present. The figures seem to show this. I have conflicting intuitions here. On the one hand, if GWAS summary statistics can be inflated by the tagging of epistasis, then it seems that LDSC should overestimate heritability (or at least this should be an upwardly biasing factor; other factors may lead the net bias to be different). On the other hand, if standard and interaction LD Scores are lowly correlated, then I feel that the inclusion of interaction LD Score in the regression should not strongly affect the coefficient of the standard LD Score. Relatedly, I find it rather curious that i-LDSC seems increasingly biased as the proportion of genetic variance that is non-additive goes up---but perhaps this is not too important, since such a high ratio of narrow-sense to broad-sense heritability is not realistic.

How Much Epistasis Is i-LDSC Detecting?

I think the proper conclusion to be drawn from the authors' analyses is that statistically significant epistatic (non-additive) genetic variance was not detected. Specifically, I think that the analysis presented in Supplementary Table~S6 should be treated as a main analysis rather than a supplementary one, and the results here show no statistically significant epistasis. Let me explain.

Most serious researchers, I think, treat LDSC as an unreliable estimator of narrow-sense heritability; it typically returns estimates that are too low. Not even the original LDSC paper pressed strongly to use the method for estimating h2 (Bulik-Sullivan et al., 2015). As a practical matter, when researchers are focused on estimating absolute heritability with high accuracy, they usually turn to GCTA/GREML (Evans et al., 2018; Wainschtein et al., 2022).

One reason for low estimates with LDSC is that if SNPs with higher LD Scores are less likely to be causal or to have large effect sizes, then the slope of univariate LDSC will not rise as much as it ‘should’ with increasing LD Score. This was a scenario actually simulated by the authors and displayed in their Supplementary Figure~S15. [Incidentally, the authors might have acknowledged earlier work in this vein. A simulation inducing a negative correlation between LD Scores and χ2 statistics was presented by Bulik-Sullivan et al. (2015, Supplementary Figure 7), and the potentially biasing effect of a correlation over SNPs between LD Scores and contributed genetic variance was a major theme of Lee et al. (2018).] A negative correlation between LD Score and contributed variance does seem to hold for a number of reasons, including the fact that regions of the genome with higher recombination rates tend to be more functional. In short, the authors did very well to carry out this simulation and to show in their Supplementary Figure~S15 that this flaw of LDSC in estimating narrow-sense heritability is also a flaw of i-LDSC in estimating broad-sense heritability. But they should have carried the investigation at least one step further, as I will explain below.

Another reason for LDSC being a downwardly biased estimator of heritability is that it is often applied to meta-analyses of different cohorts, where heterogeneity (and possibly major but undetected errors by individual cohorts) lead to attenuation of the overall heritability (de Vlaming et al., 2017).

The optimal case for using LDSC to estimate heritability, then, is incorporating the LD-related annotation introduced by Gazal et al. (2017) into a stratified-LDSC (s-LDSC) analysis of a single large cohort. This is analogous to the calculation of multiple GRMs defined by MAF and LD in the GCTA/GREML papers cited above. When this was done by Gazal et al. (2017, Supplementary Table 8b), the joint impact of the improvements was to increase the estimated narrow-sense heritability of height from 0.216 to 0.534.

All of this has at least a few ramifications for i-LDSC. First, the authors do not consider whether a relationship between their interaction LD Scores and interaction effect sizes might bias their estimates. (This would be on top of any biasing relationship between standard LD Scores and linear effect sizes, as displayed in Supplementary Figure~S15.) I find some kind of statistical relationship over the whole genome, induced perhaps by evolutionary forces, between cis-acting epistasis and interaction LD Scores to be plausible, albeit without intuition regarding the sign of any resulting bias. The authors should investigate this issue or at least mention it as a matter for future study. Second, it might be that the authors are comparing the estimates of broad-sense heritability in Table~1 to the wrong estimates of narrow-sense heritability. Although the estimates did come from single large cohorts, they seem to have been obtained with simple univariate LDSC rather than s-LDSC. When the estimate of h2 obtained with LDSC is too low, some will suspect that the additional variance detected by i-LDSC is simply additive genetic variance missed by the downward bias of LDSC. Consider that the authors' own Supplementary Table~S6 gives s-LDSC heritability estimates that are consistently higher than the LDSC estimates in Table~1. E.g., the estimated h2 of height goes from 0.37 to 0.43. The latter figure cuts quite a bit into the estimated broad-sense heritability of 0.48 obtained with i-LDSC.

Here we come to a critical point. Lines 282--286 are not entirely clear, but I interpret them to mean that the manuscript's Equation~5 was expanded by stratifying into the components of s-LDSC and this was how the estimates in Supplementary Table~S6 were obtained. If that interpretation is correct, then the scenario of i-LDSC picking up missed additive genetic variance seems rather plausible. At the very least, the increases in broad-sense heritability reported in Supplementary Table~S6 are smaller in magnitude and not statistically significant. Perhaps what this means is that the headline should be a negligible contribution of pairwise epistasis revealed by this novel and ingenious method, analogous to what has been discovered with respect to dominance (Hivert et al., 2021; Pazokitoroudi et al., 2021; Okbay et al., 2022; Palmer et al., 2023).

References

Bulik-Sullivan, B., Loh, P.-R., Finucane, H. K., Ripke, S., Yang, J., Schizophrenia Working Group of the Psychiatric Genomics Consortium, Patterson, N., Daly, M. J., Price, A. L., and Neale, B. M. (2015). LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nature Genetics, 47, 291-295.

Burger, R. (2000). The mathematical theory of selection, recombination, and mutation. Wiley.

de Vlaming, R., Okbay, A., Rietveld, C. A., Johannesson, M., Magnusson, P. K. E., Uitterlinden, A. G., van Rooij, F. J. A., Hofman, A., Groe- nen, P. J. F., Thurik, A. R., and Koellinger, P. D. (2017). Meta-GWAS Accuracy and Power (MetaGAP) calculator shows that hiding heritability is partially due to imperfect genetic correlations across studies. PLoS Genetics, 13, e1006495.

Evans, L. M., Tahmasbi, R., Vrieze, S. I., Abecasis, G. R., Das, S., Gazal, S., Bjelland, D. W., de Candia, T. R., Haplotype Reference Consortium, Goddard, M. E., Neale, B. M., Yang, J., Visscher, P. M., and Keller, M. C. (2018). Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits. Nature Genetics, 50, 737-745.

Ewens, W. J. (2004). Mathematical population genetics I. Theoretical introduction (2nd ed.). Springer.

Fisher, R. A. (1930). The genetical theory of natural selection. Oxford University Press.

Fisher, R. A. (1941). Average excess and average effect of a gene substitution. Annals of Eugenics, 11, 53-63.

Gazal, S., Finucane, H. K., Furlotte, N. A., Loh, P.-R., Palamara, P. F., Liu, X., Schoech, A., Bulik-Sullivan, B., Neale, B. M., Gusev, A., and Price, A. L. (2017). Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nature Genetics, 49, 1421-1427.

Hivert, V., Sidorenko, J., Rohart, F., Goddard, M. E., Yang, J., Wray, N. R., Yengo, L., and Visscher, P. M. (2021). Estimation of non-additive genetic variance in human complex traits from a large sample of unrelated individuals. American Journal of Human Genetics, 108, 786- 798.

Lee, J. J., McGue, M., Iacono, W. G., and Chow, C. C. (2018). The accuracy of LD Score regression as an estimator of confounding and genetic correlations in genome-wide association studies. Genetic Epidemiology, 42, 783-795.

Lynch, M., and Walsh, B. (1998). Genetics and the analysis of quantitative traits. Sinauer.

Okbay, A., Wu, Y., Wang, N., Jayashankar, H., Bennett, M., Nehzati, S. M., Sidorenko, J., Kweon, H., Goldman, G., Gjorgjieva, T., Jiang, Y., Hicks, B., Tian, C., Hinds, D. A., Ahlskog, R., Magnusson, P. K. E., Oskarsson, S., Hayward, C., Campbell, A., … Young, A. I. (2022). Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individu- als. Nature Genetics, 54, 437-449.

Palmer, D. S., Zhou, W., Abbott, L., Wigdor, E. M., Baya, N., Churchhouse, C., Seed, C., Poterba, T., King, D., Kanai, M., Bloemendal, A., and Neale, B. M. (2023). Analysis of genetic dominance in the UK Biobank. Science, 379, 1341-1348.

Pazokitoroudi, A., Chiu, A. M., Burch, K. S., Pasaniuc, B., and Sankararaman, S. (2021). Quantifying the contribution of dominance deviation effects to complex trait variation in biobank-scale data. American Journal of Human Genetics, 108, 799-808.

Wainschtein, P., Jain, D., Zheng, Z., TOPMed Anthropometry Working Group, NHLBI Trans-Omics for Precision Medicine Consoritum, Cupples, L. A., Shadyab, A. H., McKnight, B., Shoemaker, B. M., Mitchell, B. D., Psaty, B. M., Kooperberg, C., Liu, C.-T., Albert, C. M., Roden, D., Chasman, D. I., Darbar, D., Lloyd-Jones, D. M., Arnett, D. K.,... Visscher, P. M. (2022). Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nature Genetics, 54, 263-273.

Abstract: Here and elsewhere, the term cis is used. Parenthetically it is explained that a cis-interaction score captures interactions between a focal variant and nearby variants. Does this mean that i-LDSC only tries to estimate the variance contributed by pairwise interactions between nearby SNPs? That is, does the method say nothing about statistical interactions between SNPs on different chromosomes, say? The paper should make a clear statement about this point somewhere.

line~54--58: I do not understand what is being asserted here. I think it might clear some things up to point out that the developers of LDSC did not rest content with the assumption that every SNP has an effect. E.g., they conducted simulations showing that the intercept and slope remain unbiased estimators, so along as certain other conditions are met, as the proportion of SNPs with a nonzero effect ranged from one all the way down to 104 (Bulik-Sullivan et al., 2015, Supplementary Figures 3 and 4).

line~177: I suggest choosing a term other than ‘infinitesimal model’. Some kind of model where every SNP has an effect does not need to hold in order for standard LDSC to work.

line~505: ‘delete-one jackknife’ is confusing, because each block removed in a jackknife replicate contains more than one SNP. Leaving out the ‘delete-one’ is better.

lines~513--518: The justifications for the shortcuts in the computation of interaction LD Scores seem a bit handwavy to me. What about SNPs with which the j-th SNP is highly correlated? Is it possible to explain why these do not matter so much?

line~615: Now here is a variance decomposition which, I think, is truly questionable. Why does it make sense to say that the variance attributable to statistical interaction between genetic and non-genetic variables is part of the broad-sense heritability?

Figure~S8: What makes windows of 25 or 50 SNPs for calculating interaction LD Scores the best for getting an estimate of the broad-sense heritability with i-LDSC? Why does a smaller window lead to inflation? This is not fully explained. I am sometimes fine with trying several values of an adjustable setting and picking whatever seems to work the best, but here I am more than a bit curious.

Figure~S16: Is this a duplication of Figure~4D?

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Discovering non-additive heritability using additive GWAS summary statistics" for further consideration by eLife. Your revised article has been evaluated by George Perry (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below. If you think you are unable to fully address the reviewer's comments (which pertain to similar issues identified in the last round of reviews), we would advise you not to submit a revised manuscript.

Reviewer #1 (Recommendations for the authors):

1. I don't have issues with the authors' derivations but I do still think the authors have not yet successfully related their variance decomposition to the standard variance decomposition in classical quantitative genetics. Here's a simple example demonstrating this

For simplicity, let the generative model be y=X1β1+X2β2+Wθ+e where X1, X2 are independent and standardized and W=standardized(X1Χ2), denoting r1=cov(X1,W)0, r2=cov(Χ2,W)0, with cov(X1,Χ2)=0 and e independent of everything else and y having zero mean and unit variance.

The classical additive genetic variance will the R squared from regressing y on X1 and X2. I.e.,

Vg=maxbcor(y,X1+X2b)2=maxb[cov(y,X1+X2b)21+b2]=maxb[cov(X1β1+X2β2+Wθ+e,X1+X2b)21+b2]=maxb[(β1+θr1+β2b+θr2b)21+b2]

Doing some calculus, we get Vg=(β12+β22)+2θ(β1r1+β2r2)+θ2(r12+r22). So the classical genetic variance is larger than the additive variance component in the authors model (here β12+β22)! It will include the part of the epistatic component tagged by additive SNPs X, which is nonzero despite Xβ and Wθ theta being orthogonal.

I imagine you should be able to figure out the classical additive genetic variance (and epistatic etc) from the authors' model, though this isn't present in their current parametrization. If you can do that, I think it's far easier to interpret.

2. I'm not satisfied with the authors' response to the editors' comment here:

"If these simulations were conducted for sparse architectures, we would have likely seen a greater impact on i-LDSC; although, we want to note that we have already shown the LD score regression framework to be uncalibrated for traits with sparse genetic architectures (see Figure 3 —figure supplement 8)."

The issue here isn't that the results will be biased; LDSC regression is biased for estimating h2 with missing variants but this bias is easy to understand--missing causal variants lead to lower estimates. The question for your method is what happens to epistatic variance component. It's fine if it's attenuated, it's not fine if leads to false positive signal of epistasis. Simulations to clarify this are necessary

Reviewer #2 (Recommendations for the authors):

I focus my review in this round on the issues raised by the editor and also on a point of my own accepting the results at face value, how are we to interpret their statistical significance and importance?

Orthogonal decomposition of the genetic variance

I do not think the authors did much to clarify whether their partition of the genetic variance is an orthogonal decomposition in accord with traditional quantitative-genetic theory. Looking at their line 138 and Equation 8, I see that they make use of the assumption that the additive effects and interaction coefficients are independent random variables with zero means. This kind of assumption has become common in statistical genetics (e.g., Yang et al., 2010), and I'm pretty sure the authors think they are simply following the lead of the original LDSC paper. But, as a few papers have pointed out (e.g., de los Campos et al., 2015), this assumption does not make a lot of sense since it is properties of individuals (genotypes, residuals) that are random rather than the properties of SNPs. And even if we are taking an average over different locations in the genome rather than over a population of individuals, why should the mean of the betas be zero? If we always count the derived allele, then the mean might well be negative if there is any tendency for mutations to have a depressing effect on the trait in question. Also, why independent? The recent report that SNPs with larger main effects tend to exhibit more dominance (Palmer et al., 2023) might lead us to expect that pairs of SNPs with larger main effects might show stronger statistical interactions.

Despite the fact that the ‘symbolic convention’ of E(β) = 0 has often been used to justify surprisingly robust tools, I think that it should no longer be retained in statistical genetics. It is confusing, it might be used in the future in an attempt to justify results that are really far off, and it is known already to prove too much. The supposed proofs of GCTA and LDSC seem to show that they are definitely unbiased, but they are in fact biased under conditions that cannot be explained in terms of this genetic-effects-as-random-variables notion.

My strong suggestion is to justify the orthogonality of Xβ and Wθ in some different way. One such way might to prove that Equation 1 in my first review is an orthogonal decomposition, treating the betas and thetas as fixed constants and the X's as random variables. I can see intuitively why this is the case. Suppose that X1 and Χ2 are highly or even perfectly correlated. If X1 is mean-centered, then it should be uncorrelated with the square of mean-centered Χ2. Roughly speaking, each x1 should be balanced by another of opposite sign, and since they'll both be weighted by the square of χ2 in the sum, they will cancel each other out.

Another way might be to add some caveats. Something like: "The appropriateness of treating genetic effects as random variables in analytical derivations has been questioned. Later, we will justify the theory presented here with simulation results showing that i-LDSC accurately recovers the magnitude of the epistatic variance in Eq. (1) under a broad range of conditions."

A related suggestion is to add a qualifier at some point in the text to the use of the term "additive-by-additive" or the symbol σAA2. The authors do not really explain the relationship between their epistatic variance component associated with product terms and the traditional σAA2, σAD2, and σDD2. Or, instead of a special qualifier, maybe use a different term and symbol altogether. Maybe the term should be "epistatic" or "pairwise-epistatic." Sometimes the symbol σI2 is used to denote epistatic genetic variance, the I standing for "interaction" in the statistical sense (e.g., Falconer and Mackay, 1996). This seems less potentially misleading.

Might interaction LD scores better tag single variants, as in the retracted paper of Hemani et al.

I am satisfied with the response of the authors, based on their new simulations.

The statistical significance and substantive importance of the estimated epistatic variance components

Previously, the authors put forth their heritability estimates with no functional annotations as their main results and treated their estimates with functional annotations as supplementary. I argued that the latter estimates should be treated as the main. In their revision, the authors put both sets of results in their main text. I suppose that this is a reasonable compromise.

The authors argue that the results with functional annotations (i.e., s-LDSC) cannot be dismissed even though they are not statistically significant. They point out in their reply that the standard error calculated by LDSC with the block jackknife is conservative. I had neglected this point, with which I agree. The block-jackknife standard error seems not to vanish with infinite sample size, probably because there are inherent differences between blocks of the genome. The sample size used by the authors (~350,000) might be large enough for this feature of LDSC to be a problem.

Sometimes the LDSC developers in a situation like this will take an average estimate over all traits in an analysis and then use the block jackknife to calculate the significance of it. Perhaps the authors here could calculate the average percentage increase in the heritability; this might turn out to be better powered. But this is just a suggestion, and the authors do not need to take the trouble in order to secure my assent to publication.

References

de los Campos, G., Sorensen, D., and Gianola, D. (2015). Genomic heritability: What is it? PLoS Genetics, 11, e1005048.

Falconer, D.S., and Mackay, T.F.C. (1996). Introduction to quantitative genetics (4th ed.). Longman.

Palmer, D.S. et al. (2023). Analysis of genetic dominance in the UK Biobank. Science, 379, 1341-1349.

Yang, J. et al. (2010). Common SNPs explain a large proportion of the heritability for human height. Nature Genetics, 42, 565-569.

eLife. 2024 Jun 24;13:e90459. doi: 10.7554/eLife.90459.sa2

Author response


[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]

Reviewer #1 (Recommendations for the authors):

In this study, Darnell and colleagues propose a new method to quantify the contribution of cis-epistasis (i.e., interactions between nearby genetic variants) to the heritability of complex traits. Their method, MELD, is an extension of the popular LD score regression methodology. The authors perform simulations conditional on real genotypes to assess the consistency of their estimators of SNP-based heritability and apply their methods to 25 complex traits measured in participants of the UK Biobank and Biobank Japan. The authors conclude that unaccounted epistasis biases estimates of (narrow sense) SNP-based heritability.

I have concerns regarding the method and the conclusions of this study.

We thank the reviewer for carefully reading of our manuscript and we appreciate the constructive comments. Our responses and edits to the specific comments are given below.

1. Additive and non-additive effects are not orthogonal in the proposed model. The authors used the generative models in Equations (1) and (2) to define the narrow sense heritability. However, this definition is incorrect because the additive and non-additive components of their model are not orthogonal. A key point to understand is that there is an additive genetic variance even when all genetic effects arise from statistical interactions between loci. This is the core issue discussed in many papers cited by the authors (e.g., Hill, Goddard and Visscher) but this does not seem to be fully grasped in this study. Unfortunately, I think there is an important confusion between what quantitative genetics classically terms as additive genetic variance and what the authors define here. The same distinction applies for dominance variance. Therefore, I don't think that what the authors term as a "bias" is justified here. That being said, it could still be that what MELD estimates (i.e., not the narrow sense heritability) can teach us something interesting about the genetic architecture of complex traits. I'm not just sure what exactly but my next point below may help.

2. Observations are consistent with alternative explanations. The linear relationship between (expected) chi-square statistics and LD scores postulated by the LD score regression methodology holds under a number of assumptions: (i) all variants are causal and (ii) explain the same amount of variance. However, if (i) is violated then the expected chi-square statistic at a given SNP k is an affine function of the squared correlation between SNP k and all causal variants (in the vicinity). As consequence, if causal variants are enriched in low or high LD regions of the genome, then extra (non-linear) terms could be needed to represent the relationship between chi-square and LD. I believe that the real data application of MELD is affected by the non-random distribution of causal variants wrt LD. The authors could test this prediction by sampling causal variants as a function of their LD scores, simulate a trait, and apply MELD on their resulting GWAS summary statistics. Another related explanation could be if causal variants are poorly imputed. This should also create a signal detectable by MELD.

We will attempt to address the previous two comments together because they are related. First, we want to thank the reviewer for bringing up these important points and we apologize for not appropriately representing the concept of additive genetic variance as it is classically defined in the quantitative genetics’ literature. If we define additive genetic variance as being any type of effect that can be explained by a linear model, then we agree that the term “bias” is not appropriate.

In the revised manuscript, we follow many of the reviewer’s suggestions as we reframe the contributions of our work. To begin, we now highlight that the original LD score regression (LDSC) relies on the fact that the expected relationship between chi-square test statistics (i.e., the squared magnitude of GWAS allelic effect estimates) and LD scores holds when complex traits are generated under the infinitesimal (or polygenic) model. Importantly, the estimand of the LDSC model is the proportion of phenotypic variance attributable to additive effects of genotyped SNPs. In this paper, we show that the LDSC framework can be extended to estimate greater proportions of genetic variance in complex traits (i.e., beyond the variance that is attributable to additive effects) when a subset of causal variants is involved in a gene-by-gene (G×G) interaction.

The key theoretical insight we now leverage is that SNP-level GWAS summary statistics can provide evidence of non-additive genetic effects contributing to trait architecture if there is a nonzero correlation between individual-level genotypes and their statistical interactions. Our newly named “interaction-LD score” (i-LDSC) model therefore aims to recover missing heritability from GWAS summary statistics by incorporating an additional score that measures the nonadditive genetic variation that is tagged by genotyped SNPs. We then demonstrate how i-LDSC builds upon the original LDSC model through the development of new “cis-interaction” LD scores which help to investigate signals stemming from cis-acting SNP-by-SNP interactions.

In our new simulations, we show that the cis-interaction LD scores that we propose are not always enough to recover missing genetic variation for all types of trait architectures. Following the reviewer’s suggestion, we did a study where we generated synthetic phenotypes with sparse architectures from a spike-and-slab model (Zhou et al. 2013, PLOS Genetics). Here, traits were simulated with solely additive effects, but this time only variants with the top or bottom {1, 5, 10, 25, 50, 100 percentile of LD scores were given non-zero coefficients. Breaking the assumed relationship between LD scores and chi-squared test statistics (i.e., that they are generally positively correlated) led to unbounded estimates of heritability in all but the (polygenic) scenario when 100% of SNPs contributed to the phenotypic variance for both LDSC and i-LDSC. Results for these simulations can be found in Figure S15.

3. Estimates of narrow sense heritability from MELD are biased. Figure S8 shows that MELD estimates of narrow sense heritability are biased. The authors propose a model averaging approach, which is claimed to work. However, I remain sceptical that this is the solution to the problem. I believe that the model is trying to fit a very peculiar genetic architecture where there are many causal variants within narrow windows (e.g., 100 kb) all interacting with each other. While there is evidence of widespread allelic heterogeneity (e.g., many causal variants at a given locus), evidence of widespread local epistasis is clearly lacking.

This is a great point, and we appreciate the reviewer for bringing this up. Indeed, different types of interaction scores used within the i-LDSC model will fit to different types of genetic architectures. When the cis-interaction scores are computed over genomic window sizes that are too narrow, i-LDSC can yield upwardly biased heritability estimates (e.g., Figure S8). In practice, we find that cis-interaction LD scores that are calculated using larger windows lead to the most accurate estimates of heritability while also not over representing the total phenotypic variation explained by tagged non-additive genetic effects. Therefore, unless otherwise stated, we use cisinteraction LD scores calculated with a ±50 SNP interaction window for all simulations and real data analyses conducted in this work. We now make this explicit point in the revised manuscript (see new lines 225-228 and 549-554).

Other comments.

a) Table S6: Estimates from LDSC are inconsistent with previously published data. For example, the LDSC estimates for height and BMI are 0.815 and 0.506, respectively. These estimates are much larger than REML estimates obtained with WGS data (~0.7 and ~0.3). This cannot be true. The MELD estimates for the two traits are 0.57 and 0.282 respectively. While the latter estimates make more sense, I'd be curious to see what is the part attributed by MELD to non-additivity. Sorry if that was reported somewhere but I could not see those.

We really appreciate the reviewer catching this discrepancy in the heritability estimates we report and those commonly reported in the literature. We did an extensive review of the LD score regression framework, the use of certain covariates in the software, and fixed our implementation of the weighted least squares algorithm to estimate model coefficients. This corrected the inflated heritability estimates that we were seeing in our analyses for traits in the UK Biobank and BioBank Japan. For example, for height and body mass index (BMI) in the UK Biobank, we now get heritability estimates around 0.368 and 0.176 for LDSC, respectively. For those same traits, iLDSC produces heritability estimates 0.482 and 0.235, respectively. The updated results for the real data analyses can be found in the new Figures 4 and S16 and the new Tables 1 and S6.

We apologize for not making the estimates σ^ (i.e., the proportion of heritability estimates from iLDSC attributed to non-additivity) more readily available in our initial submission. These now can be found in the newly revised Table 1 along with (1) heritability estimates from LDSC and i-LDSC, and (2) P-values highlighting statistically significant contributions of tagged non-additive genetic variation for 25 traits in the UK Biobank and BioBank Japan.

b) Line 88 : Previous methods have been developed to estimate dominance variance using an LD score regression framework.

We appreciate the reviewer for highlighting this omission in our previous submission. In the newly revised manuscript, we now include a reference to Palmer et al. (2023), Science (see line 65) where the LD score regression framework was extended to account for dominance effects.

c) Line 111 (and elsewhere): polygenicity is not a confounding factor like genetic relatedness and population stratification.

We thank the reviewer for pointing out this error and have removed all references to polygenicity as a source of confounding.

d) Lines 137-138: I think this is incorrect. GWAS do not make assumptions (i) and (ii).

We agree with the reviewer that our assumptions about GWA studies was not consistent with the literature. In the revision, we have edited this part of manuscript to read (see new lines 135-141):

“A central objective in GWAS studies is to infer how much phenotypic variation can be explained by genetic effects. To achieve that objective, a key consideration involves incorporating the possibility of non-additive sources of genetic variation to be correlated with and explained by additive effect size estimates obtained from GWAS analyses (Hill et al. 2008, PLOS Genetics). If we assume that the genotype and interaction matrices X and W are not completely orthogonal (i.e., such that XW≠𝟎) then the following relationship between the moment matrix Xy, the observed marginal GWAS summary statistics β^, and the true coefficient values β from the generative model in Eq. (1) holds in expectation (see Materials and methods)… “

e) Lines 145 – 148: V * theta is not a bias because marginal effects and joint effects are different. Equation (3) uses the same notation for both. However, even when theta=0, β_hat would never equal β (unless R = Identity matrix, i.e. all SNPs are independent).

We agree with the reviewer, and we apologize again for not using the definitions that are consistent with the quantitative genetics’ literature. We have removed the language we used previously and now write the following (see new lines 145-148):

“Intuitively, the term Vθ can be interpreted as the non-additive effects that are tagged by the additive effect estimates from the GWAS study. Note that, when (i) non-additive genetic effects play a negligible role on the overall architecture of a trait (i.e., such that θ=𝟎) or (ii) the genotype and interaction matrices X and W do not share the same column space (i.e., such that XW=𝟎), the equation above simplifies to a relationship between LD and summary statistics that is assumed in many common GWAS studies and methods…”

f) Lines 179 – 184: "Instead, …". This is evidence that the additive and non-additive components are not orthogonal. I'm expecting non-additive effects to explain variance on top of additive genetic variance.

This is a great catch. As we mentioned in a response to a previous comment made by the reviewer, we fixed our implementation of the weighted least squares algorithm to estimate model coefficients. Using the appropriate model fitting strategy resulted in the expected result where heritability estimates for i-LDSC are always larger than those from LDSC --- indicating that the variance explained by the cis-interaction LD score is on top of the variance captured by additive LD score alone. The updated results for the real data analyses can be found in the new Figures 4 and S16 and the new Tables 1 and S6.

g) Line 209: "phenotype they were" – I think that "they" should be deleted.

We appreciate the reviewer for reading our manuscript so closely. We have fixed this typo.

Reviewer #2 (Recommendations for the authors):

The authors present an approach for variance component estimation using summary statistics. In particular, they extend the univariate LDSC regression framework to include a parameter capturing local pairwise epistasis, in addition to the usual intercept and additive components, a method they call "MELD". The authors achieve this by computing a second set of LD scores that index the extent to which a given SNP tags SNP-SNP products within a local window (as opposed to tagging individual SNPs in vanilla LDSC regression). They perform a variety of simulations showing that their method is well callibrated under the standard additive LDSC model as well as their proposed generative model including local pairwise epistatic effects. Finally, they apply their method to a variety of phenotypes in the UK and Japan biobanks, identifying substantial putative non-additive genetic variance.

We thank the reviewer for their in-depth reading of our manuscript and feel that their suggestions have vastly improved our work. Our responses to the specific comments are given below.

1. In their analyses comparing LDSC and MELD narrow-sense heritability estimates in the UK biobank (UKB), the LDSC heritability estimates are far higher than I've seen elsewhere. For instance, the authors report h2 estimates of 0.815 and 0.506 for height and BMI, respectively. In the Neale lab UKB h2 browser, which used a fairly liberal set of covariates (nealelab.github.io/UKBB_ldsc/h2_browser.html), these same phenotypes have h2 estimates of 0.485 and 0.249. Evans and colleagues report (doi.org/10.1038/s41588-018-0108-x) LDSC h2 estimates of 0.259 and 0.231, also in the UKB. I can only assume an error in the present work has resulted in such large LDSC h2 values. As such, it is hard to meaningfully compare the MELD and LDSC h2 estimates.

We really appreciate the reviewer catching this discrepancy in the heritability estimates we report and those commonly reported in the literature. We did an extensive review of the LD score regression framework, the use of certain covariates in the software, and fixed our implementation of the weighted least squares algorithm to estimate model coefficients. More specifically, we altered the regression weights that were recommended by the original authors of the LDSC model (e.g., HapMap3 SNPs with no MHC region) and we believe that it corrected for correlation in chisquared statistics due to LD and heteroskedasticity between variants with small and large chisquared statistics. This procedure lessened the inflation in the heritability estimates that we were seeing in our initial analyses for traits in the UK Biobank and BioBank Japan. For example, for height and body mass index (BMI) in the UK Biobank, we now get heritability estimates around 0.368 and 0.176 for LDSC, respectively. For those same traits, i-LDSC produces heritability estimates 0.482 and 0.235, respectively. The updated results for the real data analyses can be found in the new Figures 4 and S16 and the new Tables 1 and S6.

We also apologize for not making the estimates σ^ (i.e., the proportion of heritability estimates from i-LDSC attributed to non-additivity) more readily available in our initial submission. These now can be found in the newly revised Table 1 along with (1) heritability estimates from LDSC and iLDSC, and (2) P-values highlighting statistically significant contributions of tagged non-additive genetic variation for 25 traits in the UK Biobank and BioBank Japan.

2. The authors' simulations only cover a subset of common generative models that I'd like to see before interpreting their findings. Specifically, they don't present simulations under

  • additive + GxE effects

  • the above + local pairwise epistatic effects

  • additive + long range pairwise epistasis (e.g., at loci on different chromosomes)

  • the above + local pairwise epistatic effects

  • additive + ancestry-by-G effects (e.g., the product of individual genotypes and a genomic PC)

  • the above + local pairwise epistatic effects

which I would like to see both for their proposed method and for LDSC. They do not need to present a method that will perform well under all of these scenarios, only demonstrate how their method performs under other plausible generative models. To the extent that MELD outperforms LDSC under the MELD generative model, it may perform relatively poorly under alternative architectures.

We thank the reviewer for providing us with this suggestion. As suggested by the reviewer, in the revised manuscript, we now compare heritability estimates from both LDSC and i-LDSC in the presence of additive effects, cis-acting interactions, and a third source of genetic variance stemming from either gene-by-environment (G×E) or gene-by-ancestry (G×Ancestry) effect. The results for these analyses can be found in the new Figures S9-S14. As we report in new lines 240-254:

“In general, i-LDSC underestimates overall heritability when additive effects and cis-acting interactions are present alongside G×E (Figure S9) and/or G×Ancestry effects when PCs are included as covariates (Figure S10). Notably, when PCs are not included to correct for residual stratification, both LDSC and i-LDSC can yield unbounded heritability estimates greater than 1 (Figure S11). Also interestingly, when we omit cis-interactions from the generative model (i.e., the genetic architecture of simulated traits is only made up of additive and G×E or G×Ancestry effects), i-LDSC will still estimate a nonzero genetic variance component with the cis-interaction LD scores (Figures S12-S14). Collectively, these results empirically show the important point that cis-interaction scores are not enough to recover missing genetic variation for all types of trait architectures; however, they are helpful in recovering phenotypic variation explained by statistical cis-interaction effects. Recall that the linear relationship between (expected) χ2 test statistics and LD scores proposed by the LDSC framework holds when complex traits are generated under the polygenic model where all causal variants have the same expected contribution to phenotypic variation. When cis-interactions affect genetic architecture (e.g., in our earlier simulations in Figure 3), these assumptions are violated in LDSC, but the inclusion of the additional nonlinear scores in i-LDSC help recover the relationship between the expectation of χ2 test statistics and LD.”

3. It is believed that much of the signal we see in LDSC h2 estimates reflects the effects of unmeasured variants tagged by, e.g., a million or so HapMap3 SNPs. How does MELD perform when one or both of SNPs with epistatic effects aren't directly measured in the provided data?

This is a great suggestion. In the revised version of the manuscript, we performed the new simulations using all the UK Biobank genotype variants to assign causal SNPs contributing additive and nonlinear effects (see new Figures S9-S15). However, some of these are not HapMap3 SNPs and are ultimately not included in the LDSC or i-LDSC regression analyses to produce heritability estimates. This presents a realistic scenario where variants with both additive and non-additive genetic effects are omitted (line 688-698). Exceptions to this setup are the initial set of simulations which we use to illustrate the power of the i-LDSC in a model where the exact effect of all variants are observed and included in the LDSC or i-LDSC regression (these include Figures 1-3 and S1-S8).

4. There is an analogy between LDSC regression and Haseman-Elston (HE) regression (doi.org/10.1214/17-AOAS1052). What would the HE regression equivalent of MELD be? Answering this question will better situate MELD in existing methodological literature.

We appreciate the reviewer for making this connection. Indeed, there is a direct connection between the LDSC framework and variance component approaches such as Haseman-Elston (HE) regression (Zhou 2017, Annals of Applied Statistics). While deriving the HE regression equivalent of i-LDSC might be outside the scope of this particular paper, as part of future work, we can think of i-LDSC as a multiple random effect model and explore alternative fitting algorithms such as MQS which is based on a method of moments and produces estimates that are mathematically identical to the HE regression (Zhou 2017, Annals of Applied Statistics; Crawford et al. 2017, PLOS Genetics; Zhu and Zhou 2020, Computational and Structural Biotechnology Journal).

5. It would be helpful to also present the MELD broad-sense h2 estimates whenever MELD and LDSC narrow-sense h2 estimates are compared.

Completely agree. We now report both LDSC and i-LDSC heritability estimates together whenever they are referenced. This can be seen reflected in the new Figures 3-4 and S8-15 and Tables 1 and S6.

[Editors’ note: what follows is the authors’ response to the second round of review.]

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

While the reviewers and editors appreciate that the manuscript has improved substantially since the initial submission – and that the method may be picking up something interesting – a key point that was made in the initial review was not addressed adequately: the authors have not clarified the relationship between the parameters their method estimates and the traditional decomposition of the genetic variance into additive and non-additive components. The traditional definition of the additive variance is an orthogonal decomposition, where the non-additive component is defined as the component that is orthogonal to the best linear prediction of the genetic values. This is not the case in the authors' method since they require that the genotype matrix and genetic interaction matrix are correlated in order for their method to pick up any 'epistasis' component. However, the authors claim to recover additional variance beyond the additive variance even from additive GWAS summary statistics. Unless the authors can clarify in detail how their parameters and empirical results relate to traditionally defined additive and non-additive variance components, we will not be able to publish the manuscript.

A further point that was raised during the consultation between the editors and reviewers was whether the authors' method is vulnerable to the artefact that led to a retraction of a Nature paper on cis-epistasis affecting gene expression: https://www.nature.com/articles/s41586-021-03766-y. This article was retracted because they found that what appeared to be epistasis was better explained by interaction genotypes better tagging haplotypes with causal variants.

We would require that the authors address the above points in addition to the point raised by reviewer 2 about whether the s-LDSC results should change the interpretation given to the authors' empirical results about the magnitude of epistasis.

We thank the editors for their continued consideration of our manuscript. To address the concern about the lack of relationship between the parameters in i-LDSC and traditional decompositions of genetic variance we have added new derivations around the model formulation. We now fully formalize the interaction component used in the i-LDSC model as an estimate of the phenotypic variance explained by additive-by-additive interactions between genetic variants (see the updated Results and Material and Methods). This provides two key takeaways that also addressed some of the reviewer’s concerns. First, we show that our model does indeed assume that additive, dominance, and additive-by-additive genetic effects are indeed orthogonal to each other. This is important because it means that there is a unique partitioning of genetic variance when studying a trait of interest. The second key takeaway is that the genotype matrix X and the matrix of genetic interactions W themselves are correlated despite being linearly independent. This property stems from the fact that the additive-by-additive effects between two SNPs are encoded as the Hadamard product of two genotypic vectors in the form wm = xjxk (which is a nonlinear function of the genotypes). In the revised manuscript, new edits corresponding to these changes can be found around Eqs. (2)-(4) and Eqs. (6)-(9).

We additionally performed new sets of simulations. In the first, we investigated the possibility of additive-by-additive interaction effects estimated by i-LDSC being inflated by additive effects in a haplotype that are unobserved during the model fitting of i-LDSC (i.e., a scenario highlighted in the paper mentioned by both the editors and the reviewer). Generally, we observed that, across a range of both minor allele frequencies and effect sizes, the omission of causal haplotypes had a negligible effect on the estimated value of the coefficients in i-LDSC (see new Figure 3 —figure supplement 9). We hypothesize this is because the simulations were done for polygenic architectures where all SNPs have at least an additive effect. As a result, not observing a small subset of SNPs does not hinder the ability of i-LDSC to estimate genetic variance because the effect size of each SNP is small. If these simulations were conducted for sparse architectures, we would have likely seen a greater impact on i-LDSC; although, we want to note that we have already shown the LD score regression framework to be uncalibrated for traits with sparse genetic architectures (see Figure 3 —figure supplement 8).

Please find our responses to the reviewer below. We are very grateful for the constructive criticism and do agree that addressing these concerns has made our manuscript even stronger.

Reviewer #2 (Recommendations for the authors):

This paper has a lot of strong points, and I commend the authors for the effort and ingenuity expended in tackling the difficult problem of estimating epistatic (non-additive) genetic variance from GWAS summary statistics. The mere possibility of the estimated univariate regression coefficient containing a contribution from epistasis, as represented in the manuscript's Equation~3 and elsewhere, is intriguing in and of itself.

Is i-LDSC Estimating Epistasis?

Perhaps the issue that has given me the most pause is uncertainty over whether the paper's method is really estimating the non-additive genetic variance, as this has been traditionally defined in quantitative genetics with great consequences for the correlations between relatives and evolutionary theory (Fisher, 1930, 1941; Lynch and Walsh, 1998; Burger, 2000; Ewens, 2004).

Let us call the expected phenotypic value of a given multiple-SNP genotype the total genetic value. If we apply least-squares regression to obtain the coefficients of the SNPs in a simple linear model predicting the total genetic values, then the partial regression coefficients are the average effects of gene substitution and the variance in the predicted values resulting from the model is called the additive genetic variance. (This is all theoretical and definitional, not empirical. We do not actually perform this regression.) The variance in the residuals---the differences between the total genetic values and the additive predicted values---is the non-additive genetic variance. Notice that this is an orthogonal decomposition of the variance in total genetic values. Thus, in order for the variance in Wθ to qualify as the non-additive genetic variance, it must be orthogonal to Xβ.

At first, I very much doubted whether this is generally true. And I was not reassured by the authors' reply to Reviewer~1 on this point, which did not seem to show any grasp of the issue at all. But to my surprise I discovered in elementary simulations of Equation 1 above that for mean-centered X1 and X2, (X1β1+X2β2) is uncorrelated with X1X2θ for seemingly arbitrary correlation between X1 and X2. A partition of the outcome's variance between these two components is thus an orthogonal decomposition after all. Furthermore, the result seems general for any number of independent variables and their pairwise products. I am also encouraged by the report that standard and interaction LD Scores are ‘lowly correlated’ (line~179), meaning that the standard LDSC slope is scarcely affected by the inclusion of interaction LD Scores in the regression; this behavior is what we should expect from an orthogonal decomposition.

I have therefore come to the view that the additional variance component estimated by i-LDSC has a close correspondence with the epistatic (non-additive) genetic variance after all.

In order to make this point transparent to all readers, however, I think that the authors should put much more effort into placing their work into the traditional framework of the field. It was certainly not intuitive to multiple reviewers that Xβ is orthogonal to Wθ. There are even contrary suggestions. For if (Xβ)Wθ=βXWθ is to equal zero, we know that we can't get there by XW equaling zero because then the method has nothing to go on (e.g., line~139). We thus have a quadratic form---each term being the weighted product of an average (additive) effect and an interaction coefficient---needing to cancel out to equal zero. I wonder if the authors can put forth a rigorous argument or compelling intuition for why this should be the case.

In the case of two polymorphic sites, quantitative genetics has traditionally partitioned the total genetic variance into the following orthogonal components:

  • additive genetic variance, σA2, the numerator of the narrow-sense heritability;

  • dominance genetic variance, σD2;

  • additive-by-additive genetic variance, σAA2;

  • additive-by-dominance genetic variance, σAD2; and

  • dominance-by-dominance genetic variance, σDD2.

See Lynch and Walsh (1998, pp. 88-92) for a thorough numerical example. This decomposition is not arbitrary or trivial, since each component has a distinct coefficient in the correlations between relatives. Is it possible for the authors to relate the variance associated with their Wθ to this traditional decomposition? Besides justifying the work in this paper, the establishment of a relationship can have the possible practical benefit of allowing i-LDSC estimates of non-additive genetic variance to be checked against empirical correlations between relatives. For example, if we know from other methods that σD2 is negligible but that i-LDSC returns a sizable σAA2, we might predict that the parent-offspring correlation should be equal to the sibling correlation; a sizable σD2 would make the sibling correlation higher. Admittedly, however, such an exercise can get rather complicated for the variance contributed by pairs of SNPs that are close together (Lynch and Walsh, 1998, pp. 146-152).

I would also like the authors to clarify whether LDSC consistently overestimates the narrow-sense heritability in the case that pairwise epistasis is present. The figures seem to show this. I have conflicting intuitions here. On the one hand, if GWAS summary statistics can be inflated by the tagging of epistasis, then it seems that LDSC should overestimate heritability (or at least this should be an upwardly biasing factor; other factors may lead the net bias to be different). On the other hand, if standard and interaction LD Scores are lowly correlated, then I feel that the inclusion of interaction LD Score in the regression should not strongly affect the coefficient of the standard LD Score. Relatedly, I find it rather curious that i-LDSC seems increasingly biased as the proportion of genetic variance that is non-additive goes up---but perhaps this is not too important, since such a high ratio of narrow-sense to broad-sense heritability is not realistic.

We thank the reviewer for taking the time to thoughtfully offer more context on how we might situate the i-LDSC framework within the greater context of traditional quantitative genetics. We now formalize the interaction component used in the i-LDSC model as an estimate of the phenotypic variance explained by additive-by-additive interactions between genetic variants (which we denote by σ^AA to follow the conventional notation). In the newly revised Material and Methods, we also show how the i-LDSC model can be formulated to include dominance effects in a more general framework. Our updated derivations provide two key takeaways.

First, we assume that the additive and interaction effect sizes in the general model (β,θ) are each normally distributed with variances proportional to their individual contributions to trait heritability: βjN(0,σA2),θmN(0,σAA2). This independence assumption implies that the additive and nonadditive components and are orthogonal where E[βTXTWθ]=E[βT]XTWE[θ]=0. This is important because, as the reviewer points out, it means that there is a unique partitioning of genetic variance when studying a trait of interest. In the revised version of the manuscript, we show this derivation in the main text (see lines 129-143). We also extend this derivation in the Materials and methods where we show the same result even after we include the presence of dominance effects in the generative model (see lines 415-417 and 438-457).

Second, we show that the genotype matrix X and the matrix of genetic interactions W are not linearly dependent because the additive-by-additive effects between two SNPs are encoded as the Hadamard product of two genotypic vectors in the form wm = xjxk (which is a nonlinear function of the genotypes). Linear dependence would have implied that one could find a transformation between a SNP and an interaction term in the form wm = c × xj for some constant c. However, despite their linear independence, X and W are themselves not orthogonal and still have a nonzero correlation. This implies that the inner product between genotypes and their interactions is nonzero XTW0. To see this, we focus on a focal SNP xand and consider three different types of interactions:

  • Scenario I: Interaction between a focal SNP with itself (xjxj).

  • Scenario II: Interaction between a focal SNP with a different SNP (xjxk).

  • Scenario III: Interaction between a focal SNP with a pair of different SNPs (xkxl).

In the Materials and methods of the revised manuscript, we now provide derivations showing when would expect nonzero correlation between X and W which rely on the fact that: (1) we assume that genotypes have been mean-centered and scaled to have unit variance, and (2) under Hardy-Weinberg equilibrium, SNPs marginally follow a binomial distribution xand ~Bin(2,p) where p represents the minor allele frequency (MAF) (Wray et al. 2007, Genome Res; Lippert et al. 2013, Sci Rep). These new additions are given in new lines 460-485.

Lastly, we agree with the reviewer that our results indicate that LDSC inflates estimates of SNPbased narrow-sense heritability. Our intuition for why this happens is largely consistent with the reviewer’s first point: since GWAS summary statistics can be inflated by the tagging of nonadditive genetic variance, then it makes sense that LDSC should overestimate heritability. LDSC uses a univariate regression without the inclusion of cis-interaction scores. A simple consequence from “omitted variable bias” is likely happening where, since LDSC does not explicitly account for contributions from the tagged non-additive components which also contribute to the variance in the GWAS summary statistics, the estimate for the coefficient σA2 becomes slightly inflated.

How Much Epistasis Is i-LDSC Detecting?

I think the proper conclusion to be drawn from the authors' analyses is that statistically significant epistatic (non-additive) genetic variance was not detected. Specifically, I think that the analysis presented in Supplementary Table S6 should be treated as a main analysis rather than a supplementary one, and the results here show no statistically significant epistasis. Let me explain.

Most serious researchers, I think, treat LDSC as an unreliable estimator of narrow-sense heritability; it typically returns estimates that are too low. Not even the original LDSC paper pressed strongly to use the method for estimating h2 (Bulik-Sullivan et al., 2015). As a practical matter, when researchers are focused on estimating absolute heritability with high accuracy, they usually turn to GCTA/GREML (Evans et al., 2018; Wainschtein et al., 2022).

One reason for low estimates with LDSC is that if SNPs with higher LD Scores are less likely to be causal or to have large effect sizes, then the slope of univariate LDSC will not rise as much as it ‘should’ with increasing LD Score. This was a scenario actually simulated by the authors and displayed in their Supplementary Figure S15. [Incidentally, the authors might have acknowledged earlier work in this vein. A simulation inducing a negative correlation between LD Scores and χ2 statistics was presented by Bulik-Sullivan et al. (2015, Supplementary Figure 7), and the potentially biasing effect of a correlation over SNPs between LD Scores and contributed genetic variance was a major theme of Lee et al. (2018).] A negative correlation between LD Score and contributed variance does seem to hold for a number of reasons, including the fact that regions of the genome with higher recombination rates tend to be more functional. In short, the authors did very well to carry out this simulation and to show in their Supplementary Figure S15 that this flaw of LDSC in estimating narrow-sense heritability is also a flaw of i-LDSC in estimating broad-sense heritability. But they should have carried the investigation at least one step further, as I will explain below.

Another reason for LDSC being a downwardly biased estimator of heritability is that it is often applied to meta-analyses of different cohorts, where heterogeneity (and possibly major but undetected errors by individual cohorts) lead to attenuation of the overall heritability (de Vlaming et al., 2017).

The optimal case for using LDSC to estimate heritability, then, is incorporating the LD-related annotation introduced by Gazal et al. (2017) into a stratified-LDSC (s-LDSC) analysis of a single large cohort. This is analogous to the calculation of multiple GRMs defined by MAF and LD in the GCTA/GREML papers cited above. When this was done by Gazal et al. (2017, Supplementary Table 8b), the joint impact of the improvements was to increase the estimated narrow-sense heritability of height from 0.216 to 0.534.

All of this has at least a few ramifications for i-LDSC. First, the authors do not consider whether a relationship between their interaction LD Scores and interaction effect sizes might bias their estimates. (This would be on top of any biasing relationship between standard LD Scores and linear effect sizes, as displayed in Supplementary Figure~S15.) I find some kind of statistical relationship over the whole genome, induced perhaps by evolutionary forces, between cis-acting epistasis and interaction LD Scores to be plausible, albeit without intuition regarding the sign of any resulting bias. The authors should investigate this issue or at least mention it as a matter for future study. Second, it might be that the authors are comparing the estimates of broad-sense heritability in Table~1 to the wrong estimates of narrow-sense heritability. Although the estimates did come from single large cohorts, they seem to have been obtained with simple univariate LDSC rather than s-LDSC. When the estimate of h2 obtained with LDSC is too low, some will suspect that the additional variance detected by i-LDSC is simply additive genetic variance missed by the downward bias of LDSC. Consider that the authors' own Supplementary Table~S6 gives s-LDSC heritability estimates that are consistently higher than the LDSC estimates in Table~1. E.g., the estimated h2 of height goes from 0.37 to 0.43. The latter figure cuts quite a bit into the estimated broad-sense heritability of 0.48 obtained with i-LDSC.

Here we come to a critical point. Lines 282--286 are not entirely clear, but I interpret them to mean that the manuscript's Equation~5 was expanded by stratifying into the components of s-LDSC and this was how the estimates in Supplementary Table~S6 were obtained. If that interpretation is correct, then the scenario of i-LDSC picking up missed additive genetic variance seems rather plausible. At the very least, the increases in broad-sense heritability reported in Supplementary Table~S6 are smaller in magnitude and not statistically significant. Perhaps what this means is that the headline should be a negligible contribution of pairwise epistasis revealed by this novel and ingenious method, analogous to what has been discovered with respect to dominance (Hivert et al., 2021; Pazokitoroudi et al., 2021; Okbay et al., 2022; Palmer et al., 2023).

This is an excellent question raised by the reviewer and, again, we really appreciate such a thoughtful and thorough response. First, we completely agree with the reviewer that the s-LDSC estimates previously included in the Supplementary Material should instead be discussed in the main text of the manuscript. In the revision, we have now moved the old Supplemental Table S6 to be the new Table 2. Second, we also agree that the conclusions about the magnitude of additive-by-additive effects should be based upon variance explained when using the cisinteraction score in addition to scores specific to different biological annotations when available, per s-LDSC.

However, we want to respectfully disagree that the results indicate a negligible contribution of additive-by-additive genetic variance to all the traits we analyzed (see Figure 4D). Although the additive-by-additive genetic variance component is not significant in any trait in the UK Biobank, there is little reason to expect that they would be given the inclusion of 97 other biological annotations from the s-LDSC model. Indeed, in the s-LDSC paper itself the authors look only for enrichment of heritability for a given annotation not a statistically significant test statistic. It also worth noting that jackknife approaches tend to be conservative and yield slightly larger standard errors for hypothesis testing. Taking all the great points that the reviewer mentioned into account, we believe that a moderate stance to the interpretation of our results is one that: (i) emphasizes the importance of using s-LDSC with the cis-interaction score to better assess the variance explained by additive-by-additive interaction effects and (ii) allows for the significance of the additive-by-additive component to not be the only factor when determining the importance of the role of non-additive effects in shaping trait architecture.

In the revision, we now write the following in lines 331-343:

“Lastly, we performed an additional analysis in the UK Biobank where the cis-interaction scores are included as an annotation alongside 97 other functional categories in the stratified-LD score regression framework and its software s-LDSC (Materials and methods). Here, s-LDSC heritability estimates still showed an increase with the interaction scores versus when the publicly available functional categories were analyzed alone, but albeit at a much smaller magnitude (Table 2). The contributions from the additive-by-additive component to the overall estimate of genetic variance ranged from 0.005 for MCHC (P = 0.373) to 0.055 for HDL (P = 0.575) (Figures 4C and 4D). Furthermore, in this analysis, the estimates of the additive-by-additive components were no longer statistically significant for any of the traits in the UK Biobank (Table 2). Despite this, these results highlight the ability of the i-LDSC framework to identify sources of “missing” phenotypic variance explained in heritability estimation. Importantly, moving forward, we suggest using the cisinteraction scores with additional annotations whenever they are available as it provides more conservative estimates of the role of additive-by-additive effects on trait architecture.”

Lastly, in the Discussion, we now mention an area of future work would be to explore how the relationship between cis-interaction LD scores and interaction effect sizes might bias heritability estimates from i-LDSC (e.g., similar to the relationship explored standard LD scores and linear effect sizes in Figure 3 —figure supplement 8). See new lines 364-367.

Abstract: Here and elsewhere, the term cis is used. Parenthetically it is explained that a cis-interaction score captures interactions between a focal variant and nearby variants. Does this mean that i-LDSC only tries to estimate the variance contributed by pairwise interactions between nearby SNPs? That is, does the method say nothing about statistical interactions between SNPs on different chromosomes, say? The paper should make a clear statement about this point somewhere.

This is a great suggestion. We believe that cis in the genetics context is commonly used to refer to physically colocated loci on the same chromosome but have updated the description of the cisinteraction in the abstract (lines 29-30) to make this explicit.

line~54--58: I do not understand what is being asserted here. I think it might clear some things up to point out that the developers of LDSC did not rest content with the assumption that every SNP has an effect. E.g., they conducted simulations showing that the intercept and slope remain unbiased estimators, so along as certain other conditions are met, as the proportion of SNPs with a nonzero effect ranged from one all the way down to 10-4 (Bulik-Sullivan et al., 2015, Supplementary Figures 3 and 4).

We apologize for not being clearer here in the previous submission. We now highlight the initial set of simulations performed by Bulik-Sullivan et al. (2015) in testing the tolerance of the LDSC framework to violations of assumptions of the infinitesimal model in new lines 58-60.

Line 177: I suggest choosing a term other than ‘infinitesimal model’. Some kind of model where every SNP has an effect does not need to hold in order for standard LDSC to work.

We agree and have added that if variance in a trait is only determined by additive effect that the model may also be polygenic (see new lines 185-186).

line~505: `delete-one jackknife' is confusing, because each block removed in a jackknife replicate contains more than one SNP. Leaving out the `delete-one' is better.

We agree that this was confusing and have removed ‘delete-one’ from the manuscript.

lines~513--518: The justifications for the shortcuts in the computation of interaction LD Scores seem a bit handwavy to me. What about SNPs with which the j-th SNP is highly correlated? Is it possible to explain why these do not matter so much?

This is a great catch. The previous description of this step was not correct. In the revised version of the manuscript, this procedure is not described as the following (see new lines 589-593):

“In practice, cis-interaction LD scores in i-LDSC can be computed efficiently through realizing two key opportunities for optimization. First, given J SNPs, the full matrix of genome-wide interaction effects W contains on the order of J(J – 1) /2 total pairwise interactions. However, to compute the cis-interaction score for each SNP, we simply can replace the full W matrix with a subsetted matrix Wand which includes only interactions involving the j-th SNP.”

line~615: Now here is a variance decomposition which, I think, is truly questionable. Why does it make sense to say that the variance attributable to statistical interaction between genetic and non-genetic variables is part of the broad-sense heritability?

We define the broad-sense heritability here to be the phenotypic variance explained by any genetic effects, even if there is a genetic effect with an environmental variable. The initial set of revisions suggested a model where a proportion of the phenotypic variation was explained by G×Ancestry or G×E effects. That previous reviewer argued that, although these are not explicitly included in foundational models of variance decomposition, they do offer a useful insight into how i-LDSC estimates might be affected by the presence of genetic interactions with environmental variables.

Figure~S8: What makes windows of 25 or 50 SNPs for calculating interaction LD Scores the best for getting an estimate of the broad-sense heritability with i-LDSC? Why does a smaller window lead to inflation? This is not fully explained. I am sometimes fine with trying several values of an adjustable setting and picking whatever seems to work the best, but here I am more than a bit curious.

We selected the 50 SNP windows for our analysis of empirical traits in the UK Biobank and Biobank Japan because it mitigated any upward bias in the estimates of total heritability explained. We believe that smaller windows provide noisier estimates because they consider only a few interactions and are thus prone to over-estimate the impact of additive-by-additive effects between a focal variant and the rest of the genome.

Figure~S16: Is this a duplication of Figure~4D?

Great catch. Indeed, these two figures were duplicated. Since we have updated Figure 4D to illustrate a different set of parameter estimates with s-LDSC, this comparison of the intercepts in BioBank Japan and the UK Biobank is now only included as the new Figure 4 —figure supplement 1B.

[Editors’ note: what follows is the authors’ response to the third round of review.]

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below. If you think you are unable to fully address the reviewer's comments (which pertain to similar issues identified in the last round of reviews), we would advise you not to submit a revised manuscript.

Reviewer #1 (Recommendations for the authors):

1. I don't have issues with the authors' derivations but I do still think the authors have not yet successfully related their variance decomposition to the standard variance decomposition in classical quantitative genetics. Here's a simple example demonstrating this

For simplicity, let the generative model be y=X1β1+X2β2+Wθ+e where X1 and X2 are independent and standardized and W=standardize(X1X2) denoting r1=cov(X1,W)0, r2=cov(X2,W)0, with cov(X1,X2)=0 and e independent of everything else and y Y having zero mean and unit variance.

The classical additive genetic variance will the R-squared from regressing y on X1 and X2:

Vg=maxbcor(y,X1+X2b)2=maxb[cov(y,X1+X2b)21+b2]=maxb[cov(X1β1+X2β2+Wθ+e,X1+X2b)21+b2]=maxb[(β1+θr1+β2b+θr2b)21+b2]

Doing some calculus, we get Vg=(β12+β22)+2θ(β1r1+β2r2)+θ2(r12+r22). So the classical genetic variance is larger than the additive variance component in the authors model (here β12+β22)! It will include the part of the epistatic component tagged by additive SNPs X, which is nonzero despite Xβ and Wθ being orthogonal.

I imagine you should be able to figure out the classical additive genetic variance (and epistatic etc) from the authors' model, though this isn't present in their current parametrization. If you can do that, I think it's far easier to interpret.

We thank the reviewer for continuing to encourage us to connect our random effect (linear mixed model) parameterization with the classical form of additive genetic variance. As the reviewer points out, the concept of additive genetic effects partially explaining non-additive variation has also described in classical quantitative genetics (Hill et al. 2008, PLOS Genetics, Hivert et al. 2021, Am J Hum Genet; Mäki-Tanila and Hill 2014, Genetics). In the revised version of the manuscript, we now include the following connection between our model and classical quantitative genetics (see subsection “Connection to quantitative genetics theory”, subsection “Full derivation of interaction LD score regression” and subsection “Overview of the interaction-LD score regression model”).

We hope that the combination of these new derivations better relate our variance decomposition to the standard variance decomposition in classical quantitative genetics.

2. I'm not satisfied with the authors' response to the editors' comment here:

"If these simulations were conducted for sparse architectures, we would have likely seen a greater impact on i-LDSC; although, we want to note that we have already shown the LD score regression framework to be uncalibrated for traits with sparse genetic architectures (see Figure 3 —figure supplement 8)."

The issue here isn't that the results will be biased; LDSC regression is biased for estimating h2 with missing variants but this bias is easy to understand--missing causal variants lead to lower estimates. The question for your method is what happens to epistatic variance component. It's fine if it's attenuated, it's not fine if leads to false positive signal of epistasis. Simulations to clarify this are necessary

This is a good point brought up by the reviewer. We apologize for not considering this in the previous submission. In the newly revised manuscript, we now include an additional simulation scenario where we generate synthetic traits but run i-LDSC where some number of causal SNPs involved in pairwise interactions are not observed (see new lines 282, 292-296, and 861-879, and the new Figure 3 -—figure supplement 10 and 11). In the text, Materials and methods, we describe this simulation scenario as the following:

“In this set of simulations, we extend the polygenic case to a setting where a portion of the variants involved in genetic interactions are unobserved. Similar to the case with unobserved additive effects, the purpose of these simulations is to assess whether the i-LDSC framework is prone to false discovery of non-additive genetic variance when causal interacting SNPs are not included during the estimation of GWAS summary statistics. In each simulation, we generated haplotypes that each contain 5,000 variants. Traits were simulated using the generative model in Eq. (35) with both additive and interaction effects such that V[Xβ]+V[Wθ]=H2. Here, every SNP in the genome had at least a small additive effect with a corresponding effect size that was drawn to be inversely proportional to its MAF. Only 1% or 5% of variants within each haplotype had causal non-zero interaction effects. However, when running i-LDSC, only a percentage of the interacting SNPs {1%, 5%, 10%, 25%, or 50%} were included in the estimation of ϑ^. We once again generate traits with heritability H2={0.3,0.6} such that the proportion of genetic variance explained by additive effects was equal to ρ={0.5,0.8}. As with the other simulation scenarios, all synthetic traits were generated using UK Biobank genotyped variants that passed initial preprocessing and quality control (see next section). Since not all of these SNPs are HapMap3 SNPs, some variants were omitted from the i-LDSC regression analyses. Overall, as discussed in the main text with results taken over 100 replicates, i-LDSC underestimated values of ϑ^ when there were unobserved interacting variants (see Figure 3 -—figure supplement 10 and 11). As expected, estimates of the additive variance component τ^, on the other hand, were not affected.”

Reviewer #2 (Recommendations for the authors):

I focus my review in this round on the issues raised by the editor and also on a point of my own accepting the results at face value, how are we to interpret their statistical significance and importance?

Orthogonal decomposition of the genetic variance

I do not think the authors did much to clarify whether their partition of the genetic variance is an orthogonal decomposition in accord with traditional quantitative-genetic theory. Looking at their line 138 and Equation 8, I see that they make use of the assumption that the additive effects and interaction coefficients are independent random variables with zero means. This kind of assumption has become common in statistical genetics (e.g., Yang et al., 2010), and I'm pretty sure the authors think they are simply following the lead of the original LDSC paper. But, as a few papers have pointed out (e.g., de los Campos et al., 2015), this assumption does not make a lot of sense since it is properties of individuals (genotypes, residuals) that are random rather than the properties of SNPs. And even if we are taking an average over different locations in the genome rather than over a population of individuals, why should the mean of the betas be zero? If we always count the derived allele, then the mean might well be negative if there is any tendency for mutations to have a depressing effect on the trait in question. Also, why independent? The recent report that SNPs with larger main effects tend to exhibit more dominance (Palmer et al., 2023) might lead us to expect that pairs of SNPs with larger main effects might show stronger statistical interactions.

Despite the fact that the ‘symbolic convention’ of E[β]=0 has often been used to justify surprisingly robust tools, I think that it should no longer be retained in statistical genetics. It is confusing, it might be used in the future in an attempt to justify results that are really far off, and it is known already to prove too much. The supposed proofs of GCTA and LDSC seem to show that they are definitely unbiased, but they are in fact biased under conditions that cannot be explained in terms of this genetic-effects-as-random-variables notion.

My strong suggestion is to justify the orthogonality of Xβ and Wθ in some different way. One such way might to prove that Equation 1 in my first review is an orthogonal decomposition, treating the betas and thetas as fixed constants and the X's as random variables. I can see intuitively why this is the case. Suppose that X1 and X2 are highly or even perfectly correlated. If X1 is mean-centered, then it should be uncorrelated with the square of mean-centered X2. Roughly speaking, each x1 should be balanced by another of opposite sign, and since they'll both be weighted by the square of X2 in the sum, they will cancel each other out.

Another way might be to add some caveats. Something like: "The appropriateness of treating genetic effects as random variables in analytical derivations has been questioned. Later, we will justify the theory presented here with simulation results showing that i-LDSC accurately recovers the magnitude of the epistatic variance in Eq. (1) under a broad range of conditions."

We appreciate the reviewer for bringing up these important points. Indeed, when formulating the i-LDSC model, we wanted to follow the lead of the original LDSC paper to draw more explicit parallels with our approach. Taking this into account and following the suggestion of the reviewer, we now include caveats in the text when assuming that the additive effects and interaction coefficients are independent random variables with zero means in the generative model. In the revision, we now write the following in lines 131-134:

While the appropriateness of treating genetic effects as random variables in analytical derivations has been questioned (de los Campos et al. 2015, PLOS Genet), later, we will justify the theory presented here with simulation results showing that i-LDSC accurately recovers non-additive genetic variance in Eq. (1) under a broad range of conditions.

We also include a similar caveat in the Materials and methods (see new lines 456-459):

“As we mentioned in the main text, we recognize that the appropriateness of treating genetic effects as random variables in analytical derivations has been questioned (de los Campos et al. 2015, PLOS Genet), but our simulation studies show that i-LDSC accurately recovers nonadditive genetic variance in Eq. (7) under a broad range of conditions.”

In reference to the reviewer’s first point about needing to clarify how our partition of variance relates to theory in traditional quantitative genetics theory, we also want to highlight our response to Reviewer #1 above where we now derive a formal connection with our model (also see subsection “Connection to quantitative genetics theory”, subsection “Full derivation of interaction LD score regression” and subsection “Overview of the interaction-LD score regression model” in the newly revised manuscript).

A related suggestion is to add a qualifier at some point in the text to the use of the term "additive-by-additive" or the symbol σAA2. The authors do not really explain the relationship between their epistatic variance component associated with product terms and the traditional σAA2, σAD2, and σDD2. Or, instead of a special qualifier, maybe use a different term and symbol altogether. Maybe the term should be "epistatic" or "pairwise-epistatic." Sometimes the symbol σI2 is used to denote epistatic genetic variance, the I standing for "interaction" in the statistical sense (e.g., Falconer and Mackay, 1996). This seems less potentially misleading.

This is a great suggestion. In the newly revised manuscript, we now reserve the “additive-by-additive” and the notation σA2 and σAA2 for when we make connections between our random effect model and theoretical concepts in classical quantitative genetics (again see see subsection “Connection to quantitative genetics theory”, subsection “Full derivation of interaction LD score regression” and subsection “Overview of the interaction-LD score regression model”). Following the reviewer’s points, we also use a different variable for the regression coefficient in the i-LDSC model changing the old σ^AA2 to a new symbol ϑ^. The i-LDSC model is now formulated as the following (e.g., see new Eqs. (5), (27), and (28)):

E[X2]lτ+fϑ+1

We describe the coefficient ϑ^ as an estimate of the proportion of phenotypic variation explained by tagged “pairwise interaction” effects. We hope that the combination of these changes is less misleading for readers.

Might interaction LD scores better tag single variants, as in the retracted paper of Hemani et al.

I am satisfied with the response of the authors, based on their new simulations.

We really appreciate the suggestions for the additional simulation studies during the previous round of review.

The statistical significance and substantive importance of the estimated epistatic variance components

Previously, the authors put forth their heritability estimates with no functional annotations as their main results and treated their estimates with functional annotations as supplementary. I argued that the latter estimates should be treated as the main. In their revision, the authors put both sets of results in their main text. I suppose that this is a reasonable compromise.

The authors argue that the results with functional annotations (i.e., s-LDSC) cannot be dismissed even though they are not statistically significant. They point out in their reply that the standard error calculated by LDSC with the block jackknife is conservative. I had neglected this point, with which I agree. The block-jackknife standard error seems not to vanish with infinite sample size, probably because there are inherent differences between blocks of the genome. The sample size used by the authors (~350,000) might be large enough for this feature of LDSC to be a problem.

Sometimes the LDSC developers in a situation like this will take an average estimate over all traits in an analysis and then use the block jackknife to calculate the significance of it. Perhaps the authors here could calculate the average percentage increase in the heritability; this might turn out to be better powered. But this is just a suggestion, and the authors do not need to take the trouble in order to secure my assent to publication.

We thank the reviewer for pointing this out. We were unaware that sometimes the LDSC developers will take an average estimate over all traits in an analysis and then use the block jackknife to calculate statistical significance. While we elected not to do that here, we will keep this strategy in mind for future work. We also added the following statement referencing the possibility of this approach in new lines 639-643:

“It is worth noting that the block-jackknife approach tends to be conservative and yield larger standard errors for hypothesis testing (Efron 1982). As an alternative, we could first run i-LDSC using the block-jackknife procedure over all traits in a study and then use the average of the standard errors to calculate the statistical significance of coefficient estimates; but we do not explore this strategy here and leave that for future work.”

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Smith S, Darnell G, Udwin D, Stamp J, Harpak A, Ramachandran S, Crawford L. 2023. Replication Data for: Discovering non-additive heritability using additive GWAS summary statistics. Harvard Dataverse. [DOI] [PMC free article] [PubMed]

    Supplementary Materials

    Supplementary file 1. Comparison of LDSC and i-LDSC estimates of the proportion of phenotypic variance explained (PVE) by genetic effects (i.e., estimated heritability) when the true heritability is set to H𝟐=0.3 for polygenic traits.

    Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e., creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 10% of the total number of SNPs genome-wide. These interact with the group #2 SNPs which are selected to be variants within a ± 100 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with no minor allele frequency dependency α=0 (see Materials and Methods). Here, we assume a heritability H2=0.3 and vary the proportion contributed by additive effects with ρ={0.2,0.4,0.6,0.8}. We run i-LDSC while computing the cis-interaction LD scores using different estimating windows of ± 5, ± 10, ± 25, and ± 50 SNPs. The “average” column represents results using model averaging over the different estimating windows (see Materials and Methods). We report the mean estimates of heritability (with standard errors in the parentheses) and use mean absolute error (MAE) to quantify the difference between the two methods. Results are based on 100 simulations per parameter combination. As shown in Figure 3—figure supplements 3 and 1, LDSC does not capture the contribution of non-additive genetic effects to trait variation.

    elife-90459-supp1.xlsx (10.3KB, xlsx)
    Supplementary file 2. Comparison of LDSC and i-LDSC estimates of the proportion of phenotypic variance explained (PVE) by genetic effects (i.e., estimated heritability) when the true heritability is set to H𝟐=0.6.

    Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e., creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two interacting groups. The group #1 SNPs are chosen to be 10% of the total number of SNPs genome-wide. These interact with the group #2 SNPs which are selected to be variants within a ± 100 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with no minor allele frequency dependency α=0 (see Materials and Methods). Here, we assume a heritability H2=0.6 and vary the proportion contributed by additive effects with ρ={0.2,0.4,0.6,0.8}. We run i-LDSC while computing the cis-interaction LD scores using different estimating windows of ± 5, ± 10, ± 25, and ± 50 SNPs. The “average” column represents results using model averaging over the different estimating windows (see Materials and Methods). We report the mean estimates of heritability (with standard errors in the parentheses) and use mean absolute error (MAE) to quantify the difference between the two methods. Results are based on 100 simulations per parameter combination. As shown in Figure 3—figure supplements 3 and 1, LDSC does not capture the additional contribution of non-additive genetic effects to trait variation.

    elife-90459-supp2.xlsx (10.3KB, xlsx)
    Supplementary file 3. Abbreviations used throughout this study for 14 quantitative traits analyzed in this study.

    The remaining 11 traits analyzed were Basophil count, Cholesterol, Eosinophil count, Height, Hematocrit, Hemoglobin, Lymphocyte count, Monocyte count, Neutrophil count, and Triglyceride levels, respectively. These are not abbreviated in the main text.

    elife-90459-supp3.xlsx (9.8KB, xlsx)
    Supplementary file 4. Trait-specific α parameters for each of the 25 traits analyzed.

    Here, α values are used to weight each variant based on its minor allele frequency to account for frequency dependent architectures in each trait. The ∗ indicates α parameters that were taken directly from Schoech et al. The α parameters for other traits were calculated using the protocol used in that paper. Expansion of trait abbreviations are given in Supplementary file 3.

    elife-90459-supp4.xlsx (10KB, xlsx)
    Supplementary file 5. Number of individuals and total SNPs included in the analysis of each trait in BioBank Japan.
    elife-90459-supp5.xlsx (10.3KB, xlsx)
    MDAR checklist

    Data Availability Statement

    Source code and tutorials for implementing interaction-LD score regression via the i-LDSC package are written in Python and are publicly available online on GitHub (copy archived at Crawford and Smith, 2024). Files of LD scores, cis-interaction LD scores, and GWAS summary statistics used for our analyses of the UK Biobank and BioBank Japan can be downloaded from the Harvard Dataverse. All software for the traditional and stratified LD score regression framework with LDSC and s-LDSC were fit using the default settings, unless otherwise stated in the main text. Source code for these approaches was downloaded from https://github.com/bulik/ldsc (Bulik-Sullivan et al., 2020). When applying s-LDSC, we used 97 functional annotations from Gazal et al., 2017 to estimate heritability. Data from the UK Biobank Resource (Bycroft et al., 2018) was made available under Application Numbers 14649 and 22419. Data can be accessed by direct application to the UK Biobank.

    The following dataset was generated:

    Smith S, Darnell G, Udwin D, Stamp J, Harpak A, Ramachandran S, Crawford L. 2023. Replication Data for: Discovering non-additive heritability using additive GWAS summary statistics. Harvard Dataverse.


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES