Abstract
The relative contributions of additive versus non-additive interactions in the regulation of complex traits remains controversial. This may be in part because large-scale epistasis has traditionally been difficult to detect in complex, multi-cellular organisms. We hypothesized that it would be easier to detect interactions using mouse chromosome substitution strains that simultaneously incorporate allelic variation in many genes on a controlled genetic background. Analyzing metabolic traits and gene expression levels in the offspring of a series of crosses between mouse chromosome substitution strains demonstrated that inter-chromosomal epistasis was a dominant feature of these complex traits. Epistasis typically accounted for a larger proportion of the heritable effects than those due solely to additive effects. These epistatic interactions typically resulted in trait values returning to the levels of the parental CSS host strain. Due to the large epistatic effects, analyses that did not account for interactions consistently underestimated the true effect sizes due to allelic variation or failed to detect the loci controlling trait variation. These studies demonstrate that epistatic interactions are a common feature of complex traits and thus identifying these interactions is key to understanding their genetic regulation.
Author summary
Most complex traits and diseases are regulated by the combined influence of multiple genetic variants. However, it remains controversial whether these genetic variants independently influence complex traits, and therefore the impact of each variant could be simply added together (additivity), or whether the variants work together to influence trait variation, in which case the combined impact of multiple variants would differ from the summed impact of each individual variant (epistasis). In this study in mice, we discovered that the genetic regulation of blood sugar levels and gene expression in the liver were predominantly controlled by non-additive interactions, whereas body weight was predominantly controlled by additive interactions. Remarkably, the expression level of nearly 25% of all genes in the liver was controlled by non-additive interactions. The non-additive interactions typically acted to return trait values to the levels detected in control mice, thus contributing to a reduction in trait variation. We also demonstrated that not accounting for non-additive interactions significantly underestimated the phenotypic effect of a genetic variant on a particular genetic background, suggesting that many previously identified risk loci may have significantly larger effects on disease susceptibility in a subset of individuals. These studies highlight the importance of understanding interactions between genetic variants to better understand disease risk and personalize clinical care.
Introduction
The genetic basis of complex traits and diseases results from the combined action of many genetic variants [1]. However, it remains unclear whether these variants act individually in an additive manner or via non-additive epistatic interactions. Epistasis has been widely observed in model organisms such as S. cerevisiae [2,3], C. elegans [4], D. melanogaster [5] and M. musculus [6]. However, it has been more difficult to detect in humans, potentially due to their diverse genetic backgrounds, low allele frequencies, limited sample sizes, complexity of interactions, insufficient effect sizes, and methodological limitations [7,8]. Nonetheless, a number of genome-wide interaction-based association studies in humans have provided evidence for epistasis in a variety of complex traits and diseases [9–15]. However, concerns remain over whether observed epistatic interactions are due to statistical or experimental artifacts [16,17].
To better understand the contribution of epistasis to complex traits, we studied mouse chromosome substitution strains (CSSs) [18]. For each CSS, a single chromosome in a host strain is replaced by the corresponding chromosome from a donor strain. This provides an efficient model for mapping quantitative trait loci (QTLs) on a fixed genetic background. This is in contrast to populations with many segregating variants such as advanced intercross lines [19], heterogeneous stocks [20], or typical analyses in humans. Given the putative importance of genetic background effects in complex traits [21,22], we hypothesized the fixed genetic backgrounds of CSSs can provide a novel means for detecting genetic interactions on a large-scale [18,23]. Previous studies of CSSs with only a single substituted chromosome suggested that non-additive epistatic interactions between loci were a dominant feature of complex traits [6]. However, to identify the interacting loci, or at least their chromosomal locations, requires the analysis of genetic variation in multiple genomic contexts [24]. We thus extended the analysis of single chromosome substitutions by analyzing a series of CSSs with either one or two substituted chromosomes, collectively representing the pairwise interactions between genetic variants on the substituted chromosomes. This experimental design can directly identify and map loci that are regulated by epistasis by analyzing the phenotypic effects of genetic variants on multiple fixed genetic backgrounds. Here we report the widespread effects of epistasis in controlling complex traits and gene expression. The detection of true epistatic interactions will improve our understanding of trait heritability and genetic architecture as well as provide insights into the biological pathways that underlie disease pathophysiology [25]. Knowing about epistasis will also be essential for guiding precision medicine-based decisions by interpreting specific variants in appropriate contexts.
Results
Contribution of epistasis to metabolic traits
Body weight and fasting plasma glucose levels were measured in a total of 766 control and CSS mice (S1 and S2 Tables, S1 Fig). The CSSs included 240 mice that were heterozygous for one A/J-derived chromosome and 444 mice that were heterozygous for two different A/J-derived chromosomes, both on otherwise B6 backgrounds. The CSSs with two A/J-derived chromosomes represented all pairwise interactions between the individual A/J-derived chromosomes. For example, comparisons were made between strain B6, strains (B6.A3 x B6)F1 and (B6 x B6.A10)F1 which were both heterozygous for a single A/J-derived chromosome (Chr. 3 and 10, respectively), and strain (B6.A3 x B6.A10)F1 which was heterozygous for A/J-derived chromosomes 3 and 10 (S2 Fig). A complete list of the strains analyzed is shown in S2 Table. Quantitative trait loci (QTLs) were identified for both body weight and plasma glucose levels that were due to main effects and interaction effects. Of note, due to the nature of the CSS experimental design, the regions defined by the identified QTLs correspond to the entire substituted chromosome and contain many allelic variants that may contribute to trait regulation. Additionally, due to the study design, only QTLs with dominant or semi-dominant effects could be assessed.
Joint F-tests for main effects on body weight indicated that the chromosome substitutions influenced body weight (males p = 0.0028; females p = 0.0008; meta p = 1.4e-05). Similarly, joint F-tests tests for main effects on plasma glucose levels demonstrated a significant effect of the chromosome substitutions (males p = 0.0082; females p = 0.00011; meta p = 1.4e-05). QTLs with main effects on body weight were mapped to chromosomes 8 (main effect: 1.23g; average effect: 1.02g) and 17 (main effect: -1.13g; average effect: -1.11g) (S3 Table). Note that we define main effects as the effect of a chromosome substitution as estimated by a model which includes all pairwise interaction terms, thus taking into account context-dependent genetic background effects. In contrast, the average effect is estimated using a model that does not include any interaction terms; the latter is similar to the analyses performed in a typical GWAS study. QTLs with main effects on fasting glucose were mapped to chromosomes 3 (main effect: 25.0 mg/dL; average effect: 9.61 mg/dL), 5 (main effect: 15.6 mg/dL; average effect: 6.02 mg/dL), and 4 (main effect: 17.5 mg/dL; average effect: 6.61 mg/dL) (S3 Table).
Joint F-tests for interaction effects on body weight were not significant (males p = 0.19; females p = 0.83; meta p = 0.44), and therefore epistatic interactions on body weight were not further investigated. However, joint F-tests for interaction effects on plasma glucose demonstrated the importance of epistasis in regulating this trait (males p = 0.002; females p = 0.003; meta p = 8.99e-05). In fact, among the males and females respectively, epistasis accounted for 43% (95% confidence interval: 23%-75%) and 72% (95% confidence interval: 37%-97%) of the heritable effects on plasma glucose levels. The discrepant results for the contribution of interactions to body weight and plasma glucose are likely reflected in the difference between whether QTLs for these traits were detected using the main effect model or the average effect model (S3 Table). For plasma glucose, only 1 of the 3 QTLs identified using the main effect model was also identified using the average effect model, and no new QTLs were identified with the average effect model. In contrast, both of the QTLs for body weight identified using the main effect model were also identified using the average effect model, and 2 new QTLs were identified on chromosomes 6 and 10. This suggests that for a trait regulated by epistatic interactions, the ability to successfully identify QTLs is greatly enhanced by accounting for these interactions. However, for a trait regulated primarily by additive effects, a model incorporating interactions can be detrimental to QTL identification.
To identify specific epistatic interactions, we tested explicit hypotheses for inter-chromosomal pairwise interactions on plasma glucose levels. Among the 15 CSS crosses analyzed, 5 crosses demonstrated inter-chromosomal epistatic interactions that altered plasma glucose levels (Fig 1, S3 and S4 Figs). Interestingly, in all 5 crosses demonstrating interactions, one chromosome substitution increased fasting glucose levels relative to the control B6 strain. These main effects raised plasma glucose levels by an average of 12.3 mg/dL in males and 17.8 mg/dL in females. However, in all 5 observed interactions the average plasma glucose levels in the double CSSs were closer to the control B6 strain than any single CSS was. Furthermore, in 4 of the 5 interactions, the plasma glucose levels in the double CSS did not differ statistically from the control strain B6 (p value > 0.1). Thus, the chromosome substitution driving the increase in plasma glucose on a B6 background had no effect on glucose levels when the genetic background was altered by the second chromosome substitution.
Regulation of gene expression by epistasis
As hepatic gluconeogenesis is a key determinant of plasma glucose levels in healthy insulin-sensitive mice [26], the hepatic gene expression patterns of control and CSS male mice were analyzed to better understand the molecular mechanisms underlying the epistatic regulation of plasma glucose. The RNA-Seq data was filtered for genes expressed in the liver, leaving 13,289 genes that were tested for differential expression associated with both main and interaction effects. A total of 6,101 main effect expression QTLs (meQTLs) were identified (FDR < 0.05) (Fig 2, S4 Table). Those meQTL genes located on the substituted chromosome were classified as cis-meQTLs (Fig 2, red) whereas the meQTL genes not located on the substituted chromosome were classified as trans-meQTLs (Fig 2, blue). Among all possible genes regulated by a cis-meQTL, on average 11.48% of these genes in each strain had a cis-meQTL (range: 5.54% - 22.09%) (S5 Table). Similarly, among all possible genes regulated by a trans-meQTL, on average 5.42% (range: 0.08% to 19.26%) of these genes were regulated by a trans-meQTL (S5 Table). The percentage of cis- and trans-meQTLs in each strain demonstrated a strong positive correlation (Spearman’s r = 1.0) but the proportion of cis-eQTLs was always greater than the proportion of trans-eQTLs. Strain (B6 x B6.A8)F1 had both the highest percentage of genes with cis-meQTLs (22.09%) and trans-meQTLs (19.26%), whereas strain (B6 x B6.A5)F1 had both the lowest percentage of genes with cis-meQTLs (5.54%) and trans-meQTLs (0.08%). This suggests that trans-meQTLs are being driven by the cumulative action of many cis-effects rather than a single or small number or major transcriptional regulators (S5 Fig). Among the genes regulated by a meQTL(s), 41.98% (1615 out of 3847) were regulated by multiple meQTLs (Range: 2–6) (S6 and S7 Tables). For example, Brca2 is regulated by 5 trans-meQTLs mapped to chromosomes 4, 6, 8, 10 and 14 (S6 Fig, S7 Table), demonstrating that hepatic Brca2 expression is regulated by allelic variation throughout the genome. In addition to the well-known role of Brca2 in breast cancer susceptibility, Brca2 has been implicated in hepatocellular carcinoma risk [27–29].
In addition to the meQTLs regulated by substitution of a single chromosome, the analysis of double CSSs enabled the detection of eQTLs with additive and interaction effects between the substituted chromosomes. The expression of Zkscan3 represents an example of additivity, with the substitution of A/J-derived chromosomes 8 and 17 each individually increasing the expression of Zkscan3 relative to control B6 mice (S7A Fig). In the double CSS strain (B6.A17 x B6.A8)F1, the effects of each individual chromosome substitution are combined in an additive manner to result in yet higher expression than either of the single CSSs (S7A Fig). The additive effects of the Zkscan3 meQTLs detected by RNA-Seq were confirmed by quantitative reverse transcription PCR (S7B Fig), as were 4/5 additional meQTLs demonstrating additivity (S8 Table).
In addition to examples of additivity, interaction expression QTLs (ieQTLs) were identified that were jointly regulated by genetic variation on two substituted chromosomes. The ieQTLs, similar to the meQTLs, were divided into cis-ieQTLS and trans-ieQTLs, with cis-ieQTLs defined by differentially expressed genes located on either one of the two substituted chromosomes and trans-ieQTLs representing differentially expressed genes that are not located on either substituted chromosome. A total of 4,283 ieQTLs were identified (S9 Table). Among all possible genes regulated by a cis-ieQTL or trans-ieQTL, 2.01% and 2.16% of genes were regulated by a cis- or trans-ieQTL respectively (Table 1). The combination of A/J-derived chromosomes 8 and 14 yielded the most ieQTLs (n = 2,305) including cis-ieQTLs regulating the expression of 17.56% of all genes on chromosomes 8 or 14 and trans-ieQTLs regulating the expression of 17.32% of all genes throughout the remainder of the genome. Overall, the ieQTLs demonstrated a similar positive correlation as the meQTLs (Spearman’s r = 0.92) (S8 Fig), although there was no enrichment for cis-ieQTLs. Among the genes regulated by an ieQTL(s), 32.35% (945 out of 2921) were regulated by multiple ieQTLs (Range: 2–7) (S10 and S11 Tables). For example, Agt expression is decreased in strain (B6.A8 x B6)F1 relative to control B6 mice; however, interactions between one or more alleles on chromosome 8 and chromosomes 6, 3, 17, and 14 all result in expression levels of Agt that did not differ from the control strain (Fig 3).
Table 1. Interection effects on gene expression.
Cross | Cis-ieQTLs | Trans-ieQTLs | Subtypes of ieQTLs | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Percentage of genes with cis-ieQTL | Number of cis-ieQTLs | Number of genes on substituted chromosomes | Percentage of genes with trans-ieQTL | Number of trans-ieQTLs | Number of genes on other chromosomes | Percentage of genes with synergistic ieQTLs | Number of synergistic ieQTLs | Percentage of genes with antagonistic ieQTLs | Number of antagonistic ieQTLs | |
A14:A8 | 17.56% | 199 | 1133 | 17.32% | 2106 | 12156 | 6% | 129 | 94% | 2176 |
A6:A8 | 6.81% | 96 | 1409 | 5.89% | 700 | 11880 | 2% | 15 | 98% | 781 |
A14:A4 | 3.86% | 48 | 1244 | 3.57% | 430 | 12045 | 6% | 31 | 94% | 447 |
A6:A10 | 1.81% | 24 | 1325 | 0.58% | 69 | 11964 | 1% | 1 | 99% | 92 |
A14:A10 | 1.62% | 17 | 1049 | 2.05% | 251 | 12240 | 1% | 3 | 99% | 265 |
A6:A4 | 0.86% | 13 | 1520 | 0.58% | 68 | 11769 | 0% | 0 | 100% | 81 |
A3:A8 | 0.77% | 11 | 1434 | 0.77% | 91 | 11855 | 2% | 2 | 98% | 100 |
A3:A10 | 0.44% | 6 | 1350 | 0.56% | 67 | 11939 | 0% | 0 | 100% | 73 |
A17:A8 | 0.23% | 3 | 1329 | 0.00% | 0 | 11960 | 0% | 0 | 100% | 3 |
A17:A4 | 0.14% | 2 | 1440 | 0.48% | 57 | 11849 | 2% | 1 | 98% | 58 |
A17:A10 | 0.08% | 1 | 1245 | 0.03% | 4 | 12044 | 0% | 0 | 100% | 5 |
A14:A5 | 0.00% | 0 | 1410 | 0.17% | 20 | 11879 | 0% | 0 | 0% | 0 |
A17:A5 | 0.00% | 0 | 1606 | 0.00% | 0 | 11683 | 0% | 0 | 0% | 0 |
A3:A5 | 0.00% | 0 | 1711 | 0.00% | 0 | 11578 | 0% | 0 | 0% | 0 |
A6:A5 | 0.00% | 0 | 1686 | 0.00% | 0 | 11603 | 0% | 0 | 100% | 20 |
All | 2.01% | 420 | 20891 | 2.16% | 3863 | 178444 | 4% | 182 | 96% | 4101 |
Context-dependent effects on gene expression
We next tested whether the interaction effects on gene expression were synergistic (positive epistasis) or antagonistic (negative epistasis) (S9 Fig). Synergistic refers to an increased difference in gene expression levels between the double CSS and the control B6 strain beyond that expected based on an additive model, whereas antagonistic refers to a decreased difference. The regulation of Agxt was an example of an antagonistic interaction, with main effects from substituted chromosomes 6 and 8 each individually decreasing Agxt expression, whereas this effect was lost in the double chromosome substitution strain (Fig 4A). In contrast, the regulation of Cyp3a16 represented an example of synergistic interaction with the detection of an ieQTL in the absence of a meQTLs (Fig 4B). Among the ieQTLs, antagonistic interactions accounted for 96% (n = 4101) while synergistic interactions accounted for 4% (n = 182) (Table 1). Remarkably, for 80% of the antagonistic interactions (3285/4101), gene expression in one or both of the single CSSs differed from the control B6 strain (a meQTL), whereas expression in the double CSS reverted to control levels (p > 0.1 relative to strain B6). To again validate the RNA-Seq data using an independent method, RT-qPCR was performed for a subset of genes with antagonistic (n = 13) and synergistic (n = 10) interactions. Replication by RT-qPCR confirmed the detection of epistasis in 61% (p <0.05) of the genes tested (Antagonistic: 8/13; Synergistic: 6/10) (S8 Table).
Significant contribution of epistasis to trait heritability
Given that the ieQTLs regulated approximately 2% of all genes expressed in the liver (Table 1), we sought to quantify the contribution of genetic interactions to the heritable component of all genes. First, an empirical Bayes quasi-likelihood F-test identified 6,684 genes out of the 12,325 genes expressed in the liver for which there was evidence of genetic control within the population of CSSs (FDR<0.05). The average proportion of heritable variation attributable to interactions across these genes was 0.56 (1st quartile: 0.43 – 3rd quartile: 0.68) (Fig 5A). When the same analysis was restricted to only genes with a statistically significant (FDR<0.05) contribution of interactions to gene expression levels (n = 3,236 genes), the proportion of heritable variation attributable to interactions increased to 0.66 (1st quartile: 0.56, 3rd quartile: 0.74) (Fig 5B). For comparison, a simulation study was conducted using artificial data to model pure additivity in the absence of interactions, with a resulting estimate of heritability of 0.13 (1st quartile: 0.05, 3rd quartile: 0.19) (Fig 5C), which provides an estimate of the background noise in this measurement. Thus, genetic interactions are a major contributor to the regulation of gene expression.
Discussion
CSSs, which have a simplified and fixed genetic background, were used to identify widespread and likely concurrent epistatic interactions. This systematic analysis of mammalian double CSSs demonstrated that epistatic interactions controlled the majority of the heritable variation in both fasting plasma glucose levels and hepatic gene expression (Fig 5). Among genes expressed in the liver, the expression level of 24% were regulated, at least in part, by epistasis (Fig 5). This number is remarkable considering that only dominant or semi-dominant effects were tested, only a single tissue and time point were examined, allelic variation from only two inbred strains of mice were included, and only 15 pairwise strain combinations of CSSs were tested out of a possible 462 combinations of double CSSs. The prevalence of epistatic interactions provides a potential molecular mechanism underlying the highly dependent nature of complex traits on genetic background [21,22,30,31]. Interpreting the effect of individual allelic variants will thus be severely limited by population-style analyses that fail to account for possible contextual effects. Nonetheless, progress is being made in this field, including in diseases such as multiple sclerosis (MS), which is a complex genetic disease whose risk is highly associated with family history [32]. For example, MS risk alleles in DDX39B (rs2523506) and IL7R (rs2523506A) together significantly increase MS risk considerably more than either variant independently [15]. Based on the considerable number of interactions detected in the CSS crosses, context-dependent interactions such as that between DDX39B and IL7R in MS are likely widespread and may therefore represent a significant source of missing heritability for complex traits and diseases [33,34].
Although epistasis was a dominant factor regulating fasting glucose levels, the same effect was not detected in the regulation of body weight. It is not clear if this is due to different genetic architectures between these two traits or whether this was due to the limited genetic variation between the B6 and A/J strains. The body weight studies were conducted in mice fed a standard rodent chow, whereas differences in body weight between strains B6 and A/J are significantly more pronounced when challenged with a high-fat diet [35,36]. Alternately, a recent meta-analysis of trait heritability in twin studies identified significant variation in the role of additive and non-additive variation among different traits, with suggestive evidence for non-additive effects in 31% of traits [37]. Among the traits analyzed, genetic regulation of neurological, cardiovascular, and ophthalmological traits were among the most consistent with solely additive effects, whereas traits related to reproduction and dermatology were more often consistent with non-additive interactions. Among the metabolic traits studied, 40% of the 464 traits studied were consistent with a contribution of non-additive interactions [37]. It is interesting to speculate whether some traits that may have a more direct effect on fitness (e.g. reproduction) are more likely to involve multiple non-additive effectors in order to maintain a narrow phenotypic or developmental range [38].
Although many inter-chromosomal non-additive interactions were identified in mice, it remains unclear whether these interactions are attributable to bigenic gene-gene interactions or to higher-order epistasis involving multiple loci located on a substituted chromosome. Studies in yeast that dissected the genetic architecture of epistasis demonstrated that gene-gene interactions played a minor role among the heritable effects attributable to epistasis, thus primarily implicating higher order interactions [2]. Yet, other studies in yeast that methodically tested pairs of gene knockouts for interactions identified a number of gene-gene interactions [39]. Additional evidence for both high-order epistasis with three, four, and even more mutations [40] as well as bigenic gene-gene interactions [41] have been identified, and it seems likely that both will underlie interactions detected in the CSS studies. This is because the use of CSSs to study the allelic variation found on an entire chromosome in tandem equally enables the detection of bigenic and higher-order interactions, although it does not distinguish between these two possibilities by identifying the number of contributing variants on the substituted chromosomes without further mapping studies. This property of CSSs may contribute to the robust detection of epistasis using the CSS experimental platform relative to genetic mapping studies in populations with many independently segregating variants, which are often underpowered to identify higher-order interactions [42]. However, to formally test this and determine the relative contribution of each, higher resolution genetic mapping of the epistatic interactions will be necessary to better understand their molecular nature [43]. Higher resolution mapping studies should eventually shed light on whether the chromosome-level properties discovered in this study are consistent with those for SNP-level interactions. Based on previous studies of complex trait QTLs in single-CSS studies, chromosome-level QTLs demonstrated a similar genetic architecture as that found in higher resolution QTLs including large effect sizes, similar direction of effects, and suggestive evidence of widespread epistasis [23,44]. Thus, it seems likely that discoveries made based on chromosome-level analysis of epistasis, will apply equally to studies involving individual genetic variants. For example, genetic variants in Cntnap2 were identified by higher resolution mapping studies of chromosome-level QTLs in CSSs, that were associated with opposing effects on body weight depending on epistatic interactions with intra-chromosomal variation in the genetic background [45].
Perhaps the most significant outcome of the epistasis detected was the high degree of constancy in the light of context dependence, such that the interactions usually returned trait values to the levels detected in control mice. Remarkably, this is just as Waddington predicted 75 years ago, a phenomenon he referred to as canalization [46] and has been observed in previous studies[47–51]. Canalization refers to the likelihood of an organism to proceed towards one developmental outcome, despite variation in the process along the way. This variation can be influenced by among other things the numerous functional genetic variants present in a typical human genome, which may contain thousands of variants that alter gene function [52]. We find that the overwhelming majority of genetic interactions return trait values to levels seen in control strains, which would act to reduce phenotypic variation among developmental outcomes. Studies of epistasis in tomato plants detected by analyzing short chromosomal regions on different genetic backgrounds identified a similar bias towards antagonistic epistasis relative to synergistic epistasis[50]. A bias towards antagonistic interactions was also detected in large-scale gene-gene interactions studies in yeast, although with a lower frequency of antagonistic relative to synergistic interactions[49,53]. Thus, our results are concordant with other studies that the majority of epistatic interactions are antagonistic, and together suggest that when larger tracts of DNA are assessed for interactions the effects are even more likely to be antagonistic. This robustness in the face of considerable genetic variation is central to the underlying properties of canalization. These genetic interactions therefore represent a mechanism for storing genetic variation within a population, without reducing individual fitness. This stored genetic variation could then enable populations to more quickly adapt to environmental changes [54].
Finally, the consistently greater effect sizes of main effects relative to average effects suggests that GWAS-type studies, in both human and model organisms, consistently underestimate true effect sizes in at least a subset of individuals. For example, a large F2 intercross between inbred mice carrying a mutation that results in a nonfunctional allele of the growth hormone releasing hormone receptor (Ghrhr) on either a B6 or C3H genetic background identified widespread antagonistic epistasis, albeit with small contributions to overall trait heritability relative to additive effects [47]. Similarly, epistatic interactions were identified in the Diversity Outbred mice resulting in small contributions to the overall heritability of metabolic-related traits [55]. These studies contrast the large contribution of epistasis to trait heritability identified using the CSS paradigm (Fig 5), mirroring the contrasting portraits of genetic architecture identified based on differing genetic structures of these experimental populations [23]. The CSS paradigm examines context-dependent effects on individual genotypes and typically identifies QTLs with large effect sizes. Alternatively, GWAS-type studies average effects across a population of heterogeneous genotypes and typically identify QTLs with small phenotypic effects. However, perhaps most relevant is that the relatively simpler genotypes of CSSs enable greater depth analyzing fewer unique genotypes, potentially capturing what would be rare genotypic combinations in a segregating cross or human population. Therefore, the key to enabling precision medicine, which like the CSS studies is focused on the effect of a variant on one specific genetic background, is to identify in which subset of individuals a particular variant has a significant effect. The consideration of epistasis in treatment, although in its infancy, remains a promising avenue for improving clinical treatment regimens, including predicting drug response in tumors [56] and guiding antibiotic drug-resistance [57]. However, true precision medicine will necessitate a more comprehensive understanding of how genetic background, across many loci, affects single variant substitutions.
Materials and methods
Ethics statement
All mice were cared for as described under the Guide for the Care and Use of Animals, eighth edition (2011) and all experiments were approved by IACUC and carried out in an AAALAC approved facility. The IACUC protocol numbers were 2013–0098 and 2016–0064. Mice were anesthetized with isoflurane prior to retro-orbital bleeding and subsequently euthanized by cervical dislocation for tissue collection.
Mice
Chromosome substitution strains (CSS) and control strains were purchased from The Jackson Laboratory. These strains include C57BL/6J-Chr3A/J/NaJ mice (Stock #004381) (B6.A3), C57BL/6J-Chr4A/J/NaJ mice (Stock #004382) (B6.A4), C57BL/6J-Chr5A/J/NaJ mice (Stock #004383) (B6.A5), C57BL/6J-Chr6A/J/NaJ mice (Stock #004384) (B6.A6), C57BL/6J-Chr8A/J/NaJ mice (Stock #004386) (B6.A8), C57BL/6J-Chr10A/J/NaJ mice (Stock #004388) (B6.A10), C57BL/6J-Chr14A/J/NaJ mice (Stock #004392) (B6.A14), C57BL/6J-Chr17A/J/NaJ mice (Stock #004395) (B6.A17) and C57BL/6J (Stock #000664). Mice were maintained by brother-sister matings. All mice used for experiments were obtained from breeder colonies at Case Western Reserve University. Mice were housed in ventilated racks with access to food and water ad libitum and maintained at 21°C on a 12-hour light/12-hour dark cycle. Male mice from strains B6, B6.A4, B6.A5, B6.A10 strains and B6.A8 were bred with female mice from strains B6, B6.A3, B6.A6, B6.A14 and B6.A17 strain. The offspring were weaned at 3 weeks of age. The number of offspring analyzed from each cross is shown in S2 Table for both body weight and plasma glucose, although glucose levels were not measured in one mouse each from the following strains: (B6 x B6.A10)F1, (B6.A14 x B6)F1, (B6.A17 x B6.A10)F1, (B6.A3 x B6.A10)F1, (B6.A6 x B6.A4)F1, (B6.A14 x B6.A5)F1 and (B6.A6 x B6.A5)F1. The mice analyzed from each cross were derived from at least three independent breeding cages. No blinding to the genotypes was undertaken.
Mouse phenotyping
At 5 weeks of age, mice were fasted 16 hours overnight and body weight was measured. Mice were anesthetized with isoflurane and fasting blood glucose levels were measured via retro-orbital bleeds using an OneTouch Ultra2 meter (LifeScan, Milpitas, CA, USA). Mice were subsequently euthanized by cervical dislocation and the caudate lobe of the liver was collected and immediately placed in RNAlater (Thermo Fisher Scientific, Waltham, MA, USA).
Trait analysis
To analyze the body weight and fasting plasma glucose data, linear regression was used with a main effects term and a term for each pairwise interaction for the males and females separately. In the glucose data, 5 observations were Winserized by setting a ceiling of 4 median absolute deviations from the median. Any values larger than the ceiling (165 mg/dL) were set to the ceiling. Additionally, interactions where one of the crosses contained less than 5 mice were not analyzed leading to the removal of the (B6.A4 x B6.A3)F1 mice, the female (B6.A8 x B6.A14)F1 and the male (B6.A8 x B6.A3)F1 mice. For each trait and for each sex, we estimated a linear model with the following predictors: (1) maternal substitution, (2) paternal substitution and (3) the interaction of maternal by paternal substitution. In these models, the reference strain was B6. The sexes may potentially differ in residual variance and in the effect of the chromosome substitutions (i.e. gene by sex interaction). To handle these differences transparently, we estimated and reported models for each sex separately. Within each of the above models, two joint linear hypothesis tests were performed of the following hypothesis: (a) there were no main effects (i.e. terms (1) and (2) in the model above were all 0), and (b) there were no interaction effects (i.e. terms (3) in above model were all 0). These linear hypothesis tests were carried out using the “linearHypothesis” function in the “car” package [58] and with the anova function in R. Fisher’s method was used to combine these p-values from males and females [59]. Similar results were obtained using a full 3-way interaction model including all interactions between sex, maternal substitution and paternal substitution. In this approach, the test of the null hypothesis that all main effects in males and females were 0 had a p-value of 3.168e-05 and 1.17e-05 for weight and glucose respectively, while the overall test for interaction had a p-value of 0.44 and 0.00011 for weight and glucose respectively. Inverse-variance meta-analysis was used to combine the coefficient estimates from the males and females. If and are estimated genetic effects for males and females respectively then the IVW estimator is where . Thus, while the genetic effects may potentially differ between males and females, the combined results represent a weighted average of the effect in males and in females. To account for potential non-normality, heteroscedasticity and multiple testing, we created 10,000 bootstrap data sets by sampling with replacement from each cross and sex combination. Studentized bootstraps (i.e. using pivotal statistics) were used to create confidence intervals for the coefficients and p-values. Multiple tests were adjusted for by comparing the observed test statistics to the maximum bootstrap test statistic as described elsewhere [60]. P-values were adjusted for multiple comparisons separately for each trait and separately for the main effects and interactions. As an alternative to the meta-analysis approach, we also fit a linear model adjusting for sex as a covariate. Results of this analysis are reported in S12 and S13 Tables. The proportion of the genetic variance explained by interactions was estimated as (RFull−RAdditive)/ RFull where RAdditive and RFull are the adjusted coefficients of determination for the model with only main effects and for the full interaction model respectively. The adjusted coefficients of determination are an estimate of the proportion of variation in the trait which is explained by the model. Note that RFull and RAdditive share the same denominator (i.e. the total trait variation). Thus, total trait variation cancels out of the quantity (RFull—RAdditive)/ RFull so that the quantity represents the amount of genetic variation that cannot be explained by main effects only. Using the adjusted version of the coefficient of determination helps account for potential overfitting. Bootstrap confidence intervals of this proportion were calculated.
Sample preparation for RNA-Seq
Liver tissue stored in RNAlater was homogenized using a Tissumizer Homogenizer (Tekmar, Cincinnati, OH, USA). Total RNA was isolated using the PureLink RNA purification kit (Thermo Fisher Scientific, Waltham, MA, USA). A sequencing library was generated using the TruSeq Stranded Total RNA kit (Illumina, San Diego, CA, USA). RNA samples were sequenced on Illumina HiSeq2500s with single-end 50 base pair reads [61]. Library preparation and RNA sequencing were performed by the CWRU genomics core (Director, Dr. Alex Miron). A total of 7,269,450,186 reads were generated across four flow cells, with an average of 47,204,222 ± 928,913 [range: 14,561,990–76,538,825] reads per sample. Sequencing quality was assessed by FastQC [62], which identified an average per base quality score of 35.46.
RNA-Seq data analysis
To maximize statistical power, 20 samples were selected for analysis from the control B6 group, 8 samples were selected from the single CSS groups, and 5 samples were selected from the double CSS groups. A total of 154 control and CSS mice were analyzed, including 20 B6 mice, 63 mice that were heterozygous for one A/J-derived chromosome, and 71 mice that were heterozygous for two different A/J-derived chromosomes. Only male mice were analyzed to avoid complications due to sex differences in gene expression. The B6.A4 x B6.A3 and B6.A8 x B6.A3 crosses were poor breeders and thus we did not obtain 5 samples to analyze from these crosses.
Reads were aligned using TopHat2 (2.0.10) [63] to the reference mm10 genome with the GENCODE vM7 annotations as a guide. Because the reference genome is comprised of sequence from strain B6, sequencing reads from a B6-derived chromosome are more accurately mapped than reads from an A/J-derived chromosome [64]. To avoid potential mapping biases, we created an “individualized genome” of the A/J mouse strain using the program Seqnature [64] with variant calls from the Mouse Genomes Project that were downloaded from The Sanger Institute [65]. Reads that were not mapped to the B6 genome were then mapped to the individualized AJ genome with TopHat2. HTSeq-count [66] and the GENCODE vM7 gene annotations[67] were used to count the number of reads for each gene feature. After filtering to remove duplicate reads, unmapped reads, low quality reads, and reads mapped to non-GENCODE regions of the genome, an average of 16,506,775 ± 439,754 [range: 4,638,701–30,465,477] reads were mapped to GENCODE regions per sample. There was no significant difference in the mapping efficiency (number of mapped reads / total number of reads) between the control B6 samples and any of the CSS strains either genome-wide (S10A Fig) or on the substituted chromosome (S10B Fig). This suggests that the sequence differences on the A/J chromosomes did not reduce mapping efficiency in the CSSs.
Graphical depictions of the distribution CPM (counts per million) were used to remove the following 3 outlier samples: E171, E305, and E570 (S1 Table). Genes where less than 75% of the samples had a count greater than or equal to 15 were considered to be expressed at low levels in liver and were removed leaving 13,289 genes that were considered expressed. To enhance reproducibility and reduce the dependence between the genes, svaseq [68] was used to create 5 surrogate variables that served as covariates in subsequent modeling.
EdgeR [69] was used to fit a model with main effects and pairwise interactions between each chromosome substitution. EdgeR uses a log link function, and thus departure from additivity in EdgeR is departure from a multiplicative model on the gene expression level. For each gene an interaction model was fit which included the following terms: (1) maternal substitution, (2) paternal substitution, (3) the interaction of maternal by paternal substitution, and (4) the SVA covariates. For all models, “B6” was used as the reference for the categorical chromosome substitution predictors.
A stratified FDR approach was used for the analysis of both meQTLs and ieQTLs [70]. For meQTLs, we tested for associations between every combination of chromosome substitutions in the study with every unfiltered gene in the RNA-Seq data. These hypothesis tests were stratified by chromosome and cis vs. trans. The method of Benjamini and Hochberg [71] was applied within each strata to control the false discovery rate. Similarly, the hypothesis tests for the ieQTLs were stratified by each chromosome combination and cis/trans. The stratified FDR approach has been shown to be more powerful when the proportion of true hypothesis differs by strata. The chromosome-chromosome interactions with FDR < 0.05 were divided into the categories synergistic and antagonistic based on the gene expression differences between the double CSS strain and the control strain relative to that predicted by an additive model (S9 Fig). Spearman’s r was used to summarize the association between several variables in the analysis. A Spearman’s r of 1 implies that the rank order of the values for two variables is the same. To estimate the amount of variation attributable to interaction, we fit an additive model in EdgeR which did not include any interaction terms. We then calculated for each individual and gene the fitted values assuming that the individual’s covariates (i.e. the SVA surrogate variables) were set to 0 and thus do not contribute to the variation. We calculate SSFull as the sum of the mean centered and squared fitted values for the full model including interaction, SAdditive was calculated similarly for the additive model. We calculated the proportion of the genetic variation explained by interactions as (SSFull—SAdditive) / SFull. This proportions is only meaningful when there is genetic variation to be explained. To filter out only genes with evidence of genetic control, using the full model for each gene, we tested the overall joint null hypothesis that all mouse strains had the same average expression level using the empirical Bayes quasi-likelihood F-tests test as implemented in EdgeR. This allowed us to classify some genes as showing evidence of genetic control. Only these genes were looked at further. The estimator (SSFull—SAdditive) / SFull may be slightly biased upward due to overfitting. However, the mean value for this statistic among the genes with no significant interaction (FDR > 0.5) was 0.25 (1st quartile: 0.20, 3rd quartile: 0.32) (Fig 5B), which gives one estimate of the upper bound on the possible bias. Here, the overall test that the interaction terms were all 0 was carried out using the Bayes quasi-likelihood F-tests test as implemented in EdgeR. To assess any potential bias stemming from the arbitrary selection of an FDR > 0.5, we performed a simulation study to independently approximate the upper limit on this bias. Using the fitted values (i.e. predicted mean) from the additive model described above, we simulated counts for each gene and individual from a Poisson distribution. The full and additive model was fit to the simulated data set, and the variance explained (SSFull—SAdditive) / SFull was calculated for each gene. The simulation was repeated 100 times and the average variance explained by interaction was averaged across all simulations for each gene. The mean for the amount of genetic variance explained by interaction under this simulated additive model was 0.13 (1st quartile: 0.05, 3rd quartile: 0.19) (Fig 5C). This gives another estimate of the upper bound on the possible bias.
Multiple testing correction
For both the analysis of mouse phenotypes and RNA-Seq data it is necessary to account for multiple testing in order to avoid a large number of false positive findings. The approaches to multiple testing for the mouse phenotypes and RNA-Seq data are fundamentally different because the number hypotheses being tested were very different. For the mouse phenotype data, there were a relatively small number of targeted hypotheses, and thus the conservative and more confirmatory approach of controlling the family-wise type I error was applied. In this case, the genetic scan for each of the small number of traits was considered to be a separate question (i.e. the main effects for each trait and interaction effects for each trait were considered a separate “family” of hypotheses). For the large number traits analyzed in the RNA-Seq data, a less conservative and more hypothesis generating approach known as the stratified FDR was applied.
Quantitative PCR (qPCR)
Tissue was homogenized using TissueLyser II (Qiagen, Valencia, CA, USA) and total RNA was isolated using the PureLink RNA purification kit with TRIzol protocol (Thermo Fisher Scientific, Waltham, MA, USA). Total RNA was reverse transcribed using the high capacity cDNA reverse transcription kit (Applied Biosystems, Carlsbad, CA, USA). The sequences for each primer are listed in S14 Table. The qPCR reactions were performed with the power SYBR green PCR Master Mix (Thermo Fisher Scientific, Waltham, MA, USA) and run on a Bio Rad CFX Connect Real Time System (Bio Rad, Hercules, CA, USA). Expression levels were calculated using the ΔΔCt method relative to the Rplp0 control gene.
Supporting information
Acknowledgments
This research was supported by the Genomics Core Facility of the CWRU School of Medicine’s Genetics and Genome Sciences Department. We would like to thank Dr. Alex Miron and the entire Genomic Core staff. This work made use of the High Performance Computing Resource in the Core Facility for Advanced Research Computing at Case Western Reserve University. We would also like to thank Mrs. Li Wang for her valuable assistance.
Data Availability
The RNA-Seq data is available from GEO under the accession number GSE93591.
Funding Statement
This work was supported by the National Institute of Diabetes and Digestive and Kidney Diseases (https://www.niddk.nih.gov/) grant DK099533 to DAB. AC was supported by grant G201603152079889 from the Sigma Xi Scientific Research Society (https://www.sigmaxi.org/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Fu W, O’Connor TD, Akey JM. Genetic architecture of quantitative traits and complex diseases. Curr Opin Genet Dev. 2013;23: 678–683. doi: 10.1016/j.gde.2013.10.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bloom JS, Ehrenreich IM, Loo WT, Lite T-LV, Kruglyak L. Finding the sources of missing heritability in a yeast cross. Nature. 2013;494: 234–237. doi: 10.1038/nature11867 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jasnos L, Korona R. Epistatic buffering of fitness loss in yeast double deletion strains. Nat Genet. 2007;39: 550–554. doi: 10.1038/ng1986 [DOI] [PubMed] [Google Scholar]
- 4.Lehner B, Crombie C, Tischler J, Fortunato A, Fraser AG. Systematic mapping of genetic interactions in Caenorhabditis elegans identifies common modifiers of diverse signaling pathways. Nat Genet. 2006;38: 896–903. doi: 10.1038/ng1844 [DOI] [PubMed] [Google Scholar]
- 5.Huang W, Richards S, Carbone MA, Zhu D, Anholt RRH, Ayroles JF, et al. Epistasis dominates the genetic architecture of Drosophila quantitative traits. Proc Natl Acad Sci. 2012;109: 15553–15559. doi: 10.1073/pnas.1213423109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Shao H, Burrage LC, Sinasac DS, Hill AE, Ernest SR, O’Brien W, et al. Genetic architecture of complex traits: Large phenotypic effects and pervasive epistasis. Proc Natl Acad Sci. 2008;105: 19910–19914. doi: 10.1073/pnas.0810388105 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Mackay TFC. Epistasis and Quantitative Traits: Using Model Organisms to Study Gene-Gene Interactions. Nat Rev Genet. 2014;15: 22–33. doi: 10.1038/nrg3627 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Huang W, Mackay TFC. The Genetic Architecture of Quantitative Traits Cannot Be Inferred from Variance Component Analysis. PLoS Genet. 2016;12: e1006421 doi: 10.1371/journal.pgen.1006421 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Liu Y, Xu H, Chen S, Chen X, Zhang Z, Zhu Z, et al. Genome-wide interaction-based association analysis identified multiple new susceptibility Loci for common diseases. PLoS Genet. 2011;7: e1001338 doi: 10.1371/journal.pgen.1001338 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Verma SS, Cooke Bailey JN, Lucas A, Bradford Y, Linneman JG, Hauser MA, et al. Epistatic Gene-Based Interaction Analyses for Glaucoma in eMERGE and NEIGHBOR Consortium. PLoS Genet. 2016;12: e1006186 doi: 10.1371/journal.pgen.1006186 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Huang Y, Wang C, Yao Y, Zuo X, Chen S, Xu C, et al. Molecular Basis of Gene-Gene Interaction: Cyclic Cross-Regulation of Gene Expression and Post-GWAS Gene-Gene Interaction Involved in Atrial Fibrillation. PLoS Genet. 2015;11: e1005393 doi: 10.1371/journal.pgen.1005393 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kirino Y, Bertsias G, Ishigatsubo Y, Mizuki N, Tugal-Tutkun I, Seyahi E, et al. Genome-wide association analysis identifies new susceptibility loci for Behçet’s disease and epistasis between HLA-B*51 and ERAP1. Nat Genet. 2013;45: 202–207. doi: 10.1038/ng.2520 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hemani G, Shakhbazov K, Westra H-J, Esko T, Henders AK, McRae AF, et al. Detection and replication of epistasis influencing transcription in humans. Nature. 2014;508: 249–253. doi: 10.1038/nature13005 [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
- 14.Hu T, Chen Y, Kiralis JW, Collins RL, Wejse C, Sirugo G, et al. An information-gain approach to detecting three-way epistatic interactions in genetic association studies. J Am Med Inform Assoc JAMIA. 2013;20: 630–636. doi: 10.1136/amiajnl-2012-001525 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Galarza-Muñoz G, Briggs FBS, Evsyukova I, Schott-Lerner G, Kennedy EM, Nyanhete T, et al. Human Epistatic Interaction Controls IL7R Splicing and Increases Multiple Sclerosis Risk. Cell. 2017;169: 72–84.e13. doi: 10.1016/j.cell.2017.03.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Wood AR, Tuke MA, Nalls MA, Hernandez DG, Bandinelli S, Singleton AB, et al. Another explanation for apparent epistasis. Nature. 2014;514: E3–E5. doi: 10.1038/nature13691 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Fish AE, Capra JA, Bush WS. Are Interactions between cis-Regulatory Variants Evidence for Biological Epistasis or Statistical Artifacts? Am J Hum Genet. 2016;99: 817–830. doi: 10.1016/j.ajhg.2016.07.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Nadeau JH, Singer JB, Matin A, Lander ES. Analysing complex genetic traits with chromosome substitution strains. Nat Genet. 2000;24: 221–225. doi: 10.1038/73427 [DOI] [PubMed] [Google Scholar]
- 19.Darvasi A, Soller M. Advanced intercross lines, an experimental population for fine genetic mapping. Genetics. 1995;141: 1199–1207. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Talbot CJ, Nicod A, Cherny SS, Fulker DW, Collins AC, Flint J. High-resolution mapping of quantitative trait loci in outbred mice. Nat Genet. 1999;21: 305–308. doi: 10.1038/6825 [DOI] [PubMed] [Google Scholar]
- 21.Sackton TB, Hartl DL. Genotypic Context and Epistasis in Individuals and Populations. Cell. 2016;166: 279–287. doi: 10.1016/j.cell.2016.06.047 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Chow CY. Bringing genetic background into focus. Nat Rev Genet. 2016;17: 63–64. doi: 10.1038/nrg.2015.9 [DOI] [PubMed] [Google Scholar]
- 23.Buchner DA, Nadeau JH. Contrasting genetic architectures in different mouse reference populations used for studying complex traits. Genome Res. 2015;25: 775–791. doi: 10.1101/gr.187450.114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Rapp JP, Garrett MR, Deng AY. Construction of a double congenic strain to prove an epistatic interaction on blood pressure between rat chromosomes 2 and 10. J Clin Invest. 1998;101: 1591–1595. doi: 10.1172/JCI2251 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Mackay TF, Moore JH. Why epistasis is important for tackling complex human disease genetics. Genome Med. 2014;6: 125 doi: 10.1186/gm561 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Brown MS, Goldstein JL. Selective versus total insulin resistance: a pathogenic paradox. Cell Metab. 2008;7: 95–96. doi: 10.1016/j.cmet.2007.12.009 [DOI] [PubMed] [Google Scholar]
- 27.Stoppa-Lyonnet D. The biological effects and clinical implications of BRCA mutations: where do we go from here? Eur J Hum Genet EJHG. 2016;24 Suppl 1: S3–9. doi: 10.1038/ejhg.2016.93 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Wang K, Lim HY, Shi S, Lee J, Deng S, Xie T, et al. Genomic landscape of copy number aberrations enables the identification of oncogenic drivers in hepatocellular carcinoma. Hepatol Baltim Md. 2013;58: 706–717. doi: 10.1002/hep.26402 [DOI] [PubMed] [Google Scholar]
- 29.Kan Z, Zheng H, Liu X, Li S, Barber TD, Gong Z, et al. Whole-genome sequencing identifies recurrent mutations in hepatocellular carcinoma. Genome Res. 2013;23: 1422–1433. doi: 10.1101/gr.154492.113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Gerke J, Lorenz K, Cohen B. Genetic Interactions Between Transcription Factors Cause Natural Variation in Yeast. Science. 2009;323: 498–501. doi: 10.1126/science.1166426 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Gerke J, Lorenz K, Ramnarine S, Cohen B. Gene–Environment Interactions at Nucleotide Resolution. PLOS Genet. 2010;6: e1001144 doi: 10.1371/journal.pgen.1001144 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sawcer S, Franklin RJM, Ban M. Multiple sclerosis genetics. Lancet Neurol. 2014;13: 700–709. doi: 10.1016/S1474-4422(14)70041-9 [DOI] [PubMed] [Google Scholar]
- 33.Zuk O, Hechter E, Sunyaev SR, Lander ES. The mystery of missing heritability: Genetic interactions create phantom heritability. Proc Natl Acad Sci. 2012;109: 1193–1198. doi: 10.1073/pnas.1119675109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, et al. Finding the missing heritability of complex diseases. Nature. 2009;461: 747–753. doi: 10.1038/nature08494 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Buchner DA, Burrage LC, Hill AE, Yazbek SN, O’Brien WE, Croniger CM, et al. Resistance to diet-induced obesity in mice with a single substituted chromosome. Physiol Genomics. 2008;35: 116–122. doi: 10.1152/physiolgenomics.00033.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Hill-Baskin AE, Markiewski MM, Buchner DA, Shao H, DeSantis D, Hsiao G, et al. Diet-induced hepatocellular carcinoma in genetically predisposed mice. Hum Mol Genet. 2009;18: 2975–2988. doi: 10.1093/hmg/ddp236 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Polderman TJC, Benyamin B, de Leeuw CA, Sullivan PF, van Bochoven A, Visscher PM, et al. Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat Genet. 2015;47: 702–709. doi: 10.1038/ng.3285 [DOI] [PubMed] [Google Scholar]
- 38.Siegal ML, Bergman A. Waddington’s canalization revisited: Developmental stability and evolution. Proc Natl Acad Sci. 2002;99: 10528–10532. doi: 10.1073/pnas.102303999 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Tong AH, Evangelista M, Parsons AB, Xu H, Bader GD, Pagé N, et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science. 2001;294: 2364–2368. doi: 10.1126/science.1065810 [DOI] [PubMed] [Google Scholar]
- 40.Sailer ZR, Harms MJ. Detecting High-Order Epistasis in Nonlinear Genotype-Phenotype Maps. Genetics. 2017;205: 1079–1088. doi: 10.1534/genetics.116.195214 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Lagator M, Igler C, Moreno AB, Guet CC, Bollback JP. Epistatic Interactions in the Arabinose Cis-Regulatory Element. Mol Biol Evol. 2016;33: 761–769. doi: 10.1093/molbev/msv269 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Taylor MB, Ehrenreich IM. Higher-order genetic interactions and their contribution to complex traits. Trends Genet TIG. 2015;31: 34–40. doi: 10.1016/j.tig.2014.09.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Nadeau JH, Forejt J, Takada T, Shiroishi T. Chromosome substitution strains: gene discovery functional analysis and systems studies. Mamm Genome Off J Int Mamm Genome Soc. 2012;23: 693–705. doi: 10.1007/s00335-012-9426-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Yazbek SN, Buchner DA, Geisinger JM, Burrage LC, Spiezio SH, Zentner GE, et al. Deep congenic analysis identifies many strong, context-dependent QTLs, one of which, Slc35b4, regulates obesity and glucose homeostasis. Genome Res. 2011;21: 1065–1073. doi: 10.1101/gr.120741.111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Buchner DA, Geisinger JM, Glazebrook PA, Morgan MG, Spiezio SH, Kaiyala KJ, et al. The juxtaparanodal proteins CNTNAP2 and TAG1 regulate diet-induced obesity. Mamm Genome Off J Int Mamm Genome Soc. 2012;23: 431–442. doi: 10.1007/s00335-012-9400-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Waddington C. H. Canalization of development and the inheritance of acquired characters. Nature. 1942;150: 563–565. [Google Scholar]
- 47.Tyler AL, Donahue LR, Churchill GA, Carter GW. Weak Epistasis Generally Stabilizes Phenotypes in a Mouse Intercross. PLOS Genet. 2016;12: e1005805 doi: 10.1371/journal.pgen.1005805 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Gonzalez PN, Pavlicev M, Mitteroecker P, Pardo-Manuel de Villena F, Spritz RA, Marcucio RS, et al. Genetic structure of phenotypic robustness in the collaborative cross mouse diallel panel. J Evol Biol. 2016;29: 1737–1751. doi: 10.1111/jeb.12906 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Costanzo M, VanderSluis B, Koch EN, Baryshnikova A, Pons C, Tan G, et al. A global genetic interaction network maps a wiring diagram of cellular function. Science. 2016;353 doi: 10.1126/science.aaf1420 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Guerrero RF, Muir CD, Josway S, Moyle LC. Pervasive antagonistic interactions among hybrid incompatibility loci. PLoS Genet. 2017;13: e1006817 doi: 10.1371/journal.pgen.1006817 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Bastepe M, Fröhlich LF, Linglart A, Abu-Zahra HS, Tojo K, Ward LM, et al. Deletion of the NESP55 differentially methylated region causes loss of maternal GNAS imprints and pseudohypoparathyroidism type Ib. Nat Genet. 2005;37: 25–27. doi: 10.1038/ng1487 [DOI] [PubMed] [Google Scholar]
- 52.Chun S, Fay JC. Identification of deleterious mutations within three human genomes. Genome Res. 2009;19: 1553–1561. doi: 10.1101/gr.092619.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Segrè D, DeLuna A, Church GM, Kishony R. Modular epistasis in yeast metabolism. Nat Genet. 2005;37: 77–83. doi: 10.1038/ng1489 [DOI] [PubMed] [Google Scholar]
- 54.Carlborg O, Jacobsson L, Ahgren P, Siegel P, Andersson L. Epistasis and the release of genetic variation during long-term selection. Nat Genet. 2006;38: 418–420. doi: 10.1038/ng1761 [DOI] [PubMed] [Google Scholar]
- 55.Tyler AL, Ji B, Gatti DM, Munger SC, Churchill GA, Svenson KL, et al. Epistatic Networks Jointly Influence Phenotypes Related to Metabolic Disease and Gene Expression in Diversity Outbred Mice. Genetics. 2017;206: 621–639. doi: 10.1534/genetics.116.198051 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Weigelt B, Reis-Filho JS. Epistatic interactions and drug response. J Pathol. 2014;232: 255–263. doi: 10.1002/path.4265 [DOI] [PubMed] [Google Scholar]
- 57.Wong A. Epistasis and the Evolution of Antimicrobial Resistance. Front Microbiol. 2017;8 doi: 10.3389/fmicb.2017.00246 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Fox J, Weisberg S. An R Companion to Applied Regression, Second Edition Sage Publications; 2011. [Google Scholar]
- 59.Michael Dewey (2016). metap: meta-analysis of significance values. R package version 0.7.).
- 60.Westfall P. H. & Young S. S. Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment. Wiley; 1993. [Google Scholar]
- 61.Chhangawala S, Rudy G, Mason CE, Rosenfeld JA. The impact of read length on quantification of differentially expressed genes and splice junction detection. Genome Biol. 2015;16: 131 doi: 10.1186/s13059-015-0697-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Andrews S. FastQC: a quality control tool for high throughput sequence data [Internet]. 2010. Available: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
- 63.Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14: R36 doi: 10.1186/gb-2013-14-4-r36 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Munger SC, Raghupathy N, Choi K, Simons AK, Gatti DM, Hinerfeld DA, et al. RNA-Seq Alignment to Individualized Genomes Improves Transcript Abundance Estimates in Multiparent Populations. Genetics. 2014;198: 59–73. doi: 10.1534/genetics.114.165886 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Keane TM, Goodstadt L, Danecek P, White MA, Wong K, Yalcin B, et al. Mouse genomic variation and its effect on phenotypes and gene regulation. Nature. 2011;477: 289–294. doi: 10.1038/nature10413 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Anders S, Pyl PT, Huber W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31: 166–169. doi: 10.1093/bioinformatics/btu638 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Mudge JM, Harrow J. Creating reference gene annotation for the mouse C57BL6/J genome assembly. Mamm Genome Off J Int Mamm Genome Soc. 2015;26: 366–378. doi: 10.1007/s00335-015-9583-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Leek JT. svaseq: removing batch effects and other unwanted noise from sequencing data. Nucleic Acids Res. 2014;42: e161–e161. doi: 10.1093/nar/gku864 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26: 139–140. doi: 10.1093/bioinformatics/btp616 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Sun L, Craiu RV, Paterson AD, Bull SB. Stratified false discovery control for large-scale hypothesis testing with application to genome-wide association studies. Genet Epidemiol. 2006;30: 519–530. doi: 10.1002/gepi.20164 [DOI] [PubMed] [Google Scholar]
- 71.Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J R Stat Soc Ser B Methodol. 1995;57: 289–300. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The RNA-Seq data is available from GEO under the accession number GSE93591.