Skip to main content
Genetics logoLink to Genetics
. 2013 Dec 2;196(2):413–425. doi: 10.1534/genetics.113.157503

Additive, Epistatic, and Environmental Effects Through the Lens of Expression Variability QTL in a Twin Cohort

Gang Wang *,1, Ence Yang *,1, Candice L Brinkmeyer-Langford *, James J Cai *,†,2
PMCID: PMC3914615  PMID: 24298061

Abstract

The expression of a gene can vary across individuals in the general population, as well as between monozygotic twins. This variable expression is assumed to be due to the influence of both genetic and nongenetic factors. Yet little evidence supporting this assumption has been obtained from empirical data. In this study, we used expression data from a large twin cohort to investigate the influences of genetic and nongenetic factors on variable gene expression. We focused on a set of expression variability QTL (evQTL)—i.e., genetic loci associated with the variance, as opposed to the mean, of gene expression. We identified evQTL for 99, 56, and 79 genes in lymphoblastoid cell lines, skin, and fat, respectively. The differences in gene expression, measured by the relative mean difference (RMD), tended to be larger between pairs of dizygotic (DZ) twins than between pairs of monozygotic (MZ) twins, showing that genetic background influenced the expression variability. Furthermore, a more profound RMD was observed between pairs of MZ twins whose genotypes were associated with greater expression variability than the RMD found between pairs of MZ twins whose genotypes were associated with smaller expression variability. This suggests that nongenetic (e.g., environmental) factors contribute to the variable expression. Lastly, we demonstrated that the formation of evQTL is likely due to partial linkages between eQTL SNPs that are additively associated with the mean of gene expression; in most cases, no epistatic effect is involved. Our findings have implications for understanding divergent sources of gene expression variability.

Keywords: expression variability, evQTL, additive genetic effect, genetic interaction, epistasis, twin cohort


VARIATION and variability are central concepts in biology (Hallgrímsson and Hall 2005). Although often used interchangeably in the scientific literature, the two are not synonymous. Variation refers to the differences among individuals, whereas variability refers to the potential of a population to vary (Wagner 1995; Wagner and Altenberg 1996). In many cases, greater phenotypic variability (e.g., transcriptional noise) is disadvantageous (Kemkemer et al. 2002; Bahar et al. 2006; Wang and Zhang 2011) unless it gives rise to greater organismal plasticity—first at the level of an individual organism and eventually at the population level. Genetic factors resulting in more variable phenotypes become favored when they enable a population to more effectively respond to environmental changes (Hill and Zhang 2004; Kaern et al. 2005; Acar et al. 2008; Zhang et al. 2009). Thus, understanding to what extent and in what ways genotypes influence phenotypic variability is of fundamental importance.

Much effort has been focused on identifying genetic loci such as expression quantitative trait loci, or eQTL (Stranger et al. 2005, 2007; Choy et al. 2008; Montgomery et al. 2010; Pickrell et al. 2010; Montgomery and Dermitzakis 2011), that affect the average value of a phenotype, while ignoring those that affect the variance of a phenotype. However, there is increasing evidence across species for genetic loci that affect the variance of phenotype (Queitsch et al. 2002; Jimenez-Gomez et al. 2011; Ronnegard and Valdar 2011; Perry et al. 2012; Shen et al. 2012; Yang et al. 2012). Recently we introduced the concept of expression variability QTL, or evQTL (Hulse and Cai 2013). By definition, an evQTL is a genetic locus linked to or associated with genetic variation influencing the variance of gene expression in a population. To identify evQTL, we previously adapted the method developed by Ronnegard and Valdar (2011) for detecting vQTL based on the double generalized linear model (dglm) (Verbyla and Smyth 1998). The dglm method tests for expression variances and measures the contribution of genetic variants to the expression heteroscedasticity. It compares the fit of a full model, which takes into account the contribution of genotype to both the mean and the variance of gene expression simultaneously, and a mean model, which only takes into account the contribution of genotype to the mean and ignores the contribution to the variance. A significant result of a dglm test shows the nonrandom association between genotypes and gene expression variances. Using this method, we have conducted a genome-wide scan for evQTL in the human genome (Hulse and Cai 2013).

How an evQTL is created in the first place is not clear. We consider two possible scenarios, emphasizing either the role of environmental or genetic factors. The first possibility is that specific genetic variants disrupt the stabilizing genetic architecture that buffers stochastic variation in phenotype. As a result of such an effect of decanalization, along with the sensitization of the stabilizer (e.g., heat-shock protein 90), the phenotype becomes more sensitive to external environment and varies more greatly between individuals (Ronnegard and Valdar 2011; Hulse and Cai 2013).

The other possible scenario concerns the role of genetic interactions in the formation of evQTL. Through either epistatic or nonepistatic (e.g., additive and dominance) effects within or between loci, genetic interactions contribute to genotypic variance. Epistasis may increase the variance of a quantitative trait (Pare et al. 2010; Ronnegard and Valdar 2012). However, it is extremely difficult to distinguish the contributions of epistatic or nonepistatic effects to variable expression of genes. Epistasis, in particular, is known to produce predominantly additive and dominance genetic variance when the low-frequency alleles of some SNPs are involved (Cheverud and Routman 1995).

Here we investigated the distribution and formation of evQTL by leveraging the existing dataset (Grundberg et al. 2012) derived from a population-based cohort of twin studies (Moayyeri et al. 2013). We interrogated this dataset for evQTL and investigated the roles of genetic and nongenetic factors in the formation of the evQTL we identified. The twin cohort offered a unique advantage for studying the relative contributions of various factors that influence expression variability. Importantly, comparing expression data of monozygotic and dizygotic twins allowed us to distinguish between genetic and nongenetic effects. In the following sections, we present the descriptive statistics for expression variability in the twin cohort, describe the detection of evQTL, and finally estimate the relative contributions of genetic and nongenetic factors, as well as epistatic and nonepistatic effects, to the creation of evQTL.

Materials and Methods

The TwinsUK dataset

We obtained the TwinsUK dataset including both genotype and expression data, which had been used in the eQTL study of Grundberg et al. (2012). Here, we briefly describe the cohort and data processing performed in this previous study (Grundberg et al. 2012). The TwinsUK cohort includes 856 female individuals of European descent recruited from the TwinsUK adult twin registry (Spector and Williams 2006; Moayyeri et al. 2013). Subcutaneous adipose tissue, skin tissue, and lymphoblastoid cell lines (LCLs) were collected from each individual. Genotyping was performed with a combination of Illumina HumanHap300, HumanHap610Q, 1-M Duo and 1.2-M Duo 1-M chips. Genotypes were called with the Illuminus calling algorithm (Teo et al. 2007), and SNPs were filtered for minor allele frequency (MAF) of >5%. Gene expression levels were measured in LCLs, skin, and adipose (Grundberg et al. 2012). Expression profiling of the samples, each with either two or three technical replicates, was performed using Illumina Human HT-12 V3 BeadChips (Illumina). All samples were randomized before array hybridization, and replicates were hybridized on different BeadChips. Raw data were imported to Illumina BeadStudio software, and probes with fewer than three beads present were excluded. Log2-transformed expression signals were normalized separately per tissue, with quantile normalization of the replicates of each individual followed by quantile normalization across all individuals (Nica et al. 2011).

In this study, we used available gene expression data for both individuals of a twin pair. All 48,804 probe sequences were mapped by BLAST to the reference genome (hg18), and probes found to map to more than one location were not used. Polymorphisms in the target mRNA sequence can greatly affect the binding affinity of microarray probe sequences, leading to false-positive and false-negative signals with any other polymorphisms in linkage disequilibrium (LD) (Ramasamy et al. 2013). To control for this, we used a comprehensive compendium of SNPs in European (CEU) population of the 1000 Genomes Project Consortium (2012) to remove an additional 13,600 probes found to anneal in regions with SNPs present at a MAF of >5%. Similarly, probes mapping to nonautosomal locations were excluded from further analysis. Finally, 35,078 probes were left for our analysis.

The coefficient of variation (CV) is used as a normalized measure of the dispersion of expression distribution (Maheshri and O’Shea 2007; Ansel et al. 2008; Ronnegard and Valdar 2011). The CV was computed as CV=σ/μ, where σ and μ are the standard deviation and the mean of gene expression levels, respectively. LD block plots were obtained by using HaploView (Barrett et al. 2005).

Identification of evQTL using the dglm method

First we used the F–K test filter to greatly reduce the number of SNPs for computationally intensive model fitting. We then adapted the dglm method (Verbyla and Smyth 1998) to test for inequality in expression variances and measure the contribution of genetic variants to the expression heteroscedasticity. We considered the following model: yi=μ+xiβ+giα+εi,εiN(0,σ2exp(giθ)), where yi indicates a gene expression trait of individual i, gi is the genotype at the given SNP (encoded as 0, 1, or 2 for homozygous rare, heterozygous, and homozygous common alleles, respectively), εi is the residual with variance σ2, and θ is the corresponding vector of coefficients of genotype gi on the residual variance. Age of subjects and the batch of data collection were modeled as covariates xi. With this full model, both mean and variance of expression yi were controlled by SNP genotype gi. We coded the fitting procedure using the dglm package in R. A snippet of R code for the dglm analysis is available in the supporting information of Ronnegard and Valdar (2011). We assumed that the input gene expression data were approximately normally distributed, conditional on the evQTL and covariates, and set family = Gaussian in the dglm R code to specify the error distribution and link function used. We tested for each input probe–SNP pair and obtained two P-values: Pdispersion and Pmean, for the effects of genotypes on the variance and the mean of expression levels, respectively (Ronnegard and Valdar 2011). Probe–gene pairs that did not make the algorithm converge during computation were discarded. To control for the effect of outlier expression data points, permutation tests (Stranger et al. 2005) were conducted for all Pdispersion significant pairs. Specifically, for each probe–SNP pair, we performed 10,000 permutations of expression phenotype relative to SNP genotypes. An association was considered significant if the P-value from the analysis of the observed Pdispersion was lower than the threshold of the 0.001 tail of the distribution of the Pdispersion from the 10,000 permutations (Ppermutation < 0.001).

Single-cell expression and mRNA decay rate

Expression level of 96 genes was measured in 1440 single lymphoblastoid single cells by qPCR assays in another study (Livak et al. 2013). We used this data to compute the CV of expression of the same gene in different cells. The mRNA decay rates of 16,823 genes were estimated in 70 human LCLs (Pai et al. 2012). We obtained the mRNA decay rate data to compute the average mRNA decay rate for each gene among these LCL samples.

Identification of interacting SNPs

We used a two-step procedure to identify SNPs that may “interact” with evSNPs. Assuming an additive interaction between the SNP to be identified and an evSNP, we first partitioned individuals into L and S groups according to genotypes of the evSNP, which were associated with large (L) and small (S) variances of gene expression. Next we scanned genome-wide SNPs. For each SNP, we computed heterozygosity of the polymorphic site among individuals in L and S groups as HetL=1i=1npi_L2 and HetS=1i=1npi_S2, respectively, where Pi are allele frequencies of SNPs and, for diallelic SNPs, n=2. All SNPs were then ranked by the value of HetLHetS and the top 100 SNPs with largest values were selected for further analysis. In the next step, a typical eQTL (not evQTL) analysis (Stranger et al. 2005) was conducted among individuals of the L group. In other words, for each top SNP with high genotype heterozygosity difference, a simple linear regression was performed between the SNP’s genotypes and gene expression. The most significant SNPs were retained after applying a Bonferroni adjusted P-value cutoff = 0.05 and were reported as candidate interacting SNPs. To maintain sample independence, only one group of the twin sets was used in this analysis.

Results

Expression and genotype data

To investigate the genetic influences underlying variable gene expression, we revisited the published expression data (Grundberg et al. 2012) of the MuTHER (Multiple Tissue Human Expression Resource) project (Nica et al. 2011). In that study, gene expression was measured for LCL, adipose tissue (subcutaneous fat), and skin (tissue biopsies) using Illumina Human HT-12 V3 BeadChips. These tissues were sampled from a cohort of 856 female twins from the TwinsUK adult registry, including 154 monozygotic (MZ) twin pairs, 232 dizygotic (DZ) twin pairs and 84 singletons (Moayyeri et al. 2013). After quality control, expression data for 825 (adipose and LCL) and 705 (skin) individuals were retained (Grundberg et al. 2012). For each tissue, we downloaded the processed MuTHER expression data files deposited at ArrayExpression (http://www.ebi.ac.uk/arrayexpress/) using accession E-TABM-1140. The data were the quantile-normalized log2-transformed expression signals. Quantile normalization was performed first across the replicates of a single individual and then across all individuals as described in Grundberg et al. (2012). Along with the expression data, we also obtained the genotype data of this cohort (Grundberg et al. 2012). In our analysis, all available twin pairs with complete expression and genotype information were included, corresponding to 134 MZ and 195 DZ pairs with LCL profiles, 139 MZ and 188 DZ pairs with adipose profiles, and 105 MZ and 148 DZ pairs with skin profiles. Members of the TwinsUK cohort have health and lifestyle characteristics that are comparable to those of population singletons (Andrew et al. 2001). Because of this, we were able to use this cohort as a representative general population to investigate both genetic and nongenetic factors behind expression variability in this study.

Expression variability in the twin cohort

Here we present basic, descriptive statistics for expression data (independent of genotype information), with particular attention to disparities in gene expression among individuals. We chose to focus on the LCL data for this analysis, due to the availability of additional expression-related statistics (such as single-cell expression data and mRNA decay data).

We used the quantile-normalized log2-transformed expression data in all analyses throughout the article unless otherwise indicated. From this data, we first determined that expression values for most probes (n = 35,078) approximately fit the normal distribution: 97% of probes were with a skewness between −0.80 and 0.80 and a kurtosis of ∼3.0 (Supporting Information, Figure S1A); <7% of probes were rejected by Shapiro–Wilk test of normality with Bonferroni adjustment to the level of α = 0.01. These justified the use of the Gaussian error distribution and link function in our dglm model (Materials and Methods). Retrospectively, we showed that the profile distributions for evQTL probes are approximately normal before and after Box–Cox transformation (Figure S1B).

To measure the level of dispersion of gene expression values, we computed the CV for each probe. The CVs ranged from 0.0024 [for ILMN_1765043 (RPL38)] to 0.2115 [for ILMN_1715169 (HLA-DRB1)], with a median of 0.0154. The distributions of CVs measured in subcohorts are indistinguishable from one another such as when comparing one group of MZ twin sets with the other (i.e., MZ 1 vs. MZ 2) or comparing a group of MZ twin sets with a group of DZ twin sets (e.g., MZ 1 vs. DZ 1) (Figure 1A). Probe data points are located along or close to the 1–1 diagonal line in the CV–CV scatter plot for the majority of probes, regardless of the CV–CV comparison between MZ 1 and MZ 2 or between MZ 1 and DZ 1 (Figure 1B). These results indicate that the extent and overall distribution of expression variability measured between individuals across different MZ and DZ cohorts are highly similar when all genes are taken into account.

Figure 1.

Figure 1

Distributions of expression variability in LCLs. (A) Distribution of CVs of gene expression (probe n = 35,078) measured in MZ and DZ twins. MZ 1 is the set of first pairs of all MZ twins and MZ 2 is the set of second pairs of all MZ twins. Similarly, DZ 1 is the set of first pairs of all DZ twins and DZ 2 is the set of second pairs of all DZ twins. (B) Scatter plot of CVs of gene expression (probe n = 35,078) in MZ 1 against those in MZ 2 (blue) or DZ 1 (red) cohorts. (C) Scatter plot of median RMD between pairs of MZ twins against median RMD between pairs of DZ twins. Each blue dot indicates a single expression probe (or a gene) and the position of the blue dot indicates the median value of RMD of expression between all MZ pairs (MZ 1 − MZ 2) on the x-axis and that between all DZ pairs (DZ 1 − DZ 2) on the y-axis. The red line is based on quadratic regression to show a more pronounced difference between MZ and DZ with greater RMD. (D) Scatter plot of CVs of gene expression (n = 59) in single cells against CVs of gene expression in MZ 1. (E) Scatter plot of mean mRNA decay rate against CVs of gene expression in the MZ 1 cohort. The red line is based on the linear regression.

Next, we measured expression differences between each pair of twins. For each probe, we computed the relative mean difference (RMD) in expression between MZ twin pairs and DZ twin pairs, separately. For a pair of MZ twins, for example, the RMD was computed using RMD=((1/2)|yMZ1yMZ2|)/y¯, where y¯ is the arithmetic mean of the expression levels, yMZ1 and yMZ2, for the MZ twin pair. For most probes, the median RMD of expression between DZ pairs is larger than it is between MZ pairs, as indicated by the fact that most genes are located above the 1–1 diagonal line in the scatter plot (Figure 1C). That is to say, the normalized difference in gene expression between DZ pairs (DZ 1 and DZ 2) tends to be larger than that between MZ pairs (MZ 1 and MZ 2), suggesting that genetic factors influence expression variability for most of these genes.

To determine the influence of single-cell expression variability on population-level expression variability, we computed the CVs of expression for a selection of genes whose expression levels have been measured in single LCL cells (Livak et al. 2013). No correlation between the single-cell CVs and the between-individual CVs measured was detected for MZ 1 (Spearman’s correlation test, P = 0.21, n = 59; Figure 1D). This suggests a limited contribution of single-cell expression variability (or transcriptional noise at the single-cell level) to the variability between individuals (or transcriptional noise at the population level).

Finally, we hypothesized that variable gene expression may be due to different mRNA decay rates for different genes. To test this, we used the mRNA decay rate data from the study of Pai et al. (2012). The correlation between mean mRNA decay rate and CV of expression among genes is not specific as shown by the opposite signs of two correlation coefficients: Spearman’s ρ = −0.027 (P = 0.00498) and Pearson’s r = 0.044 (P = 4e-6, n = 11,083; Figure 1E). Thus, gene expression variability showed no signs of correlation with the mRNA decay rates of genes.

Genetic variants underlying expression variability

To systematically assess the genetic influence on expression variability, we identified genome-wide evQTL using the method we previously established (Hulse and Cai 2013). We focused on cis-acting evQTL by limiting our search to those SNPs that flanked probes within 1.0 Mb on either side.

After filtering for quality control (Materials and Methods), a total of 35,078 probes were available for analysis. On average, each probe corresponded to 1212 SNPs in the 2-Mb cis-region (i.e., 6 SNPs per 10 kb). For each SNP–probe pair, we conducted a three-step test to determine the evQTL relationship as described previously (Hulse and Cai 2013). Briefly, we first tested for the homogeneity of variances in gene expression among different genotype groups using the Fligner–Killeen (F–K) test (Fligner and Killeen 1976). Only those SNPs with a P < 0.01 (following Fraser and Schadt 2010) were carried on to the next step of analysis. We then applied the dglm method (Ronnegard and Valdar 2011) to each SNP–probe pair, ultimately computing Pdispersion for a total of 1,251,611 SNP–probe pairs. To account for multiple tests performed between these probe and SNP pairs, we used the threshold of Pdispersion < 1 × 10−8, which is roughly equivalent to Bonferroni-adjusted P < 0.01, to assess the genome-wide significance. Finally, we conducted permutation tests for each significant SNP–probe pair to control for the influence of outlier data points on the dglm results (Materials and Methods). The detection of evQTL was performed independently for each of the two groups of twin sets. The assignment of individual twins to each group was random and did not influence the overall results. Each evQTL detected with one group of twin sets was then validated with the other group of twin sets to confirm its authenticity. For all three tissues, concordance was prevalent (Figure 2A) and the cases of discordance were mostly due to outliers present in one group of twins but not in the other group. The direction of effect (association with increased or decreased gene expression variability) was the same between the two groups of the twin sets for all evaluated SNPs.

Figure 2.

Figure 2

Numbers of evQTL in LCL, skin, and fat. (A) Venn diagrams of evQTL genes detected in two groups of twin sets. Each group of the twin sets is composed of one set of unrelated twin individuals. Overlapping areas of the Venn diagrams contain numbers of validated evQTL genes identified with both sets of twins. (B) Numbers correspond to evQTL genes within a subset of tissues. (C) One example of evQTL shared by all three tissues: evQTL at SEMA4G.

A total of 99, 75, and 59 genes were identified and confirmed to have at least one validated cis-evQTL SNP (evSNP) in LCLs, fat, and skin, respectively (Table S1). Eight genes (corresponded to 4.2% of all unique evQTL genes) were shared in all three tissues (Figure 2B). One of these shared evQTL genes, SEMA4G, is given as an example to illustrate the consistent influence of genotypes on the variance of gene expression across the three tissues (Figure 2C). All evQTL shared across tissues showed the same directional effect, defined as either increasing or decreasing the variance of gene expression. That is to say, the directionality of some evQTL effects is not tissue- or cell-type specific. However, understanding regulatory variability in many different tissues might yield insights into the basic biological processes that influence tissue differentiation.

Given that many evQTL genes have more than one cis-evSNP, we examined the structure of haplotypes of these multiple cis-evSNPs. We found that cis-evSNPs of the same gene are likely to be located within the same LD block and that typically these blocks contained only a few prominent haplotypes (see Figure S2 for an example involving gene ALG11). This suggests that multiple evSNPs are likely to be linked with the same causal variant. We furthermore retrieved the ancestral allele information for SNPs from the 1000 Genomes Project Consortium (2012). The prediction of ancestral alleles was based on the phylogenetic trees constructed with sequences of human, chimpanzee, orangutan, and rhesus macaque. We found that ancestral alleles of evSNPs are more likely to be associated with smaller expression variability than derived alleles (Fisher’s exact test: P = 0.0036, 0.022, and 0.036 for LCLs, skin, and fat, respectively).

Dissect genetic and nongenetic effects of evQTL

Twin data facilitated the dissection of the contributions of genetic and nongenetic factors. Variability measured between pairs of DZ twins is expected to be larger than that between pairs of MZ twins, as the phenotypic difference between DZ pairs may result from both genetic and environmental (nongenetic) effects while differences between genetically identical MZ pairs are attributable to the environment, assuming that the environments influencing MZ and DZ twin individuals are essentially identical. Figure 3 depicts the difference in expression level of evQTL gene AXIN2 in three genotypes (GG, AG, and AA) defined by rs740026. Figure 3, A and B illustrate genotypes at rs740026 by linking the two data points for each twin pair by a straight line: Figure 3A shows genotype similarities between MZ twins, while in Figure 3B, similarities between DZ twin pairs are shown. Note that linkers between DZ twin pairs with different genotypes at the SNP site (i.e., DZ 1 ≠ DZ 2) are not plotted. The expression difference between a pair of twins can be visually quantified by the slope of the straight line: a steeper line reflects a more dissimilar expression level between the twins. In the case of AXIN2, it is apparent that expression differences between DZ pairs tend to be larger than between MZ pairs. This is especially true for the AA genotype group, which shows a larger variance in expression between individuals.

Figure 3.

Figure 3

Dissection of genetic and nongenetic effects of evQTL using twins data. (A) The evQTL between AXIN2 and rs740026. The expression data points from pairs of MZ twins are linked. (B) Same as A except that DZ twins are linked. (C) CDFs of RMD between twins classified into four groups, namely MZ-S, DZ-S, MZ-L, and DZ-L (see main text for definitions). The double arrow highlights the highly significant discrepancy in RMD distribution between MZ-L and DZ-L (K–S test, P < 0.01). Insert shows the same CDFs of RMD recomputed after randomly shuffling identities of corresponding MZ and DZ pairs. (D) Same as C except that data are randomly sampled from non-evQTL genes.

For each evSNP and its associated genes in LCLs, we computed the RMD in gene expression between all pairs of MZ or DZ twins, as long as the genotypes of two individuals of the pair of twins were both identical to each other and homozygous at the SNP site. By definition, each evSNP allele is associated with either larger (L) or smaller (S) variance in gene expression. Thus, the RMD values (for evSNPs and associated genes) were separated according to whether homozygous genotypes defined by evSNPs were associated with L or S variances in gene expression. The cumulative distribution functions (CDFs) of these RMD values were plotted (Figure 3C). The curves were based on the RMD values calculated between all possible twin pairs for all evSNPs and genes and classified into four groups: MZ-S, MZ-L, DZ-S, and DZ-L. The MZ-S and DZ-S groups included pairs whose genotypes showed a small (S) amount of variance, while the MZ-L and DZ-L groups included pairs whose genotypes were associated with large (L) variances. In the end, the four groups—MZ-S, MZ-L, DZ-S, and DZ-L—contained 3629, 2548, 3825, and 2520 RMD values, respectively. Detailed statistics for the distributions of RMD values in each of these four groups are provided (Table S2). We found that CDF curves for the large-variance groups (MZ-L and DZ-L) were shifted toward the right compared to those for small-variance groups (MZ-S and DZ-S) [Kolmogorov–Smirnov (K–S) test, all P < 10−5]. This indicated that the distribution of RMD between twin pairs (either MZ or DZ) in the large-variance groups was significantly different from that of the small-variance groups, with a larger RMD median for the large-variance groups. This difference (in RMD distribution between L and S groups) remained even when we randomly assigned the identities of MZ and DZ pairs (see insert in Figure 3C). Together, these results suggested that the increased discrepancy in gene expression between twin pairs (shown as a larger median RMD) contributed to the elevated variability in expression, which is true for both MZ and DZ twins. Because MZ twins are genetically identical, the increased RMD between MZ pairs was likely due to an increased sensitivity of gene expression to environmental factors.

More importantly, we found a significant discrepancy in distribution of RMD between MZ and DZ: DZ groups tended to have larger RMD values than MZ groups. This trend applied to both L and S groups, but was more obvious in the L group (all K–S tests, P < 0.01) (Figure 3C). These results suggested that the different genetic backgrounds resulted in a larger difference in gene expression between DZ twin pairs, which is more pronounced than that observed between MZ twin pairs.

For comparison, we randomly selected the same number of genes and cis-SNPs and conducted the same analysis of RMD distribution. There was no difference between CDFs of RMD in these non-evQTL genes in regard to either MZ or DZ twins, larger or smaller variance groups, as well as before or after shuffling of the twin identities. CDFs of all groups were more similar to each other (K–S test, all P > 0.025, except between MZ-S and DZ-L, P = 2.9e-4; Table S2, Figure 3D). That is to say, the influence of genetic and/or environmental effects on variable expression was not detected at the genomic level for all genes, but was limited to evQTL regions.

Finally, we repeated the CDF analyses using the RMD values computed from the Box–Cox normalized log2-transformed expression data, as well as using the absolute difference (instead of RMD) in gene expression. In both cases, our findings were highly similar to those obtained above (Figure S3), which supports the robustness of the results presented above.

Possible evQTL replicated by using RNA-seq data and SNPs of the 1000 Genomes Project

We obtained genotype data for fully sequenced samples of European ancestry (CEU) from the phase 1 release of the 1000 Genomes Project (1000 Genomes Project Consortium 2012), along with short reads from RNA-seq experiments in LCLs for these same individuals (n = 43) (Montgomery et al. 2010). After mapping the short reads, we estimated the expression level in fragments per kilobase of exon per million fragments (FPKM) (Trapnell et al. 2013) for all genes. For the same evQTL gene–SNP pairs detected in LCLs (i.e., those included in Table S1), we plotted the relationships between genotype and FPKM value for each. Many evQTL relationships could be recognized by visual inspection (examples are presented in Figure S4), though none were statistically significant due to the small sample size. It is noteworthy that the algorithm for computing FPKM models the dispersion in a transcript’s fragment count with a negative binomial distribution (Trapnell et al. 2013), which may introduce a relationship between the mean and variance of the count. The relationship should be taken into account in FPKM-based evQTL analyses.

Partially linked SNPs contribute to variable gene expression

Recent theoretical work showed that the within-genotype variance of a quantitative trait varies when a nonadditive genetic interaction or epistasis is present (Pare et al. 2010; Struchalin et al. 2010). Alternatively, the variance of a quantitative trait may be from the result of the interaction between genetic variants additively associated with the mean of the quantitative trait. To test this, we employed a two-step procedure to identify SNPs partially associated with (or interacting) with evSNPs through an incomplete haplotype structure (Materials and Methods). In an ideal scenario (Figure 4A), the genotype heterozygosity of the partially linked SNPs is large among individuals whose evSNP genotype is associated with larger expression variability (L group), while the genotype heterozygosity is small or equals zero among individuals whose evSNP genotype is associated with smaller expression variability (S group). If the interacting SNP is associated with the mean level of gene expression, then the association between the evSNP genotype and greater expression variability is likely due to the partial linkage between the evSNP and the interacting SNP.

Figure 4.

Figure 4

Schematic and example of an interacting SNP that helps the creation of an evQTL. (A) L indicates the group of individuals with evSNP genotype (C/C) associated with large variance in gene expression, while S indicates that with evSNP genotype (A/A) associated with small variance. The interacting SNP shows large genotype heterogeneity in the L group and small or no genotype heterogeneity in the S group. (B) Real example of evSNP rs742090 and interacting SNP rs3799378 at BTN3A2. Individuals with rs742090-CC genotype are further broken down by rs3799378 into three subgenotype groups, which are associated with different means of gene expression levels (shadowed panel).

Given these considerations, we performed a genome-wide search to identify a set of candidate interacting SNPs for each evQTL SNP and then used simple linear regression analysis to evaluate whether the potential interacting SNPs are significantly associated with gene expression among L-group individuals (Materials and Methods). For the 99 evQTL in LCLs, we identified 56 with at least one interacting SNP (Table S3). Among these interacting SNPs, 54 are located within the cis-region of the evSNPs with which they interact. Figure 4B presents one such relationship between evSNP rs742090 and interacting SNP rs3799378, both at BTN3A2. Individuals with a CC genotype for evSNP rs742090 were further sorted by rs3799378 genotypes. Clearly, the expression level of BTN3A2 in individuals with the rs742090-CC genotype is significantly influenced by rs3799378 genotypes. The increased variability in gene expression showed in individuals with rs742090-CC genotype is caused by the heterogeneity of rs3799378 genotypes. These results suggest that local haplotype structure between SNPs contributed to the creation of evQTL.

Linking evQTL with complex disease phenotypes

Several studies have utilized eQTL data to interpret the discoveries of association studies of complex traits (Emilsson et al. 2008; Nica et al. 2010; Nicolae et al. 2010). Along this same vein, we identified evQTL associated with complex traits from the catalog of published genome-wide association studies (GWAS) (http://www.genome.gov/gwastudies/). From the results of these GWAS, we identified 61 reported genes as evQTL genes (Table S4). In four cases, the exact same SNP was found to be both an evSNP and a marker SNP associated with risk or susceptibility to the complex trait (Table 1). Intriguingly, the “T” allele of rs8070463, associated with smaller expression variability of TBKBP1, is a reported risk allele in multiple sclerosis (Patsopoulos et al. 2011), while the “C” allele for this same SNP, associated with larger expression variability, is linked with risk for ankylosing spondylitis (Evans et al. 2011).

Table 1. SNPs associated with gene expression variability and human complex traits.

Gene (evSNP) Tissue GWAS complex trait Risk allele Reference
PAX8 (rs11123170) LCL, fat, skin Renal function-related traits (BUN) rs11123170-GL Okada et al. (2012)
WDR41 (rs163030) LCL, fat, skin Caudate nucleus volume rs163030-AL Stein et al. (2011)
HCG22 (rs2517532) LCL Hypothyroidism rs2517532-GS Eriksson et al. (2012)
TBKBP1 (rs8070463) LCL Multiple sclerosis, ankylosing spondylitis rs8070463-TS Patsopoulos et al. (2011)
rs8070463-CL Evans et al. (2011)

L and S indicate that individuals carrying homozygotic genotype of the risk allele have large and small variance in gene expression, respectively. GWAS, genome-wide association studies; LCL, lymphoblastoid cell line.

Discussion

There is empirical evidence across several species that the variance among phenotypes is genotype dependent (Ansel et al. 2008; Wolc et al. 2009; Hill and Mulder 2010; Jimenez-Gomez et al. 2011). Understanding genetic control of phenotypic variability has become increasingly important in evolutionary biology, human medicine, the agricultural industry and other branches of biological science (Gibson 2009; Yang et al. 2012). Despite the importance, few research programs focus on genetic variants associated with trait variance, while studies of trait averages abound. Recently, a powerful statistical framework based on the dglm model has been developed for studying phenotypic variability of complex traits (Ronnegard and Valdar 2011). Given that gene expression is a complex trait with highly variable and heritable patterns (Stranger et al. 2005; Williams et al. 2007; Montgomery and Dermitzakis 2011), we have previously adapted the dglm method to investigate genetic variants controlling expression variability (Hulse and Cai 2013).

In this study, we further investigated the relative contributions of genetic and nongenetic (environmental) factors to expression variability and the roles of these factors in the formation of evQTL. We started by exploring basic gene expression statistics measured in the TwinsUK cohort. For all genes, expression level dispersions were highly similar in and between both MZ and DZ twins. No correlations with expression variability were detected when compared between individuals, between single cells, or relative to the average mRNA decay rate, highlighting the marked discrepancies in variability measured at population and molecular levels. Further results showed that the discordance in expression between each pair of DZ twins was more pronounced than that between MZ twins, implying that the increased amount of genetic variation between DZ twins influences expression variability. Next, we systematically identified cis-acting evQTL in three tissues of the TwinsUK cohort. Twin data greatly facilitated the validation of detected evQTL and revealed overall robust signals that would otherwise not be appreciable in studies of nontwin design. Focusing on the detected evQTL, we showed that the discordance in expression between DZ pairs was larger than that between MZ pairs, and further showed that the discordance in expression between MZ pairs whose genotypes were associated with large expression variability was significantly larger than that between MZ pairs whose genotypes were associated with small expression variability. It is intriguing to find that the phenotypic discordance remained even in the absence of genetic variation between MZ twins. This might be explained by incomplete penetrance of mutations, which is frequent in isogenic model organisms in homogeneous environments (Horvitz and Sulston 1980; Gartner 1990). This might also be epigenetic: for example, DNA methylation, which can be influenced by environmental factors such as diet and lifestyle, is known to affect gene expression (Badano and Katsanis 2002; Baranzini et al. 2010). Lastly, much to our surprise, we found that more than half of evQTL could be explained by a conceptually simple scenario in which the evSNP was occasionally associated with a nearby SNP that influenced gene expression both additively and independently. We suspect there should be many different ways of nonepistatic interaction between two or more genetic variants, such as the mode of partial linkage we have described here, giving rise to genotype-dependent expression variances. That is to say, the majority of phenotypic variability across individuals might be explained without invoking epistasis (Hill et al. 2008; Powell et al. 2013).

In light of our new findings, several related considerations are discussed below.

Methodological considerations for studying phenotypic variability

The procedure we used for identifying evQTL (Materials and Methods and Hulse and Cai 2013) consisted of three steps. First, the F–K test was applied to test for the heterogeneity of variances of gene expression between different genotypes and identify corresponding SNPs. Next, the dglm method was applied to the selected SNPs. The significant results of the dglm test were then subjected to permutation tests to reduce the influence of outliers in the data. This procedure is less likely to be susceptible to issues related to multiple testing and outliers in input data, though a formal assessment of its statistical power remains to be done.

Statistical methods, including Levene’s test (Pare et al. 2010; Struchalin et al. 2010), squared residual value linear modeling (SVLM) (Struchalin et al. 2012), and the methods by Yang et al. (2011), have been applied in studying phenotypic variability (see review in Geiler-Samerotte et al. 2013). As a full parametric approach, the dglm method (Ronnegard and Valdar 2011) has several advantages. For example, it accounts for the uncertainty of fitted parameters for both the mean and the variance aspects of the model and also allows fitting of covariates (Ronnegard and Valdar 2012); it is also highly flexible, allowing for any response distribution from the exponential family (Smyth 2002) (such as binomial, Poisson, or gamma) to be modeled (see section 2 in the Supporting Theory of Ronnegard and Valdar 2011 for a sample of modeling the gamma-distributed traits).

Given the flexibility of the dglm method, we acknowledge that the results of our evQTL analysis are likely to be dependent on how the dglm analysis was set up. For this study, we adapted the Gaussian error distribution and link function because no significant departure from normality was found in the expression data. The effect of different methods of normalization on statistical interpretation of gene expression remains subject to careful scrutiny (Bolker et al. 2009; Qin et al. 2012; Geiler-Samerotte et al. 2013). For example, normalizations may perturb the covariance structure of input data or change the scale of the resulting data. Thus, the impacts of different methods of data transformation and normalization should be carefully considered in future studies involving evQTL analysis. Finally, we acknowledge that the dglm analysis described in this article may be influenced by the scale effect (e.g., mean–variance relationship). It is not uncommon for trait variance to change with trait mean, often causing trait skewness. If this occurs, any SNP associated with a large increase in mean expression would also be associated with an increase in variability (Ronnegard and Valdar 2011). Analyses studying a specific phenotype and/or with a more narrowly targeted focus than that of the broad-based study described in this article should employ a more conservative approach in which QTL associated strictly with variance (i.e., those affecting only variability and not the mean) are identified, using the procedure proposed by Ronnegard and Valdar (2011).

Additive vs. epistatic effect of genotypes on phenotypic variation in a population

Quantitative geneticists partition the genetic effect on phenotypic variation between individuals into additive, dominance, and epistatic components. The additive component describes the variance associated with the independent contributions of alleles, while dominance describes the variance contributed by interactions between alleles at the same locus, and epistasis refers to the contribution of interactions between alleles at different loci. For most complex traits, quantitative genetic theory (Hill et al. 2008; Crow 2010) suggests that epistasis is unlikely to contribute substantially to the between-individual variation. That is to say, most of the variation in a population will be due to the additive effects of specific allelic combinations. Yet this assertion is not without controversy. The results of empirical linkage mapping and association studies suggest that epistasis seems to explain considerable variation in complex trait characteristics within natural populations (Carlborg and Haley 2004; Zuk et al. 2012).

Our results showed that >50% of evQTL can be explained by a partial linkage (i.e., partial association between haplotypes) of the evQTL SNP and another SNP nearby. Our interacting SNP analysis only considered a simplistic scenario of the association. There are many other possible ways of partial associations in which SNPs interact. For example, consider the genotyped SNP “A/a” and the causative expression QTL “Q/q,” with only three haplotypes segregating in the population: AQ, aQ, and aq (as would occur if the novel “q” allele arose on the “a” haplotype). Then the “a” SNP allele would be associated with a changed trait mean and a higher trait variance as the expression QTL segregates within that genotype. If we could take all possible partial associations into account, we would anticipate that even more evQTL could be explained by the effect of partial linkage, rather than epistasis. We therefore conclude that much variance in a quantitative trait may be explained by partial linkage between local genetic variants, each additively associated with the trait. Our view is supported by the results of recent studies. Powell et al. (2013) conducted a gene expression study using blood samples from 862 individuals from nuclear families containing MZ or DZ twin pairs, using both pedigree and genotype information. They found that the genetic architecture of gene expression is predominantly additive, with a minority of transcripts displaying nonadditive effects. Hill et al. (2008) evaluated the evidence from empirical studies of genetic variance components and found that additive variance typically accounts for over half and often close to 100% of the total genetic variance.

Detecting evQTL as a shortcut for detecting epistasis?

Detection of the variance of a quantitative trait in genetic association studies is thought to increase knowledge about the interaction between genetic variants. More specifically, detecting variability QTL (e.g., evQTL) is considered to be a shortcut for detecting genetic interactions (Ronnegard and Valdar 2011, 2012). So far, many methods for detecting genetic interactions are based on testing for different variances of phenotype between genotypes, with the underlying assumption that the variance of a quantitative trait is likely to differ under the influence of epistasis (Pare et al. 2010; Ronnegard and Valdar 2011). However, our new discovery that evQTL are formed due to the partial haplotype association between SNPs refutes this assumption. As stated above, more than half (and probably many more) of the evQTL identified could be explained by partial linkage between SNPs with additive effects. Both additive and epistatic effects can result in increased phenotypic variation (as schematically illustrated in Figure S5). Merely detecting the variance of a quantitative trait cannot in itself distinguish between the additive and epistatic effects; thus, no specific conclusions can be made. The relationship between partially associated SNPs, each additively associated with phenotypic variation, needs to be integrated more carefully in the study of phenotypic variability. Thus, the variance of a quantitative trait should not serve as a hallmark of genetic interaction or epistasis.

Phenotypic variability and implications in complex traits and diseases

High-throughput sequencing and genotyping technologies have spurred an increasing number of studies detecting genotype–phenotype relationships and mapping in complex, polygenic traits and human diseases (Hindorff et al. 2009). The remarkable success of GWAS is accompanied by the issue of the “missing heritability” (Manolio et al. 2009), namely the fact that the trait-associated SNPs identified through GWAS often account for only a small proportion of the observed correlations in phenotype between relatives. The reason behind this issue has been thought to be that additional genetic factors remain to be found, and that the presence of epistasis is a particular cause for concern (Carlborg and Haley 2004; Moore and Williams 2009; Ueki and Cordell 2012). In reality, if the effect of one locus is altered or masked by effects at another locus, power to detect the first locus is likely to be reduced and elucidation of the joint effects at the two loci will be hindered by their interaction. Consequently, a large amount of research has been devoted to the detection and investigation of epistatic interactions; a number of methods for detecting the interaction between SNPs have been proposed (Pare et al. 2010; Struchalin et al. 2010; Shang et al. 2011; Daye et al. 2012; Struchalin et al. 2012), yet there has been much confusion in the literature over definitions and interpretations of epistasis (Cordell 2002).

This study, along with other studies (Hill et al. 2008; Powell et al. 2013), have clearly pointed out that a detailed investigation of local haplotype structure between SNPs at the same locus is necessary to reveal their combined influences on phenotypes of complex traits. For example, we have identified a list of evSNPs that are also associated with human complex traits (see Table 1). Further investigations on partial linkage between SNPs that may influence these traits should be performed. The same should also be done for the FTO (fat mass and obesity associated) gene locus whose genotype is associated with phenotypic variability of body mass index (Yang et al. 2012).

Finally, we point out that an interaction detected via statistical models is different from the biological interaction (Phillips 1998; Cordell 2002; Wang et al. 2010). The lack of direct correspondence between statistical and biological interactions makes it difficult to make strong inferences concerning biological mechanisms based on interaction terms from a statistical model (Ueki and Cordell 2012). Therefore, detection of statistical interaction merely provides a good starting point for a more focused investigation of the joint involvement of the relevant factors, which can perhaps be better addressed through other types of experimental data. Our findings suggest that there is a lot that can be done at the statistical level to prioritize those loci that are most likely to produce significant experimental results.

Conclusions

In conclusion, we used evQTL as a statistical model system for studying phenotypic variability and dissected the genetic and nongenetic effects by using the twin data. Our findings concerning evQTL offer new insights into relative contribution of genetic and environmental factors in the formation of evQTL. Dissecting the genetic components underlying phenotypic variability into additive and epistatic effects allowed the dominant role of additive effect to be revealed.

Supplementary Material

Supporting Information

Acknowledgments

We thank the TwinUK study for producing the datasets, Ken Livak for sharing the single-cell expression data, and the Whole Systems Genomics Initiative of Texas A&M University for providing computational resources. The TwinUK study was funded by the Wellcome Trust and the European Community’s Seventh Framework Programme (FP7/2007–2013). The study also received support from the National Institute for Health Research (NIHR)’s Clinical Research Facility at Guy’s and St. Thomas’ National Health Service (NHS) Foundation Trust and NIHR’s Biomedical Research Centre based at Guy’s and St. Thomas’ NHS Foundation Trust and King’s College London. SNP genotyping was performed by the Wellcome Trust Sanger Institute and the National Eye Institute via National Institutes of Health/Computerised Infectious Disease Reporting (CIDR).

Footnotes

Communicating editor: E. Petretto

Literature Cited

  1. Acar M., Mettetal J. T., van Oudenaarden A., 2008.  Stochastic switching as a survival strategy in fluctuating environments. Nat. Genet. 40: 471–475. [DOI] [PubMed] [Google Scholar]
  2. Andrew T., Hart D. J., Snieder H., de Lange M., Spector T. D., et al. , 2001.  Are twins and singletons comparable? A study of disease-related and lifestyle characteristics in adult women. Twin Res. 4: 464–477. [DOI] [PubMed] [Google Scholar]
  3. Ansel J., Bottin H., Rodriguez-Beltran C., Damon C., Nagarajan M., et al. , 2008.  Cell-to-cell stochastic variation in gene expression is a complex genetic trait. PLoS Genet. 4: e1000049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Badano J. L., Katsanis N., 2002.  Beyond Mendel: an evolving view of human genetic disease transmission. Nat. Rev. Genet. 3: 779–789. [DOI] [PubMed] [Google Scholar]
  5. Bahar R., Hartmann C. H., Rodriguez K. A., Denny A. D., Busuttil R. A., et al. , 2006.  Increased cell-to-cell variation in gene expression in ageing mouse heart. Nature 441: 1011–1014. [DOI] [PubMed] [Google Scholar]
  6. Baranzini S. E., Mudge J., van Velkinburgh J. C., Khankhanian P., Khrebtukova I., et al. , 2010.  Genome, epigenome and RNA sequences of monozygotic twins discordant for multiple sclerosis. Nature 464: 1351–1356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Barrett J. C., Fry B., Maller J., Daly M. J., 2005.  Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21: 263–265. [DOI] [PubMed] [Google Scholar]
  8. Bolker B. M., Brooks M. E., Clark C. J., Geange S. W., Poulsen J. R., et al. , 2009.  Generalized linear mixed models: a practical guide for ecology and evolution. Trends Ecol. Evol. 24: 127–135. [DOI] [PubMed] [Google Scholar]
  9. Carlborg O., Haley C. S., 2004.  Epistasis: Too often neglected in complex trait studies? Nat. Rev. Genet. 5: 618–625. [DOI] [PubMed] [Google Scholar]
  10. Cheverud J. M., Routman E. J., 1995.  Epistasis and its contribution to genetic variance components. Genetics 139: 1455–1461. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Choy E., Yelensky R., Bonakdar S., Plenge R. M., Saxena R., et al. , 2008.  Genetic analysis of human traits in vitro: drug response and gene expression in lymphoblastoid cell lines. PLoS Genet. 4: e1000287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cordell H. J., 2002.  Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum. Mol. Genet. 11: 2463–2468. [DOI] [PubMed] [Google Scholar]
  13. Crow J. F., 2010.  On epistasis: why it is unimportant in polygenic directional selection. Philos. Trans. R. Soc. Lond. B Biol. Sci. 365: 1241–1244. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Daye Z. J., Chen J., Li H., 2012.  High-dimensional heteroscedastic regression with an application to eQTL data analysis. Biometrics 68: 316–326. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Emilsson V., Thorleifsson G., Zhang B., Leonardson A. S., Zink F., et al. , 2008.  Genetics of gene expression and its effect on disease. Nature 452: 423–428. [DOI] [PubMed] [Google Scholar]
  16. Eriksson N., Tung J. Y., Kiefer A. K., Hinds D. A., Francke U., et al. , 2012.  Novel associations for hypothyroidism include known autoimmune risk loci. PLoS ONE 7: e34442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Evans D. M., Spencer C. C. A., Pointon J. J., Su Z., Harvey D., et al. , 2011.  Interaction between ERAP1 and HLA-B27 in ankylosing spondylitis implicates peptide handling in the mechanism for HLA-B27 in disease susceptibility. Nat. Genet. 43: 761–767. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Fligner M. A., Killeen T. J., 1976.  Distribution-free 2-sample tests for scale. J. Am. Stat. Assoc. 71: 210–213. [Google Scholar]
  19. Fraser H. B., Schadt E. E., 2010.  The quantitative genetics of phenotypic robustness. PLoS ONE 5: e8635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Gartner K., 1990.  A third component causing random variability beside environment and genotype. A reason for the limited success of a 30 year long effort to standardize laboratory animals? Lab. Anim. 24: 71–77. [DOI] [PubMed] [Google Scholar]
  21. Geiler-Samerotte K., Bauer C., Li S., Ziv N., Gresham D., et al. , 2013.  The details in the distributions: why and how to study phenotypic variability. Curr. Opin. Biotechnol. 24: 752–759. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Gibson G., 2009.  Decanalization and the origin of complex disease. Nat. Rev. Genet. 10: 134–140. [DOI] [PubMed] [Google Scholar]
  23. Grundberg E., Small K. S., Hedman A. K., Nica A. C., Buil A., et al. , 2012.  Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44: 1084–1089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Hallgrímsson B., Hall B. K., 2005.  Variation. Academic Press, Amsterdam. [Google Scholar]
  25. Hill W. G., Zhang X. S., 2004.  Effects on phenotypic variability of directional selection arising through genetic differences in residual variability. Genet. Res. 83: 121–132. [DOI] [PubMed] [Google Scholar]
  26. Hill W. G., Mulder H. A., 2010.  Genetic analysis of environmental variation. Genet. Res. 92: 381–395. [DOI] [PubMed] [Google Scholar]
  27. Hill W. G., Goddard M. E., Visscher P. M., 2008.  Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet. 4: e1000008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Hindorff L. A., Sethupathy P., Junkins H. A., Ramos E. M., Mehta J. P., et al. , 2009.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106: 9362–9367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Horvitz H. R., Sulston J. E., 1980.  Isolation and genetic characterization of cell-lineage mutants of the nematode Caenorhabditis elegans. Genetics 96: 435–454. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Hulse A. M., Cai J. J., 2013.  Genetic variants contribute to gene expression variability in humans. Genetics 193: 95–108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Jimenez-Gomez J. M., Corwin J. A., Joseph B., Maloof J. N., Kliebenstein D. J., 2011.  Genomic analysis of QTLs and genes altering natural variation in stochastic noise. PLoS Genet. 7: e1002295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Kaern M., Elston T. C., Blake W. J., Collins J. J., 2005.  Stochasticity in gene expression: from theories to phenotypes. Nat. Rev. Genet. 6: 451–464. [DOI] [PubMed] [Google Scholar]
  33. Kemkemer R., Schrank S., Vogel W., Gruler H., Kaufmann D., 2002.  Increased noise as an effect of haploinsufficiency of the tumor-suppressor gene neurofibromatosis type 1 in vitro. Proc. Natl. Acad. Sci. USA 99: 13783–13788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Livak K. J., Wills Q. F., Tipping A. J., Datta K., Mittal R., et al. , 2013.  Methods for qPCR gene expression profiling applied to 1440 lymphoblastoid single cells. Methods 59: 71–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Maheshri N., O’Shea E. K., 2007.  Living with noisy genes: how cells function reliably with inherent variability in gene expression. Annu. Rev. Biophys. Biomol. Struct. 36: 413–434. [DOI] [PubMed] [Google Scholar]
  36. Manolio T. A., Collins F. S., Cox N. J., Goldstein D. B., Hindorff L. A., et al. , 2009.  Finding the missing heritability of complex diseases. Nature 461: 747–753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Moayyeri A., Hammond C. J., Hart D. J., Spector T. D., 2013.  The UK Adult Twin Registry (TwinsUK Resource). Twin Res. Hum. Genet. 16: 144–149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Montgomery S. B., Dermitzakis E. T., 2011.  From expression QTLs to personalized transcriptomics. Nat. Rev. Genet. 12: 277–282. [DOI] [PubMed] [Google Scholar]
  39. Montgomery S. B., Sammeth M., Gutierrez-Arcelus M., Lach R. P., Ingle C., et al. , 2010.  Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464: 773–777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Moore J. H., Williams S. M., 2009.  Epistasis and its implications for personal genetics. Am. J. Hum. Genet. 85: 309–320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Nica A. C., Montgomery S. B., Dimas A. S., Stranger B. E., Beazley C., et al. , 2010.  Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet. 6: e1000895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Nica A. C., Parts L., Glass D., Nisbet J., Barrett A., et al. , 2011.  The architecture of gene regulatory variation across multiple human tissues: the MuTHER study. PLoS Genet. 7: e1002003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Nicolae D. L., Gamazon E., Zhang W., Duan S., Dolan M. E., et al. , 2010.  Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6: e1000888. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Okada Y., Sim X., Go M. J., Wu J. Y., Gu D. F., et al. , 2012.  Meta-analysis identifies multiple loci associated with kidney function-related traits in east Asian populations. Nat. Genet. 44: 904. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Pai A. A., Cain C. E., Mizrahi-Man O., De Leon S., Lewellen N., et al. , 2012.  The contribution of RNA decay quantitative trait loci to inter-individual variation in steady-state gene expression levels. PLoS Genet. 8: e1003000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Pare G., Cook N. R., Ridker P. M., Chasman D. I., 2010.  On the use of variance per genotype as a tool to identify quantitative trait interaction effects: a report from the Women’s Genome Health Study. PLoS Genet. 6: e1000981. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Patsopoulos N. A., Bayer Pharma MS Genetics Working Group; Steering Committees of Studies Evaluating IFNβ-1b and a CCR1-Antagonist; ANZgene Consortium; GeneMSA et al, 2011.  Genome-wide meta-analysis identifies novel multiple sclerosis susceptibility loci. Ann. Neurol. 70: 897–912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Perry G. M. L., Nehrke K. W., Bushinsky D. A., Reid R., Lewandowski K. L., et al. , 2012.  Sex modifies genetic effects on residual variance in urinary calcium excretion in rat (Rattus norvegicus). Genetics 191: 1003–1013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Phillips P. C., 1998.  The language of gene interaction. Genetics 149: 1167–1171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Pickrell J. K., Marioni J. C., Pai A. A., Degner J. F., Engelhardt B. E., et al. , 2010.  Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464: 768–772. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Powell J. E., Henders A. K., McRae A. F., Kim J., Hemani G., et al. , 2013.  Congruence of additive and non-additive effects on gene expression estimated from pedigree and SNP data. PLoS Genet. 9: e1003502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Qin S., Kim J., Arafat D., Gibson G., 2012.  Effect of normalization on statistical and biological interpretation of gene expression profiles. Front Genet 3: 160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Queitsch C., Sangster T. A., Lindquist S., 2002.  Hsp90 as a capacitor of phenotypic variation. Nature 417: 618–624. [DOI] [PubMed] [Google Scholar]
  54. Ramasamy A., Trabzuni D., Gibbs J. R., Dillman A., Hernandez D. G., et al. , 2013.  Resolving the polymorphism-in-probe problem is critical for correct interpretation of expression QTL studies. Nucleic Acids Res. 41: e88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Ronnegard L., Valdar W., 2011.  Detecting major genetic loci controlling phenotypic variability in experimental crosses. Genetics 188: 435–447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Ronnegard L., Valdar W., 2012.  Recent developments in statistical methods for detecting genetic loci affecting phenotypic variability. BMC Genet. 13: 63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Shang J., Zhang J., Sun Y., Liu D., Ye D., et al. , 2011.  Performance analysis of novel methods for detecting epistasis. BMC Bioinformatics 12: 475. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Shen X., Pettersson M., Ronnegard L., Carlborg O., 2012.  Inheritance beyond plain heritability: variance-controlling genes in Arabidopsis thaliana. PLoS Genet. 8: e1002839. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Smyth G. K., 2002.  An efficient algorithm for REML in heteroscedastic regression. J. Comput. Graph. Statist. 11: 836–847. [Google Scholar]
  60. Spector T. D., Williams F. M. K., 2006.  The UK Adult Twin Registry (TwinsUK). Twin Res. Hum. Genet. 9: 899–906. [DOI] [PubMed] [Google Scholar]
  61. Stein J. L., Hibar D. P., Madsen S. K., Khamis M., McMahon K. L., et al. , 2011.  Discovery and replication of dopamine-related gene effects on caudate volume in young and elderly populations (N=1198) using genome-wide search. Mol. Psychiatry 16: 927–937. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Stranger B. E., Forrest M. S., Clark A. G., Minichiello M. J., Deutsch S., et al. , 2005.  Genome-wide associations of gene expression variation in humans. PLoS Genet. 1: e78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Stranger B. E., Nica A. C., Forrest M. S., Dimas A., Bird C. P., et al. , 2007.  Population genomics of human gene expression. Nat. Genet. 39: 1217–1224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Struchalin M. V., Dehghan A., Witteman J. C., van Duijn C., Aulchenko Y. S., 2010.  Variance heterogeneity analysis for detection of potentially interacting genetic loci: method and its limitations. BMC Genet. 11: 92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Struchalin M. V., Amin N., Eilers P. H., van Duijn C. M., Aulchenko Y. S., 2012.  An R package “VariABEL” for genome-wide searching of potentially interacting loci by testing genotypic variance heterogeneity. BMC Genet. 13: 4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Teo Y. Y., Inouye M., Small K. S., Gwilliam R., Deloukas P., et al. , 2007.  A genotype calling algorithm for the Illumina BeadArray platform. Bioinformatics 23: 2741–2746. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. The 1000 Genomes Project Consortium, 2012.  An integrated map of genetic variation from 1,092 human genomes. Nature 491: 56–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Trapnell C., Hendrickson D. G., Sauvageau M., Goff L., Rinn J. L., et al. , 2013.  Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat. Biotechnol. 31: 46–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Ueki M., Cordell H. J., 2012.  Improved statistics for genome-wide interaction analysis. PLoS Genet. 8: e1002625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Verbyla, A. P., and G. K. Smyth, 1998 Double generalized linear models: approximate residual maximum likelihood and diagnostics, pp. 1–15 in Research Report. Department of Statistics, University of Adelaide, Adelaide, Australia. [Google Scholar]
  71. Wagner, G., 1995 Adaptation and the modular design of organisms, pp. 315–328 in Advances in Artificial Life, edited by F. Morán, A. Moreno, J. Merelo, and P. Chacón. Springer-Verlag, Berlin. [Google Scholar]
  72. Wagner G. P., Altenberg L., 1996.  Complex adaptations and the evolution of evolvability. Evolution 51: 967–976. [DOI] [PubMed] [Google Scholar]
  73. Wang X., Elston R. C., Zhu X., 2010.  The meaning of interaction. Hum. Hered. 70: 269–277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Wang Z., Zhang J., 2011.  Impact of gene expression noise on organismal fitness and the efficacy of natural selection. Proc. Natl. Acad. Sci. USA 108: E67–E76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Williams R. B., Chan E. K., Cowley M. J., Little P. F., 2007.  The influence of genetic variation on gene expression. Genome Res. 17: 1707–1716. [DOI] [PubMed] [Google Scholar]
  76. Wolc A., White I. M., Avendano S., Hill W. G., 2009.  Genetic variability in residual variation of body weight and conformation scores in broiler chickens. Poult. Sci. 88: 1156–1161. [DOI] [PubMed] [Google Scholar]
  77. Yang J., Loos R. J., Powell J. E., Medland S. E., Speliotes E. K., et al. , 2012.  FTO genotype is associated with phenotypic variability of body mass index. Nature 490: 267–272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Yang Y., Christensen O. F., Sorensen D., 2011.  Use of genomic models to study genetic control of environmental variance. Genet. Res. 11: 1–14. [DOI] [PubMed] [Google Scholar]
  79. Zhang Z., Qian W., Zhang J., 2009.  Positive selection for elevated gene expression noise in yeast. Mol. Syst. Biol. 5: 299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Zuk O., Hechter E., Sunyaev S. R., Lander E. S., 2012.  The mystery of missing heritability: Genetic interactions create phantom heritability. Proc. Natl. Acad. Sci. USA 109: 1193–1198. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES