Abstract
The genetics of phenotypic responses to changing environments remains elusive. Using whole-genome quantitative gene expression as a model, here we study how the genetic architecture of regulatory variation in gene expression changed in a population of fully sequenced inbred Drosophila melanogaster strains when flies developed in different environments (25 °C and 18 °C). We find a substantial fraction of the transcriptome exhibited genotype by environment interaction, implicating environmentally plastic genetic architecture of gene expression. Genetic variance in expression increases at 18 °C relative to 25 °C for most genes that have a change in genetic variance. Although the majority of expression quantitative trait loci (eQTLs) for the gene expression traits in the two environments are shared and have similar effects, analysis of the environment-specific eQTLs reveals enrichment of binding sites for two transcription factors. Finally, although genotype by environment interaction in gene expression could potentially disrupt genetic networks, the co-expression networks are highly conserved across environments. Genes with higher network connectivity are under stronger stabilizing selection, suggesting that stabilizing selection on expression plays an important role in promoting network robustness.
Subject terms: Evolutionary genetics, Gene expression, Genomics, Quantitative trait
Huang et al. show that developing under different temperatures changes the genetic architecture of regulatory variation in Drosophila melanogaster gene expression yet the co-expression network remains robust. Data suggest that stabilizing selection on gene expression may promote co-expression network robustness.
Introduction
Organisms living in fluctuating environments or entering novel environments must possess mechanisms to cope with environmental changes. One such mechanism is to change expressed phenotypes in response to different environments, a phenomenon called phenotypic plasticity1. Alternatively, organisms may develop homeostatic mechanisms to cushion the effect of environmental fluctuations without changing their phenotypes. The maintenance of homeostasis is important for organismal fitness as it protects organisms from detrimental effects. The definition of environment can be broad, ranging from cell types, tissues, and physiological states, to diseases, external stimuli, and climate; all of which are known to cause plastic changes or lack thereof in certain phenotypes.
In addition to environmental factors, phenotypes can also respond to genetic perturbations in a plastic or homeostatic manner, which characterizes the potential of an organism to express phenotypes when genes mutate. In a population of genetically diverse individuals, the extent of genetic variation of a phenotype measures the overall sensitivity of individuals to mutations segregating in the population.
Importantly, the state of plasticity or homeostasis, with respect to either genetic or environmental variation, is not necessarily static and can be modified by both genetic and environmental factors. A classic example is the heat shock protein system, particularly Hsp90, whose expression is environmentally plastic and increases under thermal stress, but buffers phenotypic changes induced by mutations to maintain homeostasis2, a process termed canalization3. The opposite of canalization—decanalization—describes the change from a homeostatic state to a plastic one, which allows phenotypic expression of genetic and/or environmental variation4,5. The dynamics of genetic variation (variance across different genotypes) and environmental variation (variance across different environments) may be controlled by different mechanisms. For example, although the histone variant H2A.Z is a capacitator for environmental variation6, its presence in the yeast genome does not increase robustness to mutations7.
Change in genetic variation across environments is one of the many forms of genotype by environment interaction (G×E). G×E can be interpreted equivalently either as variable genetic architecture across environments or as variable environmental plasticity across genotypes, depending on what factor is chosen as the context. G×E has important implications in quantitative trait variation and evolution. It is important for maintenance of genetic variation8. It is pervasive in plants and animals and influences domestication9 and genetic improvement10,11. G×E is also of paramount importance to realize personalized medicine such as individualized drug therapy12.
Gene expression is a unique class of quantitative traits that are under genetic control and that exhibit both plasticity and homeostasis13. Because of the sheer number of gene expression traits and their biological function annotations, gene expression can serve as an important model for quantitative traits.
In this study, we sought to understand G×E for gene expression by exposing the sequenced inbred lines of the Drosophila melanogaster Genetic Reference Panel (DGRP) to a low-temperature treatment. Temperature is one of the most wide ranging environmental factors an organism can experience and must manage, and previous studies have shown that there is genetic variation in plasticity of fitness in response to low temperature in Drosophila melanogaster14. We measured whole-genome gene expression and combined it with a previous data set that quantified gene expression of the same lines reared at standard ambient temperature15. This experimental design whereby the same standing genetic variation is subjected to different thermal environments enabled us to address three fundamental questions. First, how does genetic variance of gene expression change when the environment changes? Second, is environmental plasticity of gene expression heritable; or equivalently, does heritable regulatory variation change in response to environmental change? If so, what are the locations and effects of the quantitative trait loci (QTLs) that exhibited such variability and response to environments? And finally, how does the response of regulatory genetic variation in gene expression changes the architecture of co-expression networks?
Results
Identification of transcriptional units by RNA sequencing and estimation of gene expression by tiling microarrays
We used a two-stage procedure (Fig. 1a) to measure whole-genome gene expression profiles for adult flies independently raised at 18 and 25 °C for 185 DGRP lines16. In the first stage, we defined regions in the genome that expressed detectable RNA levels by sequencing pooled polyadenylated RNA from all DGRP lines for each of the two sexes, separately for each of the two temperatures (Supplementary Data 1 and Supplementary Fig. 1). After alignment of RNA sequence reads to the reference transcriptome and genome, followed by transcript model reconstruction, we merged known and newly discovered transcript models from all conditions to obtain 21,873 gene models with nonoverlapping constitutive exons (Fig. 1a).
In the second stage, we used genome tiling arrays to estimate gene expression for the DGRP lines with two replicates per line, sex, and temperature (Fig. 1b). After removing probes that overlapped with common non-reference alleles, we were able to estimate expression for 20691 genes, among which about 24% (n = 4943) were unannotated novel transcribed regions (NTRs). Our subsequent global analyses did not differentiate between annotated genes and NTRs, except when performing gene set enrichment analyses when annotations were needed. We filtered out six samples (none involved both replicates of the same condition) that were outliers based on scaled expression within each sex and temperature (Supplementary Fig. 2) and removed undesired batch effects using surrogate variable analysis17 on normal quantile-transformed expression within each sex. The adjustment appeared to be effective because known batches such as array scan date was effectively captured by the derived surrogate variables (Supplementary Fig. 3). Subsequent analyses began with these adjusted expression values.
Canalization and decanalization of genetic variance in gene expression
Because we collected the data at 18 and 25 °C separately, the effects of temperature and batch were confounded. Therefore we cannot make inferences on the effect of temperature on the mean change of gene expression, which has been extensively investigated in other studies18,19. To characterize the patterns of genetic variance in gene expression in the two thermal environments, we first partitioned the variance in gene expression into the between-line genetic component () and the remaining within-line microenvironmental component (). In females, there were 4912 genes with significant genetic variance at 18 °C, and 3002 at 25 °C at an false discovery rate (FDR) = 0.05 (Supplementary Data 2), among which 2505 were shared between both environments. In males, there were 5315 and 4278 genes with significant genetic variance at 18 and 25 °C, respectively, including 3339 in common between the two environments (Supplementary Data 2).
The marked difference in the numbers of genes with significant genetic variance at 18 and 25 °C in both sexes was intriguing. This could be due to either an overall decrease in environmental variance or an increase in genetic variance at 18 °C. We therefore tested for variance heterogeneity for the genetic and environmental components. In both sexes, was relatively stable across the two temperatures for the majority of genes (Fig. 2a–d and Supplementary Data 2). In contrast, the difference in was far more pronounced (Fig. 2a–d and Supplementary Data 2). This pattern of variance heterogeneity was almost identical when we did not adjust for infection status with the symbiont Wolbachia bacteria that affects approximately half of the lines (Supplementary Fig. 4). This is consistent with the previous observation that Wolbachia infection does not substantially impact gene expression15,20.
We define genetic decanalization as increased genetic variance in gene expression, and genetic canalization as decreased genetic variance in an environment relative to an arbitrary baseline (Fig. 2e). Our tests for variance heterogeneity at the two temperatures revealed both genetic decanalization and canalization for gene expression relative to 25 °C when flies developed at 18 °C, depending on specific genes considered (Fig. 2). Interestingly, genetic decanalization at 18 °C relative to 25 °C was more prevalent than canalization. There were 149 genes in females that had significantly different genetic variance at 18 than 25 °C by at least twofold. Among these genes 141 were genetically decanalized and only 8 were canalized (Supplementary Data 2). The same was true in males, where 264 genes were decanalized and 15 were canalized at 18 °C. We found little evidence for preferential genetic canalization or decanalization of the expression of genes involved in particular functions. Using gene set enrichment analysis (GSEA), only three broad gene ontology (GO) terms were found to be significantly (FDR = 0.05) enriched for genes whose genetic variance was decanalized or canalized at 18 °C, including the structural constituent of chitin-based cuticle that was enriched for decanalized genes in females and odorant binding and DNA-binding transcription factor activity enriched for canalized genes in males (Supplementary Data 3 and Supplementary Fig. 5).
Response of regulatory genetic variation in gene expression to environmental change
The change in genetic variance upon exposure to environmental change is a special form of G×E. To understand the genetic basis of G×E in gene expression, we asked whether there was variation in gene expression that could be attributed to G×E. There are several equivalent ways to describe this phenomenon, each with a different perspective. The first is a largely statistical description, which can be graphically illustrated by reaction norms. In the reaction norm representation, the presence of G×E causes otherwise parallel lines depicting environmental plasticity to cross (Fig. 3a). G×E may or may not be accompanied by an environmental effect when averaged across individuals. Importantly, with the same phenotypic scale across environments, a change in genetic variance will cause a statistical presentation of G×E (Fig. 3b) though the reverse is not necessarily true. Alternatively, G×E is equivalent to environmentally responsive differences between genotypes (Fig. 3c). If environmental plasticity is genetically variable, its variation between DGRP lines characterizes the degree to which the plasticity is heritable (Fig. 3a). Finally, if an environment is able to modify the allelic effects of QTLs controlling the phenotype21–23, significant G×E indicates that the genetic architecture of the phenotype is environmentally responsive.
To identify genes that showed significant G×E, we pooled data from both environments and partitioned the variance in gene expression into components due to a common genetic effect shared by both temperatures () and due to G×E (). Remarkably, among the 5248 genes in females and 6327 genes in males that had at least some significant genetic component ( or ), 424 (8%) and 619 (10%), respectively, had significant G×E at an FDR = 0.05 (Supplementary Data 2 and Fig. 3d, e). The genetic architecture of regulatory variation of these genes was thus variable or environmentally responsive between the two thermal environments. Equivalently, the environmental plasticity of these genes in response to low temperature was therefore heritable. Of these genes, 66 and 110 were also significant for variance heterogeneity between the two environments in females and males, respectively. GSEA revealed little evidence of G×E or the lack of it concentrating in particular biological functions, as only a very limited number of broad GO terms were enriched for higher or lower degrees of G×E. GO terms enriched (FDR = 0.05) for G×E include tricarboxylic acid cycle and translation initiation, which are enriched for greater G×E in males; and protein serine/threonine kinase activity, which is enriched for less G×E in males (Supplementary Data 4).
Mapping response of regulatory genetic variation to environmental change
We have previously shown that we can map expression quantitative trait loci (eQTLs) for a substantial fraction of genes that are genetically variable in the same population15. To understand the environmental response of the genetic architecture of gene expression, we mapped eQTLs at both 18 and 25 °C among 1891697 common (minor allele frequency >0.05) variants and compared the locations and effects of the eQTLs. In females, there were 793 and 511 genes with at least one mapped eQTL (FDR = 0.05), constituting ~16% and 17% of the genetically variable genes, at 18 and 25 °C, respectively. In males, we mapped eQTLs (FDR = 0.05) for 1086 (20%) and 808 (19%) genes at 18 and 25 °C, respectively. To refine the eQTL-gene association models and account for linkage disequilibrium between DNA variants, we performed forward model selection. This procedure sequentially added variants meeting the FDR thresholds in the order of their association with gene expression, conditional on existing variants in the model, until no variant could be added at P < 1 × 10−5. It resulted in between 1 and 5 eQTLs for each gene, with the majority (74%) of genes containing only one eQTL (Supplementary Data 5).
Using the mapped eQTLs and their estimated effects and locations, we made four comparisons in order to understand the response of the regulatory genetic variation to environmental change in the Drosophila transcriptome. First, we asked if there were shared or environment-specific eQTLs, regardless of the signs and magnitudes of effects. In females, there were 2407 and 497 genes that had environment-specific genetic variance at 18 and 25 °C, respectively, among which 222 (9%) and 41 (8%) contained mapped eQTLs (Fig. 4a). In males, 1976 and 939 genes were genetically variable only at 18 and 25 °C, respectively; 250 (13%) and 116 (12%) of which contained mapped eQTLs (Fig. 4b). These environment-specific eQTLs represent regulatory genetic variation that responds (“inactive” versus “active”) to the specific environmental change. In addition, of the 2505 genetically variable genes at both 18 and 25 °C in females, 571 and 470 genes had mapped eQTLs, respectively (Fig. 4a). Among the 329 genes with at least one eQTL in both environments (sharing of the dark purple boxes in Fig. 4a), 303 (92%) had at least one shared eQTL. A similar result was obtained for males, where 459 (93%) out of 494 genes with eQTLs in both environments shared common eQTLs (Fig. 4b). These results indicate that when genes contained eQTLs in both environments, the majority of them shared eQTLs.
Second, we compared the estimated single-variant effects of eQTLs at 18 and 25 °C among the genes that were genetically variable at both temperatures. Of the 1181 eQTL-gene pairs in females, 294 (25%) were specific to 25 °C, 436 (37%) were specific to 18 °C, and 451 (38%) were common to both (Fig. 4c). Of the 1740 eQTL-gene pairs in males, 412 (24%) were specific to 25 °C, 630 (36%) were specific to 18 °C, and 698 (40%) were common to both (Fig. 4d). However, these classifications depended on significance thresholds. To enable a more quantitative comparison, we estimated single-variant genetic effects regardless of their statistical significance as long as it was significant in at least one environment. As expected, for eQTLs that were shared by both environments, their effects were large and highly similar (Fig. 4c, d). On the other hand, eQTLs that were specific to either environment had much larger effects in the environment where they were detected (Fig. 4c, d). Those eQTLs whose effects were different between environments also contributed to the environmental response of the regulatory variation. Interestingly, almost all eQTLs had effects of the same sign in both environments, suggesting that G×E or the environmental response of regulatory genetic variation for the majority of genes was likely a result of change in the magnitudes of effects rather than signs.
Third, we asked if using the estimated eQTL effects in one environment could predict gene expression in the other. Conserved regulatory genetic variation would lead to better prediction accuracy. We predicted gene expression at 18 °C for each gene using mapped eQTLs and their effects at 25 °C. As expected, the prediction accuracy correlated well with G×E. Genes with higher G×E, which also tended to not share eQTLs in the two environments, were poorly predicted across environments (Fig. 4e, f).
Finally, to probe the nature of the regulatory genetic effects on plasticity, we compared the degree of overlap between the eQTLs of different classes (shared or environment-specific) with known transcription factor binding sites in the Drosophila genome. In both females and males, eQTLs tended to overlap with transcription factor binding sites in general (Fig. 4g, 25 °C only vs. random), consistent with their localization to the proximity of transcription start and end sites15. Two developmentally important transcription factors, chinmo and eve, were significantly (FDR = 0.05) depleted among eQTLs specific to 25 °C in females (Fig. 4g). This led to the over-representation of eQTLs common to both environments (FDR = 0.10) or specific to 18 °C (FDR = 0.05 for chinmo and 0.10 for eve) overlapping the binding sites of these two transcription factors (Fig. 4g). Therefore, chinmo and eve may be involved in the environment-specific regulation of gene expression in females.
Robust co-expression networks in the presence of environmentally responsive regulatory genetic variation
It has been well documented that genes form co-expression networks in which genes executing similar functions have correlated expression levels, either across different tissues of the same individuals, or across individuals in a population24. Importantly, perturbations to co-expression networks can lead to expressed phenotypes such as diseases25,26. In a population sample, if there was no genetic variation in the plasticity of gene expression, the co-expression networks constructed based on correlations between genes would remain the same even if plasticity was widespread but constant across individuals. In contrast, G×E or plasticity of regulatory variation may lead to a change in the network structure.
To examine the robustness of co-expression networks in the presence of heritable transcriptome plasticity, we identified co-expression modules using weighted correlation network analysis (WGCNA) in both temperatures and compared the module memberships (Fig. 5a, b). We observed strong preservation of network structures. For example, in females, the largest module at 25 °C included 386 genes enriched for basic biological processes such as neurogenesis, mitotic spindle organization, and translation, among others (Supplementary Data 6). Of these genes, 90% (n = 346) were also in the same module at 18 °C (Fig. 5a). Using a permutation-based approach27, several measures of network preservation were found to be highly significant (Supplementary Data 7).
However, given that only ~10% of genes showed G×E or plasticity of regulatory variation, a high degree of preservation was expected, especially for genes that are required to maintain basic organismal functions. To test the trivial explanation that the network structure preservation is expected given the observed extent of G×E in the transcriptome, we employed a simulation-based approach to derive the expected distribution of changes in correlation under the null hypothesis that the plastic changes in the regulatory variation of genes were independent from each other (Fig. 5c, d). We found evidence that the changes in the correlation among genes were smaller than if all genes were responding to the environment change independently. Between the genes with significant G×E, the change in their correlation with each other was significantly smaller than expected (P < 0.001 based on 1000 simulations), more so in males than in females (Fig. 5c, d). This result indicates that mechanisms exist to ensure that plasticity of regulatory variation in gene expression remains coordinated between genes in the face of environmental perturbations, thus preserving the co-expression networks.
If the environmental response of regulatory genetic variation is not independent but coordinated, genes that are highly correlated with a large number of genes (“hub” genes) may be of particular importance in preserving the co-expression networks. We postulated that expression of these genes may be under stronger constraints. We tested this hypothesis by relating the connectivity of genes with the strength of stabilizing selection estimated in a previous study28. We defined the connectivity of a gene as the mean of the absolute correlation of the gene with all other genes, and the strength of stabilizing selection as the ratio of mutational variance (Vm) over standing genetic variation (Vg). Remarkably, there was a highly significant positive correlation (Spearman r = 0.29 and P = 1.79 × 10−39 in females and r = 0.27 and P = 1.23 × 10−44 in males) between a gene’s network connectivity and the strength of stabilizing selection in both sexes (Fig. 6a, b), indicating that the robustness of co-expression networks is under stabilizing selection. The correlation was slightly stronger for genes without G×E (Fig. 6a, b). Though conceivable and suspected in many cases, few studies have experimentally observed this result.
Discussion
Using a simple but powerful design, we provided a comprehensive characterization of the response of the regulatory genetic variation of the Drosophila transcriptome to environmental change. Specifically, an inbred line reference population enabled us to use the same genotypes for the alternative treatments and obtain biological replicates. This was instrumental in partitioning the variance in gene expression and providing a global characterization of the extent of genetic canalization or decanalization and G×E. Although we found many more genes genetically decanalized at 18 °C relative to the baseline at 25 °C, this depended on the choice of the baseline environment as well as the environmental history of this population that had shaped its genetic variation. Nevertheless, the genetic variance of hundreds of genes changed between the two thermal environments, clearly indicating that even mild environmental fluctuations can be a potent agent in exposing or masking genetic variation in quantitative trait phenotypes. In addition, ~8–10% of genetically variable genes exhibited G×E in just two environments. These genes may be subject to differential selection when they experience heterogeneous environments, contributing to the maintenance of genetic variation8. Because only two environments were considered in this study, the extent of G×E may be much more widespread than can be detected and all inferences drawn from this study should be interpreted in the specific context of the two temperatures.
We compared the list of genes exhibiting G×E in the present study with genes that showed evidence of adaptive divergence or G×E between tropical and temperate geographical locations. Among the 619 genes significant for G×E in males in this study, 166 (Supplementary Data 8) were previously reported to be differentially expressed between flies of tropical and temperate origins29,30 or had G×E interaction18. However, more formal integration across studies with proper experimental design and statistical inference is needed to understand the role of canalization and decanalization and G×E in general in adaptive evolution, which remains controversial31.
We mapped eQTLs at 18 and 25 °C and compared their locations and effects. While there were many environment-specific or plasticity eQTLs, when eQTLs could be mapped in both environments, the majority of eQTLs were in fact shared between the two environments with similar effects (Fig. 4a–d). Cross-environment prediction of gene expression using mapped eQTLs was poor in the presence of G×E, a caveat that must be considered when predicting gene expression based on mapped eQTLs in reference populations32. We found two transcription factors (chinmo and eve) whose binding sites were enriched in eQTLs in 18 °C relative to 25 °C specific eQTLs, suggesting that they may be involved in regulating the plasticity of gene expression when flies were exposed to 18 °C. The specific mechanisms by which these cis regulatory elements change how flies respond to different temperatures require further investigation. Differential binding by the developmentally dynamic chinmo and eve transcription factors may modify the development of sensory neurons such that flies sense environmental temperature differently33.
We observed strong preservation of the co-expression network structure between the two environments. A trivial explanation for this observation could be that since the majority of genes did not show significant G×E, the correlation among these genes was preserved. We developed a simulation-based test to specifically test whether the robustness of co-expression networks was expected under the null hypothesis that the plastic responses to low temperature was independent for each of the significant genes. This hypothesis was rejected, indicating that the plastic responses by the genes were somewhat coordinated. This would be consistent with a model where the responses of many genes were secondary to a smaller number of primary first responders. Our identification of two transcription factors in females whose binding sites were enriched in eQTLs in 18 °C (common between 25 and 18 °C or specific to 18 °C) was also consistent with this model. A robust co-expression network involving many genes would also promote polygenicity of organismal traits or fitness if we consider the expression of genes as mediators of mutational effects on organismal traits. This would be consistent with the ominigenic model of complex quantitative traits34. Indeed, adaptation to novel thermal environments has been found to be highly polygenic35. However, further experiments are needed to define a working model.
Importantly, we found genes with higher network connectivity with other genes, which were also under stronger stabilizing selection (Fig. 6). Furthermore, larger and more conserved modules were enriched for genes involved in basic biological processes (Fig. 5a, b and Supplementary Data 6). Previous theoretical work has suggested that stabilizing selection was important in promoting the evolution of network robustness36. Our result is one of the few37 that provides empirical support for the role of stabilizing selection in evolving network robustness.
Methods
Sample preparation
For the low-temperature treatment, all DGRP strains were reared on cornmeal-molasses-agar medium at 18 °C with 60–75% relative humidity and a 12-h light/dark cycle. For each DGRP line, we collected 25 mated female or 40 mated male 3–5-day-old flies to constitute one biological replicate for each sex. Collection was performed between 1 and 3 p.m. consistently to account for circadian rhythm in gene expression and the flies were immediately frozen in liquid nitrogen before they were sorted. We collected two biological replicates per sex for each of 185 DGRP lines.
RNA sequencing and analysis
We sequenced pooled polyadenylated RNA of flies reared at 18 °C exactly as previously described15. RNA-Seq data for flies reared at 25 °C were downloaded from GEO (GSE67505) and data from the present study were deposited to SRA (PRJNA615927). Although all analyses were identical, we re-analyzed the 25 °C data together with the new 18 °C data for consistency and improvement in sensitivity by merging transcript reconstructions across environments. Briefly, RNA-Seq reads were aligned to the reference transcriptome (FlyBase release 5.57) and the genome (BDGP5) using tophat238 (v2.0.13) for each sex and temperature separately. A summary of alignments including statistics of input, successful alignments, and reads overlapping various genomic features is provided in Table S1. Alignments were assembled using cufflinks39 (v2.2.1) into transcript models, which were merged across multiple conditions. To enable subsequent expression profiling using microarrays, we removed segments of exons that were either specific to only a subset of splice isoforms or overlapped from either strand by multiple genes to arrive at a set of nonoverlapping constitutive exons for each gene (Fig. 1).
Microarray data acquisition and processing
We acquired and processed microarray data as previously described15. The 25 °C data were downloaded from ArrayExpress (E-MTAB-3216) and analyzed together with the new 18 °C data (E-MTAB-8953). We removed probes that mapped to multiple genomic locations, overlapped with variants whose non-reference allele frequency in the 185 DGRP lines exceeded 0.05, or did not entirely fall within the constitutive exons as defined above. Probe hybridization intensity was corrected for background hybridization40 and quantile normalized41 within each sex but across temperature, which was motivated by strong effects of sex on gene expression15 but relatively minor temperature effects as observed in the RNA-Seq data (Supplementary Fig. 1), and an attempt to partially account for batch effects. Probe expression was summarized into gene expression by median polish, which is robust to rare outliers. Finally, within each sex, the expression for each gene was normal quantile-transformed with mean equal to the median and standard deviation equal to the median absolute deviation multiplied by a factor of 1.4824. It is important to note that because the 18 and 25 °C data were collected at different times, batch is completely confounded with temperature, and therefore we could not make any inference about the overall effect of temperature on gene expression. Nonetheless, because the arrays were randomized within each temperature, we could still make inferences on G×E after the temperature effect and other latent batch effects were accounted for. We removed unwanted heterogeneity in the gene expression matrix by regressing out top surrogate variables while explicitly retaining temperature and Wolbachia effects17. The numbers of surrogate variables were determined using the “num.sv” function in the “sva” package (v3.34) in R. The adjusted gene expression matrices were used for subsequent analyses for females and males separately.
Quantitative genetic analysis of gene expression
Within each temperature and sex, we partitioned the observed phenotypic variance into genetic () and microenvironmental () variances using mixed model implemented in the “lme4” R package (v1.1–10), adding Wolbachia infection status as a fixed effect. We then combined data from the two temperatures and partitioned the observed phenotypic variance into genetic and environmental components using different models as described below, which allowed us to make specific inferences. First, we tested variance homogeneity by likelihood ratio tests comparing models with either a single variance parameter or separate variance parameters for the environments. The models were fitted by the “nlme” R package (v3.1–124) and included temperature, Wolbachia, and the interaction between the two as fixed effects. Second, we also partitioned the observed phenotypic variance across environments into genetic (), G×E interaction (), and microenvironmental variances (), and the same fixed effects as above. We defined a G×E index as , which measured the proportion of total genetic variance due to G×E interaction. The FDR was controlled using the Benjamini–Hochberg procedure when calling statistical significance. We compared genes significant for G×E with genes that showed evidence of adaptive divergence or G×E between tropical and temperate geographical locations. These included genes that were differentially expressed between African and European populations29; genes that were differentially expressed between Panama City and Maine populations at 21 and 29 °C30; and genes that showed G×E from temperate and tropical Australia populations raised at 18 and 30 °C18.
Gene set enrichment analysis (GSEA)
We performed two different quantitative GSEA analyses. The first was a measure of canalization and decanalization and was computed as . We considered only genes with at least one significant genetic variance at 18 or 25 °C. The second was a measure of G×E defined as 2G×E − 1 for genes with significant and/or . These scores were designed to range between −1 and 1 and replaced the correlation scores in GSEA, the general form of which takes a ranked list of scores as input. Subsequent steps were exactly followed as previously described in the original GSEA publication42. We limited analyses to GO terms with at least 20 genes among the entire set of genes with at least one GO annotation. For GO enrichment analysis for genes in Module 1 of the female WGCNA (v1.69) modules, the significance was tested using a hypergeometric test with genes in the WGCNA input as the background set. P values were adjusted for multiple testing using the Benjamini–Hochberg procedure.
eQTL mapping
We mapped eQTLs as previously described15. Briefly, we obtained BLUP estimates of gene expression for each line at 18 °C and 25 °C separately, adjusting for Wolbachia infection, inversions, and major genotypic principal components. eQTLs were mapped using a t test implemented in PLINK (v1.07). To estimate the empirical FDR, the phenotypic data were permuted 100 times, retaining the association between genes, and eQTLs were mapped following the same procedure. At each P value threshold, the empirical FDR was estimated as the average expected number of significant eQTLs divided by the observed number. Significant eQTLs were further filtered by performing a forward selection procedure as previously described15. In each iteration of the forward selection, all candidate eQTLs were tested conditional on eQTLs in the model, the one with the smallest P value was added to the model until no eQTL could be added at P < 10−5. When considering matching of eQTLs between environments, an eQTL must be retained in the model selection in at least one environment. For example, an eQTL retained in the model selection at 18 °C may not be retained at 25 °C, but overlap is still assessed as long as it has been initially mapped at 25 °C before model selection.
Co-expression network analysis
We identified modules in gene expression using the WGCNA (v1.69) R package43 and tested network preservation naively using the NetRep (v1.2.1) R package27. To account for known cross-environment correlation of gene expression and quantitatively compare the preservation of correlation of expression among genes, we employed a simulation-based approach. To test the null hypothesis that the plastic response of gene expression at 18 °C of different genes were independent from each other, we simulated for each gene its expression at 18 °C given the correlation between 18 and 25 °C. This was achieved by first scaling the expression at 25 °C such that , where X was the expression at 25 °C. Then the expression at 18 °C was simulated by the following formula: , where μ2 and σ2 were the mean and variance of expression at 18 °C, ρ was the correlation coefficient between the expression at the two environments and Z2 was a random number drawn from the standard normal distribution. Thus Y was a simulated gene expression that preserved the mean and variance at 18 °C as well as the correlation and G×E between the two environments. We then computed the pair-wise correlation matrix at both temperatures and computed their difference. This procedure was repeated 1000 times. A significant deviation from the null hypothesis was signified by decreased variance of the distribution of the correlation coefficient changes (Δr).
We defined gene connectivity of a gene as the mean absolute correlation of the gene with all other genes. The strength of stabilizing selection was obtained from a previous study28, in which mutational variance (Vm) of gene expression was estimated in a set of mutational accumulation lines while the standing genetic variation (Vg) was estimated from a subset of DGRP lines. The study also used whole bodies of flies of the same age but a different array platform and thus was an independent study from the present one. The strength of stabilizing selection was defined as the ratio of Vm over Vg.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
The research was supported by National Institutes of Health Grants R01 GM45146 and R01 AG043490 to T.F.C.M. and R.R.H.A., and from Michigan State University and MSU AgBioResearch to W.H.
Author contributions
Conceived and designed the experiment: T.F.C.M. and R.R.H.A.; performed the experiment: R.F.L. and M.A.C.; analyzed data: W.H.; wrote the manuscript: W.H. and T.F.C.M. with input from all authors. All authors read and approved the manuscript.
Data availability
All data have been deposited into public repositories, including the RNA-Seq data (GSE67505 in GEO for 25 °C, PRJNA615927 in SRA for 18 °C), and the tiling microarray data in ArrayExpress (E-MTAB-3216 and E-MTAB-8953). All derivative data used to generate figures and tables are available in the GitHub repository https://github.com/qgg-lab/dgrp-plasticity-eqtl.
Code availability
All codes used for analysis and figure and table generations are available in the GitHub repository https://github.com/qgg-lab/dgrp-plasticity-eqtl.
Competing interests
The authors declare no competing interests.
Footnotes
Peer review information Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Wen Huang, Email: huangw53@msu.edu.
Trudy F. C. Mackay, Email: tmackay@clemson.edu
Supplementary information
Supplementary information is available for this paper at 10.1038/s41467-020-19131-y.
References
- 1.Price TD, Qvarnström A, Irwin DE. The role of phenotypic plasticity in driving genetic evolution. Proc. R. Soc. Lond. Ser. B Biol. Sci. 2003;270:1433–1440. doi: 10.1098/rspb.2003.2372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Rutherford SL, Lindquist S. Hsp90 as a capacitor for morphological evolution. Nature. 1998;396:336–342. doi: 10.1038/24550. [DOI] [PubMed] [Google Scholar]
- 3.Waddington CH. Canalization of development and the inheritance of acquired characters. Nature. 1942;150:563–565. doi: 10.1038/150563a0. [DOI] [PubMed] [Google Scholar]
- 4.Gibson G, Wagner G. Canalization in evolutionary genetics: a stabilizing theory? BioEssays. 2000;22:372–380. doi: 10.1002/(SICI)1521-1878(200004)22:4<372::AID-BIES7>3.0.CO;2-J. [DOI] [PubMed] [Google Scholar]
- 5.Geiler-Samerotte K, Sartori FMO, Siegal ML. Decanalizing thinking on genetic canalization. Semin. Cell Dev. Biol. 2019;88:54–66. doi: 10.1016/j.semcdb.2018.05.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Levy SF, Siegal ML. Network hubs buffer environmental variation in Saccharomyces cerevisiae. PLoS Biol. 2008;6:e264. doi: 10.1371/journal.pbio.0060264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Richardson JB, Uppendahl LD, Traficante MK, Levy SF, Siegal ML. Histone variant HTZ1 shows extensive epistasis with, but does not increase robustness to, new mutations. PLoS Genet. 2013;9:e1003733. doi: 10.1371/journal.pgen.1003733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Gillespie JH, Turelli M. Genotype-environment interactions and the maintenance of polygenic variation. Genetics. 1989;121:129–138. doi: 10.1093/genetics/121.1.129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Doust AN, et al. Beyond the single gene: How epistasis and gene-by-environment effects influence crop domestication. Proc. Natl Acad. Sci. U.S.A. 2014;111:6178–6183. doi: 10.1073/pnas.1308940110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Des Marais DL, Hernandez KM, Juenger TE. Genotype-by-Environment Interaction and Plasticity: Exploring Genomic Responses of Plants to the Abiotic Environment. Annu. Rev. Ecol. Evol. Syst. 2013;44:5–29. doi: 10.1146/annurev-ecolsys-110512-135806. [DOI] [Google Scholar]
- 11.Rauw WM, Gomez-Raya L. Genotype by environment interaction and breeding for robustness in livestock. Front. Genet. 2015;6:310. doi: 10.3389/fgene.2015.00310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Eichelbaum M, Ingelman-Sundberg M, Evans WE. Pharmacogenomics and individualized drug therapy. Annu. Rev. Med. 2006;57:119–137. doi: 10.1146/annurev.med.56.082103.104724. [DOI] [PubMed] [Google Scholar]
- 13.López-Maury L, Marguerat S, Bähler J. Tuning gene expression to changing environments: from rapid responses to evolutionary adaptation. Nat. Rev. Genet. 2008;9:583–593. doi: 10.1038/nrg2398. [DOI] [PubMed] [Google Scholar]
- 14.Fry JD, Nuzhdin SV, Pasyukova EG, Mackay TFC. QTL mapping of genotype–environment interaction for fitness in Drosophila melanogaster. Genet. Res. 1998;71:133–141. doi: 10.1017/S0016672398003176. [DOI] [PubMed] [Google Scholar]
- 15.Huang W, et al. Genetic basis of transcriptome diversity in Drosophila melanogaster. Proc. Natl Acad. Sci. 2015;112:E6010–E6019. doi: 10.1073/pnas.1519159112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Mackay TFC, et al. The Drosophila melanogaster genetic reference panel. Nature. 2012;482:173–178. doi: 10.1038/nature10811. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Leek JT, Storey JD. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 2007;3:1724–1735. doi: 10.1371/journal.pgen.0030161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Levine MT, Eckert ML, Begun DJ. Whole-genome expression plasticity across tropical and temperate Drosophila melanogaster populations from Eastern Australia. Mol. Biol. Evol. 2011;28:249–256. doi: 10.1093/molbev/msq197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Chen J, Nolte V, Schlötterer C. Temperature-related reaction norms of gene expression: regulatory architecture and functional implications. Mol. Biol. Evol. 2015;32:2393–2402. doi: 10.1093/molbev/msv120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Everett LJ, et al. Gene expression networks in the Drosophila genetic reference panel. Genome Res. 2020;30:485–496. doi: 10.1101/gr.257592.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Barreiro LB, et al. Deciphering the genetic architecture of variation in the immune response to Mycobacterium tuberculosis infection. Proc. Natl Acad. Sci. 2011;109:1204–1209. doi: 10.1073/pnas.1115761109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Fairfax BP, et al. Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression. Science. 2014;343:1246949. doi: 10.1126/science.1246949. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Lee MN, et al. Common genetic variants modulate pathogen-sensing responses in human dendritic cells. Science. 2014;343:1246980. doi: 10.1126/science.1246980. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ardlie KG, et al. The genotype-tissue expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science. 2015;348:648–660. doi: 10.1126/science.1262110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.de la Fuente A. From ‘differential expression’ to ‘differential networking’ - identification of dysfunctional regulatory networks in diseases. Trends Genet. 2010;26:326–333. doi: 10.1016/j.tig.2010.05.001. [DOI] [PubMed] [Google Scholar]
- 26.Lea A, et al. Genetic and environmental perturbations lead to regulatory decoherence. Elife. 2019;8:e40538. doi: 10.7554/eLife.40538. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Ritchie SC, et al. A scalable permutation approach reveals replication and preservation patterns of network modules in large datasets. Cell Syst. 2016;3:71–82. doi: 10.1016/j.cels.2016.06.012. [DOI] [PubMed] [Google Scholar]
- 28.Huang W, et al. Spontaneous mutations and the origin and maintenance of quantitative genetic variation. Elife. 2016;5:e14625. doi: 10.7554/eLife.14625. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hutter S, Saminadin-Peter SS, Stephan W, Parsch J. Gene expression variation in African and European populations of Drosophila melanogaster. Genome Biol. 2008;9:R12. doi: 10.1186/gb-2008-9-1-r12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Zhao L, Wit J, Svetec N, Begun DJ. Parallel gene expression differences between low and high latitude populations of Drosophila melanogaster and D. simulans. PLoS Genet. 2015;11:e1005184. doi: 10.1371/journal.pgen.1005184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Partridge L, Barton NH. Evolving evolvability. Nature. 2000;407:457–458. doi: 10.1038/35035173. [DOI] [PubMed] [Google Scholar]
- 32.Gamazon ER, et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 2015;47:1091–1098. doi: 10.1038/ng.3367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Alpert MH, et al. A circuit encoding absolute cold temperature in Drosophila. Curr. Biol. 2020 doi: 10.1016/j.cub.2020.04.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Boyle EA, Li YI, Pritchard JK. An expanded view of complex traits: from polygenic to omnigenic. Cell. 2017;169:1177–1186. doi: 10.1016/j.cell.2017.05.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Barghi N, et al. Genetic redundancy fuels polygenic adaptation in Drosophila. PLoS Biol. 2019;17:e3000128. doi: 10.1371/journal.pbio.3000128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Espinosa-Soto C. Selection for distinct gene expression properties favours the evolution of mutational robustness in gene regulatory networks. J. Evol. Biol. 2016;29:2321–2333. doi: 10.1111/jeb.12959. [DOI] [PubMed] [Google Scholar]
- 37.Wagner A. Robustness against mutations in genetic networks of yeast. Nat. Genet. 2000;24:355–361. doi: 10.1038/74174. [DOI] [PubMed] [Google Scholar]
- 38.Kim D, et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14:R36. doi: 10.1186/gb-2013-14-4-r36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Trapnell C, et al. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat. Biotechnol. 2013;31:46–53. doi: 10.1038/nbt.2450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Wu Z, Irizarry RA, Gentleman R, Martinez-Murillo F, Spencer F. A model-based background adjustment for oligonucleotide expression arrays. J. Am. Stat. Assoc. 2004;99:909–917. doi: 10.1198/016214504000000683. [DOI] [Google Scholar]
- 41.Bolstad BM, Irizarry RA, Åstrand M, Speed TP. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003;19:185–193. doi: 10.1093/bioinformatics/19.2.185. [DOI] [PubMed] [Google Scholar]
- 42.Subramanian A, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. U. S. A. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 2008;9:559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data have been deposited into public repositories, including the RNA-Seq data (GSE67505 in GEO for 25 °C, PRJNA615927 in SRA for 18 °C), and the tiling microarray data in ArrayExpress (E-MTAB-3216 and E-MTAB-8953). All derivative data used to generate figures and tables are available in the GitHub repository https://github.com/qgg-lab/dgrp-plasticity-eqtl.
All codes used for analysis and figure and table generations are available in the GitHub repository https://github.com/qgg-lab/dgrp-plasticity-eqtl.