Abstract
Genetic changes affecting gene expression contribute to phenotypic divergence; thus, understanding how regulatory networks controlling gene expression change over time is critical for understanding evolution. Prior studies of expression differences within and between species have identified properties of regulatory divergence, but technical and biological differences among these studies make it difficult to assess the generality of these properties or to understand how regulatory changes accumulate with divergence time. Here, we address these issues by comparing gene expression among strains and species of Drosophila with a range of divergence times and use F1 hybrids to examine inheritance patterns and disentangle cis- and trans-regulatory changes. We find that the fixation of compensatory changes has caused the regulation of gene expression to diverge more rapidly than gene expression itself. Specifically, we observed that the proportion of genes with evidence of cis-regulatory divergence has increased more rapidly with divergence time than the proportion of genes with evidence of expression differences. Surprisingly, the amount of expression divergence explained by cis-regulatory changes did not increase steadily with divergence time, as was previously proposed. Rather, one species (Drosophila sechellia) showed an excess of cis-regulatory divergence that we argue most likely resulted from positive selection in this lineage. Taken together, this work reveals not only the rate at which gene expression evolves, but also the molecular and evolutionary mechanisms responsible for this evolution.
Understanding the relationship between tempo (the rate at which a trait evolves) and mode (the manner in which a trait evolves) is essential for understanding the evolutionary process (Simpson 1944). This is true not only for organismal phenotypes, but also for the molecular phenotypes that produce organismal traits. Gene expression is one such molecular phenotype (Gordon and Ruvinsky 2012); it is essential for organismal form, fitness, and function, and frequently varies within and between species. Comparative studies using genomic surveys of gene expression in yeast (Busby et al. 2011), Drosophila (Rifkin et al. 2003), and mammalian species (Brawand et al. 2011) with a range of divergence times have provided insight into the tempo of gene expression evolution, but the mode and its relationship to tempo remain less well understood.
Elucidating the mode of gene expression evolution includes identifying the types of regulatory changes that have evolved as well as how interactions among divergent regulatory alleles affect gene expression. F1 hybrids, in which divergent regulatory alleles interact in the same cellular environment, can be used to investigate these issues. Allele-specific expression in F1 hybrids separates the effects of cis- and trans-regulatory changes affecting a gene’s expression by providing a readout of relative cis-regulatory activity in a common trans-regulatory environment (Cowles et al. 2002). Expression differences between genotypes not attributed to cis-regulatory changes are inferred to be caused by trans-regulatory divergence (Wittkopp et al. 2004). In addition, the net effects of interactions among divergent regulatory alleles are revealed by comparing levels of total expression in F1 hybrids to parental genotypes.
This approach was initially used to separate cis- and trans-regulatory effects of divergence affecting expression of dozens of genes. These studies suggested that (1) cis-regulatory changes are more common than trans-regulatory changes between species (Wittkopp et al. 2004); (2) genes with cis- and trans-acting changes favoring expression of opposite alleles are more likely than other types of changes to cause misexpression in F1 hybrids (Landry et al. 2005); (3) environmental factors modulate relative cis-regulatory activity (de Meaux et al. 2006); (4) cis-regulatory variation is abundant in natural populations (Osada et al. 2006; Genissel et al. 2007; Campbell et al. 2008; Gruber and Long 2009); and (5) the amount of expression divergence attributable to cis-acting changes is greater between than within species (Wittkopp et al. 2008).
More recently, microarrays and RNA-seq have been used to extend these analyses to the genomic scale (Wang et al. 2008; Graze et al. 2009; Tirosh et al. 2009; Zhang and Borevitz 2009; Fontanillas et al. 2010; McManus et al. 2010; He et al. 2012; Shi et al. 2012; Coolon and Wittkopp 2013; Levy et al. 2013; Schaefke et al. 2013). In some cases, relationships seen in the smaller scale studies were replicated. For example, cis- and trans-regulatory changes with effects in opposite directions were overrepresented among misexpressed genes (Tirosh et al. 2009; McManus et al. 2010; Schaefke et al. 2013) and cis-regulatory changes explained more of the expression differences between than within species (Tirosh et al. 2009; Emerson et al. 2010). Other observations, such as the relative proportion of genes with evidence of cis- and/or trans-regulatory changes, were much more variable among studies. Finally, novel patterns, such as the relationship between dominance and cis/trans-regulatory changes (Lemos et al. 2008; McManus et al. 2010) and the frequency of compensatory cis- and trans-regulatory variants (Tirosh et al. 2009; Goncalves et al. 2012; Shi et al. 2012), were identified.
Despite this growing collection of case studies examining the types of changes responsible for expression differences within and/or between particular pairs of species, the use of different organisms (flies, yeast, plants, and mice), techniques (pyrosequencing, microarrays, RNA-seq), and analysis methods (linear models, exact tests, and Bayesian approaches) among these studies precludes the type of meta-analysis needed to determine how the mode of regulatory evolution changes with divergence time and to robustly assess the generality of relationships reported in previous studies. To address these issues, we examined the tempo and mode of regulatory evolution in concert using strains and species of Drosophila with a range of divergence times.
Results
Experimental overview
mRNA abundance was compared among (1) African and non-African strains of Drosophila melanogaster (mel-mel), which have been geographically isolated for ∼10,000 yr and show evidence of behavioral isolation (David and Capy 1988; Lachaise et al. 1988; Wu et al. 1995; Hollocher et al. 1997) and expression divergence (Hutter et al. 2008); (2) D. simulans and D. sechellia (sim-sech), which diverged ∼250,000 yr ago (Garrigan et al. 2012); and (3) D. melanogaster and D. simulans (mel-sim), which diverged ∼2.5 million yr ago (Fig. 1A; Cutter 2008). For each of these genotypes, we derived a strain-specific genome sequence and used RNA-seq to measure mRNA abundance (hereafter referred to as expression) in a pool of 20 adult female flies. Reciprocal crosses were performed for each of the three pairs of genotypes (mel-mel, sim-sech, and mel-sim), and RNA-seq was used to measure both total and allele-specific expression in pools of 20 female F1 hybrids from each cross (Fig. 1B). Sequence divergence observed in transcribed regions of these strains correlated with published estimates of divergence time (Fig. 1C) as well as the number of RNA-seq reads informative for allele-specific expression (Fig. 1D). Gene-specific and allele-specific read counts were used to investigate regulatory evolution as shown in Supplemental Figure S1.
Quantifying gene expression levels
For each comparison (mel-mel, sim-sech, and mel-sim), RNA-seq reads from the two strains or species and their F1 hybrids were aligned to the relevant genomes and mapped to specific genes. Differences in sequencing depth among libraries (Supplemental Table S1) were eliminated by using random sampling without replacement to produce a data set with the same number of mapped reads for each sample. After excluding genes with fewer than 20 mapped reads in any sample (Supplemental Table S2), 7587 genes were deemed suitable for comparing total expression levels between all pairs of genotypes and their F1 hybrids (Data set 1), which is 83% of the genes classified as expressed in D. melanogaster adult females by modENCODE (Graveley et al. 2011).
Measures of relative gene expression derived from these mapped and normalized RNA-seq data correlated well with estimates of relative gene expression derived from independent pyrosequencing experiments (Supplemental Fig. S2A; Ahmadian et al. 2000). Genome-wide, expression levels between F1 hybrids from reciprocal crosses were also highly correlated (Fig. 2A; Supplemental Fig. S3). Despite this similarity, Fisher’s exact tests (FETs) with a false discovery rate (FDR) of 0.05 identified significant expression differences between reciprocal hybrids for 26%–49% of individual genes (Fig. 2B). Most of these significant expression differences were small in magnitude (median expression difference = 1.20- to 1.25-fold) (Supplemental Fig. S4), however, they reflect the sensitivity of the Fisher’s exact test for detecting differences in relative expression from RNA-seq data when read counts are high. These differences in expression between hybrids from reciprocal crosses provide a conservative baseline for expression differences detected in the mel-mel, sim-sech, and mel-sim comparisons because they include variance from technical and biological replication as well as parent-of-origin effects.
Evolution of expression differences
To determine the tempo of regulatory divergence, we compared total expression levels in the mel-mel, sim-sech, and mel-sim comparisons for the set of 7587 genes in Data set 1 described above. First, we analyzed overall expression divergence (1 − Spearman’s ρ, see Methods) and found that it increased consistently and significantly with divergence time (Fig. 2A; Supplemental Fig. S5). We then used FETs to compare expression levels for individual genes and determine whether the increased overall expression divergence resulted from more genes with divergent expression or more divergent expression of similar numbers of genes. Surprisingly, we found that the proportion of genes with significant expression differences did not increase consistently with divergence time (Fig. 2B), suggesting that increasing magnitudes of expression differences rather than increasing numbers of genes with divergent expression drive the overall increase in expression differences with divergence time observed.
We also examined the evolutionary trajectories of individual genes by assigning each of the 7587 genes in Data set 1 to one of nine classes depending on whether its expression difference increased, decreased, or remained similar between mel-mel and sim-sech and between sim-sech and mel-sim. Expression differences less than 1.25-fold were considered similar for this analysis to minimize the impact of small but statistically significant expression differences (Supplemental Fig. S6). Despite observing that expression differences increased with divergence time on a genomic scale (Fig. 2A), this pattern was only seen for 2% of individual genes (Fig. 2C, class I). Expression differences of similar magnitude in all three comparisons were much more common (43% of all genes examined) and tended to be small in magnitude (median expression difference = 1.18-fold) (Fig. 2C, class II). The remaining 55% of genes fell into one of seven categories in which two of the three comparisons showed similar expression differences (Fig. 2C, class III). Interestingly, nearly half (45%) of such genes showed similar expression differences in mel-mel and mel-sim but larger or smaller expression differences in the sim-sech comparison (Fig. 2C, IIIc and IIId), which has an intermediate divergence time.
Evolution of regulatory incompatibilities
Divergence of the regulatory networks controlling gene expression can cause misexpression in F1 hybrids that can contribute to speciation (Meiklejohn et al. 2003; Michalak and Noor 2004; Ranz et al. 2004; Haerty and Singh 2006; Moehring et al. 2007; Maheshwari and Barbash 2012). This can occur, for example, when proteins and/or DNA with sequence-specific interactions coevolve such that divergent alleles of the interacting molecules do not function properly together in F1 hybrids. To determine the rate at which misexpression resulting from such regulatory incompatibilities evolves, we compared expression levels in mel-mel, sim-sech, and mel-sim F1 hybrids to expression levels in the corresponding parental genotypes. We found that overall expression differences between parents and F1 hybrids increased with divergence time, most dramatically in the mel-sim comparisons (Fig. 2A; Supplemental Fig. S7). A similar increase was seen in the proportion of genes showing misexpression in F1 hybrids (Fig. 2B). The much more extensive misexpression seen in mel-sim F1 hybrids compared with mel-mel or sim-sech F1 hybrids is consistent with mel-sim F1 hybrid females having morphological defects that cause sterility (Dickinson et al. 1984) and mel-mel and sim-sech F1 hybrid females being completely fertile (Lachaise et al. 1986; Hollocher et al. 1997).
To further investigate the inheritance of gene expression levels and how inheritance patterns change over evolutionary time, we considered each gene separately and classified its expression in F1 hybrids as dominant, additive, misexpressed (i.e., over- or under-dominant), or similar (Supplemental Fig. S8). To minimize the impact of small but statistically significant expression differences on this analysis (Supplemental Fig. S9), we considered expression similar between genotypes if the expression difference was less than 1.25-fold. In the mel-mel F1 hybrids, we found that 7% of genes showed additivity, 14% showed misexpression, and 43% showed dominant inheritance. The remaining 36% of genes showed similar expression in both strains of D. melanogaster and in their F1 hybrids. The proportions of genes with additive and dominant inheritance decreased consistently with divergence time, whereas the proportion of genes showing misexpression increased dramatically with divergence time (Fig. 2D).
Using allele-specific RNA-seq reads to study regulatory evolution
Differences in gene expression can be caused by changes in cis- and/or trans-regulation. Understanding the relative contribution of these two types of changes is critical for understanding the mode of regulatory evolution (Gordon and Ruvinsky 2012). To separate the effects of cis- and trans-regulatory divergence, we analyzed allele-specific expression in F1 hybrids and contrasted it with comparable measures of total expression differences between parental genotypes derived from allele-specific reads in “mixed parental” samples. These mixed parental samples were constructed in silico by combining equal numbers of mapped RNA-seq reads from each parental genotype and subjected to the same bioinformatic analysis as the reads from F1 hybrids. Expression differences between alleles in F1 hybrids were attributed to cis-regulatory differences, and differences in relative expression between parental genotypes that were not explained by differences in cis-regulatory activity were attributed to trans-regulatory divergence (Wittkopp et al. 2004).
For each F1 hybrid and mixed parent sample, RNA-seq reads that aligned perfectly and uniquely to one parental genome but not the other were considered allele-specific. Genes with low confidence allele assignments (see Supplemental Material), fewer than 20 total allele-specific reads, or expression consistent with genomic imprinting in any comparison were excluded from analysis (Supplemental Table S3). For each of the remaining 4851 genes, differences in the number of allele-specific reads among comparisons were eliminated by using hypergeometric sampling to produce a data set with the same number of allele-specific reads in all comparisons (Data set 2). Measures of relative total expression derived from allele-specific reads in the mixed parental samples were strongly correlated with measures of relative total expression derived from the full RNA-seq data set (Supplemental Fig. S10) and pyrosequencing (Supplemental Fig. S2B). Relative allele-specific expression in F1 hybrids also showed a strong correlation between the RNA-seq and pyrosequencing data (Supplemental Fig. S2C) and was similar in F1 hybrids from reciprocal crosses (Supplemental Fig. S11). In the analyses described below, hybrids from reciprocal crosses were considered separately, with results from one hybrid for each comparison presented in the main text and results from the other hybrid presented in the Supplemental Material. With few exceptions (noted below), results were similar between reciprocal hybrids.
Evolution of cis- and trans-regulation
To determine the rate of cis-regulatory divergence and compare it with the rate of total expression divergence for the same genes, we contrasted overall differences in relative allelic abundance between the F1 hybrid and mixed parental samples for the 4851 genes deemed suitable for measuring allele-specific expression (Data set 2). Compared with the 7587 genes discussed above (Data set 1), this set of genes showed more similar levels of overall expression differences among the three comparisons (Fig. 3A; Supplemental Figs. S12A, S13A–C), resulting from increased expression divergence in mel-mel and sim-sech in Data set 2 relative to Data set 1 (Supplemental Fig. S14). Despite this similarity in total expression differences among comparisons, we found that cis-regulatory differences were greater between than within species, with similar differences in relative cis-regulatory activity observed in sim-sech and mel-sim (Fig. 3A; Supplemental Figs. S12A, S13D–I). Comparing the proportions of genes with statistically significant differences in total expression and cis-regulatory activity showed a similar pattern, except that the proportion of genes with evidence of a cis-regulatory difference increased consistently and significantly with divergence time (Fig. 3B; Supplemental Fig. S12B). This suggests that the greater overall cis-regulatory divergence observed in the sim-sech comparison for these 4851 genes results from large differences in relative cis-regulatory activity for some genes rather than an excess of genes with divergent cis-regulatory activity.
We also compared the evolutionary trajectories of individual genes for total expression differences (Fig. 3C) and relative cis-regulatory activity (Fig. 3D; Supplemental Fig. S15) by dividing the 4851 genes in Data set 2 into the same nine classes described above for Data set 1 (Fig. 2C). Compared with total expression, we found that more genes showed consistent and small (median = 1.16-fold) differences in relative cis-regulatory activity in all three comparisons (Fig. 3C, II, 3D, II). We also observed more genes with unique differences in cis-regulatory activity in sim-sech (Fig. 3D, IIIc,d) and mel-sim (Fig. 3D, IIIe,f) that were greater in these comparisons than the other two comparisons. In other words, genes with a similar difference in cis-regulatory activity in mel-mel and mel-sim but not sim-sech were more likely to show increased than decreased divergence in sim-sech relative to the other two comparisons. Such asymmetry was much less pronounced for levels of total expression (Fig. 3C), suggesting that trans-acting changes have compensated for differences in cis-regulatory activity in many cases.
Differences between divergent cis-regulatory activity and total gene expression are caused by the divergence of trans-regulatory factors. We found that significantly more genes showed evidence of trans-regulatory differences in the mel-mel and mel-sim comparisons than in the sim-sech comparison (Fig. 3B; Supplemental Fig. S12B). This suggests that cis-regulatory divergence accounts for a larger proportion of overall expression divergence in sim-sech than in mel-mel or mel-sim. Consistent with this inference, a regression analysis showed that cis-regulatory differences explained more of the expression differences between D. simulans and D. sechellia than between either of the other two pairs of genotypes (Supplemental Fig. S16).
As overall sequence divergence increases, the number of loci with variation affecting expression of each gene is also expected to increase. Consistent with this expectation, we found that the proportion of genes with regulatory changes showing evidence of both cis- and trans-regulatory changes increased with divergence time, although the increase between the mel-mel and sim-sech comparisons was only statistically significant for one of the two hybrids (Fig. 3E; Supplemental Fig. S12C). For the majority of these genes, the cis- and trans-regulatory changes favored expression of alternative alleles (Fig. 3F; Supplemental Fig. S12D), suggesting that stabilizing selection has favored regulatory mutations that reduce expression differences. As described above, this type of developmental systems drift (True and Haag 2001) is thought to cause misexpression in F1 hybrids (Michalak and Noor 2004; Ranz et al. 2004; Landry et al. 2005; McManus et al. 2010; Barrière et al. 2012; Maheshwari and Barbash 2012). The frequency of genes with compensatory cis- and trans-regulatory changes did not increase steadily with divergence time; however, cis- and trans-regulatory changes favoring expression of opposite alleles were observed least often in the sim-sech comparison (Fig. 3F; Supplemental Fig. S12D). Contrary to prior studies (Landry et al. 2005; Tirosh et al. 2009; McManus et al. 2010), we found that genes affected by cis- and trans-regulatory changes with opposing effects on total expression levels were not more likely to show misexpression in F1 hybrids (Fig. 3G; Supplemental Fig. S12E).
To determine how the relative effects of cis- and trans-regulatory changes vary with divergence time, we calculated the percentage of total regulatory divergence attributable to cis-regulatory changes for each gene. This value is referred to as “percent cis” (% cis), and prior studies of flies (Wittkopp et al. 2008; McManus et al. 2010) and yeast (Emerson et al. 2010) found it to be larger between than within species. We also found that % cis was larger between than within species; however, in contrast to prior predictions (Wittkopp et al. 2008), % cis did not increase systematically with divergence time. Rather, it was largest for the sim-sech comparison with intermediate divergence time (Fig. 4A; Supplemental Fig. S17A). A correlation between % cis and total expression divergence for individual genes was previously reported between D. melanogaster and D. sechellia (McManus et al. 2010), but we did not observe this pattern for any of the three comparisons (Supplemental Fig. S17B–G). Finally, two prior studies (Lemos et al. 2008; McManus et al. 2010) reported that % cis was higher for genes showing additive than nonadditive (i.e., dominant, over-dominant, or under-dominant) inheritance. We observed this relationship only for the comparison of D. simulans and D. sechellia in one hybrid (Fig. 4B; Supplemental Fig. S17H), suggesting that it is also not a general feature of regulatory evolution.
Discussion
Researchers have been comparing genomic patterns of expression divergence among species for over a decade using microarrays, but sequence divergence between microarray probes and RNA samples often complicates comparisons among species and differences in normalization and statistical analyses can complicate comparisons among studies. Here, we use RNA-seq data to determine the tempo and mode of regulatory evolution among four divergent strains and species of Drosophila. This technique is better suited for interspecific comparisons than microarrays because it uses full sequence information instead of hybridization signals to determine gene expression levels, allowing more direct comparisons among species and studies.
RNA-seq was also recently used to compare expression levels in six different tissues among nine mammalian species and a bird (Brawand et al. 2011). Using Spearman’s rank correlation coefficient ρ to compare overall expression differences in each pair of species, this study showed that expression similarity decreased quickly over shorter divergence times and then slowed. Patterns of expression divergence were strikingly similar among RNA samples from brain (cerebral cortex or whole brain without cerebellum), cerebellum, heart, kidney, and liver, with accelerated expression divergence in RNA samples from testes (Brawand et al. 2011). By combining our data with data from three previous studies (McManus et al. 2010; Meisel et al. 2012; Suvorov et al. 2013), we found that expression divergence among Drosophila species showed a similar pattern to that of mammals, but on a different timescale (Fig. 5A,B). The Drosophila data showed greater expression divergence (lower values of ρ) than the mammalian data, which could be due to differences in tissue size among Drosophila species given that whole bodies rather than single tissues were used to generate these data. RNA-seq has also been used to compare expression divergence among four species of yeast (Busby et al. 2011), but it is difficult to compare the tempo in yeast to that of Drosophila and mammals because only three divergence time points were sampled (Fig. 5C).
For each gene, interspecific expression differences can be caused by cis- and/or trans-regulatory changes. When F1 hybrids can be made between species, measures of allele-specific expression can be used to disentangle the net effects of these two types of changes (Wittkopp et al. 2004). Such analyses have been reported for closely related pairs of strains or species in yeast (Tirosh et al. 2009; Emerson et al. 2010; Schaefke et al. 2013), flies (Graze et al. 2009; McManus et al. 2010; Coolon et al. 2012), plants (Shi et al. 2012; Bell et al. 2013), fishes (Murata et al. 2012; Shen et al. 2012), and mice (Goncalves et al. 2012). To the best of our knowledge, this is the first genomic study collecting data on cis- and trans-regulatory divergence for more than one pair of genotypes. As such, it provided unprecedented insight into the rate at which cis- and trans-regulatory changes evolve and allowed us to better assess the generality of relationships reported in other studies.
Compensatory cis- and trans-regulatory changes are common
We found that the number of genes with evidence for cis-regulatory divergence increased linearly with divergence time, but the number of genes with differences in total expression did not (Fig. 3B; Supplemental Fig. S12D). This suggests that trans-regulatory factors might often compensate for cis-regulatory differences at the level of total gene expression, either by fixing compensatory trans-regulatory variants or by feedback mechanisms affecting availability or activity of trans-acting factors (McManus et al. 2014). Consistent with this interpretation, cis- and trans-acting changes affecting expression of the same gene had opposite effects on expression levels 79%, 73%, and 87% of the time in the mel-mel, sim-sech, and mel-sim comparisons, respectively (Fig. 3F). The exponential accumulation of genes that are misexpressed in F1 hybrids (Fig. 2D) is also consistent with compensatory changes playing an important role in maintaining gene expression levels over evolutionary time (Landry et al. 2005). Such compensation can result from stabilizing selection acting to maintain similar expression levels in the face of new mutations, and has been seen not only in flies, but also in yeast (Tirosh et al. 2009), mice (Goncalves et al. 2012), and plants (Shi et al. 2012).
Compensation for cis-regulatory divergence resulting from the fixation of trans-acting changes could evolve by fixing cis-acting mutations first and then compensating trans-acting mutations, or vice versa. We favor the latter model because trans-acting mutations appear to arise more frequently than cis-acting mutations for individual genes (Gruber et al. 2012) and most trans-acting mutations that compensate for cis-regulatory divergence of one gene are expected to have deleterious pleiotropic effects on expression of other genes (Wray et al. 2003; Carroll 2008; Stern and Orgogozo 2008). Goncalves et al. (2012) favored a similar explanation for the extensive compensatory cis- and trans-regulatory changes they observed between strains of mice. An example of such trans-regulatory divergence subsequently compensated for by cis-regulatory changes has been described in yeast (Kuo et al. 2010). Regardless of which type of regulatory mutation is usually fixed first, it is clear that the regulatory networks controlling gene expression evolve more rapidly than the output from these networks.
Relative impact of selection and drift on regulatory evolution
A common goal for comparative studies of gene expression is identifying the selective and nonselective forces responsible for patterns of divergence and conservation, but this is not straightforward (Gilad et al. 2006b; Fay and Wittkopp 2008; Emerson et al. 2010). Without the biological replication needed to make statistically robust inferences based on alternative evolutionary models (e.g., Rifkin et al. 2003; Fay and Wittkopp 2008; Bedford and Hartl 2009; Brawand et al. 2011), we can only make speculative statements about the evolutionary processes responsible for each of the nine different trajectories of expression divergence we observed (Figs. 2C, 3C,D). For example, genes with similar (and typically small) expression differences in all three comparisons (class II in Figs. 2C and 3C,D) may either have low mutation-drift variance or be subject primarily to stabilizing selection. This is the most abundant class of genes for both total expression and cis-regulatory activity with 43% and 34% of genes showing this pattern for total expression in Data sets 1 and 2, respectively, and 48% of genes showing this pattern for differences in cis-regulatory activity in Data set 2. This is consistent with prior work suggesting that stabilizing selection has had a larger impact on the evolution of gene expression than genetic drift (Hsieh et al. 2003; Rifkin et al. 2003; Lemos et al. 2005; Gilad et al. 2006a; Xing et al. 2007; Kalinka et al. 2010). Indeed, <2.2% of genes in each comparison showed the increasing differences in total expression and/or cis-regulatory activity with divergence time (class I in Figs. 2C, 3C,D, and Supplemental Fig. S15) that are expected when expression evolves primarily due to genetic drift (Khaitovich et al. 2004; Gilad et al. 2006a). The remaining genes fell into one of seven categories consistent with variable selection pressures among lineages (class III in Figs. 2C and 3C,D).
Lineage-specific regulatory changes in D. sechellia
Gene-specific patterns of total expression divergence consistent with lineage-specific selection were more abundant in sim-sech than mel-mel or mel-sim for both Data sets 1 and 2 despite the sim-sech comparison having an intermediate divergence time (Figs. 2C, 3C). This is consistent with D. sechellia being an island endemic species with a small effective population size that has evolved many novel phenotypes relative to D. melanogaster and D. simulans (Orgogozo and Stern 2009), including adaptation to a new host plant (Jones 2005). As a consequence of this evolutionary history, D. sechellia might have fixed more deleterious mutations than the other two species by drift as well as more adaptive substitutions by positive selection. We observed an apparent excess of cis-regulatory divergence between D. simulans and D. sechellia (Figs. 3A, 4A; Supplemental Fig. S16) that we believe is more likely to result from positive selection than drift because (1) trans-acting variation contributes more than cis-acting variation to polymorphic expression within species (Lemos et al. 2008; Wittkopp et al. 2008; Emerson et al. 2010), suggesting that drift is more likely to fix trans-acting than cis-acting variants; (2) cis- and trans-regulatory changes affecting expression of the same gene were most likely to act in the same direction in the sim-sech comparison (Fig. 3F), which is consistent with positive, directional selection; and (3) simulation studies have shown that cis-regulatory divergence is more likely to be driven by natural selection than trans-regulatory divergence (Emerson et al. 2010). These results emphasize the importance of considering not only divergence time, but also the demographic and ecological history of individual species when studying the tempo and mode of evolution.
Methods
Fly strains, rearing, and collections
Four Drosophila genotypes were used for this study: the D. melanogaster North American zhr strain [full genotype: XYS.YL.Df(1)Zhr] (Sawamura and Yamamoto 1993; Ferree and Barbash 2009), the D. melanogaster Zimbabwean isofemale strain z30 (Begun and Aquadro 1993; Wu et al. 1995), the sequenced D. sechellia strain (droSec1 [14021-0428.25]), and an isofemale strain of D. simulans (Tsimbazaza) that mates well with D. melanogaster (Hollocher et al. 2000). All flies were reared on cornmeal medium using a 16:8 light:dark cycle at 20°C. Just prior to the start of the experiment, all strains were subjected to 10 generations of sibling pair matings to reduce genome-wide heterozygosity, followed by three generations of population expansion to generate the quantity of flies needed for crosses. For each cross between strains of D. melanogaster, 10 vials were set up with three female and three male flies each. For each interspecific cross, 30 vials were set up with three female and three male flies each. Virgin female progeny were allowed to mate from the time of eclosion to 3 d post-eclosion, then males and females were separated and females aged to 7–10 d post-eclosion. All flies were collected between 9 and 10 am to minimize the effects of circadian rhythm and snap-frozen in liquid nitrogen.
Sample preparation and sequencing
For each genotype analyzed, a pool of 20 female flies was used for total RNA extraction with TRIzol reagent according to manufacturer instructions (Invitrogen). This incorporates variation from biological replication into a single sample. Prior work has shown that expression for most genes is similar among replicate pools constructed in this way (Wittkopp et al. 2004, 2008; Coolon et al. 2012). Genomic DNA (gDNA) was extracted from a separate pool of 20 flies for each genotype using the DNeasy Blood & Tissue Kit (Qiagen). Illumina sequencing libraries for RNA-seq were prepared as previously reported (McManus et al. 2010; Coolon et al. 2012). Briefly, 10 µg of total RNA from each sample was treated with DNase I (Invitrogen) followed by poly(A)+ selection using Dynal magnetic beads (Invitrogen). Poly(A)+ RNA was fragmented using RNA fragmentation reagent (Ambion) before cDNA synthesis. Double-stranded cDNA was produced using random hexamers and SuperScript II reverse transcriptase (Invitrogen). cDNA was run on a 2% agarose gel and the region corresponding to ∼300-bp fragments was extracted. The size-selected double-stranded cDNA extracted from this gel slice was used in the Paired-End Genomic DNA Library Preparation Kit (Illumina) according to manufacturer’s recommendations. For the gDNA sequencing libraries, 10 µg of gDNA was used with the Paired-End Genomic DNA Library Preparation Kit (Illumina), following manufacturer’s recommendations. Each cDNA and gDNA library was subjected to a full lane of paired-end sequencing on an Illumina Genome Analyzer IIx using 76 cycles. On average, 24 million 76-bp, paired-end sequence reads were generated from each sequencing library (Supplemental Table S1). The zhr gDNA sample was also sequenced from a single end on an additional lane for 76 cycles per read. Images were analyzed using the Firecrest and Bustard modules to generate sequence and quality scores for each read.
Resequencing, genome assembly, and sequence divergence
Using the gDNA sequences, we constructed a strain-specific genome sequence for each genotype as described in the Supplemental Material. To determine percent sequence divergence in each comparison (mel-mel, sim-sec, mel-sim), we created reverse chain files to liftOver coordinates from D. melanogaster dm3 space to each of the other strain or species genomic space (zhr, z30, Tsimbazaza, droSec1) using the chainSwap utility from the UCSC Genome Browser (Kent et al. 2002). Using these chain files, we converted the dm3 genomic coordinates for each exon used for quantification in this study into their respective strain- or species-specific genomic coordinates. Using these coordinates, sequences for each exon were extracted from each strain- or species-specific genome. These sequences were aligned in pairs using Fast Statistical Alignment (FSA) (Bradley et al. 2009), and the number of divergent sites per gene was determined using custom perl scripts (pairwise_aln_FSA.pl, compare_pairwise.pl, seq_div_from_set.pl). Strain-specific genomes and chain files are provided in Supplemental File 1, and all custom perl scripts are included in Supplemental File 2.
Mapping sequencing reads to genes and alleles
We built a bioinformatics pipeline to measure total and allele-specific expression from Illumina sequencing outputs similar to those reported previously (McManus et al. 2010; Coolon et al. 2012). This pipeline, as well as the pyrosequencing methods used to validate measures of total and allele-specific expression derived from this pipeline, is described in the Supplemental Material.
Normalizing RNA-seq read counts among comparisons
Different numbers of sequence reads were recovered for each of the 10 cDNA libraries sequenced. These differences in read counts caused the Fisher’s exact tests used to identify significant changes in gene expression between pairs of genotypes to have differences in power among the mel-mel, sim-sech, and mel-sim comparisons. To equalize power in all three comparisons, we considered exactly 12,704,991 mapped reads from each RNA-seq data set by down-sampling mapped reads randomly without replacement in all but the D. sechellia data set, which already had exactly 12,704,991 mapped reads (Supplemental Table S2). A similar down-sampling strategy was recently used to investigate the power of different bioinformatic tools for identifying expression differences (Rapaport et al. 2013). We then excluded genes with fewer than 20 reads in any of the RNA-seq data sets, resulting in the same 7587 “expressed” genes being analyzed in each comparison (Supplemental Table S2). Simulations confirmed that a larger data set down-sampled in this way has the same power to detect significant expression differences with a Fisher’s exact test as a data set originally collected at the smaller sample size (data not shown). The exact data analyzed are provided in Supplemental Material as Data set 1.
Comparing total expression among genotypes
Spearman’s correlation coefficients (ρ) were used to measure overall expression differences between pairs of genotypes, following Brawand et al. (2011) and Meisel et al. (2012). Unlike Pearson’s r, Spearman’s ρ makes no assumptions about normality, linearity, or homoscedasticity. It is also less sensitive to outliers. Bootstrapping was used to test for statistically significant differences in ρ between mel-mel and sim-sech and between sim-sech and mel-sim by sampling with replacement 7587 gene-specific read counts from the observed 7587 genes 10,000 times using R, calculating ρ in each case, and determining the 2.5% and 97.5% percentiles. Significant differences were inferred when these 95% quantiles did not overlap.
We also tested for significant differences in expression level of individual genes by comparing the number of reads mapping to the focal gene to the number of reads mapping to the other 7586 genes between parental types, between reciprocal hybrids, and between each hybrid and parent using Fisher’s exact tests with a null hypothesis of equal expression in both samples. This test was used instead of other methods for detecting differential expression because it recovers a similar proportion of true positives with fewer false positives without requiring replicates (Tarazona et al. 2011). Fisher’s exact tests were also used to test for significant differences in the proportion of genes with significant differences between mel-mel and sim-sech and between sim-sech and mel-sim.
Inferring the mode of inheritance
To determine the mode of inheritance for each gene in each comparison, we followed the logic outlined in Gibson et al. (2004) and used previously for RNA-seq data in McManus et al. (2010). Using a 1.25-fold expression difference cutoff and total expression levels in the F1 hybrids and corresponding parental genotypes, we classified each gene as either “similar,” “additive,” “parent 1 dominant,” “parent 2 dominant,” “under-dominant,” or “over-dominant.” Dominant inheritance was inferred when total expression in the F1 hybrid was similar to expression in one of the parental genotypes but different from the other parental genotype. Such genes were classified as either “parent 1 dominant” or “parent 2 dominant” depending on which parent the F1 hybrid resembled. Additive inheritance was inferred when F1 hybrid expression was different from, and intermediate to, both parents; and misexpression was inferred when the total expression in the F1 hybrid was different from both parental genotypes and greater than (over-dominant) or less than (under-dominant) the more extreme parental expression level. Genes with similar expression in both parents and F1 hybrids were classified as similar. Fisher’s exact tests were used to test for significant differences in the proportion of genes in each category between mel-mel and sim-sech and between sim-sech and mel-sim.
Normalizing allele-specific RNA-seq read counts among comparisons
To equalize power when testing for cis-regulatory divergence in mel-mel, sim-sech, and mel-sim, as well as when comparing tests for cis-regulatory and total expression divergence, we created a second data set with the same number of allele-specific reads for each gene in all comparisons. This data set was constructed by (1) combining the equal numbers of mapped reads for each genotype used in the first data set to make a “mixed parental” sample for each comparison (e.g., reads from zhr and z30 were combined for the mel-mel comparison); (2) counting allele-specific reads (i.e., reads that mapped perfectly and uniquely to only one of the parental genomes) in all mixed parental and F1 hybrid samples; and (3) equalizing allele-specific read counts for each gene in all mixed parental and hybrid samples by identifying the sample with the fewest allele-specific reads for that gene and using hypergeometric sampling of the observed allele-specific read counts to randomly reduce the number of allele-specific reads considered in each of the other samples. Simulations confirmed that this down-sampling approach produced data sets with the same power to detect significant expression differences with Fisher’s exact tests as data sets originally collected at the smaller sample sizes (data not shown), and a similar method was recently used for allele-specific RNA-seq data from humans (Lappalainen et al. 2013).
Prior to analysis, genes with low confidence allele-assignments in the mel-mel, sim-sech, or mel-sim comparisons, defined as having >10% of the mapped reads from one parent aligned solely to the genome of the other parent, were excluded. Genes with less than 20 total allele-specific reads (allele 1 + allele 2 < 20) in any mixed parental or hybrid sample were also excluded from all comparisons; this threshold was based on prior theoretical and empirical work (Fontanillas et al. 2010; McManus et al. 2010). Finally, nine more genes were excluded because they showed significant differences in relative allelic expression between reciprocal hybrids using Fisher’s exact tests with a null hypothesis of equal expression and an FDR of 0.05. Such differences in relative allelic expression can result from parent-of-origin effects such as mitochondrial inheritance or genomic imprinting; imprinting seems rarely, if ever, responsible for this pattern of expression in Drosophila, however (Wittkopp et al. 2006, 2008; Coolon et al. 2012). After applying these filters, 4851 genes were deemed suitable for allele-specific analysis in all comparisons, with most of the genes excluded from this data set because they had too few allele-specific reads in the mel-mel comparison (Supplemental Table S3).
Mitochondrial genes were excluded from our allele-specific data set; however, allele assignments for F1 hybrid reads that mapped to mitochondrial genes were used as one metric to evaluate the reliability of our bioinformatic allele assignments. In the absence of sequencing and allele-assignment errors, all of these reads should map to the maternal allele. We found that 99.5% and 99.8% of reads from mitochondrial genes mapped to the maternal allele in F1 hybrids between D. simulans and D. sechellia and between D. melanogaster and D. simulans, respectively (Supplemental Table S4). Additional validation of allele assignments is described in the main text.
The exact data analyzed are provided in the Supplemental Material as Data set 2.
Evaluating cis- and trans-regulatory changes
Spearman’s ρ was used to measure cis-regulatory divergence on a genomic scale in the mel-mel, sim-sech, and mel-sim comparisons by assessing the correlation between allele 1 and allele 2 read counts from F1 hybrids. It was also used to repeat the analysis of overall expression divergence in each comparison using the mixed parental samples. To test for statistically significant differences in ρ between mel-mel and sim-sech and between sim-sech and mel-sim, we used bootstrapping. Specifically, we sampled with replacement 4851 gene-specific read counts from the observed 4851 genes 10,000 times using R, calculated ρ in each case, and determined the 2.5% and 97.5% percentiles. Significant differences were inferred when these 95% quantiles did not overlap.
Binomial exact tests with a null hypothesis of equal expression were used to identify significant expression differences between genotypes in the mixed parental pools as well as significant differences in relative allelic expression in the F1 hybrid samples that indicate differences in relative cis-regulatory activity. An FDR of 5% was used to determine statistical significance despite the fact that the P-values produced by binomial exact tests when the null hypothesis is true are not uniformly distributed as assumed by the FDR correction for multiple tests (Skelly et al. 2011). This is because our simulations showed that the violation of this assumption had no effect on the number of genes called significant in this study (Supplemental Material). To test for the unequal allelic abundance between mixed parental and F1 hybrid samples that would indicate trans-regulatory divergence, we performed Fisher’s exact tests with a null hypothesis of equal expression by comparing read counts from genotype 1 and genotype 2 in the mixed parental sample to allele 1 and allele 2 in the corresponding F1 hybrid samples. Each gene in each comparison was classified as “conserved,” “all cis,” “all trans,” “cis + trans,” “cis × trans,” “compensatory,” or “ambiguous” based on the results of the Fisher’s and binomial exact tests using the criteria described in Supplemental Table S5. These same classifications were used previously in Landry et al. (2005) and (McManus et al. 2010). Fisher’s exact tests were also used to test for significant differences in the proportion of genes with significant differences between mel-mel and sim-sech and between sim-sech and mel-sim.
For each gene in each comparison, the total expression difference was calculated as log2(genotype 1 read count/genotype 2 read count) from the mixed parental sample, and the cis-regulatory difference (“cis”) was calculated as log2(allele 1 read count/allele 2 read count) from each of the F1 hybrid samples. The trans-regulatory difference (“trans”) for each gene in each comparison was calculated as the difference between the total expression and cis-regulatory differences: log2(genotype 1 read count/genotype 2 read count) − log2(allele 1 read count/allele 2 read count). % cis (proportion of total regulatory divergence attributable to cis-regulatory changes) was then calculated as [|cis|/(|cis| + |trans|)] × 100.
Scripts and software used
All statistical analyses, down-sampling, and simulations were performed in R (version 2.12.2 or version 3.0.1, CRAN) (R Development Core Team 2008). This code includes the use of fisher.test for Fisher’s exact tests, binom.test for binomial exact tests, corr.test for Spearman’s ρ, sample to randomly down-sample mapped reads and simulate mapped read counts from a multivariate distribution, rhyper to randomly down-sample allele-specific read counts, rbinom to simulate allele-specific read counts. Custom perl and R scripts used in this work are included in Supplemental File 2.
Data access
The sequencing data from this study have been submitted to the NCBI Sequence Read Archive (SRA; http://www.ncbi.nlm.nih.gov/sra) under accession numbers SRA052065 and SRP023274.
Acknowledgments
We thank Sebastian Zoellner and the University of Michigan LSA High Performance Computing for computational resources, Hope Hollocher, Chung-I Wu, and the Bloomington and UCSD stock centers for Drosophila strains, Laura Sligar for pyrosequencing assistance, Bing Yang for computational assistance, Sebastian Zoellner, J.J. Emerson, and Manolis Dermitzakis for statistical advice, and Brian Metzger, Richard Lusk, Fabien Duveau, and Alisha John for comments on the manuscript. Funding for this work was provided by the National Institutes of Health (5F32GM089009-02 to J.D.C. and 5R01GM095296 to B.R.G.), the National Science Foundation (NSF 0903629 to K.R.S. and MCB-1021398 to P.J.W.), and the Alfred P. Sloan Research Foundation (fellowship to P.J.W.). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Institutes of Health, National Science Foundation, or Sloan Foundation.
Author contributions: P.J.W., J.D.C., C.J.M., and B.R.G. conceived and designed the project. J.D.C. performed the fly rearing, crosses, collection of biological samples, and RNA extractions. C.J.M. and J.D.C. prepared mRNA sequencing libraries. C.J.M. extracted gDNA and prepared sequencing libraries. C.J.M. and B.R.G. performed Illumina sequencing, base calling, and initial quality control filtering. C.J.M. assembled strain- and comparison-specific genomes, with input from J.D.C. and K.R.S. C.J.M. aligned RNA-seq reads to assembled genomes and identified orthologous genomic regions. K.R.S. wrote and implemented scripts for quantifying allele-specific expression, total expression, and sequence divergence, with input from J.D.C., C.J.M., and P.J.W. J.D.C. designed and performed pyrosequencing experiments. J.D.C. performed all statistical analyses and prepared all figures, with input from K.R.S. and P.J.W. The paper was written by P.J.W. and J.D.C. with input from all authors.
Footnotes
[Supplemental material is available for this article.]
Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.163014.113.
Freely available online through the Genome Research Open Access option.
References
- Ahmadian A, Lundeberg J, Nyrén P, Uhlén M, Ronaghi M 2000. Analysis of the p53 tumor suppressor gene by pyrosequencing. Biotechniques 28: 140–147 [DOI] [PubMed] [Google Scholar]
- Barrière A, Gordon KL, Ruvinsky I 2012. Coevolution within and between regulatory loci can preserve promoter function despite evolutionary rate acceleration. PLoS Genet 8: e1002961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bedford T, Hartl DL 2009. Optimization of gene expression by natural selection. Proc Natl Acad Sci 106: 1133–1138 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Begun DJ, Aquadro CF 1993. African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature 365: 548–550 [DOI] [PubMed] [Google Scholar]
- Bell GDM, Kane NC, Rieseberg LH, Adams KL 2013. RNA-seq analysis of allele-specific expression, hybrid effects, and regulatory divergence in hybrids compared with their parents from natural populations. Genome Biol Evol 5: 1309–1323 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bradley RK, Roberts A, Smoot M, Juvekar S, Do J, Dewey C, Holmes I, Pachter L 2009. Fast statistical alignment. PLoS Comput Biol 5: e1000392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brawand D, Soumillon M, Necsulea A, Julien P, Csárdi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, et al. 2011. The evolution of gene expression levels in mammalian organs. Nature 478: 343–348 [DOI] [PubMed] [Google Scholar]
- Busby MA, Gray JM, Costa AM, Stewart C, Stromberg MP, Barnett D, Chuang JH, Springer M, Marth GT 2011. Expression divergence measured by transcriptome sequencing of four yeast species. BMC Genomics 12: 635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Campbell CD, Kirby A, Nemesh J, Daly MJ, Hirschhorn JN 2008. A survey of allelic imbalance in F1 mice. Genome Res 18: 555–563 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carroll SB 2008. Evo-devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution. Cell 134: 25–36 [DOI] [PubMed] [Google Scholar]
- Coolon JD, Wittkopp PJ 2013. cis-and trans-regulation in Drosophila interspecific hybrids. In Polyploid and hybrid genomics (ed. Chen ZJ, Birchler JA), pp. 37–57. Wiley, Oxford, UK [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coolon JD, Stevenson KR, McManus CJ, Graveley BR, Wittkopp PJ 2012. Genomic imprinting absent in Drosophila melanogaster adult females. Cell Rep 2: 69–75 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cowles CR, Hirschhorn JN, Altshuler D, Lander ES 2002. Detection of regulatory variation in mouse genes. Nat Genet 32: 432–437 [DOI] [PubMed] [Google Scholar]
- Cutter AD 2008. Divergence times in Caenorhabditis and Drosophila inferred from direct estimates of the neutral mutation rate. Mol Biol Evol 25: 778–786 [DOI] [PubMed] [Google Scholar]
- David JR, Capy P 1988. Genetic variation of Drosophila melanogaster natural populations. Trends Genet 4: 106–111 [DOI] [PubMed] [Google Scholar]
- de Meaux JJ, Pop AA, Mitchell-Olds TT 2006. cis-regulatory evolution of chalcone-synthase expression in the genus Arabidopsis. Genetics 174: 2181–2202 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dickinson WJ, Rowan RG, Brennan MD 1984. Regulatory gene evolution: adaptive differences in expression of alcohol dehydrogenase in Drosophila melanogaster and Drosophila simulans. Heredity (Edinb) 52: 215–225 [DOI] [PubMed] [Google Scholar]
- Emerson JJ, Hsieh L-C, Sung H-M, Wang T-Y, Huang C-J, Lu HH-S, Lu M-YJ, Wu S-H, Li W-H 2010. Natural selection on cis and trans regulation in yeasts. Genome Res 20: 826–836 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fay JC, Wittkopp PJ 2008. Evaluating the role of natural selection in the evolution of gene regulation. Heredity (Edinb) 100: 191–199 [DOI] [PubMed] [Google Scholar]
- Ferree PM, Barbash DA 2009. Species-specific heterochromatin prevents mitotic chromosome segregation to cause hybrid lethality in Drosophila. PLoS Biol 7: e1000234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fontanillas P, Landry CR, Wittkopp PJ, Russ C, Gruber JD, Nusbaum C, Hartl DL 2010. Key considerations for measuring allelic expression on a genomic scale using high-throughput sequencing. Mol Ecol (Suppl 1) 19: 212–227 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garrigan D, Kingan SB, Geneva AJ, Andolfatto P, Clark AG, Thornton KR, Presgraves DC 2012. Genome sequencing reveals complex speciation in the Drosophila simulans clade. Genome Res 22: 1499–1511 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Genissel A, McIntyre LM, Wayne ML, Nuzhdin SV 2007. cis and trans regulatory effects contribute to natural variation in transcriptome of Drosophila melanogaster. Mol Biol Evol 25: 101–110 [DOI] [PubMed] [Google Scholar]
- Gibson G, Riley-Berger R, Harshman L, Kopp A, Vacha S, Nuzhdin S, Wayne M 2004. Extensive sex-specific nonadditivity of gene expression in Drosophila melanogaster. Genetics 167: 1791–1799 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gilad Y, Oshlack A, Rifkin SA 2006a. Natural selection on gene expression. Trends Genet 22: 456–461 [DOI] [PubMed] [Google Scholar]
- Gilad Y, Oshlack A, Smyth GK, Speed TP, White KP 2006b. Expression profiling in primates reveals a rapid evolution of human transcription factors. Nature 440: 242–245 [DOI] [PubMed] [Google Scholar]
- Goncalves A, Leigh-Brown S, Thybert D, Stefflova K, Turro E, Flicek P, Brazma A, Odom DT, Marioni JC 2012. Extensive compensatory cis-trans regulation in the evolution of mouse gene expression. Genome Res 22: 2376–2384 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gordon KL, Ruvinsky I 2012. Tempo and mode in evolution of transcriptional regulation. PLoS Genet 8: e1002432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graveley BR, Brooks AN, Carlson JW, Duff MO, Landolin JM, Yang L, Artieri CG, van Baren MJ, Boley N, Booth BW, et al. 2011. The developmental transcriptome of Drosophila melanogaster. Nature 471: 473–479 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graze RM, McIntyre LM, Main BJ, Wayne ML, Nuzhdin SV 2009. Regulatory divergence in Drosophila melanogaster and D. simulans, a genomewide analysis of allele-specific expression. Genetics 183: 547–561 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gruber JDJ, Long ADA 2009. cis-regulatory variation is typically polyallelic in Drosophila. Genetics 181: 661–670 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gruber JD, Vogel K, Kalay G, Wittkopp PJ 2012. Contrasting properties of gene-specific regulatory, coding, and copy number mutations in Saccharomyces cerevisiae: frequency, effects, and dominance. PLoS Genet 8: e1002497. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haerty W, Singh RS 2006. Gene regulation divergence is a major contributor to the evolution of Dobzhansky-Muller incompatibilities between species of Drosophila. Mol Biol Evol 23: 1707–1714 [DOI] [PubMed] [Google Scholar]
- He F, Zhang X, Hu J, Turck F, Dong X, Goebel U, Borevitz J, de Meaux J 2012. Genome-wide analysis of cis-regulatory divergence between species in the Arabidopsis genus. Mol Biol Evol 29: 3385–3395 [DOI] [PubMed] [Google Scholar]
- Hollocher H, Ting CT, Wu ML, Wu CI 1997. Incipient speciation by sexual isolation in Drosophila melanogaster: extensive genetic divergence without reinforcement. Genetics 147: 1191–1201 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hollocher H, Agopian K, Waterbury J, O’Neill RW, Davis AW 2000. Characterization of defects in adult germline development and oogenesis of sterile and rescued female hybrids in crosses between Drosophila simulans and Drosophila melanogaster. J Exp Zool 288: 205–218 [DOI] [PubMed] [Google Scholar]
- Hsieh W-P, Chu T-M, Wolfinger RD, Gibson G 2003. Mixed-model reanalysis of primate data suggests tissue and species biases in oligonucleotide-based gene expression profiles. Genetics 165: 747–757 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hutter S, Saminadin-Peter SS, Stephan W, Parsch J 2008. Gene expression variation in African and European populations of Drosophila melanogaster. Genome Biol 9: R12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones CD 2005. The genetics of adaptation in Drosophila sechellia. Genetica 123: 137–145 [DOI] [PubMed] [Google Scholar]
- Kalinka AT, Varga KM, Gerrard DT, Preibisch S, Corcoran DL, Jarrells J, Ohler U, Bergman CM, Tomancak P 2010. Gene expression divergence recapitulates the developmental hourglass model. Nature 468: 811–814 [DOI] [PubMed] [Google Scholar]
- Kellis M, Patterson N, Endrizzi M, Birren B, Lander ES 2003. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423: 241–254 [DOI] [PubMed] [Google Scholar]
- Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D 2002. The human genome browser at UCSC. Genome Res 12: 996–1006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Khaitovich P, Weiss G, Lachmann M, Hellmann I, Enard W, Muetzel B, Wirkner U, Ansorge W, Pääbo S 2004. A neutral model of transcriptome evolution. PLoS Biol 2: E132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuo D, Licon K, Bandyopadhyay S, Chuang R, Luo C, Catalana J, Ravasi T, Tan K, Ideker T 2010. Coevolution within a transcriptional network by compensatory trans and cis mutations. Genome Res 20: 1672–1678 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lachaise D, David JR, Lemeunier F, Tsacas L 1986. The reproductive relationships of Drosophila sechellia with D. mauritiana, D. simulans, and D. melanogaster from the Afrotropical region. Evolution 40: 262–271 [DOI] [PubMed] [Google Scholar]
- Lachaise D, Cariou ML, David JR, Lemeunier F, Tsacas L, Ashburner M 1988. Historical biogeography of the Drosophila melanogaster species subgroup. Evol Biol 22: 159–225 [Google Scholar]
- Landry CR, Wittkopp PJ, Taubes CH, Ranz JM, Clark AG, Hartl DL 2005. Compensatory cis-trans evolution and the dysregulation of gene expression in interspecific hybrids of Drosophila. Genetics 171: 1813–1822 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lappalainen T, Sammeth M, Friedländer MR, ‘t Hoen PAC, Monlong J, Rivas MA, Gonzàlez-Porta M, Kurbatova N, Griebel T, Ferreira PG, et al. 2013. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501: 506–511 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lemos B, Meiklejohn CD, Cáceres M, Hartl DL 2005. Rates of divergence in gene expression profiles of primates, mice, and flies: stabilizing selection and variability among functional categories. Evolution 59: 126–137 [PubMed] [Google Scholar]
- Lemos B, Araripe LO, Fontanillas P, Hartl DL 2008. Dominance and the evolutionary accumulation of cis- and trans-effects on gene expression. Proc Natl Acad Sci 105: 14471–14476 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Levy AA, Tirosh I, Reikhav S, Bloch Y 2013. Yeast hybrids and polyploids as models in evolutionary studies. In Polyploid and hybrid genomics (ed. Chen ZJ, Birchler JA), pp. 1–14. Wiley, Oxford, UK [Google Scholar]
- Maheshwari S, Barbash DA 2012. cis-by-trans regulatory divergence causes the asymmetric lethal effects of an ancestral hybrid incompatibility gene. PLoS Genet 8: e1002597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McManus CJ, Coolon JD, Duff MO, Eipper-Mains J, Graveley BR, Wittkopp PJ 2010. Regulatory divergence in Drosophila revealed by mRNA-seq. Genome Res 20: 816–825 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McManus CJ, May GE, Spealman P, Shteyman A 2014. Ribosome profiling reveals post-transcriptional buffering of divergent gene expression in yeast. Genome Res 24: 422–430 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meiklejohn CD, Parsch J, Ranz JM, Hartl DL 2003. Rapid evolution of male-biased gene expression in Drosophila. Proc Natl Acad Sci 100: 9894–9899 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meisel RP, Malone JH, Clark AG 2012. Faster-X evolution of gene expression in Drosophila. PLoS Genet 8: e1003013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Michalak P, Noor MAF 2004. Association of misexpression with sterility in hybrids of Drosophila simulans and D. mauritiana. J Mol Evol 59: 277–282 [DOI] [PubMed] [Google Scholar]
- Moehring AJ, Teeter KC, Noor MAF 2007. Genome-wide patterns of expression in Drosophila pure species and hybrid males. II. Examination of multiple-species hybridizations, platforms, and life cycle stages. Mol Biol Evol 24: 137–145 [DOI] [PubMed] [Google Scholar]
- Murata Y, Oda S, Mitani H 2012. Allelic expression changes in medaka (Oryzias latipes) hybrids between inbred strains derived from genetically distant populations. PLoS ONE 7: e36875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Obbard DJ, Maclennan J, Kim K-W, Rambaut A, O’Grady PM, Jiggins FM 2012. Estimating divergence dates and substitution rates in the Drosophila phylogeny. Mol Biol Evol 29: 3459–3473 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orgogozo V, Stern DL 2009. How different are recently diverged species? More than 150 phenotypic differences have been reported for the D. melanogaster species subgroup. Fly (Austin) 3: 117. [DOI] [PubMed] [Google Scholar]
- Osada N, Kohn MH, Wu C-I 2006. Genomic inferences of the cis-regulatory nucleotide polymorphisms underlying gene expression differences between Drosophila melanogaster mating races. Mol Biol Evol 23: 1585–1591 [DOI] [PubMed] [Google Scholar]
- R Development Core Team. 2008. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org
- Ranz JM, Namgyal K, Gibson G, Hartl DL 2004. Anomalies in the expression profile of interspecific hybrids of Drosophila melanogaster and Drosophila simulans. Genome Res 14: 373–379 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rapaport F, Khanin R, Liang Y, Pirun M, Krek A, Zumbo P, Mason CE, Socci ND, Betel D 2013. Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biol 14: R95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rifkin SA, Kim J, White KP 2003. Evolution of gene expression in the Drosophila melanogaster subgroup. Nat Genet 33: 138–144 [DOI] [PubMed] [Google Scholar]
- Sawamura K, Yamamoto MT 1993. Cytogenetical localization of Zygotic hybrid rescue (Zhr), a Drosophila melanogaster gene that rescues interspecific hybrids from embryonic lethality. Mol Gen Genet 239: 441–449 [DOI] [PubMed] [Google Scholar]
- Schaefke B, Emerson JJ, Wang T-Y, Lu M-YJ, Hsieh L-C, Li W-H 2013. Inheritance of gene expression level and selective constraints on trans- and cis-regulatory changes in yeast. Mol Biol Evol 30: 2121–2133 [DOI] [PubMed] [Google Scholar]
- Shen Y, Catchen J, Garcia T, Amores A, Beldorth I, Wagner J, Zhang Z, Postlethwait J, Warren W, Schartl M, et al. 2012. Identification of transcriptome SNPs between Xiphophorus lines and species for assessing allele specific gene expression within F1 interspecies hybrids. Comp Biochem Physiol C Toxicol Pharmacol 155: 102–108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shi X, Ng DW-K, Zhang C, Comai L, Ye W, Chen ZJ 2012. cis- and trans-regulatory divergence between progenitor species determines gene-expression novelty in Arabidopsis allopolyploids. Nat Commun 3: 950. [DOI] [PubMed] [Google Scholar]
- Simpson GG. 1944. Tempo and mode in evolution. Columbia University Press, New York. [Google Scholar]
- Skelly DA, Johansson M, Madeoy J, Wakefield J, Akey JM 2011. A powerful and flexible statistical framework for testing hypotheses of allele-specific gene expression from RNA-seq data. Genome Res 21: 1728–1737 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stern DL, Orgogozo V 2008. The loci of evolution: how predictable is genetic evolution? Evolution 62: 2155–2177 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Suvorov A, Nolte V, Pandey RV, Franssen SU, Futschik A, Schlötterer C 2013. Intra-specific regulatory variation in Drosophila pseudoobscura. PLoS ONE 8: e83547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tarazona S, García-Alcalde F, Dopazo J, Ferrer A, Conesa A 2011. Differential expression in RNA-seq: a matter of depth. Genome Res 21: 2213–2223 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tirosh I, Reikhav S, Levy AA, Barkai N 2009. A yeast hybrid provides insight into the evolution of gene expression regulation. Science 324: 659–662 [DOI] [PubMed] [Google Scholar]
- True JR, Haag ES 2001. Developmental system drift and flexibility in evolutionary trajectories. Evol Dev 3: 109–119 [DOI] [PubMed] [Google Scholar]
- Wang H-Y, Fu Y, McPeek MS, Lu X, Nuzhdin S, Xu A, Lu J, Wu M-L, Wu C-I 2008. Complex genetic interactions underlying expression differences between Drosophila races: analysis of chromosome substitutions. Proc Natl Acad Sci 105: 6362–6367 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wittkopp PJ, Haerum BK, Clark AG 2004. Evolutionary changes in cis and trans gene regulation. Nature 430: 85–88 [DOI] [PubMed] [Google Scholar]
- Wittkopp PJ, Haerum BK, Clark AG 2006. Parent-of-origin effects on mRNA expression in Drosophila melanogaster not caused by genomic imprinting. Genetics 173: 1817–1821 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wittkopp PJ, Haerum BK, Clark AG 2008. Regulatory changes underlying expression differences within and between Drosophila species. Nat Genet 40: 346–350 [DOI] [PubMed] [Google Scholar]
- Wray GA, Hahn MW, Abouheif E, Balhoff JP, Pizer M, Rockman MV, Romano LA 2003. The evolution of transcriptional regulation in eukaryotes. Mol Biol Evol 20: 1377–1419 [DOI] [PubMed] [Google Scholar]
- Wu CI, Hollocher H, Begun DJ, Aquadro CF, Xu Y, Wu ML 1995. Sexual isolation in Drosophila melanogaster: a possible case of incipient speciation. Proc Natl Acad Sci 92: 2519–2523 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xing Y, Ouyang Z, Kapur K, Scott MP, Wong WH 2007. Assessing the conservation of mammalian gene expression using high-density exon arrays. Mol Biol Evol 24: 1283–1285 [DOI] [PubMed] [Google Scholar]
- Zhang X, Borevitz JO 2009. Global analysis of allele-specific expression in Arabidopsis thaliana. Genetics 182: 943–954 [DOI] [PMC free article] [PubMed] [Google Scholar]