Abstract
Drosophila mauritiana is an Indian Ocean island endemic species that diverged from its two sister species, Drosophila simulans and Drosophila sechellia, approximately 240,000 years ago. Multiple forms of incomplete reproductive isolation have evolved among these species, including sexual, gametic, ecological, and intrinsic postzygotic barriers, with crosses among all three species conforming to Haldane’s rule: F1 hybrid males are sterile and F1 hybrid females are fertile. Extensive genetic resources and the fertility of hybrid females have made D. mauritiana, in particular, an important model for speciation genetics. Analyses between D. mauritiana and both of its siblings have shown that the X chromosome makes a disproportionate contribution to hybrid male sterility. But why the X plays a special role in the evolution of hybrid sterility in these, and other, species remains an unsolved problem. To complement functional genetic analyses, we have investigated the population genomics of D. mauritiana, giving special attention to differences between the X and the autosomes. We present a de novo genome assembly of D. mauritiana annotated with RNAseq data and a whole-genome analysis of polymorphism and divergence from ten individuals. Our analyses show that, relative to the autosomes, the X chromosome has reduced nucleotide diversity but elevated nucleotide divergence; an excess of recurrent adaptive evolution at its protein-coding genes; an excess of recent, strong selective sweeps; and a large excess of satellite DNA. Interestingly, one of two centimorgan-scale selective sweeps on the D. mauritiana X chromosome spans a region containing two sex-ratio meiotic drive elements and a high concentration of satellite DNA. Furthermore, genes with roles in reproduction and chromosome biology are enriched among genes that have histories of recurrent adaptive protein evolution. Together, these genome-wide analyses suggest that genetic conflict and frequent positive natural selection on the X chromosome have shaped the molecular evolutionary history of D. mauritiana, refining our understanding of the possible causes of the large X-effect in speciation.
Keywords: adaptation, Drosophila, genome, satellite DNA, selective sweep, X chromosome
Introduction
During the past 30 years, the three species of the Drosophila simulans clade—D. simulans, Drosophila sechellia, and Drosophila mauritiana—have emerged as important models in evolutionary genetics, owing mostly to their phylogenetic proximity to Drosophila melanogaster. The common ancestor of the D. simulans clade species originated on Madagascar approximately 3 Ma, splitting from that of D. melanogaster, and later gave rise to D. sechellia on the Seychelles archipelago and to D. mauritiana on Mauritius nearly simultaneously, approximately 240 ka (Lachaise et al. 1986; Ballard 2000a; Kliman et al. 2000; Dean and Ballard 2004; Kopp et al. 2006; McDermott and Kliman 2008; Garrigan et al. 2012). Drosophila mauritiana is an island endemic species, and D. simulans, despite having a recently established cosmopolitan distribution, has never been collected on Mauritius (David et al. 1989; Legrand et al. 2011). Nevertheless, although D. mauritiana and D. simulans have had largely allopatric histories, there is evidence for limited interisland migration and natural hybridization. Multiple mitochondrial haplotypes have introgressed from D. simulans into D. mauritiana (Ballard 2000b; Nunes et al. 2010), and evidence for interspecific introgression, probably from D. simulans into D. mauritiana, is scattered over all three major chromosomes (Garrigan et al. 2012). However, genomic introgressions between the D. simulans and D. mauritiana are underrepresented on the X chromosome (Garrigan et al. 2012), suggesting that X-linked chromosomal segments are less exchangeable between species than those on the autosomes.
The paucity of interspecific introgressions on the X chromosome may be attributable to selection against hybrid incompatibilities that cause intrinsic postzygotic isolation. Crosses between D. mauritiana and its two sister species follow Haldane’s rule: all crosses yield sterile F1 hybrid (XY) males but fertile F1 hybrid (XX) females. Genetic analyses involving D. mauritiana show that the X chromosome has a disproportionately large effect on hybrid male sterility (HMS; Coyne 1985, 1992; Coyne and Charlesworth 1989). This so-called large X-effect has two proximate causes: recessive HMS factors on the X chromosome are fully expressed in hemizygous hybrid males (Turelli and Orr 1995, 2000); and between D. mauritiana and its siblings species, the density of HMS factors on the X chromosome is 2–4 times greater than on the autosomes (True, Weir, et al. 1996; Tao et al. 2003; Masly and Presgraves 2007). The large X-effect holds across a range of taxa, including other Drosophila (Moehring et al. 2006), mammals (Good et al. 2008), fish (Kitano et al. 2009), butterflies (Naisbit et al. 2002), and birds (Saetre et al. 2001). Determining the evolutionary cause(s) of the large X-effect is thus one of the major unsolved problems of speciation genetics (Charlesworth et al. 1987; Coyne and Orr 1989; Coyne 1992; Presgraves 2008b). There are three leading candidate explanations. First, X–autosome differences in transcriptional regulation exist in the male germline that may be especially prone to disruption by hybrid incompatibilities (Lifschytz and Lindsley 1971; Wu and Davis 1993; Hense et al. 2007; Meiklejohn et al. 2011). Second, the X chromosome is susceptible to the invasion and spread of selfish genetic elements that may, as incidental byproducts, contribute to sterility in species hybrids (Frank 1991; Hurst and Pomiankowski 1991; Tao et al. 2001; Presgraves 2008a, 2008b). Third, the X chromosome may experience a faster rate of adaptive evolution than the autosomes which could, in turn, contribute to the evolution of more X-linked hybrid incompatibilities (Charlesworth et al. 1987; Coyne and Orr 1989; Presgraves 2008b; Hvilsom et al. 2012; Llopart 2012; Meisel et al. 2012; Meisel and Connallon 2013; Kousathanas et al. 2014). These explanations are not mutually exclusive and, while evidence for each exists (Phadnis and Orr 2009; Campbell et al. 2013), their relative importance is still unknown.
To complement functional genetic analyses of speciation and the large X-effect, we have performed population genomics analyses of polymorphism and divergence for ten genomes sampled from D. mauritiana. Our analyses show that the D. mauritiana X chromosome bears more signatures of recent, strong, and recurrent positive natural selection than the autosomes. Genes and gene regions with signatures of positive selection are enriched for functions in gametogenesis, chromosome biology, and satellite DNA, implying that genetic conflict over transmission contributes to molecular evolution in D. mauritiana.
Materials and Methods
Fly Inbred Lines, Library Preparation, and Sequencing
We sampled nine wild-type isofemale lines of D. mauritiana (collected in 2006 and kindly donated by Maria Ramos–Womack) and the inbred laboratory strain, mau12 (14021-0241.60). Wild-type lines were subjected to single-pair sibling mating for a minimum of nine generations. Genomic DNA extraction and library preparation were performed for pooled females as previously described (Garrigan et al. 2012). However, additional sequences for the mau12 line were collected from a large-insert, paired-end Nextera library (∼3,000 bp) and a TrueSeq paired-end RNA library (∼260 bp). For the large-insert library, we pooled females and purified genomic DNA from whole flies using the DNeasy Blood and Tissue kit (Qiagen, Valencia, CA). DNA concentration was determined with the Qubit Flourometer (Life Technologies, Grand Island, NY) and quality was assessed using the Agilent Bioanalyzer (Agilent, Santa Clara, CA). Mate pair libraries were generated using the Nextera Mate Pair protocol (Illumina, San Diego, CA). Briefly, 4 μg of genomic DNA was fragmented per manufacturer’s protocol followed by strand displacement, agarose gel size selection of 5-kb target size, and circularization. The 5-kb fragments were sheared using a Covaris S2 (Covaris Inc., Woburn, MA) to 300- to 1,000-bp size fragments for end repair, A-tailing, and indexed adaptor ligation. The amplified libraries were purified by AMPure (Beckman Coulter, Brea, CA) purification and hybridized to an Illumina paired-end flow cell for cluster amplification using the cBot (Illumina) at a concentration of 8 pmol per lane.
For RNAseq, males and females were reared on standard cornmeal food at room temperature. 50% of males and 50% of females were allowed to freely mate for 3 days, the rest were maintained as virgins. Adults were aged 4–5 days then flash frozen in liquid nitrogen and stored at −80 °C. Total RNA was isolated from whole tissue using the RNeasy Plus Kit (Qiagen) per manufacturer’s recommendations. RNA concentration was determined with the NanoDrop 1000 spectrophotometer (NanoDrop, Wilmington, DE) and RNA quality was assessed with the Agilent Bioanalyzer (Agilent). The TruSeq RNA Sample Preparation Kit V2 (Illumina) was used for next generation sequencing library construction per manufacturer’s protocols. Briefly, polyA mRNA was purified from approximately 100 ng total RNA with oligo-dT magnetic beads and fragmented. First-strand cDNA synthesis was performed with random hexamer priming followed by second-strand cDNA synthesis. End repair and 3′ adenylation was performed on the double-stranded cDNA. Illumina-indexed adaptors were ligated to both ends of the cDNA, purified by gel electrophoresis, and amplified with polymerase chain reaction primers specific to the adaptor sequences to generate amplicons of approximately 200–500 bp in size. The amplified libraries were purified by AMPure (Beckman Coulter) purification and hybridized to an Illumina paired end flow cell for cluster amplification using the cBot (Illumina) at a concentration of 8 pmol per lane.
Each of the three additional mau12 libraries (large-insert, male RNAseq, and female RNAseq) were bar-coded and multiplexed on the equivalent of one-half of a single flow cell. Paired-end (2×100 bp) sequencing was performed per manufacturer’s recommendations. The raw data were demultiplexed using configurebcl2fastq.pl version 1.8.3. Low complexity reads and vector contamination were removed using sequence cleaner (seqclean) and the National Center for Biotechnology Information (NCBI) univec database, respectively. The FASTX toolkit (fastq_quality_trimmer) was applied to remove bases with Phred quality scores below Q = 13 from the end of each read.
De Novo Genome Assembly
Both the short- and long-insert libraries were used for de novo assembly of the mau12 genome. Raw sequence reads were first filtered by discarding reads with 1) more than 25% of bases called as ambiguous, 2) more than 40% of bases masked as low sequence complexity, and 3) containing either vector or adapter sequences. The ABYSS genome assembler software was then used to assemble reads from both libraries simultaneously (Simpson et al. 2009) with a k-mer size of 64 bp. Statistics describing the mau12 de novo genome assembly can be found in supplementary table S1, Supplementary Material online.
The resulting contigs were ordered first by identifying contigs containing the flanking sequence from the P [lac-w+]-element insertion sites (Araripe et al. 2006). A total of 69 contigs could be ordered using the P-element insertion sites, which represents nearly 39 Mb of sequence. Additional contigs were ordered by assuming synteny with D. simulans. The NUCMER program (Kurtz et al. 2004) was used to align all mau12 contigs to the reference sequence for the D. simulans w501 strain (Hu et al. 2013). Contigs were then ordered by placing uniquely mapping contigs in the same order as the w501 genomic sequence with the highest sequence identity. There was no resulting conflict between placing contigs in this manner with the order inferred from the P-element insertion sites (data not shown). Finally, syntenic pseudochromosomes were constructed by inferring regions of overlap between adjacent contigs and filling gaps with IUPAC ambiguity characters, such that the total chromosome length of the mau12 draft is the same as the w501 chromosomes (supplementary table S2, Supplementary Material online). Finally, unincorporated contigs were blasted to the NCBI database. Unincorporated contigs with bacterial or viral homology were excluded, whereas those with homology to insects or those with no known homology were retained for further analysis.
Short Read Alignment
The BWA-MEM alignment algorithm (Li 2013) was used to map sequence reads from each of the ten lines of D. mauritiana and two lines of D. simulans, the MD063 strain from Madagascar (Garrigan et al. 2012) and the w501 strain (Hu et al. 2013). Additionally, short reads from the D. melanogaster reference strain y; cn bw; sp were aligned to the mau12 de novo draft genome (see supplementary table S3, Supplementary Material online, for SRA accession numbers). Reads that did not properly pair in mapping or that had a Phred-scaled mapping score less than 30 were discarded. Additionally, to increase the accuracy of variant calling, all duplicate sequences were removed and reads were realigned around flanking indels using the SAMTOOLS software (Li et al. 2009). The total number of reads mapping to the mau12 reference genome per library, after quality filtering, is provided in supplementary table S3, Supplementary Material online. Finally, the proportion of the mau12 reference genome covered by the short read assembly and the median read depth across the major chromosome arms are given in supplementary table S4, Supplementary Material online.
Transcriptome Assembly and Annotation of Genome Features
To characterize the transcriptome of mau12, we separately mapped reads from paired-end RNA libraries of males and females to the mau12 syntenic pseudochromosomes using BOWTIE2 (Langmead and Salzberg 2012). Transcripts were assembled using CUFFLINKS and the male and female GTF files were merged using CUFFMERGE (Trapnell et al. 2010). The largest open reading frame (ORF) was determined for each transcript using a custom Perl script, assuming canonical start and stop codons. A genome annotation file in GFF3 format (http://www.sequenceontology.org/gff3.shtml, last accessed September 12, 2014) was generated for the following fields using the GTF and ORF data for each transcript: exon, intron, coding sequences (CDS), 5′-UTR, and 3′-UTR.
Genome Scans of Polymorphism and Divergence
Divergence of the mau12 assembly with the reference sequences of both D. simulans w501 and D. melanogaster was measured as a Hamming distance in 10-kb sliding windows along whole-chromosome alignments created with the MAUVE software (Darling et al. 2010). Only windows with greater than 10% of the total number of sites being aligned were considered for analysis. Polymorphism among the ten lines of D. mauritiana was summarized by measuring unbiased nucleotide diversity (π). Estimates of π in each 10-kb window were corrected for unequal sample sizes by multiplying by a factor of n/(n − 1) (Nei 1987). Similarly, in each window, Tajima’s D statistic was calculated as a measure of the site frequency spectrum (SFS) (Tajima 1989). Because the ten lines of D. mauritiana were inbred for at least nine generations, we also measured linkage disequilibrium in 10-kb windows using the unweighted average pairwise r2 statistic ZnS (Kelly 1997). Singleton polymorphic sites were excluded from the calculation of ZnS. All analyses of polymorphism, the SFS, and linkage disequilibrium were performed using the POPBAM software with default quality filtering settings (Garrigan 2013). Finally, the calculation of π, Tajima’s D, and ZnS assumed haploid genotypes, thereby ignoring any residual heterozygosity that remains after inbreeding.
Polymorphism and Divergence Analysis of Sequence Classes
We analyzed polymorphism and divergence for the following sequence classes: CDS, 5′-UTR, 3′-UTR, intron, exon, bases 8–30 of introns shorter than 70 bp (the median intron length in the mau12 genome). We analyzed the longest transcript from each of the annotated genes, a total of 11,348 transcripts. From the BAM file described above we generated a VCF file (Danecek et al. 2011) using SAMTOOLS mpileup and BCFTOOLS view, removing reads with mapping quality less than 20 and without performing indel calling (Li et al. 2009). For estimates of π, we used VCFTOOLS (Danecek et al. 2011) to generate estimates of average pairwise distance using the “window-pi” function. The VCF file generated to calculate π contained only the ten D. mauritiana samples (the “-s” option in BCFTOOLS view). To calculate π for CDS and UTR sequences that spanned multiple fragments, we concatenated the VCF files and then renumbered the position field starting with position “1” through the total length of all concatenated fragments. To calculate all other summary statistics, we generated FASTA sequences from the VCF file using vcfutils.pl vcf2fq (part of the SAMTOOLS package) and the “seq” and “cutN” utilities of the “sequence tool kit” (SEQTK v1.0-r31, https://github.com/lh3/seqtk, last accessed September 12, 2014). For CDS and UTR sequences that spanned multiple fragments, we concatenated FASTA sequences together into a single file and reverse complemented CDS as necessary. The average Jukes–Cantor-corrected pairwise distance (Nei and Kumar 2000) between D. mauritiana and D. melanogaster was calculated from these FASTA alignments using a custom Perl script. In addition, we calculated average dN and dS (Yang and Nielsen 2000) and pN and pS using a custom Perl script calling CODEML (Yang 2007). Similar results were obtained using YN00, which is an approximate method for calculating dN and dS (Yang and Nielsen 2000).
McDonald–Kreitman (MK) tests were used to test the neutral model of protein evolution (McDonald and Kreitman 1991) and estimate the proportion of adaptive amino acid substitutions (Smith and Eyre-Walker 2002). Two different variants of the MK test were performed: 1) A pooled, unpolarized MK test between D. mauritiana and D. melanogaster, and 2) a lineage-specific MK test to identify D. mauritiana-specific mutations where D. simulans w501 was used as the near-outgroup and D. melanogaster as a far-outgroup. Tests were performed with custom Perl scripts using the method of Begun et al. (2007). The proportion of divergent sites driven by positive selection (α) was calculated from counts of polymorphic and fixed synonymous and nonsynonymous changes (Fay et al. 2002), after excluding singletons. A polarized and unpolarized statistic was estimated as described for MK tests. A separate measure, α*, was calculated using the short intron sequence class as our neutral standard. Again, we counted the total polymorphic sites and fixed differences, excluding singletons. Finally, 95% confidence intervals around α estimates were calculated by performing 1,000 bootstrap resampling replicates.
Tests for Recent Selective Sweeps
Recent selective sweeps are expected to generate aberrant SFS surrounding the targets of selection. We used the parametric test described by Nielsen and colleagues, implemented in the program SweepFinder (Nielsen et al. 2005), to suggest candidate regions experiencing recent selective sweeps by identifying genomic windows with an unusual SFS. We first used POPBAM to identify all variable sites in the alignment described above. We then focused our analyses on the D. mauritiana lineage by filtering for sites where a derived state was present in at least one D. mauritiana sample, but absent in all samples from D. melanogaster and D. simulans. We calculated unfolded frequency spectra for 10-kb nonoverlapping windows and used those windows with at least 5,000 aligned sites as input for SweepFinder, which determines a maximum composite likelihood ratio for each window by contrasting the likelihood of a complete selective sweep at the location to the null hypothesis of no sweep using the observed SFS in a window and the chromosome-wide SFS. We calculated P values of the maximum composite likelihood ratios reported by SweepFinder using 9 degrees of freedom. Finally, contiguous tracts of windows were grouped into putative sweep regions using a simple hidden Markov model with two possible states, “sweep” and “nonsweep.” The emission probabilities are derived directly from the SweepFinder P values and the transition probabilities are derived from the product of the marginal frequencies of significant and nonsignificant SweepFinder windows.
Analysis of Repetitive Sequence
The repeat content of each chromosome arm was analyzed with RepeatMasker (Smit et al. 2010). To analyze repeat-enriched regions of each chromosome, a dot plot was generated in Geneious (Kearse et al. 2012) and repeat blocks were identified by eye. Chromosome coordinates were estimated as the proportional distance along the dot plot axis. Our repeat analysis is based on the proportion of masked sequence, rather than the absolute length of masked sequence, and is robust to error in estimating these coordinates. In addition, we separately characterized the D. simulans 359-bp satellite, which was absent from the Repbase database used by RepeatMasker (Jurka et al. 2005). The 359-bp satellite sequences were ascertained by a BLAST search using the D. melanogaster canonical 359 satellite sequence (Hsieh and Brutlag 1979) against the D. simulans reference assembly (Hu et al. 2013); the sequence defined by coordinates X:1,218,055–1,218,405 was used as the query in the BLAST against our mau12 reference sequence.
Results
De Novo Genome Assembly and Annotation
Our de novo assembly of the reference D. mauritiana white (w; here after mau12) has a total length of more than 124 Mb. The longest of our 2,759 scaffolds is nearly 3 Mb (supplementary table S1, Supplementary Material online). More than 50% of the assembly is contained in scaffolds ≥542 kb. We assembled chromosome-level sequences for the five major arms, using between 59 and 151 scaffolds per arm. Each chromosome-level draft sequence is ≥80% of the length of the corresponding D. melanogaster reference sequence. For the ten D. mauritiana lines, we obtained 72- to 130-fold sequence coverage spanning 98.8–99.5% of the de novo mau12 assembly (supplementary table S4, Supplementary Material online). To annotate the de novo assembly, we generated more than 38 (28) million pairs of 100-bp RNAseq reads from the whole bodies of mau12 females (males), after quality filtering. In total, 86% (80%) of reads from females (males) were mapped to the mau12 pseudochromosome reference sequences using BOWTIE2. Despite the greater number of reads generated from females, CUFFLINKS predicts 15,890 transcripts (from 12,071 genes) in males, compared with 12,488 transcripts (from 9,762 genes) in females. Merging the transcript annotations of the sexes yields a total of 16,261 predicted transcripts from 11,356 genes. Note that the number of genes in the merged annotation is lower than the male annotation because overlapping or closely neighboring transcripts may be combined into a single gene model.
Genome Diversity and Divergence
We surveyed nucleotide polymorphism and divergence from D. melanogaster in 10-kb windows for ten D. mauritiana genomes (fig. 1). Compared with the autosomes, the X chromosome has significantly elevated divergence (dX/dA = 1.100), but reduced polymorphism (πX/πA = 0.649; table 1, Mann–Whitney U tests, PMWU < 2.2 × 10−16). These contrasting X/A ratios for polymorphism and divergence cannot be explained by a standard neutral model, assuming no selection, an equal breeding sex ratio, and constant effective population size (Ne), such that Ne,X/Ne,A = ¾. Furthermore, the observed πX/πA ratio of nucleotide diversity is difficult to reconcile with an extreme founder event (Pool and Nielsen 2008), for which there is no evidence in the recent history of D. mauritiana. The X/A ratios of polymorphism and divergence are however consistent with a model involving selection. First, under a model of nearly neutral evolution, slightly deleterious substitutions can accumulate faster on the X than on the autosomes. Assuming new mutations have scaled selection coefficients of Nes −3 (or weaker), and assuming Ne,X/Ne,A 0.65 (as observed), then the X chromosome is expected to experience a higher rate of substitution than the autosomes, regardless of dominance (Vicoso and Charlesworth 2009). Second, under a model of adaptive evolution, beneficial mutations can accumulate on the X more quickly than on the autosomes (Charlesworth et al. 1987; Vicoso and Charlesworth 2009). Assuming new beneficial mutations tend to be recessive, then the X/A ratios of polymorphism and divergence could be consistent with a model of recurrent hitchhiking in which selective sweeps on the X chromosome are more frequent, stronger, and/or more often involve new beneficial mutations rather than standing genetic variation (Begun and Whitley 2000; Orr and Betancourt 2001; Betancourt et al. 2004).
Fig. 1.—
Scans of population genetics statistics across the five major chromosomal arms of the D. mauritiana genome. The four statistics were calculated in nonoverlapping 10-kb windows. Each column of plots represents scans from a single chromosome arm. The top row of plots (blue points) shows scans of nucleotide diversity (π). The second row of plots (red points) shows the distribution of a likelihood ratio test statistic that measures the deviation of the local allele frequency spectrum (LR). The third row of plots (grey points) shows the measure of linkage disequilibrium (ZnS) across all arms. Finally, the bottom row of plots (green points) shows the scan of average sequence divergence between the ten D. mauritiana samples and a single D. melanogaster genome sequence.
Table 1.
Nucleotide Diversity in Drosophila mauritiana (π), Divergence between D. mauritiana and Drosophila melanogaster (d, Jukes–Cantor Corrected Average Pairwise Divergence), and Tajima’s D (TD) for Different Functional Classes of Site
Sequence Class | πX | πA | πX/πA | dX | dA | dX/dA | TDX | TDA | TDX/TDA |
---|---|---|---|---|---|---|---|---|---|
Genome | 0.008 | 0.013 | 0.649* | 0.045 | 0.041 | 1.100* | −0.53 | −0.33 | 1.58* |
SI8-30 | 0.012 | 0.022 | 0.567* | 0.075 | 0.076 | 0.979* | −0.51 | −0.55 | 0.93 |
Synonymous | 0.011 | 0.015 | 0.757 | 0.109 | 0.102 | 1.065* | −0.90 | −0.97 | 0.93* |
Introns | 0.007 | 0.011 | 0.653* | 0.033 | 0.032 | 1.040* | −0.78 | −0.81 | 0.96* |
UTR | 0.005 | 0.008 | 0.603* | 0.034 | 0.034 | 1.026* | −0.95 | −1.02 | 0.93* |
Nonsynonymous | 0.001 | 0.002 | 0.868* | 0.014 | 0.011 | 1.260* | −0.93 | −1.07 | 0.87* |
Note.—The values given in the table are averages over all autosomal sites and all X-linked sites. The ratio of the X-linked value to that of the autosomal value is also provided. Asterisk indicates a significant departure from expectations based on selective neutrality and equal breeding sex ratio (Mann–Whitney U test, P < 0.05)
Heterogeneity in levels of polymorphism and divergence also exists within chromosome arms, in part due to local selective sweeps (see below), and in part due to the qualitatively different patterns of diversity and divergence in centromere- and telomere-proximal regions. Euchromatic regions that are centromere- and telomere-proximal have reduced rates of crossing over (True, Mercer, et al. 1996), reduced polymorphism, and elevated divergence (fig. 1; Begun et al. 2007; Langley et al. 2012; Mackay et al. 2012). These patterns imply that recurrent selective sweeps and/or background selection, and their consequent Hill–Robertson effects, impact the genomic distribution of polymorphism and divergence in the D. mauritiana genome, especially in regions with reduced rates of crossing over (Hill and Robertson 1966; Felsenstein 1974; Maynard-Smith and Haigh 1974; Begun and Aquadro 1992; Charlesworth et al. 1993; Hey and Kliman 2002; Langley et al. 2012; Mackay et al. 2012).
Selective Constraint in the Genome
To characterize how natural selection has shaped sequence evolution in the D. mauritiana genome, we compared divergence and polymorphism across five sequence classes (table 1 and supplementary fig. S1, Supplementary Material online). Synonymous sites show the highest divergence, followed by bases 8–30 of short introns (i.e., introns shorter than the median intron length of 70 bp, hereafter SI8-30), suggesting that these are the two least constrained sequence classes. Within D. mauritiana, SI8–30 sites have the highest per-site estimate of π and the least negative Tajima’s D, a summary of the SFS (table 1). These findings suggest that SI8–30 sites have experienced the least selective constraint within the recent history of the species (Parsch et al. 2010). Divergence, polymorphism, and Tajima’s D at nonsynonymous sites are lower than those at synonymous and SI8–30 sites (table 1), consistent with strong functional constraints on protein-coding changes. Divergence, polymorphism, and Tajima’s D at introns and UTRs are intermediate (table 1), consistent with weaker functional constraints at these sites (Andolfatto 2005; Begun et al. 2007; Langley et al. 2012; Mackay et al. 2012).
Different Patterns of Molecular Evolution on the X Chromosome and Autosomes
Sequence divergence between D. mauritiana and D. melanogaster is higher on the X chromosome than on the autosomes for all sequence classes (PMWU ≤ 0.023) but one: SI8–30 site divergence on the X is significantly lower than on the autosomes (PMWU = 0.0035; table 1). Assuming SI8–30 sites are the least constrained class, the slower X evolution observed at these sites suggests either that these positions are weakly constrained, with more effective purifying selection on the X, or that the rate of mutation is lower on the X chromosome. Tajima’s D at SI8–30 sites is comparable between the X and autosomes (−0.51 and −0.55, respectively, PMWU = 0.0783), giving no indication of a chromosome-wide difference in the efficacy of purifying selection. Instead, these findings are consistent with a male germline mutation rate that is 1.14-fold higher than that in the female germline (CI95 = 0.91-1.42) (Miyata et al. 1987), as seen in some previous population and experimental genetic analyses in other Drosophila species (Bachtrog 2008; Keightley et al. 2009).
Several lines of evidence indicate more frequent positive selection on the X chromosome relative to the autosomes. First, assuming that SI8–30 sites are indeed the least constrained class (Parsch et al. 2010), then the faster X evolution observed for all other sequence classes implies more frequent positive selection at these X-linked sites. Second, for all noncoding sequence classes, polymorphism on the X chromosome is significantly reduced relative to the autosomes (PMWU < 2 × 10−16), even after accounting for the chromosome difference in effective size (Nx/NA = ¾). As the SFS is comparable between X and autosomes for each sequence class (table 1; the same holds for preferred, unpreferred, and neutral synonymous polymorphisms; not shown), the reduced sequence polymorphism on the X is not readily attributable to more effective purifying selection. A strong alternative is that the reduced diversity on the X reflects more frequent hitchhiking events caused by the fixation of new beneficial mutations (Begun and Whitley 2000). Taken together, the data support a model in which the X has a slightly lower mutation rate but nevertheless experiences a higher substitution rate at most functional sequence classes owing to a greater efficacy of positive natural selection (Charlesworth et al. 1987; Kirkpatrick and Hall 2004).
Adaptive Evolution of Protein-Coding Genes
We used MK tests (McDonald and Kreitman 1991) to evaluate the neutral mutation–drift null hypothesis for more than 11,000 protein-coding genes. We performed unpolarized MK tests, with D. melanogaster as the outgroup species, and polarized MK tests, to characterize lineage-specific evolution in D. mauritiana, with D. melanogaster as the far-outgroup and D. simulans as an additional near-outgroup. The polarized MK tests thus involve polymorphisms and fixed differences accumulated only in the D. mauritiana lineage. At the P < 0.05 level, 91 (417) genes have excess nonsynonymous substitutions for the polarized (unpolarized) MK tests, consistent with recurrent positive selection, whereas 26 (267) genes have excess nonsynonymous polymorphisms, consistent with segregating deleterious alleles or some forms of balancing selection. At the P < 0.01 level, 23 (144) genes have excess nonsynonymous substitutions for the polarized (unpolarized) MK tests, and 5 (89) have excess nonsynonymous polymorphisms. For the unpolarized MK tests, the X chromosome has a significant 1.6-fold excess of positively selected genes relative to autosomes at the P < 0.01 level using Fisher’s exact test (PFET = 0.020), but not at the P < 0.05 level (PFET = 0.536). For the polarized tests, the X shows a 5.1- and 2.2-fold excess of positively selected genes on the X chromosome at the P < 0.01 and 0.05 levels (PFET ≤ 0.001; supplementary table S5, Supplementary Material online). The X and autosomes do not differ in the frequency of genes with excess nonsynonymous polymorphism (PFET ≥ 0.550). For the unpolarized MK data, the estimate of the proportion of amino acid substitutions fixed by positive selection (α) is calculated using synonymous mutations as a neutral reference class, following Fay et al. (2002) and excluding singletons. The mean value of α for X-linked genes (0.240, CI95 = 0.199–0.282) is less than that for autosomal genes (0.359, CI95 = 0.344–0.373; table 2; but see below). For the polarized MK data, the opposite appears true: α for X-linked genes (0.362, CI95 = 0.305–0.418) is significantly greater than that for autosomal genes (0.211, CI95 = 0.181–0.240; table 3). These analyses show that X-linked genes are more likely to have individually significant histories of recurrent positive selection relative to autosomal genes and, at least for the polarized data, a greater fraction of nonsynonymous substitutions is beneficial on the X.
Table 2.
Mean and 95% Confidence Intervals for the Estimated Proportion of Nonsynonymous Fixations on the X Chromosome (αX) and Autosomes (αA) That Are Driven by Positive Selection, Assuming Synonymous Sites Are Selectively Neutral
αX |
αA |
|||
---|---|---|---|---|
Mean | CI95 | Mean | CI95 | |
Polarized | 0.362 | 0.305–0.418 | 0.211 | 0.181–0.240 |
Unpolarized | 0.240 | 0.199–0.282 | 0.359 | 0.344–0.373 |
Table 3.
Mean and 95% Confidence Intervals for the Estimated Proportion of Fixations on the X Chromosome (α*X) and Autosomes (α*A) Driven by Positive Selection for a Variety of Sequence Classes, Assuming Short Introns Are Selectively Neutral
Sequence |
α*X |
α*A |
||
---|---|---|---|---|
Class | Mean | CI95 | Mean | CI95 |
Synonymous | 0.372 | 0.319 to 0.418 | 0.277 | 0.258 to 0.295 |
Nonsynonymous | 0.583 | 0.541 to 0.572 | 0.559 | 0.546 to 0.370 |
Intron | −0.132 | −0.203 to −0.063 | −0.055 | −0.080 to −0.030 |
UTR | 0.236 | 0.187 to 0.282 | 0.241 | 0.221 to 0.259 |
We tested for Gene Ontology (GO) category enrichment among genes having histories of recurrent adaptive protein evolution (Eden et al. 2009). Over the long D. mauritiana–D. melanogaster history of divergence (unpolarized MK data), the top genes function in reproduction, the nuclear pore complex (including the hybrid incompatibility gene, Nup160; Tang and Presgraves 2009), and the regulation of satellite DNA (supplementary table S6, Supplementary Material online). While not detected by formal GO analyses, we noted that 6/14 genes with roles in kinetochore function show evidence of recurrent positive selection, consistent with models of genetic conflict over centromeric drive (Henikoff et al. 2001; Malik and Henikoff 2001; Przewloka et al. 2007). During the recent history of the D. mauritiana lineage (polarized MK data), the top genes function in the female germline (e.g., female germ-line cyst formation), mRNA catabolism (e.g., nuclear-transcribed mRNA catabolic process, nonsense-mediated decay), chromosome biology, and the regulation of satellite DNAs (e.g., dodeca-satellite-binding protein and topoisomerase 2; supplementary table S7, Supplementary Material online).
We also estimated the proportion of substitutions attributable to positive selection using SI8–30 sites as a putatively neutral reference class (hereafter, α*), instead of synonymous sites (α), for the unpolarized data (table 3). On the X chromosome, nonsynonymous (0.583, CI95 = 0.541–0.619), synonymous (0.372, CI95 = 0.319–0.418), and UTR substitutions (0.236, CI95 = 0.187–0.282) all show evidence for substantial adaptive evolution, whereas intronic sequences do not (−0.132, CI95 = −0.203 to −0.063). Similarly, on the autosomes, nonsynonymous substitutions have the highest α* (0.559, CI95 = 0.546–0.572), followed by synonymous (0.277, CI95 = 0.258–0.295) and UTR (0.241, CI95 = 0.221–0.259) substitutions, whereas intronic substitutions show no evidence for adaptive evolution ( − 0.055, CI95 = −0.080 to −0.030). These analyses provide evidence for adaptive evolution at UTR sequences (Andolfatto 2005) and, surprisingly, at synonymous sites. The latter finding has three implications. First, if true, then α for unpolarized nonsynonymous substitutions (α = 0.240 and 0.359 for X and autosomes, respectively; see above) are likely to be underestimates (α* = 0.583 and 0.559 for X and autosomes, respectively). Second, the X and the autosomes show comparable α* estimates among all sequence classes except synonymous substitutions, for which α* is significantly higher on the X (0.372 vs. 0.277). This finding helps to explain why α for nonsynonymous substitutions on the X appeared lower than on the autosomes in the unpolarized MK data (see above). Third, the higher α* for synonymous sites on the X chromosome implies a greater efficacy of weak positive selection than on the autosomes. However, as X-linked UTRs do not show an elevated α* relative to autosomes, the elevated α* at X-linked synonymous sites would seem to be attributable to the biology of biased codon usage rather than a generally greater efficacy of selection on the X. Interestingly, X-linked genes maintain higher codon usage bias for preferred synonymous codons in Drosophila (Singh et al. 2005, 2008). The present results suggest that synonymous sites genome-wide may be adapting to new or shifting optimal codon preferences (Comeron and Kreitman 2002; DuMont et al. 2004). We cannot, however, exclude the possibility that the signal of adaptive evolution (or, part of it) results from a historical change in effective population size or functional constraints (Eyre-Walker 2002).
Recent Selective Sweeps
The SweepFinder analyses of 10,576 ten-kilobase windows from ten D. mauritiana strains identify 152 windows (1.4%) with significantly aberrant SFS relative to their chromosomal background at the P < 0.05 level (fig. 1). Significant windows are overrepresented on the X chromosome (61) relative to the autosomes (91; χ2 = 54.68, df = 1, P = 1.4 × 10−13). To characterize the distribution of polymorphism in these anomalous regions, we compared summaries of polymorphism (π), linkage disequilibrium (ZnS), and the SFS (using Tajima’s D) in the 152 aberrant windows with randomly selected windows using a simple randomization test. We performed 1,000 replicates randomly selecting 152 windows without replacement, divided among chromosome arms to match the distribution of aberrant windows. Anomalous windows have significantly lower π and significantly higher ZnS (P < 0.05). Although Tajima’s D is not different, the variance in Tajima’s D is significantly larger among anomalous windows (supplementary tables S8–S11, Supplementary Material online). For inferred sweep regions that are ≥20 kb, eight large sweeps have affected approximately 4.1% of the total map length of chromosome X; four large sweeps affect approximately 1.7% of the length of chromosome 2L, four affect 0.8% of 2R, five affect 0.4% of 3L, and no sweeps ≥20 kb are detected on chromosome 3R (table 4). Finally, the presence of these putative sweeps alone cannot account for the reduced πX/πA ratio (table 1); when all putative sweep regions on the X are excluded, πX/πA = 0.655 for the remaining X-linked sites.
Table 4.
Recently Swept Regions of the Drosophila mauritiana Genome That Are Larger Than 20 kb
Chromosome | Coordinates | Length (kb) | cM |
---|---|---|---|
2L | 1,445,000–1,468,000 | 23 | 0.095 |
2L | 20,687,000–20,943,000 | 256 | 0.405 |
2L | 22,406,000–22,635,000 | 229 | 0.362 |
2L | 22,837,000–23,030,000 | 193 | 0.305 |
2R | 1,390,000–1,576,000 | 186 | 0.294 |
2R | 1,704,000–1,760,000 | 56 | 0.088 |
2R | 3,064,000–3,087,000 | 23 | 0.143 |
2R | 8,774,000–8,798,000 | 24 | 0.119 |
3L | 7,000–128,000 | 121 | 0.399 |
3L | 22,379,000–22,493,000 | 114 | 0.007 |
3L | 22,687,000–22,879,000 | 192 | 0.012 |
3L | 23,184,000–23,313,000 | 129 | 0.007 |
3L | 23,562,000–23,610,000 | 48 | 0.003 |
3L | 23,712,000–23,751,000 | 39 | 0.002 |
X | 41,000–120,000 | 79 | 0.072 |
X | 436,000–457,000 | 21 | 0.019 |
X | 7,127,000–7,160,000 | 33 | 0.220 |
X | 8,500,000–9,050,000 | 550 | 3.008 |
X | 9,351,000–9,375,000 | 24 | 0.130 |
X | 14,629,000–14,661,000 | 32 | 0.107 |
X | 16,134,000–16,274,000 | 140 | 1.108 |
X | 19,996,000–20,227,000 | 231 | 0.679 |
Note.—The sweep coordinates given are from the mau12 genome assembly. The estimated size of the putative sweep region is provided in physical distance (kb) and centimorgans (True, Mercer, et al. 1996).
The majority of putative selective sweeps on the autosomes occur in low-recombination centromere-proximal regions (fig. 1). However, several of the putative selective sweeps on the X chromosome affect unusually large physical and recombination distances in the middle of the chromosome (table 4). The largest putative selective sweep occurs between positions X:8,500,000–9,050,000 (fig. 2). Interestingly, this 3-cM sweep region is enriched for satellite DNAs (figs. 2 and 4 and see the following subsection) and spans the locations of the Winters sex-ratio genes, MDox and Dox, which have been shown to be part of a cryptic male meiotic drive system in D. simulans (Tao et al. 2007; Kingan et al. 2010). To assess the potential contribution of the Winters sex-ratio genes to this selective sweep, we assayed the presence of both Dox and MDox in 26 D. mauritiana strains, including six strains used for whole-genome sequencing (supplementary methods, Supplementary Material online). We find that Dox is present in only 23% (6/26) of these strains (supplementary fig. S2, Supplementary Material online) and therefore could not have caused the large selective sweep. (The reference strain, mau12, lacks the Dox gene.) MDox, however, appears fixed in D. mauritiana (26/26; supplementary fig. S3, Supplementary Material online). Thus, MDox remains a potential cause of the selective sweep.
Fig. 2.—
Nucleotide diversity (π) across a large selective sweep on the X chromosome of D. mauritiana. The red rectangle in the inset above the graph delimits the region of reduced polymorphism on the X chromosome. Unfilled circles plot nucleotide diversity in 1-kb windows between positions X: 8,500,000–9,050,000. The red bars below the physical position of 1-kb windows represent the scaffolds constituting this region in the de novo assembly. Below the scaffolds are blue triangles that depict the gene models for a representative group of genes in the putative sweep regions (the names are from the homologous annotations in the D. melanogaster genome). The yellow squares mark the positions of 359-like satellite DNAs. Below the gene models are orange-filled triangles that show the positions of the markers used by True, Mercer, et al. (1996) and the observed intervening recombination distances. Finally, the green triangles give the locations of the two Winters sex-ratio meiotic drive genes in the D. simulans genome.
Fig. 4.—
Repeat content differs on the X and autosomes. In the top row, dot plots of the X chromosome and a representative autosomal arm (2R) show an enrichment of repetitive sequence on the X, which is organized into blocks of high repeat density. The bottom left panel shows the Percent repetitive sequence for the X chromosome compared with the average across the major autosomal arms. Finally, the bottom right panel shows the proportion of the different repeat classes in the highlighted repeat blocks on chromosomes X and 2R.
A second large sweep in a high-recombination region occurs sweep between positions 16,134,000 and 16,274,000 on the X chromosome (fig. 1). For this putative sweep, we estimate the population-scaled selection intensity (γ = 2Nes). We assume that nucleotide diversity before the sweep was π0 = 0.0082 (the average for the X chromosome) and that the local per-base crossing over rate is 7.91 × 10−6 (True, Mercer, et al. 1996). We then predict nucleotide diversity at recombination distance c, using the equation
(1) |
(Barton 2000). We fit the model to the data using a simple least-squares approach. The best-fitting model occurs when γ = 7.05 × 105 (fig. 3). In this candidate sweep region, there is a 20-kb stretch of near-complete homozygosity in D. mauritiana. From our RNAseq data, we have annotated four genes in this 20-kb region and identified their homologs in D. melanogaster: wupA, CG32553, CG43133, and ari-1. Of these four genes, only CG32553 and CG43133 have nonsynonymous substitutions exclusively in the D. mauritiana sample (supplementary table S12, Supplementary Material online). Neither CG32553 nor CG43133 has any known function or phenotype in D. melanogaster and their products do not possess homology with any known proteins or conserved domains, although, in D. melanogaster, CG32533 is expressed exclusively in the eye and CG43133 in the eye-antennal disc. Nolte et al. (2013) raised the possibility that the HMS locus Odysseus (OdsH) is located within this sweep region; however, in our assembly, OdsH maps to a distal region 200 kb away.
Fig. 3.—
A canonical selective sweep on the D. mauritiana X chromosome spans more than 140 kb. The red rectangle in the inset above the graph delimits the region of reduced polymorphism on the X chromosome. Unfilled circles plot nucleotide diversity (π) in 1-kb windows and the red line plots the expected π under a simple hard sweep model (see text). The red bars below the physical position of the 1-kb windows represent the scaffolds in the de novo assembly. Below the scaffolds are blue triangles showing the gene models. A 20-kb region of near-complete homozygosity contains four genes: wupA, CG32553, CG43133, and ari-1.
Finally, there is another large putative selective sweep on chromosome X:7,127,000–7,160,000 (table 4 and fig. 1). In this 33-kb region, there is again a 20-kb interval of nearly complete homozygosity among the ten D. mauritiana strains. The RNAseq data indicate that six transcripts map to this 20-kb interval (data not shown), each with a homolog in D. melanogaster: Hira, NELF-B, CG12155, Pdp, Rab39, and Tom40. Interestingly, expression of the Hira gene is known to be associated with Wolbachia-induced cytoplasmic incompatibility in male D. melanogaster (Zheng et al. 2011).
Genomic Distribution of Repetitive DNAs
We characterized the genomic distribution of five classes of repetitive DNA in the de novo assembly of the reference strain mau12: long terminal repeat (LTR) and non-LTR retrotransposons, DNA transposons, satellite DNAs, and microsatellites. These analyses disproportionately survey the euchromatic portions of the genome, because heterochromatic regions can be underrepresented in the assembly. We find that the densities of all classes of repetitive DNA are significantly heterogeneous among chromosome arms (χ2 = 72.30, df = 3, P < 10−5; supplementary table S13, Supplementary Material online). Curiously, DNA transposons and both classes of retroelements are underrepresented on chromosome arm 3R and overrepresented on 2R. The other classes of repetitive DNAs are all overrepresented exclusively on the X chromosome; Although significant heterogeneity does exist among autosomal arms, all four arms show a dearth of repetitive DNAs relative to the X. The X chromosome has an approximately 1.4-fold excess of microsatellites (0.035% of the sequence) and low complexity (0.005%) repeats relative to the autosomes (0.02% and 0.003%, respectively; see also Bachtrog et al. 1999). And, strikingly, the X chromosome has a 7.5-fold excess of satellite DNA (P < 10−5; 0.5%) relative to the autosomes (0.04%; see also Gallach 2014). Although satellite DNAs are typically organized as large blocks in pericentric heterochromatin, those detected here correspond to satellite DNA islands in the euchromatin (Kuhn et al. 2012). As satellite DNAs are repetitive and organized into arrays longer than can be spanned by short-read sequences, our analyses probably detect only the edges of satellite DNA islands and thus underestimate the total amount of satellite DNA.
Dot plots highlight the heterogeneity in repetitive DNA densities along chromosome arms, as well as the qualitative difference in repetitive DNA densities between the X and autosomes (fig. 4). On chromosome arm 2R, the pericentromeric proximal region is enriched for LTR retroelements, non-LTR retroelements, and DNA transposons. On the X, there are large megabase-scale regions in the middle of the chromosome arm with high densities of satellite DNAs, microsatellites, and low complexity sequences. The two most conspicuous concentrations of repetitive DNAs occur between coordinates X:8,235,022–9,591,378 (block X.1) and X:11625913–12594739 (block X.2). The coordinates of the first region coincide with those of the distal large 550-kb sweep region on the X (fig. 2 and table 4). Slightly more than half of the satellite DNA occurring in this region have homology with the 359-bp satellite (supplementary table S14, Supplementary Material online). Although there is evidence that the sex-ratio meiotic drive genes are not involved in this large selective sweep, the possibility remains that selfish satellite DNAs may play a role in eliminating variation over this large genomic region.
Discussion
Our survey of diversity and divergence from ten high-quality D. mauritiana genomes shows that the X chromosome bears more signatures of recent and recurrent positive natural selection than do the autosomes. The observed low X/A ratio of diversity cannot be easily explained by a recent population bottleneck (cf. Pool and Nielsen 2008). Our genome-wide analyses confirm that D. mauritiana harbors a surprisingly high level of nucleotide diversity for an island endemic species. Furthermore, the genome-wide SFS shows a mild excess of low frequency polymorphisms, a pattern that is more consistent with weak population expansion and/or functional constraints. In our data, there is no clear evidence for a recent recovery from a severe population bottleneck (see also Kliman et al. 2000). Instead, despite having what must be a smaller census population size than, say, the cosmopolitan species D. melanogaster, D. mauritiana maintains more diversity overall and, hence, a relatively large effective population size. Finally, the low X/A ratio of diversity cannot be attributed to greater functional constraints on the X, because the SFS is comparable between the X and the autosomes for all sequence classes (table 1), and the X shows higher divergence from D. melanogaster than do the autosomes. The disparity between X/A diversity versus divergence thus implies that positive natural selection acts more frequently on the X chromosome than the autosomes.
Our genome-wide scan for anomalous SFS confirms that the X chromosome of D. mauritiana has experienced more recent selective sweeps than have the autosomes. In general, these X chromosome sweeps are larger and produce more aberrant SFS than those on the autosomes (fig. 1 and table 4). The excess of recent sweeps on the X cannot be attributed to adaptation from standing genetic variation, which causes difficult-to-detect “soft” sweeps (Hermisson and Pennings 2005; Teshima et al. 2006), and is expected to progress more slowly on the X chromosome than the autosomes (Orr and Betancourt 2001). Instead, these findings suggest that the X has experienced more classic “hard” selective sweeps resulting from new beneficial mutations. A higher rate of adaptive evolution on the X is expected when new beneficial mutations are, on average, at least partially recessive (Charlesworth et al. 1987; Vicoso and Charlesworth 2009). This faster X model of adaptive molecular evolution may explain some of the excess divergence (Charlesworth et al. 1987) and reduced diversity on the X chromosome (Betancourt et al. 2004).
The relative abundance of hitchhiking effects on the X chromosome may be attributable to selfish genetic elements. Sex chromosomes are susceptible to the invasion and spread of multilocus meiotic drive elements (Hurst and Pomiankowski 1991), and these can cause patterns indistinguishable from classic selective sweeps (e.g., Derome et al. 2008). The distal 3-cM sweep valley on the D. mauritiana chromosome X spans the MDox locus, which may be the original distorter element of the Winters sex-ratio system (Tao et al. 2007; Kingan et al. 2010). Although additional data clearly show that the Dox locus cannot be the target of selection in this case, the progenitor MDox locus remains a viable explanation for this sweep. However, this large distal sweep also spans a particularly high density of 359-like satellite DNAs, which may also spread by manipulating transmission (Ferree and Prasad 2012; Gallach 2014). In primates, selective sweep regions on the X chromosome are also enriched for ampliconic genes (Nam et al. 2014), which may similarly be involved in conflict over transmission. Finally, our genome-wide MK analyses show that genes with individually significant evidence for recurrent adaptive protein evolution are overrepresented on the X chromosome and, separately, tend to be enriched for functions in reproductive biology and/or chromosome biology, suggestive of frequent genetic conflict (e.g., satellite DNA-binding proteins, nuclear pore proteins, kinetochore proteins).
Ours is the second population genomics analysis of D. mauritiana. The first used a “Pool-seq” strategy, sequencing 107 isofemale lines pooled to approximately 110-fold depth overall (Nolte et al. 2013). This shallow coverage of a deep sample is complementary to our deep coverage of a phased, shallow sample. Indeed, we detect the same faster X divergence, the same prominent selective sweeps on the X chromosome, and comparable sets of genes as targets of past recurrent adaptation. There are however large differences in our estimates of basic summary statistics. Despite a smaller sample, our estimates of π are 36% and 44% higher for the X and the autosomes, respectively. Conversely, our estimates of Tajima’s D are 3.7- and 5.2-fold smaller than those of Nolte et al. (2013) for the X and the autosomes, respectively (e.g., Tajima’s D = −1.79 vs. −0.33 for the autosomes; table 1). As noted by Nolte et al., the Pool-seq approach is biased toward an excess of rare variants, as sequencing errors are not readily distinguishable. Similarly, Lynch et al. (2014) estimated that the sequencing depth for Pool-seq must be 5-10× the sample size to infer low frequency alleles accurately. These comparisons suggest that Pool-seq can be useful for estimating net divergence and characterizing gross patterns of polymorphism, but not for the high-quality single nucleotide polymorphism calling necessary for accurate parameter estimation (Cutler and Jensen 2010).
The availability of population genomic data in D. mauritiana has two impacts on speciation genetics. First, for the past decade, evidence for faster X evolution in Drosophila has been equivocal, making it a doubtful contributor to the large X-effect for HMS (Presgraves 2008b). However, the present analyses, and others (Begun et al. 2007; Langley et al. 2012; Llopart 2012; Mackay et al. 2012; Meisel et al. 2012), firmly establish the fact of faster X evolution, although its biological basis remains to be determined (Meisel and Connallon 2013). Our analyses, and others’ (Begun et al. 2007; Nolte et al. 2013), are also consistent with the notion that genetic conflict is a major driver of molecular evolution, especially on the X chromosome (Frank 1991; Hurst and Pomiankowski 1991; Meiklejohn and Tao 2010). Second, the discovery that the D. mauritiana allele of Odysseus causes male sterility in a D. simulans genetic background (Ting et al. 1998) was a major step toward understanding the molecular biology of HMS. But identifying a panel of X-linked HMS factors is necessary to draw broad conclusions about the molecular basis of the large X-effect. Population genomics resources in D. mauritiana and D. simulans (Begun et al. 2007; Rogers et al. 2014) will empower genetic mapping efforts and accelerate identification of more genes involved in HMS and other species differences.
Supplementary Material
Supplementary methods, figures S1–S3, and tables S1–S16 are available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).
Acknowledgments
The authors thank Amanda Larracuente for discussion and comments. This work was made possible by a National Institutes of Health grant R01-ODO1054801 to D.G. and D.C.P.
Literature Cited
- Andolfatto P. Adaptive evolution of non-coding DNA in Drosophila. Nature. 2005;437:1149–1152. doi: 10.1038/nature04107. [DOI] [PubMed] [Google Scholar]
- Araripe LO, Eckstrand N, Hartl DL, Tao Y. Flanking regions of P-elements inserted in the 3rd chromosome of Drosophila mauritiana. Dros Inf Serv. 2006;89:54. [Google Scholar]
- Bachtrog D. Evidence for male-driven evolution in Drosophila. Mol Biol Evol. 2008;25:617–619. doi: 10.1093/molbev/msn020. [DOI] [PubMed] [Google Scholar]
- Bachtrog D, Weiss S, Zangerl B, Brem G, Schlotterer C. Distribution of dinucleotide microsatellites in the Drosophila melanogaster genome. Mol Biol Evol. 1999;16:602–610. doi: 10.1093/oxfordjournals.molbev.a026142. [DOI] [PubMed] [Google Scholar]
- Ballard JW. Comparative genomics of mitochondrial DNA in members of the Drosophila melanogaster subgroup. J Mol Evol. 2000a;51:48–63. doi: 10.1007/s002390010066. [DOI] [PubMed] [Google Scholar]
- Ballard JW. When one is not enough: introgression of mitochondrial DNA in Drosophila. Mol Biol Evol. 2000b;17:1126–1130. doi: 10.1093/oxfordjournals.molbev.a026394. [DOI] [PubMed] [Google Scholar]
- Barton NH. Genetic hitchhiking. Philos Trans R Soc Lond B Biol Sci. 2000;355:1553–1562. doi: 10.1098/rstb.2000.0716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Begun DJ, Aquadro CF. Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature. 1992;356:519–520. doi: 10.1038/356519a0. [DOI] [PubMed] [Google Scholar]
- Begun DJ, et al. Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol. 2007;5:e310. doi: 10.1371/journal.pbio.0050310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Begun DJ, Whitley P. Reduced X-linked nucleotide polymorphism in Drosophila simulans. Proc Natl Acad Sci U S A. 2000;97:5960–5965. doi: 10.1073/pnas.97.11.5960. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Betancourt AJ, Kim Y, Orr HA. A pseudohitchhiking model of X vs. autosomal diversity. Genetics. 2004;168:2261–2269. doi: 10.1534/genetics.104.030999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Campbell P, Good JM, Nachman MW. Meiotic sex chromosome inactivation is disrupted in sterile hybrid male house mice. Genetics. 2013;193:819–828. doi: 10.1534/genetics.112.148635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Charlesworth B, Coyne JA, Barton NH. The relative rates of evolution of sex chromosomes and autosomes. Am Nat. 1987;130:113–146. [Google Scholar]
- Charlesworth B, Morgan MT, Charlesworth D. The effect of deleterious mutations on neutral molecular evolution. Genetics. 1993;134:1289–1303. doi: 10.1093/genetics/134.4.1289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Comeron JM, Kreitman M. Population, evolutionary and genomic consequences of interference selection. Genetics. 2002;161:389–410. doi: 10.1093/genetics/161.1.389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coyne JA. The genetic basis of Haldane’s rule. Nature. 1985;314:736–738. doi: 10.1038/314736a0. [DOI] [PubMed] [Google Scholar]
- Coyne JA. Genetics and speciation. Nature. 1992;355:511–515. doi: 10.1038/355511a0. [DOI] [PubMed] [Google Scholar]
- Coyne JA, Charlesworth B. Genetic analysis of X-linked sterility in hybrids between three sibling species of Drosophila. Heredity. 1989;62:97–106. doi: 10.1038/hdy.1989.13. [DOI] [PubMed] [Google Scholar]
- Coyne JA, Orr HA. Two rules of speciation. In: Otte D, Endler J, editors. Speciation and its consequences. Sunderland (MA): Sinauer Associates; 1989. pp. 180–207. [Google Scholar]
- Cutler DJ, Jensen JD. To pool, or not to pool? Genetics. 2010;186:41–43. doi: 10.1534/genetics.110.121012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Danecek P, et al. The variant call format and VCFtools. Bioinformatics. 2011;27:2156–2158. doi: 10.1093/bioinformatics/btr330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Darling AE, Mau B, Perna NT. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010;5:e11147. doi: 10.1371/journal.pone.0011147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- David JR, McEvey SF, Solignac M, Tsacas L. Drosophila communities on Mauritius and the ecological niche of Drosophila mauritiana (Diptera, Drosophilidae) Rev Zool Afr. 1989;103:107–116. [Google Scholar]
- Dean MD, Ballard JWO. Linking phylogenetics with population genetics to reconstruct the geographic origin of a species. Mol Phylogenet Evol. 2004;32:998–1009. doi: 10.1016/j.ympev.2004.03.013. [DOI] [PubMed] [Google Scholar]
- Derome N, Baudry E, Ogereau D, Veuille M, Montchamp-Moreau C. Selective sweeps in a two-locus model for sex-ratio meiotic drive in Drosophila simulans. Mol Biol Evol. 2008;25:409–416. doi: 10.1093/molbev/msm269. [DOI] [PubMed] [Google Scholar]
- DuMont VB, Fay JC, Calabrese PP, Aquadro CF. DNA variability and divergence at the notch locus in Drosophila melanogaster and D. simulans: a case of accelerated synonymous site divergence. Genetics. 2004;167:171–185. doi: 10.1534/genetics.167.1.171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z. GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinformatics. 2009;10:48. doi: 10.1186/1471-2105-10-48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eyre-Walker A. Changing effective population size and the McDonald-Kreitman test. Genetics. 2002;162:2017–2024. doi: 10.1093/genetics/162.4.2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fay JC, Wyckoff GJ, Wu CI. Testing the neutral theory of molecular evolution with genomic data from Drosophila. Nature. 2002;415:1024–1026. doi: 10.1038/4151024a. [DOI] [PubMed] [Google Scholar]
- Felsenstein J. Evolutionary advantage of recombination. Genetics. 1974;78:737–756. doi: 10.1093/genetics/78.2.737. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ferree PM, Prasad S. How can satellite DNA divergence cause reproductive isolation? Let us count the chromosomal ways. Genet Res Int. 2012;2012:430136. doi: 10.1155/2012/430136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frank SA. Haldane’s rule—a defense of the meiotic drive theory. Evolution. 1991;45:1714–1717. doi: 10.1111/j.1558-5646.1991.tb02678.x. [DOI] [PubMed] [Google Scholar]
- Gallach M. Recurrent turnover of chromosome-specific satellites in Drosophila. Genome Biol Evol. 2014;6:1279–1286. doi: 10.1093/gbe/evu104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garrigan D. POPBAM: tools for evolutionary analysis of short read sequence alignments. Evol Bioinform. 2013;9:343–353. doi: 10.4137/EBO.S12751. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garrigan D, et al. Genome sequencing reveals complex speciation in the Drosophila simulans clade. Genome Res. 2012;22:1499–1511. doi: 10.1101/gr.130922.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Good JM, Dean MD, Nachman MW. A complex genetic basis to X-linked hybrid male sterility between two species of house mice. Genetics. 2008;179:2213–2228. doi: 10.1534/genetics.107.085340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henikoff S, Ahmad K, Malik HS. The centromere paradox: stable inheritance with rapidly evolving DNA. Science. 2001;293:1098–1102. doi: 10.1126/science.1062939. [DOI] [PubMed] [Google Scholar]
- Hense W, Baines JF, Parsch J. X chromosome inactivation during Drosophila spermatogenesis. PLoS Biol. 2007;5:e273. doi: 10.1371/journal.pbio.0050273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hermisson J, Pennings PS. Soft sweeps: molecular population genetics of adaptation from standing genetic variation. Genetics. 2005;169:2335–2352. doi: 10.1534/genetics.104.036947. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hey J, Kliman RM. Interactions between natural selection, recombination and gene density in the genes of Drosophila. Genetics. 2002;160:595–608. doi: 10.1093/genetics/160.2.595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hill WG, Robertson A. Effect of linkage on limits to artificial selection. Genet Res. 1966;8:269–294. [PubMed] [Google Scholar]
- Hsieh T, Brutlag D. Sequence and sequence variation within the 1.688 g/cm3 satellite DNA of Drosophila melanogaster. J Mol Biol. 1979;135:465–481. doi: 10.1016/0022-2836(79)90447-9. [DOI] [PubMed] [Google Scholar]
- Hu TT, Eisen MB, Thornton KR, Andolfatto P. A second-generation assembly of the Drosophila simulans genome provides new insights into patterns of lineage-specific divergence. Genome Res. 2013;23:89–98. doi: 10.1101/gr.141689.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hurst LD, Pomiankowski A. Causes of sex ratio bias may account for unisexual sterility in hybrids: a new explanation of Haldane’s rule and related phenomena. Genetics. 1991;128:841–858. doi: 10.1093/genetics/128.4.841. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hvilsom C, et al. Extensive X-linked adaptive evolution in central chimpanzees. Proc Natl Acad Sci U S A. 2012;109:2054–2059. doi: 10.1073/pnas.1106877109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jurka J, et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 2005;110:462–467. doi: 10.1159/000084979. [DOI] [PubMed] [Google Scholar]
- Kearse M, et al. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28:1647–1649. doi: 10.1093/bioinformatics/bts199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keightley PD, et al. Analysis of the genome sequences of three Drosophila melanogaster spontaneous mutation accumulation lines. Genome Res. 2009;19:1195–1201. doi: 10.1101/gr.091231.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kelly JK. A test of neutrality based on interlocus associations. Genetics. 1997;146:1197–1206. doi: 10.1093/genetics/146.3.1197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kingan S, Garrigan D, Hartl D. Recurrent selection on the Winters sex-ratio genes in Drosophila simulans. Genetics. 2010;184:253–265. doi: 10.1534/genetics.109.109587. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kirkpatrick M, Hall DW. Male-biased mutation, sex linkage, and the rate of adaptive evolution. Evolution. 2004;58:437–440. [PubMed] [Google Scholar]
- Kitano J, et al. A role for a neo-sex chromosome in stickleback speciation. Nature. 2009;461:1079–1083. doi: 10.1038/nature08441. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kliman RM, et al. The population genetics of the origin and divergence of the Drosophila simulans complex species. Genetics. 2000;156:1913–1931. doi: 10.1093/genetics/156.4.1913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kopp A, Frank AK, Barmina O. Interspecific divergence, intrachromosomal recombination, and phylogenetic utility of Y-chromosomal genes in Drosophila. Mol Phylogenet Evol. 2006;38:731–741. doi: 10.1016/j.ympev.2005.10.006. [DOI] [PubMed] [Google Scholar]
- Kousathanas A, Halligan DL, Keightley PD. Faster-X adaptive protein evolution in house mice. Genetics. 2014;196:1131–1143. doi: 10.1534/genetics.113.158246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuhn GC, Kuttler H, Moreira-Filho O, Heslop-Harrison JS. The 1.688 repetitive DNA of Drosophila: concerted evolution at different genomic scales and association with genes. Mol Biol Evol. 2012;29:7–11. doi: 10.1093/molbev/msr173. [DOI] [PubMed] [Google Scholar]
- Kurtz S, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:R12. doi: 10.1186/gb-2004-5-2-r12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lachaise D, David JR, Lemeunier F, Tsacas L, Ashburner M. The reproductive relationships of Drosophila sechellia with Drosophila mauritiana, Drosophila simulans and Drosophila melanogaster from the Afrotropical region. Evolution. 1986;40:262–271. doi: 10.1111/j.1558-5646.1986.tb00468.x. [DOI] [PubMed] [Google Scholar]
- Langley CH, et al. Genomic variation in natural populations of Drosophila melanogaster. Genetics. 2012;192:533–598. doi: 10.1534/genetics.112.142018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Legrand D, Chenel T, Campagne C, Lachaise D, Cariou ML. Inter-island divergence within Drosophila mauritiana, a species of the D. simulans complex: past history and/or speciation in progress? Mol Ecol. 2011;20:2787–2804. doi: 10.1111/j.1365-294X.2011.05127.x. [DOI] [PubMed] [Google Scholar]
- Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. 2013 ArXiv e-prints. [Google Scholar]
- Li H, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lifschytz E, Lindsley DL. The role of X-chromosome inactivation during spermatogenesis. Proc Natl Acad Sci U S A. 1971;69:182–186. doi: 10.1073/pnas.69.1.182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Llopart A. The rapid evolution of X-linked male-biased gene expression and the large-X effect in Drosophila yakuba, D. santomea, and their hybrids. Mol Biol Evol. 2012;29:3873–3886. doi: 10.1093/molbev/mss190. [DOI] [PubMed] [Google Scholar]
- Lynch M, Bost D, Wilson S, Maruki T, Harrison S. Population-genetic inference from pooled-sequencing data. Genome Biol Evol. 2014;6:1210–1218. doi: 10.1093/gbe/evu085. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mackay TFC, et al. The Drosophila melanogaster genetic reference panel. Nature. 2012;482:173–178. doi: 10.1038/nature10811. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malik HS, Henikoff S. Adaptive evolution of Cid, a centromere-specific histone in Drosophila. Genetics. 2001;157:1293–1298. doi: 10.1093/genetics/157.3.1293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Masly JP, Presgraves DC. High-resolution genome-wide dissection of the two rules of speciation in Drosophila. PLoS Biol. 2007;5:e243. doi: 10.1371/journal.pbio.0050243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maynard-Smith J, Haigh J. Hitch-hiking effect of a favorable gene. Genet Res. 1974;23:23–35. [PubMed] [Google Scholar]
- McDermott SR, Kliman RM. Estimation of isolation times of the island species in the Drosophila simulans complex from multilocus DNA sequence data. PLoS One. 2008;3:e2442. doi: 10.1371/journal.pone.0002442. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McDonald J, Kreitman M. Adaptive protein evolution at the Adh locus in Drosophila. Nature. 1991;351:652–654. doi: 10.1038/351652a0. [DOI] [PubMed] [Google Scholar]
- Meiklejohn C, Tao Y. Genetic conflict and sex chromosome evolution. Trends Ecol Evol. 2010;25:215–223. doi: 10.1016/j.tree.2009.10.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meiklejohn CD, Landeen EL, Cook JM, Kingan SB, Presgraves DC. Sex chromosome-specific regulation in the Drosophila male germline but little evidence for chromosomal dosage compensation or meiotic inactivation. PLoS Biol. 2011;9:e1001126. doi: 10.1371/journal.pbio.1001126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meisel RP, Connallon T. The faster-X effect: integrating theory and data. Trends Genet. 2013;29:537–544. doi: 10.1016/j.tig.2013.05.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meisel RP, Malone JH, Clark AG. Faster-X evolution of gene expression in Drosophila. PLoS Genet. 2012;8:e1003013. doi: 10.1371/journal.pgen.1003013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miyata T, Hayashida H, Kuma K, Mitsuyasu K, Yasunaga T. Male-driven molecular evolution: a model and nucleotide sequence analysis. Cold Spring Harb Symp Quant Biol. 1987;52:863–867. doi: 10.1101/sqb.1987.052.01.094. [DOI] [PubMed] [Google Scholar]
- Moehring AJ, Llopart A, Elwyn S, Coyne JA, Mackay TF. The genetic basis of postzygotic reproductive isolation between Drosophila santomea and D. yakuba due to hybrid male sterility. Genetics. 2006;173:225–233. doi: 10.1534/genetics.105.052985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Naisbit RE, Jiggins CD, Linares M, Salazar C, Mallet J. Hybrid sterility, Haldane’s rule and speciation in Heliconius cydno and H. melpomene. Genetics. 2002;161:1517–1526. doi: 10.1093/genetics/161.4.1517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nam K, et al. Strong selective sweeps associated with ampliconic regions in great ape X chromosomes. 2014 arXiv:1402.5790. [Google Scholar]
- Nei M. Molecular evolutionary genetics. New York: Columbia University Press; 1987. [Google Scholar]
- Nei M, Kumar S. Molecular evolution and phylogenetics. New York: Oxford University Press; 2000. [Google Scholar]
- Nielsen R, et al. Genomic scans for selective sweeps using SNP data. Genome Res. 2005;15:1566–1575. doi: 10.1101/gr.4252305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nolte V, Pandey RV, Kofler R, Schlotterer C. Genome-wide patterns of natural variation reveal strong selective sweeps and ongoing genomic conflict in Drosophila mauritiana. Genome Res. 2013;23:99–110. doi: 10.1101/gr.139873.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nunes MDS, Wengel POT, Kreissl M, Schlötterer C. Multiple hybridization events between Drosophila simulans and Drosophila mauritiana are supported by mtDNA introgression. Mol Ecol. 2010;19:4695–4707. doi: 10.1111/j.1365-294X.2010.04838.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orr HA, Betancourt AJ. Haldane’s sieve and adaptation from the standing genetic variation. Genetics. 2001;157:875–884. doi: 10.1093/genetics/157.2.875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parsch J, Novozhilov S, Saminadin-Peter SS, Wong KM, Andolfatto P. On the utility of short intron sequences as a reference for the detection of positive and negative selection in Drosophila. Mol Biol Evol. 2010;27:1226–1234. doi: 10.1093/molbev/msq046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phadnis N, Orr HA. A single gene causes both male sterility and segregation distortion in Drosophila hybrids. Science. 2009;323:376–379. doi: 10.1126/science.1163934. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pool JE, Nielsen R. The impact of founder events on chromosomal variability in multiply mating species. Mol Biol Evol. 2008;25:1728–1736. doi: 10.1093/molbev/msn124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Presgraves DC. Drive and sperm: evolution and genetics of male meiotic drive. In: Birkhead TR, Hosken DJ, Pitnick S, editors. Sperm biology: an evolutionary perspective. Elsevier Press; 2008a. pp. 471–506. [Google Scholar]
- Presgraves DC. Sex chromosomes and speciation in Drosophila. Trends Genet. 2008b;24:336–343. doi: 10.1016/j.tig.2008.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Przewloka MR, et al. Molecular analysis of core kinetochore composition and assembly in Drosophila melanogaster. PLoS One. 2007;2:e478. doi: 10.1371/journal.pone.0000478. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rogers RL, et al. Landscape of standing variation for tandem duplications in Drosophila yakuba and Drosophila simulans. Mol Biol Evol. 2014;31:1750–1766. doi: 10.1093/molbev/msu124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saetre GP, et al. Speciation, introgressive hybridization and nonlinear rate of molecular evolution in flycatchers. Mol Ecol. 2001;10:737–749. doi: 10.1046/j.1365-294x.2001.01208.x. [DOI] [PubMed] [Google Scholar]
- Simpson JT, et al. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009;19:1117–1123. doi: 10.1101/gr.089532.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singh ND, Davis JC, Petrov DA. Codon bias and noncoding GC content correlate negatively with recombination rate on the Drosophila X chromosome. J Mol Evol. 2005;61:315–324. doi: 10.1007/s00239-004-0287-1. [DOI] [PubMed] [Google Scholar]
- Singh ND, Larracuente AM, Clark AG. Contrasting the efficacy of selection on the X and autosomes in Drosophila. Mol Biol Evol. 2008;25:454–467. doi: 10.1093/molbev/msm275. [DOI] [PubMed] [Google Scholar]
- Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 2010 [cited 2014 Sep 12]. Available from: http://www.repeatmasker.org. [Google Scholar]
- Smith N, Eyre-Walker A. Adaptive protein evolution in Drosophila. Nature. 2002;415:1022–1024. doi: 10.1038/4151022a. [DOI] [PubMed] [Google Scholar]
- Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989;123:585–595. doi: 10.1093/genetics/123.3.585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tang S, Presgraves DC. Evolution of the Drosophila nuclear pore complex results in multiple hybrid incompatibilities. Science. 2009;323:779–782. doi: 10.1126/science.1169123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tao Y, Chen S, Hartl DL, Laurie CC. Genetic dissection of hybrid incompatibilities between Drosophila simulans and D. mauritiana. I. Differential accumulation of hybrid male sterility effects on the X and autosomes. Genetics. 2003;164:1383–1397. doi: 10.1093/genetics/164.4.1383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tao Y, et al. A sex-ratio meiotic drive system in Drosophila simulans. II: an X-linked distorter. PLoS Biol. 2007;5:e293. doi: 10.1371/journal.pbio.0050293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tao Y, Hartl DL, Laurie CC. Sex-ratio segregation distortion associated with reproductive isolation in Drosophila. Proc Natl Acad Sci U S A. 2001;98:13183–13188. doi: 10.1073/pnas.231478798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teshima KM, Coop G, Przeworski M. How reliable are empirical genomic scans for selective sweeps? Genome Res. 2006;16:702–712. doi: 10.1101/gr.5105206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ting C-T, Tsaur S-C, Wu M-L, Wu C-I. A rapidly evolving homeobox at the site of a hybrid sterility gene. Science. 1998;282:1501–1504. doi: 10.1126/science.282.5393.1501. [DOI] [PubMed] [Google Scholar]
- Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28:511–515. doi: 10.1038/nbt.1621. [DOI] [PMC free article] [PubMed] [Google Scholar]
- True JR, Mercer JM, Laurie CC. Differences in crossover frequency and distribution among three sibling species of Drosophila. Genetics. 1996;142:507–523. doi: 10.1093/genetics/142.2.507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- True JR, Weir BS, Laurie CC. A genome-wide survey of hybrid incompatibility factors by the introgression of marked segments of Drosophila mauritiana chromosomes into Drosophila simulans. Genetics. 1996;142:819–837. doi: 10.1093/genetics/142.3.819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Turelli M, Orr HA. The dominance theory of Haldane’s rule. Genetics. 1995;140:389–402. doi: 10.1093/genetics/140.1.389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Turelli M, Orr HA. Dominance, epistasis and the genetics of postzygotic isolation. Genetics. 2000;154:1663–1679. doi: 10.1093/genetics/154.4.1663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vicoso B, Charlesworth B. Effective population size and the faster-X effect: an extended model. Evolution. 2009;63:2413–2426. doi: 10.1111/j.1558-5646.2009.00719.x. [DOI] [PubMed] [Google Scholar]
- Wu CI, Davis AW. Evolution of postmating reproductive isolation: the composite nature of Haldane’s rule and its genetic bases. Am Nat. 1993;142:187–212. doi: 10.1086/285534. [DOI] [PubMed] [Google Scholar]
- Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
- Yang Z, Nielsen R. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol. 2000;17:32–43. doi: 10.1093/oxfordjournals.molbev.a026236. [DOI] [PubMed] [Google Scholar]
- Zheng Y, Ren P-P, Wang J-L, Wang Y-F. Wolbachia-induced cytoplasmic incompatibility is associated with decreased Hira expression in male Drosophila. PLoS One. 2011;6:e19512. doi: 10.1371/journal.pone.0019512. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.