Skip to main content
eLife logoLink to eLife
. 2018 Dec 13;7:e35468. doi: 10.7554/eLife.35468

Gene flow mediates the role of sex chromosome meiotic drive during complex speciation

Colin D Meiklejohn 1,, Emily L Landeen 2,, Kathleen E Gordon 1,, Thomas Rzatkiewicz 2, Sarah B Kingan 2,§, Anthony J Geneva 2,#, Jeffrey P Vedanayagam 2,, Christina A Muirhead 2, Daniel Garrigan 2,**, David L Stern 3, Daven C Presgraves 2,
Editors: Molly Przeworski4, Diethard Tautz5
PMCID: PMC6292695  PMID: 30543325

Abstract

During speciation, sex chromosomes often accumulate interspecific genetic incompatibilities faster than the rest of the genome. The drive theory posits that sex chromosomes are susceptible to recurrent bouts of meiotic drive and suppression, causing the evolutionary build-up of divergent cryptic sex-linked drive systems and, incidentally, genetic incompatibilities. To assess the role of drive during speciation, we combine high-resolution genetic mapping of X-linked hybrid male sterility with population genomics analyses of divergence and recent gene flow between the fruitfly species, Drosophila mauritiana and D. simulans. Our findings reveal a high density of genetic incompatibilities and a corresponding dearth of gene flow on the X chromosome. Surprisingly, we find that a known drive element recently migrated between species and, rather than contributing to interspecific divergence, caused a strong reduction in local sequence divergence, undermining the evolution of hybrid sterility. Gene flow can therefore mediate the effects of selfish genetic elements during speciation.

Research organism: Other

Introduction

Speciation involves the evolution of reproductive incompatibilities between diverging populations, including prezygotic incompatibilities that prevent the formation of hybrids and postzygotic incompatibilities that render hybrids sterile or inviable. Two patterns characterizing speciation implicate a special role for sex chromosomes in the evolution of postzygotic incompatibilities: Haldane’s rule, the observation that hybrids of the heterogametic sex preferentially suffer sterility and inviability (Haldane, 1922; Wu and Davis, 1993; Orr, 1997; Laurie, 1997; Price and Bouvier, 2002; Presgraves, 2002; Coyne and Orr, 2004); and the large X-effect, the observation that the X chromosome has a disproportionately large effect on hybrid sterility (Coyne and Orr, 1989; Coyne, 1992a; Presgraves, 2008). These patterns hold across a wide range of taxa, including female heterogametic (ZW) birds and Lepidoptera and male heterogametic (XY) plants, Drosophila, and mammals (Coyne and Orr, 1989; Coyne and Orr, 2004). We now know that these ‘two rules of speciation’ (Coyne and Orr, 1989) are, in part, attributable to the rapid evolution of genetic factors that cause interspecific hybrid sterility on the X chromosome relative to the autosomes (Tao and Hartl, 2003; Moehring et al., 2007; Masly and Presgraves, 2007; Presgraves, 2008; Good et al., 2008). The relatively rapid accumulation of X-linked hybrid sterility factors is associated with reduced interspecific gene flow at X-linked versus autosomal loci (reviewed in Presgraves, 2018). Overall, these patterns show that, for many taxa with heteromorphic sex chromosomes, the X chromosome plays a large and fundamental role in speciation.

Given the taxonomic breadth of Haldane’s rule, the large X-effect, and reduced interspecific gene flow on the X, understanding why the X chromosome accumulates hybrid incompatibilities faster than the rest of the genome is imperative. At least five explanations have been proposed: faster X evolution (Charlesworth et al., 1987), gene traffic (Moyle et al., 2010), disrupted sex chromosome regulation in the germline (Lifschytz and Lindsley, 1972), the evolutionary origination of incompatibilities in parapatry (Höllinger and Hermisson, 2017), and meiotic drive (Hurst and Pomiankowski, 1991; Frank, 1991). Here, we focus on the potential role of meiotic drive. The drive theory posits that sex chromosomes are more susceptible than autosomes to invasion by selfish meiotic drive (sensu lato) elements (Hurst and Pomiankowski, 1991; Frank, 1991). Sex-linked drive compromises fertility and distorts sex ratios, which leads to evolutionary arms races between drivers, unlinked suppressors, and linked enhancers (Lindholm et al., 2016; Presgraves, 2008; Meiklejohn and Tao, 2010). These arms races can contribute to the evolution of hybrid male sterility, in at least two ways. Normally-suppressed drive elements might be aberrantly expressed in the naive genetic backgrounds of species hybrids, causing sterility rather than sex ratio distortion (Hurst and Pomiankowski, 1991; Frank, 1991). Alternatively, recurrent bouts of invasion, spread, and coevolution among drive, suppressor, and enhancer loci might cause interspecific divergence at these loci that incidentally cause hybrid sterility and map disproportionately to sex chromosomes (Presgraves, 2008; Meiklejohn and Tao, 2010).

Multiple lines of evidence support the plausibility of the drive theory. First, theoretical considerations and empirical evidence suggests that both active and suppressed sex chromosome meiotic drive systems are widespread in natural populations (Jaenike, 2001). Indeed, in one species, Drosophila simulans, three cryptic (normally suppressed) sex-ratio drive systems—Winters, Durham, and Paris—have been identified, involving distinct sets of X-linked drive loci and autosomal and/or Y-linked suppressors (Tao et al., 2001; Tao et al., 2007a; Tao et al., 2007b; Helleu et al., 2016). Second, loci involved in cryptic sex-ratio systems co-localize with hybrid male sterility loci in genetic mapping experiments (Tao et al., 2001; Zhang et al., 2015; Orr and Irving, 2005). Third, at least one of the two X-linked hybrid sterility genes identified to date also causes meiotic drive (Phadnis and Orr, 2009). These discoveries confirm that recurrent bouts of drive and suppression have occurred and that cryptic drive genes can cause hybrid sterility. While these findings put the plausibility of the drive hypothesis beyond doubt, the question of its generality remains: what fraction of X-linked hybrid sterility factors evolved as a consequence of drive? We can furthermore ask whether, and how often, drive can impede the evolution of hybrid incompatibilities. The drive hypothesis assumes, for instance, that populations evolve in strict allopatry (simple speciation) and/or that drive elements require particular population-specific genetic backgrounds for their activity. But for populations that diverge with some level of gene flow (complex speciation), drive elements can in principle migrate between species, thereby reducing divergence and potentially undermining the evolution of hybrid sterility (Macaya-Sanz et al., 2011; Crespi and Nosil, 2013; Seehausen et al., 2014).

Here, we investigate the special role of sex chromosomes in speciation with genetic mapping and population genomic analyses between Drosophila mauritiana and D. simulans. The human commensal species, D. simulans, originated on Madagascar, diverging from the sub-Saharan African species, D. melanogaster, ~3 Mya (Lachaise et al., 1988; Dean and Ballard, 2004; Baudry et al., 2006; Kopp, 2006; Ballard, 2004). The island-endemic species, D. mauritiana, originated on the Indian Ocean island of Mauritius, diverging from D. simulans ~240 kya (Kliman et al., 2000; McDermott and Kliman, 2008; Garrigan et al., 2012). The two species are now isolated by geography—D. simulans has never been collected on Mauritius (David et al., 1989)—and by multiple incomplete reproductive incompatibilities, including asymmetric premating isolation (Coyne, 1992b), postmating-prezygotic isolation (Price, 1997), and intrinsic postzygotic isolation (F1 hybrid males are sterile, F1 hybrid females are fertile; Lachaise et al., 1986). Despite geographic and reproductive isolation, there is clear evidence for historical gene flow between the two species (Solignac and Monnerot, 1986; Solignac et al., 1986; Garrigan et al., 2012; Ballard, 2000a; Ballard, 2000b; Satta et al., 1988; Satta and Takahata, 1990). The X chromosome shows both an excess of factors causing hybrid male sterility (True et al., 1996b; Tao et al., 2003) and, correspondingly, a dearth of historical interspecific introgression (Garrigan et al., 2012). The rapid accumulation of X-linked hybrid male sterility factors may have contributed to reduced X-linked gene flow, limiting exchangeability at sterility factors and genetically linked loci (Muirhead and Presgraves, 2016).

To begin to assess the role of drive in the evolution of X-linked hybrid male sterility between these two species, we performed genetic mapping experiments using genotype-by-sequencing of advanced-generation recombinant X-linked introgressions from D. mauritiana in an otherwise pure D. simulans genetic background. In parallel, we performed population genomic analyses between D. mauritiana and D. simulans to study the chromosomal distributions of interspecific divergence and gene flow. These analyses lead to two discoveries regarding the role of meiotic drive in speciation. First, we find evidence for modest X-linked segregation distortion in hybrids, supporting the hypothesis that cryptic sex-ratio systems are common. Second, we show that a now-cryptic X-linked sex-ratio drive system recently introgressed between species and likely caused large selective sweeps in both species. As a result, this X-linked region shows greatly reduced interspecific sequence divergence and an associated lack of hybrid male sterility factors. Contra the drive hypothesis, in this instance, gene flow at a meiotic drive locus may have prevented or undermined the evolution of X-linked hybrid male sterility. These findings suggest that the effects of selfish genetic elements on interspecific divergence and the accumulation of incompatibilities depend on their opportunity to migrate between species during complex speciation.

Results

Mapping X-linked hybrid male sterility

Multiple intervals on the X chromosome cause male sterility when introduced from D. mauritiana into D. simulans (True et al., 1996b; Maside et al., 1998). The number and identities of the causal factors, how they disrupt spermatogenesis, and the evolutionary forces that drove their interspecific divergence are unknown. We therefore generated a high-resolution genetic map of X-linked hybrid male sterility between the two species, with the ultimate aim of identifying a panel of sterility factors. We first introgressed eight X-linked D. mauritiana segments that together tile across ~85% of the euchromatic length of the X chromosome into a D. simulans genetic background (Figure 1A,B; Table 1). Each introgressed segment was marked by two co-dominant P element insertions bearing mini-white transgenes (P[w+]; True et al., 1996a) that serve as visible genetic markers. We introgressed these ‘2P’ segments into the D. simulans wXD1 genetic background through >40 generations of repeated backcrossing (Figure 1A). Our ability to generate these introgression genotypes confirms that the distal 85% of the D. mauritiana X euchromatin carries no dominant factors that cause female sterility or lethality in a D. simulans genetic background (True et al., 1996b; Tao et al., 2003). All eight 2P introgression genotypes are, however, completely male-sterile, indicating that each of the introgressed regions contains one or more hybrid male sterility factors. Two pairs of introgression genotypes carry largely overlapping introgressed D. mauritiana segments and were combined for further analyses (2P-5a/b and 2P-6a/b, respectively; Figure 1B, Table 1).

Figure 1. Crosses used to introgress eight regions of the D. mauritiana X chromosome into a D. simulans genome.

(A) D. mauritiana ‘2P’ lines were constructed by combining pairs of P-element insertions containing the miniwhite transgene (P[w+]; red triangles) distributed across the X chromosome. The P[w+] inserts are semi-dominant visible eye-color markers that permit discrimination of individuals carrying 0, 1 or 2P[w+]. X-linked segments from D. mauritiana were introgressed into a D. simulans genetic background by backcrossing 2P[w+] hybrid females to D. simulans wXD1 males for over 40 generations. Each introgression line was then bottlenecked through a single female to eliminate segregating variation in the recombination breakpoints flanking the 2P[w+] interval. (B) Cytological map of the D. melanogaster X chromosome, indicating the locations of P[w+] and pBac[eYFP] transgene insertions. The extent of regions introgressed from D. mauritiana into D. simulans (e.g. 2P-1) are labeled above the map. Two pairs of introgression genotypes (2P-5a/b and 2P-6a/b) mostly overlap; the regions included in 2P-5b/2P-6b but not 2P-5a/2P-6a are indicated by dashed lines. (C) Meiotic mapping of sterility factors. 2P[w+] females were crossed to D. simulans strains carrying an X-linked pBac[eYFP] transgene (yellow triangles) that was used as an additional visible marker to score recombinant chromosomes. Recombinant X chromosomes with both pBac[eYFP] and a single P[w+] were chosen and assayed for male fertility. Recombinant chromosomes were generated using pBac[eYFP] markers both proximal and distal to each 2P introgression.

Figure 1—source data 1. Source data for Figure 1—figure supplement 1, Figure 4—figure supplement 1.
DOI: 10.7554/eLife.35468.004
Figure 1—source data 2. Source data for Figure 1—figure supplement 1, Figure 4—figure supplement 1.
DOI: 10.7554/eLife.35468.005
Figure 1—source data 3. Source data for Figure 1—figure supplement 1.
DOI: 10.7554/eLife.35468.006

Figure 1.

Figure 1—figure supplement 1. Distribution of fertility (number of progeny) among all males carrying recombinant 1P-YFP X chromosomes, and average number of progeny among all 1P-YFP genotypes.

Figure 1—figure supplement 1.

Colored bars and arrow below indicate individual male and mean fertility for D. simulans wXD1, respectively. The mean fertility of 10 replicate D. mauritiana w12 males with D. simulans wXD1 females is 197.2 offspring.

Table 1. Locations and lengths of 2P intervals.

2P interval Left P[w+]* Right P[w+]* Length (Mbp)
2P-1 993419 4498520 3.51
2P-3 6192555 9126133 2.93
2P-4 9126133 11189873 2.06
2P-5a 11189873 13324017 2.13
2P-5b 11189873 13903934 2.71
2P-6a 13903934 17492084 3.59
2P-6b 13324017 17492084 4.17
2P-7 17492084 18660037 1.17

*coordinate position in the assembled D. simulans w501 genome.

To determine the genetic basis of male sterility within each 2P interval, we generated recombinant introgressions using D. simulans strains carrying pBac[eYFP] visible markers (Stern et al., 2017) (Figure 1C). These crosses capture unique recombination events between P[w+] and pBac[eYFP] markers, allowing recombinant D. mauritiana introgressions (hereafter called 1P-YFP) to be propagated indefinitely through females without recombination via selection for the 1P-YFP genotype. From these 1P-YFP females, an unlimited number of replicate males carrying identical 1P-YFP recombinant introgressions can be generated, assayed for male fertility, and archived for genotyping (Figure 1C; see below). We assayed male fertility in at least 10 individual males from each of 617 recombinant 1P-YFP genotypes (Table 2; see Materials and methods), and used the mean number of offspring across replicate males as the measure of fertility for each 1P-YFP genotype. Across 1P-YFP genotypes, the mean number of offspring ranged from 0 to 215 progeny; 238 genotypes (38.6%) were completely male-sterile, producing no offspring, and an additional 62 (10%) produced fewer than five offspring per male (Figure 1—figure supplement 1). Of the remaining 1P-YFP genotypes, 231 (37.4%) had intermediate fertility, and 86 (13.9%) had fertility indistinguishable from pure D. simulans controls (Pt-test >0.01).

Table 2. Fertility and sex ratio phenotypes for 1P-YFP recombinant genotypes.

2P interval N tested N sterile* N sub-fertile N fertile Mean fertility % fertile Mean SR
2P-1 171 48 20 103 72.2 0.60 0.43
2P-3 97 12 21 64 67.4 0.66 0.45
2P-4 77 17 9 51 71.9 0.66 0.45
2P-5a/b 92 23 16 53 68.2 0.58 0.51
2P-6a/b 97 69 10 18 73.8 0.19 0.44
2P-7 83 69 6 8 136.5 0.10 0.47
all 1P-YFP genotypes 617 238 82 297 81.7 0.48 0.45

*genotypes where no male produced any offspring.

genotypes where at least two males produced at least five offspring.

We determined high-resolution genotypes of 1P-YFP recombinant introgressions using multiplexed whole-genome sequencing (Andolfatto et al., 2011). After quality filtering, we obtained high-confidence genome-wide genotype information for 439 1P-YFP recombinant introgressions (Figure 2). No genotype showed evidence for any autosomal D. mauritiana alleles, confirming that the introgression scheme isolated X-linked D. mauritiana segments in a pure D. simulans autosomal genetic background (Figure 2—figure supplement 1). Recombinant 1P-YFP introgressions on the X chromosome ranged in size from 0.219 to 6.32 Mbp, with a mean length of 1.97 Mb (Table 3). Figure 2 shows the distribution of D. mauritiana introgression segments and their corresponding sterility phenotypes. Three large regions on the D. mauritiana X chromosome can be introgressed into D. simulans without strong negative effects on male fertility, indicating an absence of major hybrid male sterility factors in these regions (Figure 2). Conversely, we delineated four small regions (<700 kb) that consistently and strongly reduced male fertility: 90% of replicate males with introgressions spanning these regions produce fewer than five offspring. Quantitative trait locus (QTL) analyses confirmed the existence of genetic variation among introgression genotypes that significantly affects male fertility (Figure 3, Figure 3—figure supplement 1). At least five QTL peaks are significant at p<0.01 (permutation test). Most regions containing D. mauritiana alleles reduce the average number of progeny to <15. Two QTL peaks (2.5 cM, and 29.3 cM, Figure 3) appear to show higher fertility associated with the D. mauritiana allele than the D. simulans allele, but this is attributable to D. mauritiana sterility factors located at 12.6 cM and 17.5 cM and the negative linkage disequilibrium that is generated across a 2P interval by our meiotic mapping approach (Figure 1C).

Figure 2. High-resolution genetic map of X-linked hybrid male sterility.

Colored horizontal bars indicate the extent of introgressed D. mauritiana alleles for each recombinant 1P-YFP X chromosome. The color of each introgression indicates the mean fertility of 10 replicate males carrying that 1P-YFP X chromosome. The three shaded areas indicate fertile regions within which D. mauritiana introgressions do not cause sterility, whereas the four red arrows indicate small candidate sterility regions. The blue arrowhead indicates the location of the Dox/MDox meiotic drive loci. Lines in the lower panel indicate the average number of offspring and average proportion of sterile males (defined as producing fewer than five offspring) for all 1P-YFP genotypes that carry D. mauritiana alleles at each genotyped SNP.

Figure 2—source data 1. Source data for Figure 2, Figure 2—figure supplement 1, Figure 4.
DOI: 10.7554/eLife.35468.011

Figure 2.

Figure 2—figure supplement 1. SNP locations and inferred ancestry for five recombinant 1P-YFP genotypes.

Figure 2—figure supplement 1.

Red ticks indicate D. simulans alleles (par1), blue ticks indicate D. mauritiana alleles (par2), and the red (blue) shaded regions indicate the location of inferred D. simulans (D. mauritiana) ancestry.

Table 3. Distribution of 1P-YFP recombinant introgression lengths.

2P interval Sequenced Min size Mean size Max size
2P-1 129 295,225 2,617,833 6,322,871
2P-3 73 306,052 1,636,944 3,818,569
2P-4 55 226,018 1,482,659 2,917,578
2P-5 61 365,004 1,627,632 3,276,930
2P-6 55 692,350 2,400,499 4,764,204
2P-7 66 218,722 1,412,108 2,502,552

Figure 3. QTL analysis of male fertility.

Mean offspring counts for each genotype were transformed as log10(N + 1). The top plot shows lod scores for a two-part model that treats completely sterile genotypes as one class, and tests for quantitative effects on fertility among non-sterile genotypes. The solid and dotted gray lines indicate 5% and 1% significance thresholds, respectively, determined from 10,000 permutations. The bottom plot shows the estimated effects of D. simulans and D. mauritiana alleles at QTL placed every 1 cM (bounding lines indicate 95% confidence intervals).

DOI: 10.7554/eLife.35468.015

Figure 3.

Figure 3—figure supplement 1. Alternate QTL models of male fertility.

Figure 3—figure supplement 1.

Lod scores are shown for models where offspring counts for each genotype were modeled as a normally distributed variable (normal), log10(N + 1) offspring counts were modeled as a normally distributed variable (normal (log)), or offspring counts were modeled as two classes, completely sterile genotypes as one class, and tests for quantitative effects on fertility among non-sterile genotypes. Horizontal lines indicate 1% significance thresholds determined from 10,000 permutations.
Figure 3—figure supplement 2. QTL analysis of male fertility incorporating introgression length as a covariate.

Figure 3—figure supplement 2.

Lod scores are shown for analyses where offspring log10(N + 1) offspring counts were treated as a normally distributed variable, without and with introgression length in base-pairs as a covariate. Horizontal lines indicate 1% significance thresholds determined from 10,000 permutations.

Sex ratio distortion revealed through experimental introgression

Among fertile 1P-YFP males, progeny sex ratios were skewed toward a slight excess of sons: the mean proportion of daughters was 0.45, and 86% of fertile 1P-YFP genotypes (260/303) produced fewer than 50% daughters (Figure 4). These skewed sex ratios are at least partially attributable to effects of the sim wXD1 genetic background, as a similar male bias was observed among progeny of control sim wXD1 males (mean proportion females = 0.46, n = 35 sires, t-test vs. null hypothesis of 0.5, p=0.005). We observe a significant positive correlation between fertility and progeny sex-ratio among both sim wXD1 and introgression genotypes (ρ = 0.44, p=0.009; ρ = 0.21, p=0.0002, respectively); males that sire fewer progeny sire a lower proportion of daughters (Figure 4—figure supplement 1). However, there is some evidence that introgressed D. mauritiana alleles modify this modest male bias: across all fertile introgression genotypes, there is a significant negative correlation between the length of the introgressed D. mauritiana segment and the proportion of female progeny produced by that genotype (ρ = −0.31, p<0.0001, Figure 4—figure supplement 2). This effect seems to be independent of the effects of introgressed alleles on fertility as the partial correlation between progeny sex-ratio and introgression length remains unchanged after taking into account the effect of fertility (ρ = −0.31, p<0.0001; Figure 4—figure supplement 2). One interpretation of these results is that the Y chromosome of sim wXD1 causes weak segregation distortion, and the intensity of distortion is modified by X-linked alleles at multiple loci from D. mauritiana.

Figure 4. High-resolution map of progeny sex ratios among fertile 1P-YFP introgression male genotypes.

Colored horizontal bars indicate the extent of introgressed D. mauritiana alleles for each fertile recombinant 1P-YFP X chromosome. The color of each introgression indicates the sex-ratio of progeny from replicate males carrying that 1P-YFP X chromosome. The line below indicates the average progeny sex-ratio for all 1P-YFP genotypes that carry D. mauritiana alleles at each genotyped SNP.

Figure 4.

Figure 4—figure supplement 1. Relationship between progeny number and sex-ratio.

Figure 4—figure supplement 1.

The top panel shows number of progeny and the percentage of daughters for all recombinant 1P-YFP males that produced any offspring and 40 control D. simulans wXD1 males. The bottom panel shows mean number of progeny and mean progeny sex-ratio for all recombinant 1P-YFP genotypes. In all cases, there is a significant positive correlation between fertility and progeny sex-ratio (1P-YFP males: ρ = 0.12, p<0.0001; wXD1 males: ρ=0.44, p=0.009; 1P-YFP genotypes: ρ = 0.21, p=0.0002).
Figure 4—figure supplement 2. Relationship between introgression length, fertility, and sex-ratio.

Figure 4—figure supplement 2.

Partial correlation coefficients among these three variables: length, fertility: ρ = 0.03, p=0.67; fertility, sex-ratio: ρ = 0.16, p=0.02; length, sex-ratio: ρ = −0.31, p<0.0001). Trendline corresponds to linear regression of progeny sex-ratio on introgression length: sex-ratio = 0.487 - length*0.022.
Figure 4—figure supplement 3. QTL analysis of progeny sex ratio associated with introgression genotypes.

Figure 4—figure supplement 3.

Top panel includes all males that produced any offspring; bottom panel includes only males that sired more than four offspring and genotypes with at least three males that sired more than four offspring. Grey lines indicate results using all genotypes that met the above criterion; black lines indicate results excluding a single outlier genotype. Solid and dotted lines indicate 5% and 1% significance thresholds determined from 10,000 random permutations, respectively.

Although the majority of fertile 1P-YFP genotypes sired male-biased progeny, introgressions that included the distal end of the 2P-5 region sired female-biased progeny (Figure 4). QTL analysis of progeny sex ratio confirms a significant peak in the distal portion of 2P-5 (Figure 4—figure supplement 3). The estimated effect of this QTL on progeny sex ratios is 54.6% daughters for the mauritiana allele versus 42.5% daughters for the simulans allele. These results are consistent with the existence of a cryptic (normally-suppressed) X-linked drive allele in D. mauritiana that is released in a D. simulans genetic background, as the D. mauritiana w12 strain used to generate the 2P introgressions produces slightly male-biased progeny sex-ratios using the same fertility assay (one male paired with three D. simulans wXD1 females, n = 10 sires, mean sex-ratio = 0.47, t-test vs. D. simulans wXD1p=0.4). This region of the X chromosome does not contain any previously mapped meiotic drive loci in D. simulans (Montchamp-Moreau et al., 2006; Tao et al., 2007a; Helleu et al., 2016), suggesting that our experiments have uncovered a novel cryptic drive locus and provide the first evidence of cryptic X-chromosome drive in D. mauritiana.

Population genomics of speciation history

The high density of hybrid male sterility factors and the presence of cryptic drive systems on the X chromosome is expected to influence patterns of gene flow between D. mauritiana and D. simulans. We therefore analyzed whole-genome variation within and between 10 D. mauritiana strains from Mauritius (Garrigan et al., 2014) and 20 D. simulans strains, including nine from Madagascar, ten from Kenya, and one from North America (Rogers et al., 2014; Hu et al., 2013). These data allow us to characterize differentiation and identify genomic regions with aberrant genealogical histories consistent with recent interspecific introgression. The analyses reported here complement earlier studies that characterized interspecific divergence (Garrigan et al., 2012), polymorphism within D. mauritiana (Garrigan et al., 2014; Nolte et al., 2013), and polymorphism within D. simulans (Begun et al., 2007; Rogers et al., 2014). Below we present genome-wide population genetic analyses using non-overlapping 10-kb windows (unless otherwise stated; see Materials and methods).

Polymorphism

Our genome-wide analyses provide multiple indicators that the island-endemic D. mauritiana has a smaller effective population size than D. simulans (Table 4), consistent with previous multi-locus analyses (Hey and Kliman, 1993; Kliman et al., 2000). Compared to D. simulans, total polymorphism (Nei and Li, 1979) in D. mauritiana is 32% lower on the X chromosome and 19% lower on the autosomes (Figure 5—figure supplement 1). The X/autosome ratio of polymorphism is thus lower in D. mauritiana (0.656) than in D. simulans (0.778) and lower than the 3/4 expected for a random mating population with a 1:1 sex ratio (Garrigan et al., 2014). A substantial fraction of extant polymorphisms in both species arose in their common ancestor, reflecting the large effective population sizes of both species and relatively recent species split time (see Materials and methods). Compared to D. simulans, however, D. mauritiana has retained 74.4% as many ancestral polymorphisms and accumulated just 46.3% as many derived polymorphisms. The site frequency spectra (Tajima, 1989) in D. mauritiana are less skewed toward rare variants than in D. simulans, and average linkage disequilibrium (Kelly, 1997) is twofold higher. Overall, these findings show that, relative to D. simulans, D. mauritiana has lower nucleotide diversity; retained fewer ancestral SNPs; accumulated fewer derived SNPs; a less negatively skewed site frequency spectrum; and greater linkage disequilibrium—all patterns consistent with a historically smaller effective population size in D. mauritiana than in D. simulans.

Table 4. Population genomics summary statistics.
Inference Statistic* D. simulans D. mauritiana P-value
Polymorphism median πX 0.0119 0.0076 < 0.0001
median πA 0.0152 0.0116 < 0.0001
SNPs with inferred ancestry 4,324,740 2,181,959 <0.0001§
% ancestral SNPs 14.6 21.6 <0.0001#
% derived SNPs 85.3 78.3
Site frequency spectra median Tajima's DX −1.218 −0.536 < 0.0001c
median Tajima's DA −1.127 −0.359 < 0.0001c
Linkage disequilibrium median Zns, X 0.056 0.122 < 0.0001c
median Zns, A 0.058 0.129 < 0.0001c

*Summary statistics estimated from 10-kb non-overlapping windows.

†SNP were inferred as ancestral or derived using parsimony, with D. melanogaster as an outgroup (see Materials and methods).

P-value for Mann-Whitney U-test.

§P-value for χ2-test.

#P-value from Fisher's exact test.

Divergence and differentiation

Net divergence levels between species are comparable to diversity levels within species. The median number of pairwise differences per site (DXY) between the two species, estimated in non-overlapping 10-kb windows, is 0.010 for the X chromosome and 0.013 for the autosomes. However, as the X chromosome has lower levels of polymorphism within species, the median net divergence (DA) between species is 0.0007 for the X (mean DA = 0.0007) and −0.0005 (mean DA = −0.0006) for the autosomes (a negative value of DA on the autosomes occurs because, on average, levels of within-species polymorphism exceed levels of between-species divergence). DA is significantly greater on the X chromosome than the autosomes (p<0.0001 for both medians and means). Allele frequency differentiation is also higher for the X chromosome (median FST = 0.378) than the autosomes (median FST = 0.279, PMWU <0.0001). These Fst estimates imply that, for X-linked and autosomal loci, the mean times to coalescence for two gene copies sampled from the different species are 2.2- and 1.8-fold deeper than the mean coalescence times for two gene copies within-species, respectively (Slatkin, 1993).

Recent interspecific gene flow and introgression

Gene flow between D. mauritiana and D. simulans has been rare during their speciation history, with an apparent recent increase (Garrigan et al., 2012). To identify genomic regions that have introgressed between species in the recent past, we used the Gmin statistic— the ratio of the minimum pairwise sequence distance between species to the average pairwise distance between species (min[DXY]/ D¯XY; Geneva et al., 2015). As populations diverge without gene flow, all loci in the genome gradually approach reciprocal monophyly, leaving just one ancestral lineage from each population available for coalescence in the ancestral population. Consequently, the minimum distance (numerator) equals the mean pairwise distance (denominator), causing Gmin→1 with zero variance. Conversely, Gmin is small when the minimum distance is small relative to the mean pairwise distance. Gmin is therefore sensitive to genealogical configurations resulting from recent gene flow, particularly when introgressed haplotypes segregate at low to intermediate population frequency in at least one of the populations (Geneva et al., 2015). Importantly, Gmin distinguishes genealogies produced by introgression from those produced by incomplete lineage sorting. Between D. mauritiana and D. simulans, we find that median Gmin (±median absolute deviation) estimated for 10-kb windows across the major chromosome arms ranges from 0.761 ± 0.0537 for 3L to 0.785 ± 0.0531 for the X (Figure 5; Kruskal-Wallis test, p<0.0001). As 95% of Gmin values are <0.85, reciprocal monophyly for 10-kb windows is rare.

Figure 5. Identification of introgessed regions by Gmin.

Grey (black) dots indicate Gmin values calculated using 5-kb (10-kb) windows; light blue (dark blue) dots indicate 5-kb (10-kb) windows with significant Gmin values. As with 10-kb windows, 5-kb windows with significant Gmin values are 4-fold underrepresented on the X chromosome: 14 of 3603 5-kb windows on the X chromosome (0.39%) have significant Gmin values versus 266 of 17,065 5-kb windows on the autosomes (1.56%; Fisher’s exact test p<0.0001).

Figure 5—source data 1. Source data for Figure 5—figure supplements 1 and 2.
DOI: 10.7554/eLife.35468.025
Figure 5—source data 2. Source data for Figure 5.
DOI: 10.7554/eLife.35468.026

Figure 5.

Figure 5—figure supplement 1. Population genomic scans for polymorphism, divergence, and introgression in 10-kb windows.

Figure 5—figure supplement 1.

The rows of panels show: nucleotide diversity for a sample of 10 inbred strains of D. mauritianamau, green dots) and 20 inbred strains of D. simulanssim, purple dots); nucleotide divergence scaled by within-species polymorphism (blue dots); and Gmin (red dots), the ratio of the minimum number of nucleotide differences per site between D. mauritiana and D. simulans to the average number of differences per site, a summary statistic that is sensitive to introgression. Panels correspond to each major chromosome arm, with genome coordinates on the x-axis.
Figure 5—figure supplement 2. Polymorphism and Gmin.

Figure 5—figure supplement 2.

Within both D. simulans and D. mauritiana there is a significant negative correlation between polymorphism (π) and Gmin P-value (Spearman's ρ = 0.22 and 0.38, respectively, p<0.0001), indicating that windows with higher polymorphism are more likely to have low Gmin values, although this correlation is driven by the large majority of non-significant windows. However, 10-kb windows with significant Gmin values have lower levels of polymorphism in D. simulans than non-significant windows, while significant windows have higher levels of polymorphism in D. mauritiana than non-significant windows (Wilcoxon rank test p<0.0001 within both species). One interpretation of this pattern is that windows with significant Gmin values have levels of polymorphism similar to that in the other species, which is consistent with these windows carrying lineages derived from the other species.

To identify 10-kb outlier windows that have genealogical histories inconsistent with strict allopatric divergence, we used a Monte Carlo simulation procedure that assumes a constant species divergence time across all 10-kb intervals, separately for the X and the autosomes (see Materials and methods). In total, 196 of the 10,443 10-kb windows (1.9%) have a more recent common ancestry between D. mauritiana and D. simulans than expected under a strict allopatric divergence model, as indicated by significantly low values of Gmin (P ≤ 0.001, corresponding to a genome-wide false discovery rate of 5%). As Gmin is a ratio, significantly small Gmin values could result from unusually small numerators (minimum DXY) or unusually large denominators (D¯XY). We find that 10-kb windows with significant Gmin values have smaller median minimum DXY (0.0056 in introgression windows versus 0.0094 genome-wide, PMWU <0.0001) as well as smaller median D¯XY (0.0110 in introgression windows versus 0.0124 genome-wide PMWU <0.0001), indicating that the significant Gmin values are due to unusually small minimum DXY values. The smaller D¯XY of windows with significant Gmin reflects the contribution of the introgressed, low-distance haplotypes to the overall average pairwise distance between species.

Introgression windows are 4.4-fold underrepresented on the X chromosome: only nine of 1842 10-kb windows on the X chromosome (0.49%) have significant Gmin values versus 187 of 8601 10-kb windows on the autosomes (2.17%; Fisher’s exact test p<0.0001). However, not all 10-kb introgression windows are independent: 169 of the 196 significant 10-kb windows (86.2%) can be arrayed into contiguous (or nearly contiguous) genomic regions (see Materials and methods). As a result, we infer 27 small (10-kb) introgressions and 21 larger introgressions ranging in size from 20 kb to 280 kb (Supplementary file 1). Of these 48 total introgressions, only one is on the X chromosome and 47 are on autosomes (χ2-test, p=0.0124). The lengths of these introgressed haplotypes depend on their time spent in the receiving population and on the local recombination rate. First, recombination has eroded introgression sizes over time, with longer, presumably younger, introgressions having smaller average Gmin values (Spearman ρ = −0.6293, p<0.0001) and smaller minimum Dxy values (ρ = −0.3677, p=0.0101). Second, local recombination rate has been an important factor in determining introgression lengths, with relatively long introgressions tending to reside in chromosomal environments with low rates of crossing over (ρ = −0.366, p=0.0105).

To complement our distance-based Gmin analyses, we also used a genealogy-based four-population (ABBA-BABA) test, summarized by Patterson’s D-statistic (Green et al., 2010; Durand et al., 2011), to evaluate the distribution of shared derived variants between D. mauritiana and D. simulans. Assuming a (((D. sechellia, D. simulans), D. mauritiana), D. melanogaster) tree topology, the null expectation is that a history involving zero gene flow should result in approximately equal numbers of ABBA and BABA nucleotide site configurations via lineage sorting, where A and B correspond to ancestral and derived states, respectively (Green et al., 2010; Durand et al., 2011). Instead, we find that D = 0.0812 (s.e. = 0.0033; block jackknife with 1 Mb blocks) across the genome, indicating a significant excess of shared derived sites between D. simulans and D. mauritiana compared to D. sechellia and D. mauritiana. These findings provide complementary support for a history of interspecific gene flow between D. mauritiana and D. simulans.

Interspecific introgression of the cryptic Winters sex-ratio drive system

The single introgression detected on the X chromosome corresponds to a ~130-kb region that comprises eight protein-coding genes plus the Winters sex-ratio meiotic drive genes, Distorter on the X (Dox) and, its progenitor gene, Mother of Dox (MDox) (Tao et al., 2007a) (Figure 6). The median Gmin value across this 130-kb region is 0.333, a ~2.4-fold reduction relative to background Gmin on the X chromosome (PMWU <0.0001). The most extreme 10-kb window within the 130-kb region has a minimum DXY value (=0.00087) that is 92% smaller than the X chromosome-wide D¯XY, implying that introgression occurred in the recent past. The 130-kb region is also an outlier with respect to Patterson’s D statistic: we observe 90.2 (72%) ABBA sites versus just 35.2 (28%) BABA sites in the region (D = 0.4382), whereas a significantly different configuration of ABBA and BABA sites occurs on the X chromosome outside the 130-kb region (9774.6 [55%] and 7911.1 [45%], respectively; D = 0.1054; χ2-test, p=0.00027). The elevated value of D within the 130-kb region indicates a significant excess of derived nucleotide variants shared between D. simulans and D. mauritiana compared to genomic background levels. Given the evidence from both distance- and genealogy-based analyses, we conclude that this 130-kb haplotype has a history of recent gene flow between species. In D. simulans, when unsuppressed, MDox and Dox cause biased transmission of the X chromosome during spermatogenesis, with male carriers siring more than 80% daughters (Tao et al., 2007a). These drivers are suppressed by an autosomal gene, Not much yin (Nmy), a retrotransposed copy of Dox that is a source of endogenous siRNAs that silence both MDox and Dox (Tao et al., 2007b). In non-African D. simulans populations, Dox, MDox, and Nmy are nearly fixed, although haplotypes lacking functional copies of the genes segregate at low frequencies (Kingan et al., 2010). All three loci have histories consistent with selective sweeps in multiple populations of D. simulans due to the presumed transmission advantage at MDox and Dox and the associated selective advantages of suppressing drive and restoring equal sex ratios at Nmy (Kingan et al., 2010). We estimated the probability that a random X-linked 130-kb introgression might include Dox and MDox by chance by permuting the location of a 130-kb segment on the X chromosome. Out of 100,000 such random permutations, 356 included Dox and MDox (p=0.004). We hypothesize that the signature of recent introgression at these sex-ratio distorters is not coincidental, but rather that introgression was mediated by their biased transmission through males.

Figure 6. Natural introgression of the MDox-Dox region of the X chromosome.

Figure 6.

(A) Gmin values for 10-kb windows in the region containing MDox and Dox. Blue lines indicate windows with significantly low Gmin values. Inset box indicates the 90-kb region shown in panel B. (B) DNA polymorphism tables: the top table corresponds to the MDox region, and the bottom corresponds to the Dox region. Within the tables, yellow squares denote the derived nucleotide state, and blue squares indicate the ancestral state. The top 20 rows of each table correspond to the D. simulans samples, and the bottom 10 rows correspond to the D. mauritiana samples. The genome map between the polymorphism tables shows gene models for the region (orange boxes) and the locations of the MDox and Dox genes (green triangles). Regions highlighted in red are 10-kb windows with significantly low Gmin values. (C) Maximum likelihood phylogenetic trees for the MDox and Dox regions. Green circles and red triangles denote D. mauritiana and D. simulans samples, respectively.

Figure 6—source data 1. Source data for Figure 6.
DOI: 10.7554/eLife.35468.028

Maximum-likelihood phylogenetic trees for the 130-kb MDox-Dox region show reduced diversity within D. mauritiana and reduced divergence between the two species (Figure 6). Among the 10 D. mauritiana sequences, nucleotide diversity is just 24% (π = 0.0018) of background diversity levels on the X chromosome, corresponding to a massive selective sweep in the D. mauritiana genome (PMWU <0.0001; see also (Nolte et al., 2013; Garrigan et al., 2014)). The distribution of variability among haplotypes in the D. simulans samples is consistent with a parallel, albeit incomplete, selective sweep (Figure 6).

To determine if the MDox and/or Dox drive elements are associated with introgression between species and the selective sweeps within each species, we determined MDox and Dox presence/absence status for each line using diagnostic restriction digests (see Materials and methods). In contrast to previous work showing that MDox and Dox are nearly fixed among D. simulans samples collected outside of Africa (Kingan et al., 2010), we find that the drivers are at lower frequency among our 19 African samples (9 Madagascar, 10 Kenya): five have MDox (26%), five have Dox (26%), and only one has both genes (5%; NS33; Supplementary file 2). Despite these low frequencies, MDox and Dox are overrepresented among the haplotypes shared between species: 6 of the 7 shared haplotypes have MDox and/or Dox (Fisher’s Exact PFET = 0.0018), and 2 of the 7 possess both drivers (PFET = 0.0158; n = 19 African samples, plus the reference strain, D. simulans w501, which has both). In D. mauritiana, all 10 lines have MDox, but only two have Dox (Figure 6; Supplementary file 2). RT-PCR shows that MDox is expressed in testes from both species (see Materials and methods), confirming its potential activity. These findings provide support for the hypothesis that segregation distortion mediated by Dox and (transcriptionally active) MDox genes was responsible for introgression and the parallel sweeps at this locus.

Notably, the large MDox-Dox introgression, and its associated sweep co-localize with one of the three regions of the X chromosome that, in our mapping experiments, fails to cause male sterility when introgressed from D. mauritiana into D. simulans (Figure 2). These observations suggest that a driving haplotype moved between species and swept to high frequency in D. simulans and fixation in D. mauritiana, thereby reducing local sequence divergence between species. This discovery has two implications. First, the MDox-Dox region is the only locus on the X chromosome to have recently escaped from its linked hybrid incompatibility factors and introgressed between species. Second, by sweeping to high frequency or fixation, the MDox-Dox drive element region reduced local divergence between species and, incidentally, undermined the accumulation of genetic incompatibilities that might cause hybrid male sterility.

Discussion

Our combined genetic and population genomics analysis of hybrid male sterility and gene flow between D. mauritiana and D. simulans yields three findings. First, we confirm the rapid accumulation of X-linked hybrid male sterility between these species and map four major sterility factors to small (<700 kb) intervals (Figure 2). Second, we find that very recent natural introgression has occurred between these species, albeit almost exclusively on the autosomes, consistent with a large X-effect on gene flow (Supplementary file 1). Third, we discover new roles for meiotic drive during the history of speciation between these species. Some drive seems to be associated with functional divergence between species: one region of the D. mauritiana X chromosome appears to cause segregation distortion in a D. simulans genetic background. In contrast, the well-characterized X-linked Winters sex ratio distorters, MDox and Dox, have clearly migrated between species, reducing local interspecific divergence. Together, these findings, respectively, suggest that genetic conflict may both promote as well as undermine the special role of sex chromosomes in speciation.

Genetic basis of X-linked hybrid male sterility

Our genetic analyses were initiated by introgression of six different regions of the D. mauritiana X chromosome into a pure D. simulans genetic background. All six regions cause complete hybrid male sterility and therefore carry at least one, or a combination of, D. mauritiana allele(s) that disrupt spermatogenesis due to incompatibilities with X-linked, Y-linked, or autosomal D. simulans alleles. Only three large (>2 Mb) regions of the D. mauritiana X are readily exchangeable between species, permitting male fertility in a D. simulans genome. Thus, after only ~250,000 years, sufficient X-linked hybrid male sterility has accumulated to render most of the D. mauritiana X chromosome male-sterile on a D. simulans genetic background (True et al., 1996b). Most of the D. mauritiana X chromosome is male-sterile in a D. sechellia genome as well (Masly and Presgraves, 2007). The combination of such extensive reproductive isolation with such modest genetic divergence makes this species group an ideal system to study the genetic basis of speciation.

We were able to define four small regions (<700 kb), each sufficient to cause complete male sterility (Figure 2), suggesting that these may contain single, strong sterility factors. We also find a large region spanning most of 2P-6 from which we were unable to recover fertile 1P-YFP recombinants. We infer that 2P-6 contains a minimum of two strong sterility regions, one tightly linked to each of the flanking P-elements (Figure 3). While our 2P mapping scheme is designed to facilitate the identification of male sterility factors, the 2P-6 interval highlights one of its limitations: in regions like 2P-6, for which strong sterility factors are very close to both flanking P-elements, we cannot determine how many additional sterility factors might localize to the middle of the interval. The present experiments therefore provide only a minimum estimate of the total number of hybrid male sterility factors on the X chromosome. We tentatively conclude that, within the fraction of the D. mauritiana X chromosome investigated, there are at least six genetically separable regions, each individually sufficient to cause virtually complete male sterility. It is worth noting that these experimental approaches detect relatively large-effect sterility factors under a single set of laboratory conditions. There are likely many hybrid male sterility factors of smaller effect, generally neglected in the lab but easily detected by selection in natural populations and thus able to affect the probability of migration at linked loci.

Genomic signatures of complex speciation with gene flow

The two species studied here are allopatric: D. simulans has never been reported on Mauritius, and D. mauritiana has never been found anywhere other than Mauritius (David et al., 1989; Legrand et al., 2011). D. mauritiana appears to have originated from a D. simulans-like ancestor, probably from Madagascar, that migrated and established a population on Mauritius (Hey and Kliman, 1993; Kliman et al., 2000). Our characterization of genome-wide variation within and between D. mauritiana and D. simulans confirms a coalescent history that reaches considerably deeper into the past than the inferred species split time of ~250,000 years (Hey and Kliman, 1993; Kliman et al., 2000). Nested within this largely shared coalescent history, many functional differences have evolved between the two species, including extreme ones that mediate large-effect hybrid incompatibilities. The signatures of gene flow found in the genomes of these species imply recurrent bouts of migration and interbreeding. To introgress between species, immigrating foreign haplotypes must escape their locally disfavored chromosomal backgrounds by recombination before being eliminated by selection against linked incompatibilities and locally maladaptive alleles (Petry, 1983; Bengtsson, 1985; Barton and Bengtsson, 1986). Conditional on escape, the lengths of foreign haplotypes will be subject to gradual erosion by recombination with the resident genetic background.

Here, and in previous work (Garrigan et al., 2012), we detect evidence consistent with weak migration: 2–5% of the genome shows evidence of introgression between D. simulans and D. mauritiana during their recent history. Our population genomic analysis identified 48 segregating foreign haplotypes. We find evidence that the genomic locations and lengths of introgressed foreign haplotypes have been shaped by selection and by recombination in the receiving population. First, selection has likely affected the genomic distribution of foreign haplotypes: only one of the 48 introgressions occurs on the X chromosome. The opportunity for foreign haplotypes on the X chromosome to escape linked incompatibilities via recombination is more constrained than on the autosomes, as the X has a higher density of incompatible alleles, and hemizygous selection eliminates foreign X-linked haplotypes more quickly (Muirhead and Presgraves, 2016). Second, we find that the lengths of introgressed haplotypes depend on local recombination rates: introgressions tend to be longer in chromosomal regions with relatively lower recombination rates. Third, after escaping locally deleterious chromosomal backgrounds, recombination eroded the lengths of foreign haplotypes over time: recently introgressed, and hence less diverged, haplotypes tend to be longer. It is worth noting here that the 10-kb windows used for our Gmin scan for foreign haplotypes almost certainly fails to identify very small and/or old introgressions. However, similar results are obtained from Gmin scans using 10-kb and 5 kb windows (Figure 5).

Meiotic drive and complex speciation

The original drive theory posits that hybrid incompatibilities accumulate as incidental by-products of recurrent bouts of meiotic drive and suppression (Hurst and Pomiankowski, 1991; Frank, 1991). Our mapping experiments provide no direct evidence in support of this theory in D. mauritiana and D. simulans, as no hybrid male sterility loci co-localized with sex-ratio loci. Direct genetic evidence that sex-ratio distortion is responsible for the evolution of hybrid male sterility is however inherently difficult to obtain, as sterile males produce no offspring, preventing detection of biased sex-ratios. Indeed, the dual role of Ovd in hybrid male sterility and sex-ratio distortion in D. pseudoobscura was only detectable because males recover low levels of fertility as they age (Orr and Irving, 2005). Although weakly fertile males (producing fewer than five offspring) were removed from the sex-ratio analyses presented here, these males show no evidence for systematically biased sex ratios (Figure 4—figure supplement 1).

Our genetic mapping experiments have, however, provided new evidence for the accumulation of cryptic sex-ratio drive systems. We mapped a small region of the D. mauritiana X that, when introgressed into a naive D. simulans genetic background, causes modest segregation distortion resulting in female-biased progeny sex ratios (Figure 4). As the D. mauritiana X-drive locus does not map to the location of any of the three cryptic drivers known from D. simulans, we infer that it may be a new, previously undiscovered drive system in D. mauritiana.

Across D. simulans and D. mauritiana, four cryptic drive systems have been identified so far: two X-drive systems in D. simulans (Paris and Durham); one X-drive system in D. mauritiana (see above); and one X-drive system found in both species (Winters; see below). We regard this as a minimum for several reasons. First, weak segregation distortion that may be powerful in natural populations can go undetected in laboratory experiments. Second, cryptic drive systems may not be fixed within species, and our genetic mapping experiments have only surveyed genotypes derived from one strain each of D. mauritiana and D. simulans. Third, no study has yet comprehensively assayed D. simulans material introgressed into a D. mauritiana genetic background. Finally, some cryptic drive alleles might go to fixation and then simply degenerate because, once fixed (or suppressed), a driver is in a race: either suffer mutational decay or acquire a mutation that confers a new bout of drive. These considerations—and the discovery of multiple alternative cryptic drive systems in closely related species—imply that sex chromosome drive is not infrequent during the history of species divergence (Jaenike, 2001).

We have found that the Winters sex-ratio drivers, MDox and Dox, have migrated between these two species. The two drivers are suppressed by the autosomal suppressor, Nmy, which is present in both D. simulans and D. mauritiana (Tao et al., 2007a). The general absence of drive in wild-type genotypes of either species raises one of two possibilities. Either Nmy has evolved quickly to suppress the newly introgressed MDox and Dox alleles or, alternatively, a suppressing allele of Nmy also introgressed between species. We are unable to distinguish these possibilities with the present data, as Nmy resides in a chromosomal region dense with complex repetitive sequences that are refractory to genome assembly using short-read data.

The discovery that the MDox and Dox drivers have moved between species highlights an implicit assumption of the drive theory of the large X-effect—namely, that species evolve in strict allopatry. With gene flow, drive elements (and other selfish genes) have the opportunity to jump species boundaries and undermine divergence in a process analogous to adaptive introgression (Seehausen et al., 2014; Crespi and Nosil, 2013). The t-haplotype has, for instance, introgressed between sub-species of house mouse, Mus musculus (Macaya-Sanz et al., 2011). Between D. mauritiana and D. simulans, the Gmin statistic and the genealogies associated with the MDox-Dox introgressed haplotype (Figure 6) are agnostic on the direction of introgression. Nonetheless, the finding that a drive element crossed a species boundary has important implications for the drive theory explanation of Haldane’s rule and the large X-effect. For MDox and Dox to introgress between species, three things must be true: (1) neither MDox nor Dox alleles from the donor species caused male sterility in the recipient species; (2) no X-linked hybrid male sterility factors were so tightly linked to MDox and Dox as to prevent their eventual escape by recombination into the recipient species genetic background; and (3) any sterility factors located within the introgressed region of the recipient X will have been replaced by foreign alleles. Together, these inferences suggest that a selfish drive system was able to invade a new species by not causing male sterility and, for one X-linked region, may have impeded or undone the evolution of hybrid male sterility.

Materials and methods

Key resources table.

Reagent type
(species) or resource
Designation Source or reference Identifiers Additional information
Genetic reagent
(Drosophila mauritiana)
mau w[12] Drosophila species
stock center; NCBI SRA
14021–0241.60;
SRX684364;
SRX135546
Genetic reagent
(Drosophila simulans
sim w[XD1] this paper SRR8247551 obtained from J. Coyne
Genetic reagent
(Drosophila mauritiana)
2P-1 this paper w[12], P{w[+]=Neneh2},
P{w[+]=4R1}
Genetic reagent
(Drosophila mauritiana)
2P-3 this paper w[12], P{w[+]=Ophelia1},
P{w[+]=4J1}
Genetic reagent
(Drosophila mauritiana)
2P-4 this paper w[12], P{w[+]=4J1},
P{w[+]=2A1}
Genetic reagent
(Drosophila mauritiana)
2P-5a this paper w[12], P{w[+]=2A1},
P{w[+]=ILEA1}
Genetic reagent
(Drosophila mauritiana)
2P-5b this paper w[12], P{w[+]=2A1},
P{w[+]=2G3}
Genetic reagent
(Drosophila mauritiana)
2P-6a this paper w[12], P{w[+]=2G3},
P{w[+]=A1}
Genetic reagent
(Drosophila mauritiana)
2P-6b this paper w[12], P{w[+]=ILEA1},
P{w[+]=A1}
Genetic reagent
(Drosophila mauritiana)
2P-7 this paper w[12], P{w[+]=A1},
P{w[+]=3L1}
Genetic reagent
(Drosophila simulans)
YFP[175.2] PMID:28280212 pBac{3XP3::EYFP-attP}
Genetic reagent
(Drosophila simulans)
YFP[356.5] PMID:28280212 pBac{3XP3::EYFP-attP}
Genetic reagent
(Drosophila simulans)
YFP[377.31] PMID:28280212 pBac{3XP3::EYFP-attP}
Genetic reagent
(Drosophila simulans)
YFP[52.4] PMID:28280212 pBac{3XP3::EYFP-attP}
Genetic reagent
(Drosophila simulans)
YFP[277.1] PMID:28280212 pBac{3XP3::EYFP-attP}
Genetic reagent
(Drosophila simulans)
YFP[926.3] PMID:28280212 pBac{3XP3::EYFP-attP}
Genetic reagent
(Drosophila simulans)
YFP[16.3] PMID:28280212 pBac{3XP3::EYFP-attP}
Genetic reagent
(Drosophila simulans)
YFP[360.1] PMID:28280212 pBac{3XP3::EYFP-attP}
Genetic reagent
(Drosophila simulans)
YFP[433.1] PMID:28280212 pBac{3XP3::EYFP-attP}
Genetic reagent
(Drosophila simulans)
YFP[19.1] PMID:28280212 pBac{3XP3::EYFP-attP}
Genetic reagent
(Drosophila simulans)
YFP[21.4] PMID:28280212 pBac{3XP3::EYFP-attP}
Genetic reagent
(Drosophila simulans)
YFP[458.6] PMID:28280212 pBac{3XP3::EYFP-attP}
Sequence-based
reagent
Dox_F_1 this paper CGAAATGAGACGCTTCTGTG
Sequence-based
reagent
Dox_R_1 this paper AACCGATACCGTCGTAGTTGAC
Sequence-based
reagent
MDox_F_1 this paper CCCATTTTGTCCAAGGTCAC
Sequence-based
reagent
MDox_R_2 this paper AGTTCCGGTCAAAGTGGTTG
Sequence-based
reagent
RpS28b_F_1 this paper TGGACAAACCAGTTGTGTGG
Sequence-based
reagent
RpS28b_R_1 this paper AGGAACTCGACCTTCACCTG
Strain
(Drosophila simulans)
sim w[501] PMID:22936249 14021–0251.011
Strain
(Drosophila simulans)
md06 NCBI SRA SRX497551
Strain
(Drosophila simulans)
md15 NCBI SRA SRX497574
Strain
(Drosophila simulans)
md63 NCBI SRA SRX497553
Strain
(Drosophila simulans)
md73 NCBI SRA SRX497563
Strain
(Drosophila simulans)
md105 NCBI SRA SRX497558
Strain
(Drosophila simulans)
md199 NCBI SRA SRX497559
Strain
(Drosophila simulans)
md221 NCBI SRA SRX495510
Strain
(Drosophila simulans)
md233 NCBI SRA SRX495507
Strain
(Drosophila simulans)
md251 NCBI SRA SRX497557
Strain
(Drosophila simulans)
ns05 NCBI SRA SRX497560
Strain
(Drosophila simulans)
ns33 NCBI SRA SRX497575
Strain
(Drosophila simulans)
ns39 NCBI SRA SRX497562
Strain
(Drosophila simulans)
ns40 NCBI SRA SRX497556
Strain
(Drosophila simulans)
ns50 NCBI SRA SRX497571
Strain
(Drosophila simulans)
ns67 NCBI SRA SRX497565
Strain
(Drosophila simulans)
ns78 NCBI SRA SRX497573
Strain
(Drosophila simulans)
ns79 NCBI SRA SRX497576
Strain
(Drosophila simulans)
ns113 NCBI SRA SRX497572
Strain
(Drosophila simulans)
ns137 NCBI SRA SRX497561
Strain
(Drosophila mauritiana)
r12 NCBI SRA SRX135546
Strain
(Drosophila mauritiana)
r23 NCBI SRA SRX688576
strain
(Drosophila mauritiana)
r31 NCBI SRA SRX688581
Strain
(Drosophila mauritiana)
r32 NCBI SRA SRX688583
Strain
(Drosophila mauritiana)
r39 NCBI SRA SRX688588
Strain
(Drosophila mauritiana)
r41 NCBI SRA SRX688609
Strain
(Drosophila mauritiana)
r44 NCBI SRA SRX688610
Strain
(Drosophila mauritiana)
r56 NCBI SRA SRX688612
Strain
(Drosophila mauritiana)
r61 NCBI SRA SRX688710
Strain
(Drosophila mauritiana)
r8 NCBI SRA SRX688712

Drosophila husbandry and genetics

All Drosophila crosses and phenotyping were done in parallel in two locations, using standard cornmeal media (Rochester, NY) or minimal cornmeal media (Bloomington, IN) at room temperature (23–25C). We constructed D. mauritiana ‘2P’ lines that carry pairs of X-linked P-element insertions that contain the mini-white transgene (P[w+]) (True et al., 1996a) which serve as semi-dominant visible genetic eye-color markers and allow us to distinguish individuals carrying 0, 1 or 2P[w+]. These ‘2P’ regions were then introgressed into the D. simulans wXD1 genetic background through more than 40 generations of repeated backcrossing while following the two P[w+] insertions (Figure 1A). Each 2P introgression line was then bottlenecked through a single female to eliminate segregating variation in the recombination breakpoints flanking the 2P[w+] interval.

We performed meiotic mapping to ascertain the genetic basis of male sterility within each 2P introgression by generating recombinant 1P introgression genotypes (Figure 1B). 2P[w+] females were crossed to D. simulans strains carrying an X-linked pBac[eYFP] transgene (Stern et al., 2017) that served as an additional visible marker. Progeny from this cross were scored for recombinant X chromosomes carrying both pBac[eYFP] and a single P[w+] (1P-YFP). Recombinant 1P-YFP chromosomes were generated using pBac[eYFP] markers both proximal and distal to each 2P introgression. Virgin 1P-YFP females were individually crossed to D. simulans wXD1 males to initiate 1P-YFP strains. Each 1P-YFP X chromosome was then assayed for male fertility. At least 10 individual 1P-YFP males of each genotype were collected 1–2 days post-eclosion and aged 3–5 days, then placed singly in a vial with three virgin D. simulans wXD1 females. After 7 days, both the male and females were discarded, and all offspring emerging from the vial were counted. Additional 1P-YFP males were archived for DNA extraction.

Progeny sex ratios were calculated as the number of female offspring/total number of offspring (% female). Males that sired fewer than five offspring were excluded from sex ratio analyses, as were genotypes with fewer than three males that sired more than four offspring. This resulted in 2538 males and 303 recombinant 1P-YFP chromosomes that were used to estimate progeny sex ratios; 210 recombinant 1P-YFP genotypes had both progeny sex ratio and sequence data.

Genotyping recombinant chromosomes by sequencing

We determined the fine-scale genetic architecture of hybrid male sterility within each introgressed region by genotyping recombinant 1P-YFP X chromosomes using multiplexed whole-genome sequencing. DNA extraction and library construction followed published methods for high-throughput sequence analysis of a large number of recombinant genotypes (Andolfatto et al., 2011; Peluffo et al., 2015). Sequence reads were mapped to the reference genome sequence of the D. mauritiana stock used for mapping (mau w12) (Garrigan et al., 2012), the genome sequence of sim wXD1, and the D. simulans pBac[eYFP] strains (Stern et al., 2017). Ancestry from each parent species was determined by a Hidden Markov Model (HMM) (Pinero et al., 2017Andolfatto et al., 2011).

Genotype data and ancestry assignments were inspected for all recombinant 1P-YFP introgression genotypes. Genotypes were excluded if there was no segment on the X chromosome identified by the HMM that had either a posterior probability of D. mauritiana parentage >0.95 or a posterior probability of D. simulans parentage <0.05. Genotypes with segments that had either a posterior probability of D. mauritiana parentage >0.95 or a posterior probability of D. simulans parentage <0.05 in a region that was not within the parental 2P region (i.e. came from a different 2P introgression) were inferred to have resulted either from mislabeling or contamination of DNA samples and were excluded from further analyses. 112 genotypes had insufficient sequence data to identify introgressions using the criteria above (or the introgression was too small to be identified). 16 genotypes showed evidence for D. mauritiana alleles that did not fall within the parental 2P interval. Across the 439 genotypes with sufficiently high-quality sequence data for ancestry assignment, we recovered 64,373 X-linked markers. A subset of 2835 non-redundant markers were retained that delimit the extent of each 1P-YFP D. mauritiana segment. No genotype showed evidence for any autosomal D. mauritiana alleles (see Figure 2—figure supplement 1 for exemplars), confirming that our introgression scheme isolated X-linked D. mauritiana segments in a pure D. simulans autosomal genome.

Quantitative trait locus analysis

QTL analyses were done in the R/qtl package version 1.36–6. Phenotype means (fertility and progeny sex-ratio) for each introgression genotype and the 2835 non-redundant markers were used as the input data. Mean male fertility was transformed as log10 (N + 1). Because of the large proportion of completely sterile introgression genotypes (Figure 1—figure supplement 1), a two-part model (Broman et al., 2003) was used to analyze fertility; sex-ratio was analyzed assuming a normal distribution. Significance thresholds were determined using 10,000 permutations of the data.

Samples and short read alignment

We used genome sequence data from 10 lines of D. mauritiana, including nine inbred wild isolates and the genome reference strain, mau w12; 20 lines of D. simulans, including 10 inbred wild isolates from Kenya, nine wild isolates from Madagascar, and the reference strain, sim w501; and the reference strain of D. melanogaster. The D. mauritiana and D. simulans sequence data were reported previously (Garrigan et al., 2012; Garrigan et al., 2014; Rogers et al., 2014). SRA accessions for genome sequences are included in the key resources file. The D. simulans w501 and D. melanogaster genome assemblies are available on Flybase (www.flybase.org). We performed short read alignment against the D. mauritiana genome assembly (version 2) using the ‘aln/sampe’ functions of the BWA short read aligner and default settings (Li and Durbin, 2009). Reads flanking indels were realigned using the SAMTOOLS software (Li et al., 2009). Individual BAM files were merged and sorted with SAMTOOLS.

Polymorphism and divergence analyses

Both within- and between-population summary statistics were estimated in 10-kb windows using the software package POPBAM (Garrigan, 2013). The within population summary statistics include: unbiased nucleotide diversity π (Nei, 1987); the summary of the folded site frequency spectrum Tajima’s D (Tajima, 1989); and the unweighted average pairwise value of the r2 measure of linkage disequilibrium, ZnS, excluding singletons (Kelly, 1997). The between population summary statistics include: two measures of nucleotide divergence between populations, DXY, and net divergence, DA (Nei, 1987); the ratio of the minimum between-population nucleotide distance to the average, Gmin (Geneva et al., 2015); and the fixation index, FST (Wright, 1951). From a total of 11,083 scanned 10-kb windows, we only analyzed windows for which at least 50% of aligned sites passed the default quality filters (minimum read coverage 3, minimum rms mapping quality 25, minimum SNP quality 25, minimum map quality 13, minimum base quality 13) in POPBAM, which resulted in a final alignment for 10,443 scanned 10-kb windows. POPBAM output was formatted for use in the R statistical computing environment using the package, POPBAMTools (Geneva, 2014). All statistics and data visualization were done in R (R Development Core Team, 2013).

Identification of introgressed regions

We used the Gmin statistic (Geneva et al., 2015) to scan the genome for haplotypes that have recent common ancestry between D. simulans and D. mauritiana. Gmin is defined as the ratio of the minimum number of nucleotide differences per aligned site between sequences from different populations to the average number of nucleotide differences per aligned site between populations. The Gmin statistic was calculated in 10-kb intervals across each major chromosome arm using the same quality filtering criteria used for all other summary statistics. From these values, we estimated the probability of the observed Gmin under a model of allopatric divergence, conditioned on the divergence time. For each 10-kb genomic interval, the significance of the observed Gmin value was tested via Monte Carlo coalescent simulation of that 10-kb window with two populations diverging in allopatry with all mutations assumed to be neutral. Simulations were performed using msmove (Geneva, 2017), which is based on the coalescent simulation software ms (Hudson, 2002), modified to track and report the presence of introgressed genealogies. The arguments of msmove are identical to those of ms and for all simulations we used the following command (msmove 30 10000 t θ -r ρ 10001 -I 2 10 20 -ej 0.61 1 2). We assumed a population divergence time of 1.21 × 2Nsim generations before the present, in which Nsim is the current estimated effective population size of D. simulans (Garrigan et al., 2012). In the simulations, the observed local value of DXY was used to determine the neutral population mutation rate (θ) for that 10-kb interval. To account for uncertainty in local population recombination rate, for each simulated replicate, a rate was drawn from a normally distributed prior (truncated at zero) with the mean estimated from genetically determined crossover frequencies (True et al., 1996a) for that window, and variance equal to the variance of crossover estimates for the entire chromosome arm. The empirical crossover rate estimates were converted from cM to ρ (the population crossover rate, 4Nsimc) by assuming Nsim≈106. The effective population sizes of both species were assumed to be equal and constant. For each 10-kb interval, 105 simulated replicates were generated and the probability of the observed Gmin value was estimated from the simulated cumulative density. To identify putatively introgressed haplotypes, we used a significance threshold of p≤0.001 from the simulations, which yields a proportion of null tests of 0.982 and a false discovery rate of 5%. To infer the full length of any putative introgressions >10-kb, we identified runs of contiguous (or semi-contiguous) 10-kb windows with significant Gmin values (p≤0.001). We also assessed the distribution of shared derived variants using the four-population test, summarized by Patterson’s D statistic (Green et al., 2010). Variants were generated using POPBAM default parameters and used to calculate Patterson’s D across chromosome arms using customized perl scripts. For D statistic calculations, we assumed the tree structure (((D. sechellia, D. simulans), D. mauritiana), D. melanogaster) for (((P1,P2),P3),O), and used the population frequencies of SNPs to compute probabilistic contributions of individual sites to counts of ‘ABBA’ and ‘BABA’ site types (Green et al., 2010; Durand et al., 2011). Finally, we estimated maximum likelihood phylogenies for each of the putative introgression intervals using RAxML v. 8.1.1 (Stamatakis, 2014).

Genotyping the Winters sex ratio genes

We extracted genomic DNA from single male flies using the Qiagen DNeasy Blood and Tissue Kit. The meiotic drive genes of the Winters sex ratio system (Tao et al., 2007a), Dox and MDox, were PCR-amplified as previously described (Kingan et al., 2010). To assay the presence or absence of the Dox and MDox gene insertions, the amplicons for the Dox and MDox regions were digested with the StyI and StuI restriction enzymes (NEB), respectively. The digests were run on a 1% agarose gel stained with EtBr and the band size was estimated using the GeneRuler 1 kb plus ladder (Thermo Scientific). For both genes, only haplotypes containing the gene insertions have restriction sites as confirmed by samples with known genotypes (Kingan et al., 2010).

Quantitative PCR for Dox/MDox expression in fly testes

We assayed expression of the Dox and MDox genes in testes from D. simulans strain MD63 and D. mauritiana strain mau w12 using quantitative PCR. Total RNA was extracted from the dissected testes of 5–10 day old flies using the Nucleospin RNA XS kit (Macherey-Nagel, Germany), and cDNA was synthesized with poly dT oligos and random hexamers using Superscript III RT cDNA synthesis kit (Invitrogen, CA). qPCR assays were performed on a BioRad Real-time PCR machine using the cycling conditions: 95° C for 3 mins.; 40 cycles of 95° C for 10 s, 58° C for 30 s, and 72° C for 30 s. The primer sequences used for qPCR are provided in Supplementary file 3.

Acknowledgements

The authors thank Brian Calvi for generously providing laboratory space and Shelby Biel, Ally Shambaugh, and Amanda Meiklejohn for assistance with fertility assays, Cara Brand for calculation of D. mauritiana recombination rates, and Peter Andolfatto and Kevin Thornton for sharing D. simulans genome sequence data.

Appendix 1

Performance of Gmin - detecting introgression from population genomic data

Using the Gmin statistic (Geneva et al., 2015), we report three findings in the main text regarding historical gene flow between Drosophila simulans and D. mauritiana:

  1. 1.9% of 10-kb windows show evidence for recent introgression between these species;

  2. recent introgression is significantly underrepresented on the X chromosome relative to the autosomes; and

  3. the lone X-linked region identified as recently introgressed between species contains the previously characterized meiotic drive loci, Dox and MDox.

In this appendix, we present analyses, simulations, and arguments that support these inferences.

To test Gmin 's power to detect introgression between D. simulans and D. mauritiana, we used msmove simulations similar to those described in the Methods, assuming a population size Ne of 1,000,000 and 10 generations per year. We simulated divergence followed by gene flow at three times in the past: 400, 4000 and 40,000 years ago. Levels of simulated gene flow were tuned to approximate those observed genome-wide in our data (ms migration probability of 0.008, corresponding to migration occurring in 2–4% of 10-kb windows). Each simulation modeled divergence and gene flow within a 10-kb segment using empirical estimates of population mutation and recombination rate parameters estimated from each 10-kb window in our data. Gmin was then calculated for each simulated window.

We then used the same procedure described in the Methods to evaluate whether the simulated value of Gmin for a given window was an outlier by performing 10,000 msmove simulations without gene flow and comparing the Gmin value from the gene flow simulation to the distribution of Gmin values from the strictly allopatric simulations. Windows were deemed to be Gmin outliers if they fell in the lowest 0.001 quantile of the non-gene flow simulated distribution. We repeated these steps 100 times for each 10-kb window at each of the three gene flow time points. The power of Gmin was determined by measuring the concordance between windows identified as outliers by our procedure and windows that actually contained a simulated gene flow event.

Gmin identifies recent introgression

The properties and behavior of the Gmin statistic have been explored in several previous publications that used coalescent simulations to explore a range of mutation, recombination, and migration parameters. These analyses determined that Gmin statistical power is robust to variation in recombination and mutation rates (Geneva et al., 2015; Rosenzweig et al., 2016; Schrider et al., 2018). For no set of mutation or recombination parameters considered does Gmin produce an unacceptably high rate of false positives. Gmin power is however dependent on the timing of introgression (Geneva et al., 2015; Rosenzweig et al., 2016; Schrider et al., 2018), as shown by simulations that assume levels of gene flow comparable to those observed in our data (Table 1). We find that Gmin detects 86%, 41%, and 2% of simulated gene flow events that occurred 400 years, 4,000 years, and 40,000 years ago, respectively (Table 1). While the false positive rate is higher for simulated gene flow that occurred 40,000 years ago, the total number of 10-kb windows with significant Gmin values is very small; consequently, the total number of false positives is very small as well. We therefore conclude that Gmin may be unreliable for older introgressions but identifies younger introgressions with high confidence.

Appendix 1—table 1. Gmin and power to detect simulated introgression on the X chromosome and autosomes.

Numbers in parentheses indicate the standard deviation from 100 replicate simulations

400 ybp 4000 ybp 40,000 ybp
A X A X A X
Windows with migration (#) 202.12 (18) 31.06 (7.2) 250.65 (16) 66.48 (10) 281.15 (20) 65.22 (8.7)
Windows with migration (%) 2.4% (0.21) 1.7% (0.4) 3% (0.19) 3.7% (0.56) 3.4% (0.24) 3.6% (0.48)
Significant Gmin windows (#) 179.08 (17) 28.5 (6.4) 111.44 (9.8) 27.02 (4.7) 15.87 (4.4) 2.4
(1.3)
Significant Gmin windows (%) 2.2% (0.2) 1.6% (0.35) 1.3% (0.12) 1.5% (0.26) 0.19% (0.052) 0.13% (0.07)
True positive rate 96% (1.5) 95% (3.9) 94% (2.4) 93% (5.1) 45% (11) 30% (34)
False postive rate 3.7% (1.5) 4.8% (3.9) 5.7% (2.4) 6.8% (5.1) 55% (11) 70% (34)
Migration Events Detected 85% (3.2) 88% (7.5) 42% (2.9) 38% (5.1) 2.6%
(1)
1.2% (1.4)

Comparing introgression on the X versus autosomes

X-linked loci have (generally) smaller effective population sizes than autosomal loci and hence lower levels of nucleotide polymorphism. We tested if the systematically lower observed polymorphism on the X can explain its lower levels of introgression detected by Gmin using coalescent simulations, resampling, and additional statistical analyses of our empirical results.

First, we note that Gmin statistical significance for each 10-kb window was determined by Monte Carlo simulation of neutral genealogies derived from two populations diverging in allopatry (no gene flow). As these simulations used estimates of θ and ρ drawn from each 10-kb window, they necessarily incorporate systematic differences between the X and autosomes in these parameters when generating P-values. Second, the simulations presented in Table 1 show no significant difference in power (true positive rate or proportion of migration events detected) between X-linked and autosomal windows. Third, using a resampling approach, we generated 10,000 ‘X-matched autosome’ datasets, each drawn randomly from autosomal 10-kb windows, that closely matched the distribution of polymorphism among true X-linked windows and tallied the number of significant Gmin windows (Figure 1). For 10,000 ‘X-matched autosome’ datasets matching X-linked polymorphism in D. simulans, no dataset had as few or fewer significant Gmin windows than the actual X-linked data (p<0.0001); for 10,000 ‘X-matched autosome’ datasets matching polymorphism in D. mauritiana, only one dataset had as few significant Gmin windows as the actual X-linked data (p=0.0001). These findings suggest that the observed paucity of introgressions on the X chromosome cannot be explained simply by its lower levels of polymorphism.

Our data do reveal negative correlations between the Gmin P-value and polymorphism within D. simulans (Spearman's ρ = −0.22, p<0.0001) and within D. mauritiana (ρ = −0.38, p<0.0001). Importantly, this correlation is driven by variation in P-values among the large majority of non-significant windows (see Figure 5—figure supplement 2). However, significant Gmin windows on average do have different levels of polymorphism than non-significant ones. In D. simulans, significant Gmin windows have less polymorphism than non-significant windows, whereas, in D. mauritiana, significant Gmin windows have more polymorphism than non-significant windows (Figure 5—figure supplement 2; Wilcoxon test p<0.0001 for both species). The observation that significant Gmin windows have elevated polymorphism in D. mauritiana may reflect the direction of gene flow: the presence of foreign alleles will tend to elevate diversity in the receiving (D. mauritiana) population but not the donor population (D. simulans).

Patterson’s D statistic may not be appropriate for X-autosome comparisons

In the main text, we report that Patterson’s D for the genome (D = 0.0812, combining all chromosomes), is smaller than that for the the X chromosome (D = 0.1054, excluding the 130-kb Dox region). For all autosomes combined, D = 0.077, yielding a X/A ratio of D = 1.361. Superficially, these values could imply the possibility of more introgression on the X than the autosomes and would therefore seem to contradict the Gmin results which suggest the opposite. We suggest however that Patterson’s D statistic may be inappropriate for simple X versus autosome comparisons and that discrepancies between Patterson’s D and scans for introgression are not unique to our study.

In the case of constant population size, the expected value of D is inversely related to Ne (Green et al., 2010; Durand et al., 2011). As a result, under most circumstances, a larger value of D is expected for the X chromosome even if all else— including the degree of introgression— is constant. To illustrate the point, we calculated expected D using standard assumptions for our Drosophila species and obtain values similar to those estimated from the data. The simplifying assumptions for all calculations are:

  • constant Ne = 1,000,000 for all species

  • 10 generations per year

  • a three-species polytomy of D. mauritiana, D. simulans, D. sechellia

  • a speciation time = 250,000 years (2,500,000 gens) in the past

  • a single pulse of gene flow occurring 50,000 years (500,000 gens) in the past

  • introgression probability, f = 0.05

Using these assumptions and Equation 5 from (Durand et al., 2011) yields E[D]=0.072. Taking this value as a plausible autosomal expectation for D, we considered three different Ne values for the X, while holding all other parameters constant. Table 2 provides expectations for D on the X chromosome and X/A ratios of D.

Appendix 1—table 2. X chromosome, and X/A ratio, for expectation of Patterson’s D.

X/A ratio of Ne Rationale E[D] X/A ratio of D
0.75 1:1 sex ratio, random mating, etc. 0.094 1.309
0.656 Observed X/A nucleotide diversity in D. mauritiana 0.106 1.479
0.778 Observed X/A nucleotide diversity in D. simulans 0.091 1.265

For all three cases, the X/A ratio of E[D] is greater than one and comparable to the ratio estimated from our data (X/A ratio = 1.361) despite no difference in assumptions about the amount or timing of gene flow between the X and autosomes.

Notably, discrepancies between Patterson’s D and focused introgression scans are not unique to our study. Green et al. (Green et al., 2010) developed the D statistic and estimated D between Neanderthal and non-African humans for all 23 chromosomes (their Supplementary Table S47). Between Asians and Neanderthals, D is 2.3-fold higher for the X chromosome than the mean of the 22 autosomes (Table S47), which would seem to imply a greater rate of introgression on the X. Later work by the same group (Sankararaman et al., 2014; Sankararaman et al., 2016) scanned genomes for introgression using relative sequence distances and haplotype length as criteria and found a significant dearth of introgression on the X relative to the autosomes (X/A introgression ~20%). Thus, paralleling our results, Patterson’s D between Asians and Neanderthals implies excess gene flow on the X, whereas the genomic scan implies the opposite. As expectations for Patterson’s D on the X versus the autosomes are confounded by effective population size, it seems imprudent to draw strong conclusions about relative gene flow from the D statistic alone.

Appendix 1—figure 1. Resampled autosomal 10-kb windows matching X-chromosome polymorphism.

Appendix 1—figure 1

(A) Distributions of polymorphism within 10-kb windows for the X chromosome and autosomes in D. simulans and D. mauritiana. (B) Exemplar resampled autosomal data sets matching X-chromosome polymorphism for D. simulans and D. mauritiana. (C) Distribution of the number of resampled windows with significant Gmin values across 10,000 replicate resampled data sets. Vertical dotted lines indicate the observed number of significant X-linked windows in each species.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Colin D Meiklejohn, Email: cmeiklejohn2@unl.edu.

Daven C Presgraves, Email: daven.presgraves@rochester.edu.

Molly Przeworski, Columbia University, United States.

Diethard Tautz, Max-Planck Institute for Evolutionary Biology, Germany.

Funding Information

This paper was supported by the following grants:

  • National Institute of General Medical Sciences 1R01OD010548-01A1 to Daniel Garrigan, Daven Presgraves.

  • University of Nebraska-Lincoln to Colin Meiklejohn.

  • University of Rochester to Daniel Garrigan, Daven Presgraves.

  • National Science Foundation DEB-0839348 to Colin Meiklejohn.

  • National Institute of General Medical Sciences 1R01GM123194-01A1 to Colin Meiklejohn, Daven Presgraves.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Resources, Formal analysis, Supervision, Funding acquisition, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing.

Conceptualization, Supervision, Investigation, Writing—review and editing.

Investigation, Writing—review and editing.

Investigation.

Conceptualization, Software, Formal analysis, Supervision, Investigation, Visualization, Writing—original draft, Writing—review and editing.

Software, Formal analysis, Methodology, Writing—review and editing.

Data curation, Formal analysis, Writing—review and editing.

Formal analysis.

Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing.

Resources, Software, Formal analysis, Investigation, Methodology, Writing—review and editing.

Conceptualization, Resources, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing—original draft, Project administration, Writing—review and editing.

Additional files

Supplementary file 1. Gmin scan identifies forty-eight interspecific introgressions.
elife-35468-supp1.xlsx (47.5KB, xlsx)
DOI: 10.7554/eLife.35468.029
Supplementary file 2. Genotype of samples at the Dox and MDox genes.
elife-35468-supp2.xlsx (41.4KB, xlsx)
DOI: 10.7554/eLife.35468.030
Supplementary file 3. Primers used in RT-PCR to assay expression of MDox, Dox, and a control gene (RpS28b).
elife-35468-supp3.xlsx (30.8KB, xlsx)
DOI: 10.7554/eLife.35468.031
Transparent reporting form
DOI: 10.7554/eLife.35468.032

Data availability

Sequence data is available via the NCBI Sequence Read Archive (accession number: SRR8247551). Phenotype data have been submitted to Dryad (DOI: https://doi.org/10.5061/dryad.4qn4s47).

The following datasets were generated:

Colin Meiklejohn, Daven Presgraves, David L Stern. 2018. Sequence data from Gene flow mediates the role of sex chromosome meiotic drive during complex speciation. NCBI Sequence Read Archive. SRR8247551

Meiklejohn CD, Landeen EL, Presgraves DC. 2018. Gene flow mediates the role of sex chromosome meiotic drive during complex speciation. Dryad Digital Repository.

The following previously published datasets were used:

Garrigan D, Kingan SB, Geneva AJ, Vedanayagam JP, Presgraves DC. 2014. Drosophila mauritiana genome sequencing. NCBI BioProject. PRJNA158675

Rogers RL, Cridland JM, Shao L, Hu TT, Andolfatto P, Thornton KR. 2015. Tandem Duplications and the Limits of Natural Selection in Drosophila yakuba and Drosophila simulans. NCBI Sequence Read Archive. SRP040290

Rogers RL, Cridland JM, Shao L, Hu TT, Andolfatto P, Thornton KR. 2015. Tandem Duplications and the Limits of Natural Selection in Drosophila yakuba and Drosophila simulans. NCBI Sequence Read Archive. SRP029453

References

  1. Andolfatto P, Davison D, Erezyilmaz D, Hu TT, Mast J, Sunayama-Morita T, Stern DL. Multiplexed shotgun genotyping for rapid and efficient genetic mapping. Genome Research. 2011;21:610–617. doi: 10.1101/gr.115402.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Ballard JW. Comparative genomics of mitochondrial DNA in members of the Drosophila melanogaster subgroup. Journal of Molecular Evolution. 2000a;51:48–63. doi: 10.1007/s002390010066. [DOI] [PubMed] [Google Scholar]
  3. Ballard JW. When one is not enough: introgression of mitochondrial DNA in Drosophila. Molecular Biology and Evolution. 2000b;17:1126–1130. doi: 10.1093/oxfordjournals.molbev.a026394. [DOI] [PubMed] [Google Scholar]
  4. Ballard JW. Sequential evolution of a symbiont inferred from the host: Wolbachia and Drosophila simulans. Molecular Biology and Evolution. 2004;21:428–442. doi: 10.1093/molbev/msh028. [DOI] [PubMed] [Google Scholar]
  5. Barton N, Bengtsson BO. The barrier to genetic exchange between hybridising populations. Heredity. 1986;57:357–376. doi: 10.1038/hdy.1986.135. [DOI] [PubMed] [Google Scholar]
  6. Baudry E, Derome N, Huet M, Veuille M. Contrasted polymorphism patterns in a large sample of populations from the evolutionary genetics model Drosophila simulans. Genetics. 2006;173:759–767. doi: 10.1534/genetics.105.046250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Begun DJ, Holloway AK, Stevens K, Hillier LW, Poh YP, Hahn MW, Nista PM, Jones CD, Kern AD, Dewey CN, Pachter L, Myers E, Langley CH. Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLOS Biology. 2007;5:e310. doi: 10.1371/journal.pbio.0050310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bengtsson BO. The flow of genes through a genetic barrier. In: Greenwood P. J, Harvey P. H, Slatkin M, editors. Evolution: Essays in Honour of John Maynard Smith. Cambridge: Cambridge University Press; 1985. [Google Scholar]
  9. Broman KW, Wu H, Sen S, Churchill GA. R/qtl: QTL mapping in experimental crosses. Bioinformatics. 2003;19:889–890. doi: 10.1093/bioinformatics/btg112. [DOI] [PubMed] [Google Scholar]
  10. Charlesworth B, Coyne JA, Barton NH. The relative rates of evolution of sex chromosomes and autosomes. The American Naturalist. 1987;130:113–146. doi: 10.1086/284701. [DOI] [Google Scholar]
  11. Coyne JA, Orr HA. Two rules of speciation. In: Otte D, Endler J, editors. Speciation and Its Consequences. Sunderland, MA: Sinauer Associates; 1989. [Google Scholar]
  12. Coyne JA. Genetics and speciation. Nature. 1992a;355:511–515. doi: 10.1038/355511a0. [DOI] [PubMed] [Google Scholar]
  13. Coyne JA. Genetics of sexual isolation in females of the Drosophila simulans species complex. Genetical Research. 1992b;60:25–31. doi: 10.1017/S0016672300030639. [DOI] [PubMed] [Google Scholar]
  14. Coyne JA, Orr HA. Speciation. Sunderland, Massachusetts: Sinauer; 2004. [Google Scholar]
  15. Crespi B, Nosil P. Conflictual speciation: species formation via genomic conflict. Trends in Ecology & Evolution. 2013;28:48–57. doi: 10.1016/j.tree.2012.08.015. [DOI] [PubMed] [Google Scholar]
  16. David J, Mcevey SF, Solignac M, Tsacas L. Drosophila communities on Mauritius and ecological niche of D. mauritiana (Diptera, Drosophilidae) Journal of African Zoology. 1989;103:107–116. [Google Scholar]
  17. Dean MD, Ballard JW. Linking phylogenetics with population genetics to reconstruct the geographic origin of a species. Molecular Phylogenetics and Evolution. 2004;32:998–1009. doi: 10.1016/j.ympev.2004.03.013. [DOI] [PubMed] [Google Scholar]
  18. Durand EY, Patterson N, Reich D, Slatkin M. Testing for ancient admixture between closely related populations. Molecular Biology and Evolution. 2011;28:2239–2252. doi: 10.1093/molbev/msr048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Frank SA. Divergence of meiotic drive-suppression systems as an explanation for sex-biased hybrid sterility and inviability. Evolution; International Journal of Organic Evolution. 1991;45:262–267. doi: 10.1111/j.1558-5646.1991.tb04401.x. [DOI] [PubMed] [Google Scholar]
  20. Garrigan D, Kingan SB, Geneva AJ, Andolfatto P, Clark AG, Thornton KR, Presgraves DC. Genome sequencing reveals complex speciation in the Drosophila simulans clade. Genome Research. 2012;22:1499–1511. doi: 10.1101/gr.130922.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Garrigan D. POPBAM: tools for evolutionary analysis of short read sequence alignments. Evolutionary Bioinformatics. 2013;9:EBO.S12751. doi: 10.4137/EBO.S12751. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Garrigan D, Kingan SB, Geneva AJ, Vedanayagam JP, Presgraves DC. Genome diversity and divergence in Drosophila mauritiana: multiple signatures of faster X evolution. Genome Biology and Evolution. 2014;6:2444–2458. doi: 10.1093/gbe/evu198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Geneva AJ. POPBAMTools. 8129124GitHub. 2014 https://github.com/geneva/POPBAMTools
  24. Geneva AJ, Muirhead CA, Kingan SB, Garrigan D. A new method to scan genomes for introgression in a secondary contact model. PLOS ONE. 2015;10:e0118621. doi: 10.1371/journal.pone.0118621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Geneva AJ. msmove. dab4d4dGitHub. 2017 https://github.com/geneva/msmove
  26. Good JM, Dean MD, Nachman MW. A complex genetic basis to X-linked hybrid male sterility between two species of house mice. Genetics. 2008;179:2213–2228. doi: 10.1534/genetics.107.085340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MH, Hansen NF, Durand EY, Malaspinas AS, Jensen JD, Marques-Bonet T, Alkan C, Prüfer K, Meyer M, Burbano HA, Good JM, Schultz R, Aximu-Petri A, Butthof A, Höber B, Höffner B, Siegemund M, Weihmann A, Nusbaum C, Lander ES, Russ C, Novod N, Affourtit J, Egholm M, Verna C, Rudan P, Brajkovic D, Kucan Ž, Gušic I, Doronichev VB, Golovanova LV, Lalueza-Fox C, de la Rasilla M, Fortea J, Rosas A, Schmitz RW, Johnson PLF, Eichler EE, Falush D, Birney E, Mullikin JC, Slatkin M, Nielsen R, Kelso J, Lachmann M, Reich D, Pääbo S. A draft sequence of the Neandertal genome. Science. 2010;328:710–722. doi: 10.1126/science.1188021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Haldane JBS. Sex ratio and unisexual sterility in hybrid animals. Journal of Genetics. 1922;12:101–109. doi: 10.1007/BF02983075. [DOI] [Google Scholar]
  29. Helleu Q, Gérard PR, Dubruille R, Ogereau D, Prud'homme B, Loppin B, Montchamp-Moreau C. Rapid evolution of a Y-chromosome heterochromatin protein underlies sex chromosome meiotic drive. PNAS. 2016;113:4110–4115. doi: 10.1073/pnas.1519332113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Hey J, Kliman RM. Population genetics and phylogenetics of DNA sequence variation at multiple loci within the Drosophila melanogaster species complex. Molecular biology and evolution. 1993;10:804–822. doi: 10.1093/oxfordjournals.molbev.a040044. [DOI] [PubMed] [Google Scholar]
  31. Höllinger I, Hermisson J. Bounds to parapatric speciation: A Dobzhansky-Muller incompatibility model involving autosomes, X chromosomes, and mitochondria. Evolution. 2017;71:1366–1380. doi: 10.1111/evo.13223. [DOI] [PubMed] [Google Scholar]
  32. Hu TT, Eisen MB, Thornton KR, Andolfatto P. A second-generation assembly of the Drosophila simulans genome provides new insights into patterns of lineage-specific divergence. Genome research. 2013;23:89–98. doi: 10.1101/gr.141689.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Hudson RR, Coyne JA. Mathematical consequences of the genealogical species concept. Evolution. 2002;56:1557–1565. doi: 10.1111/j.0014-3820.2002.tb01467.x. [DOI] [PubMed] [Google Scholar]
  34. Hudson RR. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics. 2002;18:337–338. doi: 10.1093/bioinformatics/18.2.337. [DOI] [PubMed] [Google Scholar]
  35. Hurst LD, Pomiankowski A. Causes of sex ratio bias may account for unisexual sterility in hybrids: a new explanation of Haldane's rule and related phenomena. Genetics. 1991;128:841–858. doi: 10.1093/genetics/128.4.841. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Jaenike J. Sex chromosome meiotic drive. Annual Review of Ecology and Systematics. 2001;32:25–49. doi: 10.1146/annurev.ecolsys.32.081501.113958. [DOI] [Google Scholar]
  37. Kelly JK. A test of neutrality based on interlocus associations. Genetics. 1997;146:1197–1206. doi: 10.1093/genetics/146.3.1197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Kingan SB, Garrigan D, Hartl DL. Recurrent selection on the Winters sex-ratio genes in Drosophila simulans. Genetics. 2010;184:253–265. doi: 10.1534/genetics.109.109587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Kliman RM, Andolfatto P, Coyne JA, Depaulis F, Kreitman M, Berry AJ, McCarter J, Wakeley J, Hey J. The population genetics of the origin and divergence of the Drosophila simulans complex species. Genetics. 2000;156:1913–1931. doi: 10.1093/genetics/156.4.1913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Kopp A. Basal relationships in the Drosophila melanogaster species group. Molecular Phylogenetics and Evolution. 2006;39:787–798. doi: 10.1016/j.ympev.2006.01.029. [DOI] [PubMed] [Google Scholar]
  41. Lachaise D, David JR, Lemeunier F, Tsacas L, Ashburner M. The reproductive relationships of Drosophila sechellia with D. mauritiana, D. simulans, and D. melanogaster from the Afrotropical region. Evolution. 1986;1986:262–271. doi: 10.1111/j.1558-5646.1986.tb00468.x. [DOI] [PubMed] [Google Scholar]
  42. Lachaise D, Cariou M-L, David JR, Lemeunier F, Tsacas L, Ashburner M. Historical biogeography of the Drosophila melanogaster species subgroup. Evolutionary Biology. 1988;22:159–225. doi: 10.1007/978-1-4613-0931-4_4. [DOI] [Google Scholar]
  43. Laurie CC. The weaker sex is heterogametic: 75 years of Haldane's rule. Genetics. 1997;147:937–951. doi: 10.1093/genetics/147.3.937. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Legrand D, Chenel T, Campagne C, Lachaise D, Cariou ML. Inter-island divergence within Drosophila mauritiana, a species of the D. simulans complex: Past history and/or speciation in progress? Molecular Ecology. 2011;20:2787–2804. doi: 10.1111/j.1365-294X.2011.05127.x. [DOI] [PubMed] [Google Scholar]
  45. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Lifschytz E, Lindsley DL. The role of X-Chromosome inactivation during spermatogenesis. PNAS. 1972;69:182–186. doi: 10.1073/pnas.69.1.182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Lindholm AK, Dyer KA, Firman RC, Fishman L, Forstmeier W, Holman L, Johannesson H, Knief U, Kokko H, Larracuente AM, Manser A, Montchamp-Moreau C, Petrosyan VG, Pomiankowski A, Presgraves DC, Safronova LD, Sutter A, Unckless RL, Verspoor RL, Wedell N, Wilkinson GS, Price TAR. The Ecology and Evolutionary Dynamics of Meiotic Drive. Trends in Ecology & Evolution. 2016;31:315–326. doi: 10.1016/j.tree.2016.02.001. [DOI] [PubMed] [Google Scholar]
  49. Macaya-Sanz D, Suter L, Joseph J, Barbará T, Alba N, González-Martínez SC, Widmer A, Lexer C. Genetic analysis of post-mating reproductive barriers in hybridizing European Populus species. Heredity. 2011;107:478–486. doi: 10.1038/hdy.2011.35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Maside XR, Barral JP, Naveira HF. Hidden effects of X chromosome introgressions on spermatogenesis in Drosophila simulans x D. mauritiana hybrids unveiled by interactions among minor genetic factors. Genetics. 1998;150:745–754. doi: 10.1093/genetics/150.2.745. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Masly JP, Presgraves DC. High-resolution genome-wide dissection of the two rules of speciation in Drosophila. PLOS Biology. 2007;5:e243. doi: 10.1371/journal.pbio.0050243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. McDermott SR, Kliman RM. Estimation of isolation times of the island species in the Drosophila simulans complex from multilocus DNA sequence data. PLoS One. 2008;3:e2442. doi: 10.1371/journal.pone.0002442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Meiklejohn CD, Tao Y. Genetic conflict and sex chromosome evolution. Trends in Ecology & Evolution. 2010;25:215–223. doi: 10.1016/j.tree.2009.10.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Moehring AJ, Teeter KC, Noor MA. Genome-wide patterns of expression in Drosophila pure species and hybrid males. II. Examination of multiple-species hybridizations, platforms, and life cycle stages. Molecular Biology and Evolution. 2007;24:137–145. doi: 10.1093/molbev/msl142. [DOI] [PubMed] [Google Scholar]
  55. Montchamp-Moreau C, Ogereau D, Chaminade N, Colard A, Aulard S. Organization of the sex-ratio meiotic drive region in Drosophila simulans. Genetics. 2006;174:1365–1371. doi: 10.1534/genetics.105.051755. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Moyle LC, Muir CD, Han MV, Hahn MW. The contribution of gene movement to the "two rules of speciation". Evolution. 2010;64:1541–1557. doi: 10.1111/j.1558-5646.2010.00990.x. [DOI] [PubMed] [Google Scholar]
  57. Muirhead CA, Presgraves DC. Hybrid Incompatibilities, Local Adaptation, and the Genomic Distribution of Natural Introgression between Species. The American Naturalist. 2016;187:249–261. doi: 10.1086/684583. [DOI] [PubMed] [Google Scholar]
  58. Nei M, Li WH. Mathematical model for studying genetic variation in terms of restriction endonucleases. PNAS. 1979;76:5269–5273. doi: 10.1073/pnas.76.10.5269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Nei M. Molecular Evolutionary Genetics, New York. Columbia University Press 1987 [Google Scholar]
  60. Nolte V, Pandey RV, Kofler R, Schlötterer C. Genome-wide patterns of natural variation reveal strong selective sweeps and ongoing genomic conflict in Drosophila mauritiana. Genome Research. 2013;23:99–110. doi: 10.1101/gr.139873.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Orr HA. Haldane's rule. Annual Review of Ecology and Systematics. 1997;28:195–218. doi: 10.1146/annurev.ecolsys.28.1.195. [DOI] [PubMed] [Google Scholar]
  62. Orr HA, Irving S. Segregation distortion in hybrids between the Bogota and USA subspecies of Drosophila pseudoobscura. Genetics. 2005;169:671–682. doi: 10.1534/genetics.104.033274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Peluffo AE, Nuez I, Debat V, Savisaar R, Stern DL, Orgogozo V. A Major Locus Controls a Genital Shape Difference Involved in Reproductive Isolation Between Drosophila yakuba and Drosophila santomea. G3&amp;#58; Genes|Genomes|Genetics. 2015;5:2893–2901. doi: 10.1534/g3.115.023481. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Petry D. The effect on neutral gene flow of selection at a linked locus. Theoretical Population Biology. 1983;23:300–313. doi: 10.1016/0040-5809(83)90020-5. [DOI] [PubMed] [Google Scholar]
  65. Phadnis N, Orr HA. A single gene causes both male sterility and segregation distortion in Drosophila hybrids. Science. 2009;323:376–379. doi: 10.1126/science.1163934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Pinero G, Reilly P, Stern D, Hu T, Parsons L. MSG: Multiplexed Shotgun Genotyping. b2dcddbGithub. 2017 https://github.com/YourePrettyGood/msg
  67. Presgraves DC. Patterns of postzygotic isolation in Lepidoptera. Evolution. 2002;56:1168–1183. doi: 10.1111/j.0014-3820.2002.tb01430.x. [DOI] [PubMed] [Google Scholar]
  68. Presgraves DC. Sex chromosomes and speciation in Drosophila. Trends in Genetics. 2008;24:336–343. doi: 10.1016/j.tig.2008.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Presgraves DC. Evaluating genomic signatures of "the large X-effect" during complex speciation. Molecular Ecology. 2018;27:3822–3830. doi: 10.1111/mec.14777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Price CS. Conspecific sperm precedence in Drosophila. Nature. 1997;388:663–666. doi: 10.1038/41753. [DOI] [PubMed] [Google Scholar]
  71. Price TD, Bouvier MM. The evolution of F1 postzygotic incompatibilities in birds. Evolution. 2002;56:2083–2089. doi: 10.1111/j.0014-3820.2002.tb00133.x. [DOI] [PubMed] [Google Scholar]
  72. R Development Core Team . Vienna, Austria: R Foundation for Statistical Computing; 2013. [Google Scholar]
  73. Rogers RL, Cridland JM, Shao L, Hu TT, Andolfatto P, Thornton KR. Landscape of standing variation for tandem duplications in Drosophila yakuba and Drosophila simulans. Molecular Biology and Evolution. 2014;31:1750–1766. doi: 10.1093/molbev/msu124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Rosenzweig BK, Pease JB, Besansky NJ, Hahn MW. Powerful methods for detecting introgressed regions from population genomic data. Molecular Ecology. 2016;25:2387–2397. doi: 10.1111/mec.13610. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Sankararaman S, Mallick S, Dannemann M, Prüfer K, Kelso J, Pääbo S, Patterson N, Reich D. The genomic landscape of Neanderthal ancestry in present-day humans. Nature. 2014;507:354–357. doi: 10.1038/nature12961. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Sankararaman S, Mallick S, Patterson N, Reich D. The Combined Landscape of Denisovan and Neanderthal Ancestry in Present-Day Humans. Current Biology. 2016;26:1241–1247. doi: 10.1016/j.cub.2016.03.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Satta Y, Toyohara N, Ohtaka C, Tatsuno Y, Watanabe TK, Matsuura ET, Chigusa SI, Takahata N. Dubious maternal inheritance of mitochondrial DNA inD. simulans and evolution of D. mauritiana. Genetical Research. 1988;52:1–6. doi: 10.1017/S0016672300027245. [DOI] [Google Scholar]
  78. Satta Y, Takahata N. Evolution of Drosophila mitochondrial DNA and the history of the melanogaster subgroup. PNAS. 1990;87:9558–9562. doi: 10.1073/pnas.87.24.9558. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Schrider DR, Ayroles J, Matute DR, Kern AD. Supervised machine learning reveals introgressed loci in the genomes of Drosophila simulans and D. sechellia. PLOS Genetics. 2018;14:e1007341. doi: 10.1371/journal.pgen.1007341. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Seehausen O, Butlin RK, Keller I, Wagner CE, Boughman JW, Hohenlohe PA, Peichel CL, Saetre GP, Bank C, Brännström A, Brelsford A, Clarkson CS, Eroukhmanoff F, Feder JL, Fischer MC, Foote AD, Franchini P, Jiggins CD, Jones FC, Lindholm AK, Lucek K, Maan ME, Marques DA, Martin SH, Matthews B, Meier JI, Möst M, Nachman MW, Nonaka E, Rennison DJ, Schwarzer J, Watson ET, Westram AM, Widmer A. Genomics and the origin of species. Nature Reviews Genetics. 2014;15:176–192. doi: 10.1038/nrg3644. [DOI] [PubMed] [Google Scholar]
  81. Slatkin M. Isolation by distance in equilibrium and non-equilibrium populations. Evolution. 1993;47:264–279. doi: 10.1111/j.1558-5646.1993.tb01215.x. [DOI] [PubMed] [Google Scholar]
  82. Solignac M, Monnerot M, Mounolou JC. Mitochondrial DNA evolution in the melanogaster species subgroup of Drosophila. Journal of Molecular Evolution. 1986;23:31–40. doi: 10.1007/BF02100996. [DOI] [PubMed] [Google Scholar]
  83. Solignac M, Monnerot M. Race formation, speciation, and introgression within Drosophila simulans, D. mauritiana, and D. sechellia inferred from mitochondrial DNA analysis. Evolution. 1986;40:531–539. doi: 10.1111/j.1558-5646.1986.tb00505.x. [DOI] [PubMed] [Google Scholar]
  84. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Stern DL, Crocker J, Ding Y, Frankel N, Kappes G, Kim E, Kuzmickas R, Lemire A, Mast JD, Picard S. Genetic and Transgenic Reagents for Drosophila simulans, D. mauritiana, D. yakuba, D. santomea, and D. virilis. G3: Genes|Genomes|Genetics. 2017;7:1339–1347. doi: 10.1534/g3.116.038885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989;123:585–595. doi: 10.1093/genetics/123.3.585. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Tao Y, Hartl DL, Laurie CC. Sex-ratio segregation distortion associated with reproductive isolation in Drosophila. PNAS. 2001;98:13183–13188. doi: 10.1073/pnas.231478798. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Tao Y, Chen S, Hartl DL, Laurie CC. Genetic dissection of hybrid incompatibilities between Drosophila simulans and D. mauritiana. I. Differential accumulation of hybrid male sterility effects on the X and autosomes. Genetics. 2003;164:1383–1397. doi: 10.1093/genetics/164.4.1383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Tao Y, Hartl DL. Genetic dissection of hybrid incompatibilities between Drosophila simulans and D. mauritiana. III. Heterogeneous accumulation of hybrid incompatibilities, degree of dominance, and implications for Haldane's rule. Evolution. 2003;57:2580–2589. doi: 10.1111/j.0014-3820.2003.tb01501.x. [DOI] [PubMed] [Google Scholar]
  90. Tao Y, Araripe L, Kingan SB, Ke Y, Xiao H, Hartl DL. A sex-ratio meiotic drive system in Drosophila simulans. II: an X-linked distorter. PLOS Biology. 2007a;5:e293. doi: 10.1371/journal.pbio.0050293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Tao Y, Masly JP, Araripe L, Ke Y, Hartl DL. A sex-ratio meiotic drive system in Drosophila simulans. I: an autosomal suppressor. PLOS Biology. 2007b;5:e292. doi: 10.1371/journal.pbio.0050292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. True JR, Mercer JM, Laurie CC. Differences in crossover frequency and distribution among three sibling species of Drosophila. Genetics. 1996a;142:507–523. doi: 10.1093/genetics/142.2.507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. True JR, Weir BS, Laurie CC. A genome-wide survey of hybrid incompatibility factors by the introgression of marked segments of Drosophila mauritiana chromosomes into Drosophila simulans. Genetics. 1996b;142:819–837. doi: 10.1093/genetics/142.3.819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Turissini DA, Matute DR. Fine scale mapping of genomic introgressions within the Drosophila yakuba clade. PLOS Genetics. 2017;13:e1006971. doi: 10.1371/journal.pgen.1006971. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Wright S. The genetical structure of populations. Annals of Eugenics. 1951;15:323–354. doi: 10.1111/j.1469-1809.1949.tb02451.x. [DOI] [PubMed] [Google Scholar]
  96. Wu CI, Davis AW. Evolution of postmating reproductive isolation: the composite nature of Haldane's rule and its genetic bases. The American Naturalist. 1993;142:187–212. doi: 10.1086/285534. [DOI] [PubMed] [Google Scholar]
  97. Zhang L, Sun T, Woldesellassie F, Xiao H, Tao Y. Sex ratio meiotic drive as a plausible evolutionary mechanism for hybrid male sterility. PLOS Genetics. 2015;11:e1005073. doi: 10.1371/journal.pgen.1005073. [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision letter

Editor: Molly Przeworski1

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Gene flow mediates the role of sex chromosome meiotic drive during complex speciation" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Diethard Tautz as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission. As you will see from the comments below, all three reviewers appreciated the impressive amount of work and thought the findings were interesting and novel. However, they had some concerns, notably that (1) Gmin is a relatively untested method and its biases could potentially confound interpretation of the results. In that regard, they suggested either evaluating its performance by simulation or using it alongside standard methods such as the D statistic and checking that the conclusions remain unchanged. (2) The reviewers pointed out that both the use of Gmin and the mapping analysis is insufficiently detailed (or not described at all); please make sure to add this information in your revision. Other suggestions are included below, in individual comments.

Reviewer #1:

Meiklejohn et al., use a clever design to map to moderate resolution several male-sterile hybrid incompatibilities between D. mauritiana and D. simulans. They also find several large regions that are tolerant to introgression on the X, including a known drive element that has undergone introgression between species in natural populations. The experimental approach is novel and the findings interesting, but I have some concerns outlined below:

1a) Analyses of gene flow. The observed pattern of introgression around a drive locus is very interesting, but I had several concerns about the analysis and would like to see a more comprehensive analysis of the gene flow signal reported. First, I was unsure whether Gmin is the most sensitive and relevant statistic for this test, in part because it seems that it will only detect introgressing segments that are not fixed. If true, this seems like it could contribute to the frequency difference in introgressed segments observed on the X and autosomes. Second, it would be helpful to get a sense of the expected performance of Gmin under a number of conditions such a recombination and mutation rate variation, particularly given the observation that regions identified have systematically smaller numerators and denominators. I was also unsure how the setup of the simulation procedure could influence the results.

1b) Winters sex ratio drive system results. I was wondering if the authors could expand on these observations and analyses. First, I was wondering if the moderate Gmin reduction here suggests that this introgression might be old (?) but the fact that it is identified by Gmin and the authors' other analyses suggests that it is not fixed. Or is a ~2-fold reduction in Gmin expected for very recent introgression? It would also be informative in interpreting the results if the authors added permutations on what the probability of overlap of a 130 kb introgressed region with a known drive system on the X is by chance; I expect it would be low. Furthermore, I was confused about the argument that these regions had experienced selective sweeps in both species. I believe for simulans this is in reference to previous work, but evidence for mauritiana appears to be in reference to lower Dxy within mauritiana in this region. I would recommend a more formal analysis of this or tempering the language somewhat.

2) Possible limitations of the mapping design and analysis. If I am understanding the setup correctly, one possible limitation is that there may be many incompatible segments within the introgressed segments, but only a subset of recombination events that combine the white eye and fluorescent markers can be detected. Thus, the authors rely on the QTL mapping to try to localize these incompatible regions, but there may be several or many within each region. To me this is suggested by the reported relationship between introgressed region length and fertility, as well as qualitatively by the high background in the mapping results. In order to tease this apart, the authors could add mixture proportion of the X as a covariate in their mapping analyses to understand how much individual regions explain sterility/fertility above the length of the introgressed segment. I think either conclusion is quite interesting. I do think dealing with the variation in mixture proportion between lines is necessary because it appears quite substantial from Figure 2 and thus could be viewed as substantial population structure in the mapping population.

3) Lack of important detail in analyses. The manuscript lacks important detail necessary to understand and interpret the results. I noticed this particularly in the description of mapping of fertility/sterility and in the Monte Carlo simulations. For mapping, it appears that rQTL was used but I did not find details of this in the Methods (approach, phenotype distribution assumed, covariates, permutation approach, etc.). For the Monte Carlo simulations, which are key to interpreting the Gmin results, I did not see details on how the simulations were done. i.e. what program was used or was an in-house program used, how large were the segments simulated, how was the appropriateness of null demographic models evaluated, among many other details.

It would be very interesting to add some simulations of expected dynamics of drive loci in this hybridization scenario or make more explicit reference to the findings of previous modeling if this is outside of the scope.

Reviewer #2:

This manuscript by Meiklejohn and colleagues provides a comprehensive body of work involving genetic crosses and genomic analyses to address fundamental questions regarding the role of meiotic drive in speciation. In particular, this study brings together introgression analysis of the X chromosome and population genomic analyses between Drosophila simulans and D. mauritiana. This is a staggering amount of work, the methods and analyses appear sound, the conclusions are both robust and novel, and the manuscript is presented in a clear and concise manner.

The authors first use a 2P-based introgression system to move tiled sections of the X chromosome from D. mauritiana into D. simulans through >40 backcross generations, which allows them to coarsely map hybrid male sterility loci. They then use a 1P-YFP based method to perform high resolution mapping of these loci. Among the fertile introgressions generated during this study, they uncovered a cryptic drive system in D. mauritiana. This drive system is novel and does not correspond to any of the three known systems in D. simulans. Among the sterile introgressions, Meiklejohn and colleagues detect four small regions that cause hybrid male sterility. None of these sterile introgressions appear to show signs of meiotic drive. Together, the authors conclude that there is ample evidence for cryptic drive between species, but none of the sterile introgressions show any association with drive elements.

1) Here, it is probably worth pointing out that if a history of drive led to the evolution of any of the hybrid male sterile loci, this would be undetectable. Sterile introgression males, by definition, produce few or no kids and would not allow robust detection of biased sex ratios. Such males may, however, sometimes prove to be very weakly fertile. Progeny data from such very weakly fertile males (< five kids), however, are discarded in the current analysis. Along these lines, it may be worth separately describing the progeny sex ratios in very weakly fertile males generated in otherwise sterile introgressions.

The population genomic analyses uncover 47 genomic regions of introgression between these species located on autosomes, and only one on the X chromosome. This X-linked region corresponds to a very small region that includes genes involved in a known D. simulans drive system (Dox/ MDox). This region of low divergence is associated with a known drive system but not with any of the hybrid male sterile genes. This suggests that gene flow across the two species, perhaps mediated by the selfish spread of this driver, prevented the evolution of hybrid incompatibility genes in this region of the X-chromosome. These results indicate that selfish drive systems may not only promote the evolution of hybrid sterility between species as has been shown in other studies but may also act against the evolution of reproductive isolation in certain cases.

2) Here, it is worth noting that the X chromosomes of most strains in D. simulans or D. mauritiana do not distort within species. If the Winters system has indeed invaded across species, then this may suggest that its suppressors of drive may also have accompanied the driving locus. In this context, it may be worth noting whether the region on 3R that corresponds to the known suppressor Nmy also show signs of introgression.

Reviewer #3:

This manuscript has three large components. First Meiklejohn et al., map the genetic basis of hybrid sterility in a recently diverged species pair of Drosophila. They use high resolution mapping to identify the genetic basis of hybrid sterility in simulans/mauritiana hybrid males (carrying the mauritiana X chromosome). Sometimes the term high resolution mapping gets overhyped; not in this case, these authors really mean it. They used multiple mapping methods to validate their results (Figure 1, Figure 2 and Figure 3). The find six (!) separable elements involved in sterility. The second portion of the manuscript uses the experimental introgressions (one of the approaches the authors used to map the genetic basis of hybrid sterility) and find that some introgressions lead to cryptic X-chromosome drive in D. mauritiana. These results are important because there are very few known drive systems for which the molecular basis is known. These are important experiments and they try to elucidate the reasons of why X-chromosomes are commonly involved in isolation. Moreover, we know little about how meiotic drive systems affect genome divergence (but see below). These two components of the manuscript are nothing short of stunning and frankly it's some of the best evolutionary genetics I have read in the last 12 months. I have no major comments for this section.

The third component of the manuscript is a population genomics approach to identify regions from the X-chromosome that have crossed species boundaries. The authors use the Gmin metric to detect introgressions in the X-chromosome. This metric is slightly problematic though. Gmin was originally proposed by Geneva et al., (2015). After studying the paper, I found it has no real estimates of sensitivity of the method; the closest it gets is a comparison between Gmin and Fst. The latter is not a proper metric to detect gene flow and its shortcomings have been discussed at length (e.g., Noor and Bennett, 2010; Guerrero and Hahn, 2017 among many others). A related issue is that the method implemented in POPBAM uses windows to assess the existence of gene flow. This is a limited approach to infer the presence of small introgressions: if the X-linked alleles are strongly selected against, one would expect smaller haplotypes which means windows in the X-chromosome have less power than windows in the autosomes to detect introgression. The authors also use this window-based approach to calculate the size of the haplotype around Dox and infer that its size is 130kb (the upper panel of Figure 6 is not very informative). Then the authors infer selective sweeps in this region. This 'haplotype' shows reduced polymorphism and lower interspecific divergence than the rest of the genome. This result is interesting, but I am left to wonder how does Gmin perform in instances of low polymorphism (i.e., what is the power of the metric). Schrider et al., (bioRxiv, Figure 1) provides an estimate of the performance of Gmin (but compared to their own method, so not very useful here) and conclude that Gmin can lead to false positives.

Overall, the second half of the manuscript is weaker than the first, but I think that if the authors can demonstrate that Gmin performs well to detect interspecific gene flow with high sensitivity and specificity, I would be convinced of the results. I would suggest two possible solutions here. The authors could (1) simulate genomes with different levels of introgression and determine whether the metric performs well with the level of divergence observed between simulans and mauritiana (Geneva et al., 2015 has some forward simulations that really do not address this issue) or (2) the authors use an additional method to validate their results with POPBAM.

Two additional notes on this topic. First, I am a little surprised there is no mention of the possibility of incomplete lineage sorting (ILS) and its possible involvement with meiotic drive. For example, if the haplotype around Dox and MDox is truly 130kb, ILS is a truly unlikely explanation. If the signal is caused by smaller segments of shared ancestry that get collated into a large window, then it is less likely. (This comment is related to my concern about the use of Gmin to detect the size of an introgression.)

An additional suggestion for the authors. There is serious amount of important work in this manuscript. I had to read the piece over half a dozen times and the connection between sterility, drive, and gene exchange never crystallized. A couple of statements summarizing the results and stating the connection between sections would solve this issue and make this piece much more enjoyable.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Gene flow mediates the role of sex chromosome meiotic drive during complex speciation" for further consideration at eLife. Your revised article has been favorably evaluated by Diethard Tautz (Senior Editor), a Reviewing Editor, and two of the original three reviewers.

Everyone appreciated your revisions, but the reviewers had a couple of additional suggestions that we would like you to address before publication, most of which involve slight revisions to the text or clarifications to be added. We leave it to your discretion if you want to take up suggestion #1 of reviewer 1, or further discuss the D statistic results and the concordance with other lines of evidence in the text.

Reviewer #1:

The authors made a number of the suggested changes from the first round of review and I believe that the paper is much improved. I had a couple of additional questions/concerns regarding the added analyses and details included in this version of the manuscript.

1) I appreciate the authors adding D-statistic analyses in addition to Gmin analyses. One question arising from these analyses was that the estimate of Patterson's D for the X chromosome was not lower than the rest of the genome, and this seems to conflict with other results. This made me concerned that the observations about fewer Gmin identified windows on the X were due to power differences or some other issue and not due to lower introgression. Since D is not a direct of measure of mixture proportion, something like an F4-ratio could be used and I would be reassured if this resulted in lower estimates of introgression on the X. It would also be a helpful reality check if it could be shown that mixture proportions are estimated to be lower in regions with mapped hybrid sterility loci from this study (and possibly higher in regions estimated to be permeable to introgression based on mapping).

2) Data quality questions. I was concerned about what seemed like adhoc data quality evaluation in the fifth paragraph of the Materials and methods section. It seems like visual inspection could be replaced with a formal threshold (e.g. n kb with high posterior probability mauritiana ancestry, or a particular coverage threshold). How was evidence of contamination evaluated and how many lines did this impact? It would also be helpful to know many genotypes were excluded due to these criteria.

I also became concerned about reference bias issues and signals of introgression in reading the added details of the POPBAM analyses. The minimum coverage threshold of 3 reads seemed much too low to me based on experience with this kind of data. Low coverage can exacerbate issues with reference bias and given that mapping was to the mauritiana reference rather than an outgroup, could potential impact inferences of gene flow. I do not have an intuition about how this could impact Gmin analyses but have found it to have an impact on D-statistic type analyses.

Comment on response to reviewers: In response to my previous comments about Gmin the authors note "There is little evidence for recent parallel, hard selective sweeps from population genomic data for these species to date[…]" I am not sure that the population genetic observations relating to non-admixed models the authors detail here are directly relevant, as the dynamics after admixture can be quite different particularly with selection (both negative and positive). As before, I would prefer analyses of local ancestry that were sensitive to fixed regions as I think it would give more insight into the history of admixture, but do not think this impacts the main results of the paper which remain exciting.

Reviewer #3:

The points related to the genetic mapping have been addressed.

I still have a quibble regarding the detection of introgression. The authors use Gmin to detect introgressions and admit the metric is best suited to detect introgression in instances of recent (I'd argue very recent) introgression. In this new version they add calculations of the D-statistic on genomic windows to obtain an independent confirmation of their results.

I have reservations about a few statements in the manuscript though.

Since Gmin is dependent on dmin, its power will depend on the amount of polymorphism on a window. Since the magnitude of variation is different between autosomes and X-chromosomes (subsection “Population genomics of speciation history”); I am not sure Gmin is a good metric to compare the magnitude of introgression between X chromosomes and autosomes (as stated in the sixth paragraph).

The newly added analysis, D calculated on 10kb genomic-windows suffers from similar issues as Gmin. Simon et al., (2014) describes the statistical properties of the metric in detail.

I lean to think that differences in π or dmin can fully explain the autosome/X ratio in number of introgressed windows of 47:1 but I think is worth including the caveat.

I am convinced the DOX alleles have crossed species boundaries, but I still think the language of the manuscript needs a little bit of clean up.

eLife. 2018 Dec 13;7:e35468. doi: 10.7554/eLife.35468.049

Author response


Reviewer #1:

Meiklejohn et al., use a clever design to map to moderate resolution several male-sterile hybrid incompatibilities between D. mauritiana and D. simulans. They also find several large regions that are tolerant to introgression on the X, including a known drive element that has undergone introgression between species in natural populations. The experimental approach is novel and the findings interesting, but I have some concerns outlined below:

1a) Analyses of gene flow. The observed pattern of introgression around a drive locus is very interesting, but I had several concerns about the analysis and would like to see a more comprehensive analysis of the gene flow signal reported. First, I was unsure whether Gmin is the most sensitive and relevant statistic for this test, in part because it seems that it will only detect introgressing segments that are not fixed.

To clarify, Gmin can detect genomic segments of historical introgression in which the introgressed haplotype is (1) segregating in both populations, or (2) fixed in one population and segregating in the other. For example, our ~130-kb X-linked introgression is fixed in D. mauritiana but segregating in D. simulans.

The only introgressions that cannot be detected are those fixed in both species. We have empirical evidence that such reciprocal monophyly (at least for 10-kb windows) is rare: under reciprocal monophyly, expected Gmin→1; but there are no windows where Gmin =1, and 95% of Gmin values are <0.85 (see Figure 5). Furthermore, we argue below that the expected number of introgressions fixed in both species by drift should be negligible and those fixed in both species by positive selection should be detectable (due to information at flanking sites; see below).

If true, this seems like it could contribute to the frequency difference in introgressed segments observed on the X and autosomes. Second, it would be helpful to get a sense of the expected performance of Gmin under a number of conditions such a recombination and mutation rate variation, particularly given the observation that regions identified have systematically smaller numerators and denominators. I was also unsure how the setup of the simulation procedure could influence the results.

The publication describing Gmin included analyses of the statistic's performance over a range of both recombination and mutation rates (Geneva et al.,2015). These analyses showed that Gmin is largely insensitive to variation in these parameters.

We suggest that our inability to detect introgressed segments that are fixed in both species is unlikely to account for the observed deficit of X-linked introgressions. We consider the fixation of introgressions under (1) the standard neutral model (SNM) and (2) a standard selective sweep (SSM) model.

Under SNM assumptions, shorter transit times for neutral X-linked alleles destined to fixation are expected based on the smaller effective population size of the X chromosome. As we are interested in haplotypes that become fixed in both populations, the quantity of interest is the expected time to reciprocal monophyly (i.e., coalescence within each of the two populations). For an autosomal locus, Hudson and Coyne, 2002 showed that the probability of reciprocal monophyly reaches 0.95 only after 8.7Ne generations— this is >7-fold older than the inferred split time for our species. (Note that the Hudson-Coyne calculations assume a random partitioning of an ancestral population of size N into two descendant populations both also of size N; this history is different than the introgression of a haplotype from one population into another via gene flow. Nevertheless, the theory clearly predicts longer conditional times to fixation than expected for a single population, i.e., >4N generations.) For neutral introgressions, then, genealogies that are reciprocally monophyletic constitute a negligible fraction of sites on the X chromosome and the autosomes.

Under a model of parallel selective sweeps, reciprocal monophyly can in principle occur much faster than the neutral case. However, we believe the parallel sweeps model will have limited impact in reducing the efficacy of Gmin, for three reasons:

1) Parallel, hard selective sweeps are required to fix the introgressed haplotype in both species. There is little evidence for recent parallel, hard selective sweeps from population genomic data for these species to date.

2) The complete parallel sweeps will only produce reciprocal monophyly for the sequences corresponding to the overlap between the two fixed haplotypes. Most sweeps result in relatively short tracts that are identical-by-descent (IBD) within species, and the length of the overlap of the two IBD tracts will be smaller still.

3) Even under conditions that produce reciprocal monophyly, we expect to be able to detect introgression using Gminat flanking sites. The reason is that flanking sites will enter the receiving population with the introgression; hitchhike to non-trivial population frequencies; but then get recombined away from the beneficial introgressed mutation(s). (Note that this is the same process that causes elevated LD and excess rare variants at flanking sites in the wake a classic hard sweep.) While the beneficial mutation goes to fixation, flanking sites that “escape” the sweep will segregate as polymorphic introgressed haplotypes and will therefore be detectable by Gmin.

Finally, we note that an earlier study that identified candidate introgressed regions using only a single sequence from D. simulans and D. mauritiana also found a deficit of X-linked introgressions (Garrigan et al., 2012).

1b) Winters sex ratio drive system results. I was wondering if the authors could expand on these observations and analyses. First, I was wondering if the moderate Gmin reduction here suggests that this introgression might be old (?) but the fact that it is identified by Gmin and the authors' other analyses suggests that it is not fixed. Or is a ~2-fold reduction in Gmin expected for very recent introgression?

We predict no simple relationship between the age of an introgressed segment and quantitative reduction in Gmin, as this statistic is determined by both the similarity of between-species haplotypes and by the frequency of the introgressed haplotype in both populations through the average DXY. Two considerations suggest that the introgression at Dox-MDox occurred recently: first, at the Dox locus we find no fixed differences within the introgressed haplotype between D. simulans and D. mauritiana. Second, Gmin has limited power to detect older introgressions (see more details on this below).

It would also be informative in interpreting the results if the authors added permutations on what the probability of overlap of a 130 kb introgressed region with a known drive system on the X is by chance; I expect it would be low.

We have approached this question in two ways. First, at a cutoff of p < 0.001 (FDR = 0.05) we find that nine of 1,842 10-kb windows on the X chromosome have a significant Gmin value. In 10,000 random permutations of the data, no permutation resulted in a significant Gmin value for both 10-kb windows containing Dox and MDox, suggesting the probability of observing this result by chance is less than 0.0001. Second, these nine significant 10kb windows are all located within a single 130-kb region on the X chromosome; hence our inference that they together represent a single introgression event. MDox and Dox are physically separated by five 10-kb windows,so that there are seven 130-kb segments that include both loci. As there are 1,830 10-kb windows on the X chromosome, the probability that a randomly placed 130-kb window includes both MDox and Dox is 7/1830 = 0.004. Both of these approaches suggest that the probability of overlap of a 130-kb introgressed region with a known drive system on the X by chance is indeed low. We have included this second estimate in the revised manuscript (Discussion section).

There is a second known X-linked drive system in D. simulans, known as the Paris system. This system comprises two loci separated by ~160 kb that are both required for segregation distortion and thus could not be included within a 130-kb introgressed segment.

Furthermore, I was confused about the argument that these regions had experienced selective sweeps in both species. I believe for simulans this is in reference to previous work, but evidence for mauritiana appears to be in reference to lower Dxy within mauritiana in this region. I would recommend a more formal analysis of this or tempering the language somewhat.

Two previously published studies have presented formal analyses indicating a very strong, recent, hard selective sweep at the Dox/MDox locus in D. mauritiana (Nolte et al., 2013; Garrigan et al., 2014), and a third study documented evidence of recent positive selection at Dox and MDox in D. simulans (Kingan, Garrigan and Hartl, 2010). Garrigan et al., 2014 used the same genome sequences for D. mauritiana analyzed here, while Kingan et al., 2010 detected positive selection in a set of D. simulans isolates that are distinct from the sequences analyzed in this manuscript. We have updated the text of the manuscript to more clearly highlight this past work (Results section).

2) Possible limitations of the mapping design and analysis. If I am understanding the setup correctly, one possible limitation is that there may be many incompatible segments within the introgressed segments, but only a subset of recombination events that combine the white eye and fluorescent markers can be detected.

The mapping approach we used is able to detect all recombination events between the two P[w+] markers. For each 2P interval, we used eYFP markers located outside the two P[w+] markers, and we identified recombination events via the loss of one P[w+] marker (through changes in eye color from red or dark orange to light orange) and the gain of fluorescence. The utility of the eYFP marker was to trap specific recombination breakpoints and allow repeated measurement of fertility among males that all carry the same recombinant segment.

However, our approach is limited in its ability to identify the effects of hybrid sterility loci located in the middle of each 2P segment (perhaps this was the reviewer's point). As each recombinant genotype carries one P[w+] marker, sterility factors closely linked to the P[w+] may mask the effects of additional sterility factors within the 2P segment further from the P[w+]. This is most clearly seen for the interval 2P-6, where we detect complete sterility associated with both the left and right P[w+] markers, and thus cannot determine whether additional sterility factors reside in the middle of the interval. This limitation is discussed in the manuscript (Materials and methods section).

Thus, the authors rely on the QTL mapping to try to localize these incompatible regions, but there may be several or many within each region. To me this is suggested by the reported relationship between introgressed region length and fertility, as well as qualitatively by the high background in the mapping results. In order to tease this apart, the authors could add mixture proportion of the X as a covariate in their mapping analyses to understand how much individual regions explain sterility/fertility above the length of the introgressed segment. I think either conclusion is quite interesting. I do think dealing with the variation in mixture proportion between lines is necessary because it appears quite substantial from Figure 2 and thus could be viewed as substantial population structure in the mapping population.

It is true that the length of D. mauritiana segments varies extensively between recombinant introgression genotypes, although it is not clear to us how this could be considered equivalent to be population structure. We have re-examined the relationship between introgression length and fertility and conclude that there is less evidence for weak sterilizing factors distributed across the X chromosome than we first suspected. The previously reported negative correlation between introgression length and fertility (Pearson's r = -0.22, p<0.0001; Spearman's r = -0.31, p<0.0001) is almost entirely attributable to long, completely sterile introgressions. When we include only the 264 genotypes with a mean fertility >0, the correlation between length and fertility is greatly reduced (r = -0.13, P = 0.04; r = -0.14, P = 0.02). When we include only the 210 genotypes with sufficient fertility to be included in our sex-ratio analyses, this correlation disappears entirely (r = -0.05, P = 0.49; r = -0.03, P = 0.75). We therefore conclude that the effect of introgression length is largely the result of long introgressions being more likely to carry large-effect D. mauritiana alleles that significantly reduce male fertility.

This conclusion is supported by the analyses including introgression length suggested by the reviewer. For some of the QTL models we have implemented, it is not possible to use introgression length as a covariate in the R/qtl package. However, for the models where it is possible, we have added analyses that include introgression length as a covariate (see Figure 3—figure supplement 2). The results are largely consistent with analyses that do not include this covariate; we conclude that identification of major sterility regions via QTL analysis is not seriously confounded by the quantitative effects of introgression length on male fertility. The revised Results section now refer only to partial correlation analyses between introgression length, fertility, and progeny sex-ratio.

3) Lack of important detail in analyses. The manuscript lacks important detail necessary to understand and interpret the results. I noticed this particularly in the description of mapping of fertility/sterility and in the Monte Carlo simulations. For mapping, it appears that rQTL was used but I did not find details of this in the Methods (approach, phenotype distribution assumed, covariates, permutation approach, etc). For the Monte Carlo simulations, which are key to interpreting the Gmin results, I did not see details on how the simulations were done. i.e. what program was used or was an in-house program used, how large were the segments simulated, how was the appropriateness of null demographic models evaluated, among many other details.

We have updated the Materials and methods section to include these details that were previously omitted (QTL analysis and the Monte Carlo simulations).

It would be very interesting to add some simulations of expected dynamics of drive loci in this hybridization scenario or make more explicit reference to the findings of previous modeling if this is outside of the scope.

We agree that this is an interesting question, but it is beyond the scope of our (already lengthy) manuscript. Unfortunately, while theory exists for adaptive introgression in the context of linked genetic incompatibilities (Uecker et al., 2015), we’re unaware of comparable treatments of selfish introgression of drive loci in the context of linked genetic incompatibilities.

Reviewer #2:

This manuscript by Meiklejohn and colleagues provides a comprehensive body of work involving genetic crosses and genomic analyses to address fundamental questions regarding the role of meiotic drive in speciation. In particular, this study brings together introgression analysis of the X chromosome and population genomic analyses between Drosophila simulans and D. mauritiana. This is a staggering amount of work, the methods and analyses appear sound, the conclusions are both robust and novel, and the manuscript is presented in a clear and concise manner.

The authors first use a 2P-based introgression system to move tiled sections of the X chromosome from D. mauritiana into D. simulans through >40 backcross generations, which allows them to coarsely map hybrid male sterility loci. They then use a 1P-YFP based method to perform high resolution mapping of these loci. Among the fertile introgressions generated during this study, they uncovered a cryptic drive system in D. mauritiana. This drive system is novel and does not correspond to any of the three known systems in D. simulans. Among the sterile introgressions, Meiklejohn and colleagues detect four small regions that cause hybrid male sterility. None of these sterile introgressions appear to show signs of meiotic drive. Together, the authors conclude that there is ample evidence for cryptic drive between species, but none of the sterile introgressions show any association with drive elements.

1) Here, it is probably worth pointing out that if a history of drive led to the evolution of any of the hybrid male sterile loci, this would be undetectable. Sterile introgression males, by definition, produce few or no kids and would not allow robust detection of biased sex ratios. Such males may, however, sometimes prove to be very weakly fertile. Progeny data from such very weakly fertile males (< five kids), however, are discarded in the current analysis. Along these lines, it may be worth separately describing the progeny sex ratios in very weakly fertile males generated in otherwise sterile introgressions.

This important point has now been highlighted in the Discussion section. The progeny sex ratios from all males, including sub-fertile males are now presented in Figure 4—figure supplement 1 and Figure 4—figure supplement 2. In short, we detect no evidence for systematically female-biased progeny sex-ratios among sub-fertile males

The population genomic analyses uncover 47 genomic regions of introgression between these species located on autosomes, and only one on the X chromosome. This X-linked region corresponds to a very small region that includes genes involved in a known D. simulans drive system (Dox/ MDox). This region of low divergence is associated with a known drive system but not with any of the hybrid male sterile genes. This suggests that gene flow across the two species, perhaps mediated by the selfish spread of this driver, prevented the evolution of hybrid incompatibility genes in this region of the X-chromosome. These results indicate that selfish drive systems may not only promote the evolution of hybrid sterility between species as has been shown in other studies but may also act against the evolution of reproductive isolation in certain cases.

2) Here, it is worth noting that the X chromosomes of most strains in D. simulans or D. mauritiana do not distort within species. If the Winters system has indeed invaded across species, then this may suggest that its suppressors of drive may also have accompanied the driving locus. In this context, it may be worth noting whether the region on 3R that corresponds to the known suppressor Nmy also show signs of introgression.

Unfortunately, the region on 3R containing Nmy is surrounded by complex repeat sequences that preclude its assembly using short-read sequence data. As a consequence, the Nmy locus is missing from our assembly and alignments, and we cannot at the moment address whether Nmy has co-introgressed with MDox/Dox. This is however an interesting point we intend to return to in future work.

Reviewer #3:

This manuscript has three large components. First Meiklejohn et al. map the genetic basis of hybrid sterility in a recently diverged species pair of Drosophila. They use high resolution mapping to identify the genetic basis of hybrid sterility in simulans/mauritiana hybrid males (carrying the mauritiana X chromosome). Sometimes the term high resolution mapping gets overhyped; not in this case, these authors really mean it. They used multiple mapping methods to validate their results (Figure 1, Figure 2 and Figure 3). The find six (!) separable elements involved in sterility. The second portion of the manuscript uses the experimental introgressions (one of the approaches the authors used to map the genetic basis of hybrid sterility) and find that some introgressions lead to cryptic X-chromosome drive in D. mauritiana. These results are important because there are very few known drive systems for which the molecular basis is known. These are important experiments and they try to elucidate the reasons of why X-chromosomes are commonly involved in isolation. Moreover, we know little about how meiotic drive systems affect genome divergence (but see below). These two components of the manuscript are nothing short of stunning and frankly it's some of the best evolutionary genetics I have read in the last 12 months. I have no major comments for this section.

We appreciate these generous comments from the reviewer.

The third component of the manuscript is a population genomics approach to identify regions from the X-chromosome that have crossed species boundaries. The authors use the Gmin metric to detect introgressions in the X-chromosome. This metric is slightly problematic though. Gmin was originally proposed by Geneva et al., (2015). After studying the paper, I found it has no real estimates of sensitivity of the method; the closest it gets is a comparison between Gmin and Fst. The latter is not a proper metric to detect gene flow and its shortcomings have been discussed at length (e.g., Noor and Bennett 2010; Guerrero and Hahn, 2017 among many others). A related issue is that the method implemented in POPBAM uses windows to assess the existence of gene flow. This is a limited approach to infer the presence of small introgressions: if the X-linked alleles are strongly selected against, one would expect smaller haplotypes which means windows in the X-chromosome have less power than windows in the autosomes to detect introgression. The authors also use this window-based approach to calculate the size of the haplotype around Dox and infer that its size is 130kb (the upper panel of Figure 6 is not very informative). Then the authors infer selective sweeps in this region. This 'haplotype' shows reduced polymorphism and lower interspecific divergence than the rest of the genome. This result is interesting, but I am left to wonder how does Gmin perform in instances of low polymorphism (i.e., what is the power of the metric). Schrider et al., (bioRxiv, Figure 1) provides an estimate of the performance of Gmin (but compared to their own method, so not very useful here) and conclude that Gmin can lead to false positives.

We agree that a plausible cause of the dearth of X-linked introgressions is that X-linked foreign alleles are more strongly selected against, leading to smaller introgressions on the X that are more difficult to detect (see Materials and methods section). However, we consider this to be an interesting biological phenomenon, rather than a limitation of the Gmin statistic per se.

Simulations presented both in Geneva et al., 2015 (https://doi.org/10.1371/journal.pone.0118621) and Schrider et al., 2018 (https://doi.org/10.1371/journal.pgen.1007341) indicate that Gmin has the greatest power to detect recent introgressions. Our interpretation of Figure 1 in Schrider et al., 2018 is not that Gmin has a high rate of false positives, but rather that it has a high rate of false negatives for all but the most recent of introgressions. We therefore conclude that our analysis has likely missed older introgressions, but that we can be reasonably confident about the introgressions it did identify.

The relationship between polymorphism and the probability of identifying introgression by Gmin is complex. However, two considerations suggest that the reduced polymorphism at the MDox/Dox locus should not lead to a spurious inference of introgression. First, simulations presented in Geneva et al., 2015 indicate that Gmin's false negative rate (proportion of truly introgressed segments missed by Gmin; 1 – sensitivity) increases with decreasing polymorphism, while the false positive rate (proportion of segments identified by Gmin that did not truly introgress; 1 – specificity) is insensitive to levels of polymorphism. This suggests that we should expect a greater rate of false negatives in regions with low polymorphism, and that the identification of introgressed MDox/Dox alleles is conservative.

Second, there is a significant negative correlation between the GminP-value and polymorphism within both D. simulans (Spearman's r=-0.22, p<0.0001) and D. mauritiana (r=-0.38, p<0.0001), indicating that windows with higher polymorphism tend to have lower Gmin values, although this correlation is largely driven by the large majority of non-significant windows (Figure 5—figure supplement 2). However, 10-kb windows with significant Gmin values have lower levels of polymorphism in D. simulans than non-significant windows, while significant windows have higher levels of polymorphism in D. mauritiana than non-significant windows (Figure 5—figure supplement 2). One interpretation of this pattern is that windows identified as significant by Gmin have levels of polymorphism similar to that in the other species (D. simulans harbors more polymorphism than D. mauritiana), which we think is consistent with these windows carrying lineages derived from the other species. Altogether, these considerations suggest that our inference of introgression at MDox/Dox should be robust to the low levels of polymorphism at this locus.

Overall, the second half of the manuscript is weaker than the first, but I think that if the authors can demonstrate that Gmin performs well to detect interspecific gene flow with high sensitivity and specificity, I would be convinced of the results. I would suggest two possible solutions here. The authors could (1) simulate genomes with different levels of introgression and determine whether the metric performs well with the level of divergence observed between simulans and mauritiana (Geneva et al., 2015 has some forward simulations that really do not address this issue) or (2) the authors use an additional method to validate their results with POPBAM.

As mentioned above, the properties and behavior of Gmin have been explored in at least three publications, Geneva et al. 2015, Schrider et al., 2018, and Rosenzweig et al., 2016. Geneva et al., 2015 studied its behavior, sensitivity, and specificity using coalescent simulations that explore a range of migration, mutation and recombination parameters. Rosenzweig et al., 2016 similarly investigated the power of Gmin for varying simulated levels of migration and migration time. Schrider et al., 2018 compared the performance Gmin with their method that integrates multiple sequence features and summary statistics, including Gmin. As mentioned above, Schrider et al., 2018 showed that their method has increased power (fewer false negatives) relative to Gmin, but not that Gmin suffers from a high rate of false positives. Finally, we controlled the genome-wide false discovery rate to 5%, corresponding to P-values <0.001. Thus, we feel that our implementation of Gmin to detect regions of introgression has been adequately documented to justify these analyses.

Nonetheless, to show that our inference that the MDox/Dox region introgressed between species is robust to the method used to identify introgression, we have implemented the four-population (ABBA-BABA) test, summarized by Patterson's D statistic. Within the 130-kb MDox/Dox region we find that a large excess of derived sites is shared between D. simulans and D. mauritiana to the exclusion of D. sechellia relative to both the rest of the X chromosome and to the autosomes. We conclude that this approach also supports the recent movement of alleles at this locus between D. simulans and D. mauritiana. These results are presented in the revised manuscript (Discussion section).

Two additional notes on this topic. First, I am a little surprised there is no mention of the possibility of incomplete lineage sorting (ILS) and its possible involvement with meiotic drive. For example, if the haplotype around Dox and MDox is truly 130kb, ILS is a truly unlikely explanation. If the signal is caused by smaller segments of shared ancestry that get collated into a large window, then it is less likely. (This comment is related to my concern about the use of Gmin to detect the size of an introgression.)

We never seriously entertained the possibility that ILS could explain the signal of introgression at MDox/Dox, for multiple reasons including those outlined by the reviewer. There are many ancestral polymorphisms segregating in these two species (see Table 4), consistent with incomplete lineage sorting (ILS). But these ancestral polymorphisms segregate as SNPs interspersed with new, derived SNPs that have accumulated since the species diverged. They do not persist as long, intact ancestral haplotypes. Therefore, ILS does not produce significantly reduced Gmin estimates for 10-kb windows. The inferred introgressed region comprises 9 nearly contiguous 10-kb windows with GminP-values ranging from <0.0001 to a maximum of 0.0007 (Supplementary file 1). These are windows that show significantly low sequence distances between subsets of haplotypes in the two species that cannot be accommodated by an isolation model (and hence ILS). Put another way, the region has a dearth of derived, species specific SNPs in these haplotypes. Finally, we note that a history of introgression is supported by additional four-population test (Patterson’s D; see above).

An additional suggestion for the authors. There is serious amount of important work in this manuscript. I had to read the piece over half a dozen times and the connection between sterility, drive, and gene exchange never crystallized. A couple of statements summarizing the results and stating the connection between sections would solve this issue and make this piece much more enjoyable.

We appreciate this suggestion, and we have tried to highlight these connections in the revised manuscript (Abstract and Results section).

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Reviewer #1:

Thank you for resubmitting your work entitled "Gene flow mediates the role of sex chromosome meiotic drive during complex speciation" for further consideration at eLife. Your revised article has been favorably evaluated by Diethard Tautz (Senior Editor), a Reviewing Editor, and two of the original three reviewers.

Everyone appreciated your revisions, but the reviewers had a couple of additional suggestions that we would like you to address before publication, most of which involve slight revisions to the text or clarifications to be added. We leave it to your discretion if you want to take up suggestion #1 of reviewer 1, or further discuss the D statistic results and the concordance with other lines of evidence in the text.

We appreciate the reviewers' time and care in considering our revisions. In the interests of brevity and readability, we have added an appendix to our manuscript that describes additional simulations and calculations that support our inferences of introgression gleaned from Gmin and the D statistic. Below we address the reviewer comments, making reference to the Appendix where appropriate.

The authors made a number of the suggested changes from the first round of review and I believe that the paper is much improved. I had a couple of additional questions/concerns regarding the added analyses and details included in this version of the manuscript.

1) I appreciate the authors adding D-statistic analyses in addition to Gmin analyses. One question arising from these analyses was that the estimate of Patterson's D for the X chromosome was not lower than the rest of the genome, and this seems to conflict with other results. This made me concerned that the observations about fewer Gmin identified windows on the X were due to power differences or some other issue and not due to lower introgression. Since D is not a direct of measure of mixture proportion, something like an F4-ratio could be used and I would be reassured if this resulted in lower estimates of introgression on the X. It would also be a helpful reality check if it could be shown that mixture proportions are estimated to be lower in regions with mapped hybrid sterility loci from this study (and possibly higher in regions estimated to be permeable to introgression based on mapping).

The value of Patterson's D statistic is influenced by effective population size, and thus, even in the absence of chromosomal differences in introgression, the X chromosome is expected to show greater values of D than the autosomes. We demonstrate this property in the Appendix and refer to previously published studies using human and Neanderthal genome sequences that also show a deficit of X-linked introgression by distance methods (analogous to Gmin) but higher values of D on the X chromosome than the autosomes.

2) Data quality questions. I was concerned about what seemed like adhoc data quality evaluation in the fifth paragraph of the Materials and methods section. It seems like visual inspection could be replaced with a formal threshold (e.g. n kb with high posterior probability mauritiana ancestry, or a particular coverage threshold). How was evidence of contamination evaluated and how many lines did this impact? It would also be helpful to know many genotypes were excluded due to these criteria.

The perception of an ad hoc approach was due to an overly simplified description of the actual data filtering process. We have revised the manuscript to include more specific details regarding this data filtering step. The revised Materials and methods section now reads:

"Genotype data and ancestry assignments were inspected for all recombinant 1P-YFP introgression genotypes. Genotypes were excluded if there was no segment on the X chromosome identified by the HMM that had either a posterior probability of D. mauritiana parentage > 0.95 or a posterior probability of D. simulans parentage < 0.05. Genotypes with segments that had either a posterior probability of D. mauritiana parentage >0.95 or a posterior probability of D. simulans parentage < 0.05 in a region that was not within the parental 2P region (i.e. came from a different 2P introgression) were inferred to have resulted either from mislabeling or contamination of DNA samples and were excluded from further analyses. 112 genotypes had insufficient sequence data to identify introgressions using the criteria above (or the introgression was too small to be identified). 16 genotypes showed evidence for D. mauritiana alleles that did not fall within the parental 2P interval."

We have included data files and figures that visualize ancestry assignments for all genotypes in our data submission to Dryad, including those that were excluded from further analyses.

I also became concerned about reference bias issues and signals of introgression in reading the added details of the POPBAM analyses. The minimum coverage threshold of 3 reads seemed much too low to me based on experience with this kind of data. Low coverage can exacerbate issues with reference bias and given that mapping was to the mauritiana reference rather than an outgroup, could potential impact inferences of gene flow. I do not have an intuition about how this could impact Gmin analyses but have found it to have an impact on D-statistic type analyses.

We do not expect that this is a problem for our analyses. Reference bias should cause divergent D. simulans sequences that fail to map to the D. mauritiana genome assembly to be excluded from our analyses; missing these divergent alleles will decrease average DXY and thus increase Gmin and reduce our power. Thus, our choice of parameters should be conservative with respect to identifying introgressions.

Comment on response to reviewers: In response to my previous comments about Gmin the authors note "There is little evidence for recent parallel, hard selective sweeps from population genomic data for these species to date[…]" I am not sure that the population genetic observations relating to non-admixed models the authors detail here are directly relevant, as the dynamics after admixture can be quite different particularly with selection (both negative and positive). As before, I would prefer analyses of local ancestry that were sensitive to fixed regions as I think it would give more insight into the history of admixture, but do not think this impacts the main results of the paper which remain exciting.

We feel that these issues, while interesting, are beyond the scope of the present manuscript.

Reviewer #3:

The points related to the genetic mapping have been addressed.

I still have a quibble regarding the detection of introgression. The authors use Gmin to detect introgressions and admit the metric is best suited to detect introgression in instances of recent (I'd argue very recent) introgression. In this new version they add calculations of the D-statistic on genomic windows to obtain an independent confirmation of their results.

I have reservations about a few statements in the manuscript though.

Since Gmin is dependent on dmin, its power will depend on the amount of polymorphism on a window. Since the magnitude of variation is different between autosomes and X-chromosomes (subsection “Population genomics of speciation history”); I am not sure Gmin is a good metric to compare the magnitude of introgression between X chromosomes and autosomes (as stated in the sixth paragraph).

The newly added analysis, D calculated on 10kb genomic-windows suffers from similar issues as Gmin. Simon et al., (2014) describes the statistical properties of the metric in detail.

I lean to think that differences in π or dmin can fully explain the autosome/X ratio in number of introgressed windows of 47:1 but I think is worth including the caveat.

I am convinced the DOX alleles have crossed species boundaries, but I still think the language of the manuscript needs a little bit of clean up.

In the Appendix, we present calculations and simulations that indicate that the power of Gmin is not significantly different for the X chromosome and the autosomes, and that the lower polymorphism of the X chromosome is unlikely to explain the observed deficit of X-linked introgression. We concur that, unlike Gmin, the D statistic is inappropriate for comparisons between the X and autosomes, and we outline this reasoning in the Appendix. Nonetheless, we feel the D statistic has value for demonstrating a history of past introgression between these two species, and for confirming that the Dox/MDox region has a genealogical history that is distinct from the rest of the X chromosome.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Colin Meiklejohn, Daven Presgraves, David L Stern. 2018. Sequence data from Gene flow mediates the role of sex chromosome meiotic drive during complex speciation. NCBI Sequence Read Archive. SRR8247551 [DOI] [PMC free article] [PubMed]
    2. Meiklejohn CD, Landeen EL, Presgraves DC. 2018. Gene flow mediates the role of sex chromosome meiotic drive during complex speciation. Dryad Digital Repository. [DOI] [PMC free article] [PubMed]
    3. Garrigan D, Kingan SB, Geneva AJ, Vedanayagam JP, Presgraves DC. 2014. Drosophila mauritiana genome sequencing. NCBI BioProject. PRJNA158675
    4. Rogers RL, Cridland JM, Shao L, Hu TT, Andolfatto P, Thornton KR. 2015. Tandem Duplications and the Limits of Natural Selection in Drosophila yakuba and Drosophila simulans. NCBI Sequence Read Archive. SRP040290 [DOI] [PMC free article] [PubMed]
    5. Rogers RL, Cridland JM, Shao L, Hu TT, Andolfatto P, Thornton KR. 2015. Tandem Duplications and the Limits of Natural Selection in Drosophila yakuba and Drosophila simulans. NCBI Sequence Read Archive. SRP029453 [DOI] [PMC free article] [PubMed]

    Supplementary Materials

    Figure 1—source data 1. Source data for Figure 1—figure supplement 1, Figure 4—figure supplement 1.
    DOI: 10.7554/eLife.35468.004
    Figure 1—source data 2. Source data for Figure 1—figure supplement 1, Figure 4—figure supplement 1.
    DOI: 10.7554/eLife.35468.005
    Figure 1—source data 3. Source data for Figure 1—figure supplement 1.
    DOI: 10.7554/eLife.35468.006
    Figure 2—source data 1. Source data for Figure 2, Figure 2—figure supplement 1, Figure 4.
    DOI: 10.7554/eLife.35468.011
    DOI: 10.7554/eLife.35468.015
    Figure 5—source data 1. Source data for Figure 5—figure supplements 1 and 2.
    DOI: 10.7554/eLife.35468.025
    Figure 5—source data 2. Source data for Figure 5.
    DOI: 10.7554/eLife.35468.026
    Figure 6—source data 1. Source data for Figure 6.
    DOI: 10.7554/eLife.35468.028
    Supplementary file 1. Gmin scan identifies forty-eight interspecific introgressions.
    elife-35468-supp1.xlsx (47.5KB, xlsx)
    DOI: 10.7554/eLife.35468.029
    Supplementary file 2. Genotype of samples at the Dox and MDox genes.
    elife-35468-supp2.xlsx (41.4KB, xlsx)
    DOI: 10.7554/eLife.35468.030
    Supplementary file 3. Primers used in RT-PCR to assay expression of MDox, Dox, and a control gene (RpS28b).
    elife-35468-supp3.xlsx (30.8KB, xlsx)
    DOI: 10.7554/eLife.35468.031
    Transparent reporting form
    DOI: 10.7554/eLife.35468.032

    Data Availability Statement

    Sequence data is available via the NCBI Sequence Read Archive (accession number: SRR8247551). Phenotype data have been submitted to Dryad (DOI: https://doi.org/10.5061/dryad.4qn4s47).

    The following datasets were generated:

    Colin Meiklejohn, Daven Presgraves, David L Stern. 2018. Sequence data from Gene flow mediates the role of sex chromosome meiotic drive during complex speciation. NCBI Sequence Read Archive. SRR8247551

    Meiklejohn CD, Landeen EL, Presgraves DC. 2018. Gene flow mediates the role of sex chromosome meiotic drive during complex speciation. Dryad Digital Repository.

    The following previously published datasets were used:

    Garrigan D, Kingan SB, Geneva AJ, Vedanayagam JP, Presgraves DC. 2014. Drosophila mauritiana genome sequencing. NCBI BioProject. PRJNA158675

    Rogers RL, Cridland JM, Shao L, Hu TT, Andolfatto P, Thornton KR. 2015. Tandem Duplications and the Limits of Natural Selection in Drosophila yakuba and Drosophila simulans. NCBI Sequence Read Archive. SRP040290

    Rogers RL, Cridland JM, Shao L, Hu TT, Andolfatto P, Thornton KR. 2015. Tandem Duplications and the Limits of Natural Selection in Drosophila yakuba and Drosophila simulans. NCBI Sequence Read Archive. SRP029453


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES