Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Aug 1.
Published in final edited form as: Evolution. 2013 Apr 19;67(8):2376–2384. doi: 10.1111/evo.12118

More accurate phylogenies inferred from low-recombination regions in the presence of incomplete lineage sorting

James B Pease 1,3, Matthew W Hahn 1,2
PMCID: PMC3929462  NIHMSID: NIHMS551512  PMID: 23888858

Abstract

When speciation events occur in rapid succession, incomplete lineage sorting (ILS) can cause disagreement among individual gene trees. The probability that ILS affects a given locus is directly related to its effective population size (Ne), which in turn is proportional to the recombination rate if there is strong selection across the genome. Based on these expectations, we hypothesized that low-recombination regions of the genome, as well as sex chromosomes and non-recombining chromosomes, should exhibit lower levels of ILS. We tested this hypothesis in phylogenomic datasets from primates, the Drosophila melanogaster clade, and the D. simulans clade. In all three cases, regions of the genome with low or no recombination showed significantly stronger support for the putative species tree, although results from the X chromosome differed among clades. Our results suggest that recurrent selection is acting in these low-recombination regions, such that current levels of diversity also reflect past decreases in the effective population size at these same loci. The results also demonstrate how considering the genomic context of a gene tree can assist in more accurate determination of the true species phylogeny, especially in cases where a whole-genome phylogeny appears to be an unresolvable polytomy.

Keywords: phylogenomics, mitochondria, sex chromosomes, primates, Drosophila

Introduction

In molecular phylogenetics, the ultimate goal is the reconstruction of the evolutionary history of a group of species from molecular sequence data (Felsenstein 2004; Edwards 2009). Such species trees are often inferred from one or more gene trees, each of which describes the evolutionary relationships among homologous loci in the sampled species. However, individual gene trees may differ from the true species tree for a number of reasons, including long-branch attraction, hybridization, horizontal gene transfer, or duplication and loss of undetected paralogs (reviewed in Maddison 1997; Degnan and Rosenberg 2009). In addition, when three or more species have a relatively short time between speciation events, incomplete lineage sorting (ILS) is a common cause of discordant gene trees (Degnan and Salter 2005; Degnan and Rosenberg 2009; Hobolth et al. 2011).

In the most general sense, ILS is the failure of two lineages to coalesce within a population, instead having their most recent common ancestor (MRCA) in an ancestral population (Hudson 1983b; Tajima 1983). In order for ILS to affect gene trees among three or more taxa, multiple lineages must be maintained between speciation events without coalescing, instead coalescing in a pre-speciation ancestral population (Figure 1A, dashed line). Since multiple lineages are maintained in the internode population, the possibility exists that lineages will coalesce first with an outgroup lineage instead of with a lineage leading to its sister taxon. Because ILS depends on the maintenance of polymorphisms between speciation events, it is inversely proportional to τ/2Ne: the relative time between speciation events (τ; i.e., the length of the internode branch) and the effective population size (Ne) of the internode population. This implies that larger population sizes or shorter relative times between speciation events both increase the probability of having ILS at a locus (Pamilo and Nei 1988). Since τ is a function of the species phylogeny, it is constant across all regions of the genome. The probability of ILS occurring in a given genomic region, however, is directly proportional to Ne in that region, as Ne determines the time to the MRCA for lineages in a population (Figure 1A vs. 1B).

Figure 1.

Figure 1

Species phylogeny (solid outline) in relation to gene trees (thin internal lines). Concordant gene trees have the same topology as the species tree (solid black line). When the effective population size (Ne) is large (A) and the internode time (shaded area) is short, lineages may not coalesce in the internode ancestral population and could lead to a discordant gene tree (dotted grey line). When Ne is reduced (B) lower levels of polymorphism are maintained in the internode population, and lineages are more likely to go extinct (x), decreasing the likelihood of sampling incongruent gene trees.

There are multiple reasons why Ne may vary from locus to locus. Under neutrality, the time to the MRCA varies stochastically among loci, depending solely on the coalescent process (Hudson 1990). In the presence of strong selection — either positive selection fixing new mutations or negative selection removing new mutations — Ne is consistently lowered at linked sites (Maynard Smith and Haigh 1974; Charlesworth et al. 1993). While strong natural selection can occur at any position in the genome, the effects of this selection are magnified in regions with low recombination (i.e., low rates of crossing-over), as a much larger region is affected by any selected variant (Kaplan et al. 1989; Charlesworth 2012). Therefore, among a set of randomly chosen loci, the action of strong selection should cause Ne to be negatively correlated with the recombination rate. Indeed, studies in multiple organisms, including Drosophila melanogaster (Begun and Aquadro 1992; Haddrill et al. 2007; Campos et al. 2012; Langley et al. 2012) and Homo sapiens (Nachman et al. 1998; Stajich and Hahn 2005; Cai et al. 2009; McVicker et al. 2009) have shown that low-recombination regions (LRRs) experience a reduction in levels of nucleotide diversity relative to divergence, an indication that Ne is reduced due to linked selection. In the absence of strong selection, Ne is not expected to vary with recombination in a consistent manner (Hudson 1983a).

Taken together, the expected relationships between Ne and both ILS and recombination suggest that regions of lower recombination will show lower levels of ILS (expressed as a lower proportion of incorrect gene trees). This hypothesis is supported by previous results comparing the degree of ILS within regions expected to experience strong selection (coding sequences) to regions not experiencing strong selection (non-coding sequences; Hobolth et al. 2011; Scally et al. 2012), and among regions differing in the rate of recombination (Hobolth et al. 2011; Prufer et al. 2012). In addition to LRRs, we also expect to sample less ILS (i.e., fewer gene trees supporting incorrect species relationships) on sex chromosomes due to lower Ne. This implies that regions on sex chromosomes should have a lower level of ILS than autosomes, and furthermore that LRR regions on the X chromosome—as well as the entire Y chromosome—should have the lowest ILS among all nuclear loci. Finally, the mitochondrial genome has been shown to be generally non-recombining and to have relatively low Ne in both humans and Drosophila (Meiklejohn et al. 2007; Galtier et al. 2009; Piganeau and Eyre-Walker 2009; Rand 2011). Even though the entire mitochondrial genome represents only a single locus, and therefore a single realization of the coalescent process, it should also be enriched for support of the true species phylogeny. However, mitochondrial introgression is common in many lineages and may affect its agreement with the species phylogeny.

Based on the above considerations, we hypothesized that LRRs should on average exhibit lower levels of ILS compared to other loci. We tested this hypothesis in three genome-wide phylogenetic datasets with varying degrees of ILS. First, we examined the human-chimpanzee-gorilla clade (HCG), which shows a strong majority (70%) of nucleotides in protein-coding regions supporting humans and chimpanzees as sister taxa with gorilla as the outgroup—denoted (HC)G*—and equal support for the two alternative trees (Scally et al. 2012). (Note: ‘*’ denotes our “putative species tree” for each clade.) Second, we considered the subgroup comprised of Drosophila melanogaster, D. erecta and D. yakuba (the “Dmel clade”), which shows a plurality—but not a majority—of protein-coding nucleotides (44.7%) supporting D. erecta and D. yakuba as sister taxa, denoted (EY)M* (Pollard et al. 2006). Again, an approximately equal number of sites support the two alternative topologies (Pollard et al. 2006). Finally, we considered the subgroup composed of D. simulans, D. sechellia, and D. mauritiana (the “Dsim clade”), which is characterized by frequent hybridization events and an extremely short interval between the speciation events (Nunes et al. 2010; Legrand et al. 2011). D. sechellia and D. mauritiana are island species in the Seychelles and Mauritius archipelagos, respectively, and are inferred to have split from an ancestral population on Madagascar, where D. simulans is still prevalent. A recent analysis determined the species relationship as (simulans, sechellia)mauritiana [(SC)M*] from an alignment of all autosomes (Garrigan et al. 2012). However, after accounting for recurrent hybridizations among the species, this paper proposed that the Dsim clade represented a true (or “hard”) polytomy, with each possible topology represented approximately equally.

For each of the three clades, our analysis found increased support for a single phylogeny with decreasing recombination rate. We also found support for reduced ILS on the X chromosome in HCG, but in the Dmel and Dsim clades, the X chromosome appears to have a more complex history, possibly due to its role in species isolation. The mitochondrial genomes in all three datasets also supported the nuclear LRR-predicted species phylogenies. Therefore, in clades where ILS-discordant gene trees are a possibility, an analysis that considers the recombination environment at each locus can provide added context for gene tree discordance that may indicate the true species phylogeny.

Materials and Methods

HCG gene trees and recombination rates

For the HCG clade, we used a dataset reporting the proportion of sites that support alternative topologies derived from a CoalHMM analysis of n=1,998 regions of a five-way genome alignment, including orangutan and macaque as outgroups (Scally et al. 2012). Any regions overlapping another region (these all occurred on chromosome 1) were excluded, leaving n=1,961 regions. Each region contained 106 sites, but these sites could be spread over highly variable lengths of the genome. Therefore, only the n=1,885 regions where 1) the coordinates of the sampled genomic section were <2 Mb apart, and 2) the region did not span the centromere, were retained for the final dataset. The overall recombination rate for each region was calculated as the average of all point estimates of recombination within the region (International HapMap Consortium et al. 2007). Each region contained between 598 and 3,482 (mean = 1,622) point estimates of recombination.

Dmel gene trees and recombination rates

For the Dmel clade, we used a dataset comprised of rooted ML gene trees from alignments of n=8,838 protein-coding regions in D. melanogaster, D. yakuba, D. erecta, and outgroup D. ananassae (Pollard et al. 2006). We used recombination rates from genome-wide estimates for each protein-coding gene in D. melanogaster (Langley et al. 2012). Only genes with available recombination rates (n=6,811) were retained in the dataset. We further restricted our analysis to n=5,949 genes that appear on the same chromosome in both D. melanogaster and D. yakuba (McQuilton et al. 2012). We also required that no gene overlap another gene by more than 10% of its total length, leaving n=5,532 genes. For chromosome 4 (the non-recombining “dot” chromosome), data from the 37 ML gene trees in the original dataset was used. All 37 loci were consistently on chromosome 4 in both melanogaster and yakuba.

Dsim gene trees and recombination rates

For the Dsim clade, we used a dataset comprised of polarized SNP counts for n=23,773 non-overlapping 5 kb aligned genomic regions of D. simulans, D. sechellia, D. mauritiana, and outgroup D. melanogaster (Garrigan et al. 2012). To avoid errors resulting from small numbers of SNPs, only regions with ≥ 20 SNPs (n=21,636) were included, following Garrigan et al. (2012). Phylogenies for these 5 kb regions were determined using the 4sp software package (http://kimura.biology.rochester.edu/software/4sp/4sp.tar.gz, Garrigan et al. 2012).

Recombination rates were estimated from the D. melanogaster recombination rates for homologous genes (Langley et al. 2012). We calculated the recombination rate for each 5 kb region as r=ipiriipi. In this equation, pi is the proportion of sites in a given 5 kb region overlapped by a given gene, and ri is the recombination rate for that gene; the product of these is summed over all genes overlapping the given region and then divided by the total proportion of genic sites in the region. For 5 kb regions without any genes in them, if there was a single gene located within 1 kb we estimated the recombination rate as equal to that gene. Recombination rates for n=9,848 windows were determined, and all other regions were excluded.

Using the methods outlined in Garrigan et al. (2012), we calculated the global and local log-likelihood of introgression between species using 4sp. A standard likelihood ratio test was applied to each 5 kb window and any window with likelihood-ratio test P-value <0.029 was considered putatively introgressed. After exclusion of introgressed regions, n=9,130 regions remained in the final dataset.

Gene density

Since genic regions have reduced ILS, we need to control for the gene density of each region when considering the effect of recombination rate. For the HCG clade, we determined the number of genic positions in each genomic region by counting the number of nucleotides in each 1 Mb aligned region that fell within a coding region in the human hg19 genome assembly (Fujita et al. 2011). The proportion of genic sites in each region ranged from 0% (n=47) to 100% (n = 1), with a mean of 47%. In the Dmel clade, only coding regions were used, so no correction was needed. For the Dsim clade, we determined the number of genic positions in each genomic region by comparison with all protein-coding gene coordinates in the D. melanogaster genome (v5.31) via FlyBase (McQuilton et al. 2012).

Mitochondrial gene trees

Mitochondrial genomes for human (NC_012920.1), chimpanzee (NC_001643.1), gorilla (NC_001645.1), and outgroup orangutan (NC_001646.1) were aligned using CLUSTALW 2.1 (Larkin et al. 2007). Mitochondrial genomes of D. melanogaster (NC_001709.1), D. erecta (BK006335.1), D. yakuba (NC_001322.1), and outgroup D. ananassae (BK006336.1) were aligned by the same method. Mitochondrial genomes of D. simulans (NC_005781.1), D. mauritiana (NC_005779.1), D. sechellia (NC_005780.1), and outgroup D. melanogaster were separately aligned by the same method. We inferred an ML whole-mitochondria phylogeny from these nucleotide alignments with RaXML using the GTRGAMMA substitution model (Rokas 2011).

Statistical analyses

Autocorrelation, linear regression, logistic regression, Spearman's rank correlation, and Student's t-test were performed using the acf, lm, glm, cor, and t.test functions of R (http://www.R-project.org). All other analyses were performed with custom python scripts through MySQL databases.

Results

Low-Recombination Regions

HCG autosomal aligned regions (n=1,838) showed an average of 69.5% of sites with support for the (HC)G* phylogeny, similar to the proportion of sites reported for protein-coding regions (Figure 1C, Scally et al. 2012). However, those autosomal regions in the lowest quintile of recombination rates (LRR20) showed significantly increased support (72.8%) for (HC)G* (Figure 2A, P = 4.9×10-14, t-test). Mean support for (HC)G* in the LRR20 regions is also significantly higher than the next quintile (LRR40, P = 0.016, t-test). The mean recombination rate in LRR20 regions is 0.62 cM/MB (±0.009 SE), while the recombination rate in LRR40 is 1.06 (±0.005 SE), reflecting a real difference in recombination rates for LRR20. As can also be seen in Figure 2A, there is an almost linear decrease in support for the (HC)G* topology with increasing recombination rate (ρ = −0.24, P < 2.2×10−16, Spearman's rank correlation), with each of the two alternative trees showing equal increases in frequency (data not shown). We also found strong first-order autocorrelation coefficients for each chromosome: in effect, neighboring windows showed highly similar results. A conservative reanalysis of every fourth window still showed both a significant relationship between recombination rate and ILS (ρ = -0.23, P = 2.3×10−7) and a significant difference in support for LRR20 regions (P = 2.2×10−4, t-test).

Figure 2.

Figure 2

Regions with low-recombination rates show increased support for the putative species tree in each clade compared to the mean support (dashed line). As the recombination rate increases (dotted line), the support declines and ILS increases. This pattern is consistent in the HCG clade (A), Dmel clade (B), and Dsim clade (C).

Similarly, in the Dmel clade, the (EY)M* topology is supported by 58.3% of all autosomal genes (n=4,685), increasing to 60.9% of autosomal genes in LRR20 supporting this tree (Figure 2B; χ2=3.19, d.f. =1, P = 0.07). Lower recombination rates generally correlate with increased support for (EY)M*, though non-significantly (β = −0.016, P = 0.29, logistic regression). Neither of the alternative trees shows this decrease in support with increasing recombination (β = 0.017 and 0.005, respectively). Of autosomal genes with an estimated recombination rate of zero, 64.7% (101/156) support (EY)M*. Both the HCG and Dmel datasets match our prediction that LRRs will exhibit lower levels of ILS and therefore more strongly support the putative species tree.

Previous analysis in the Dsim clade was unable to identify a bifurcating species tree, with all three possible trees having essentially equal support after controlling for introgression (Garrigan et al. 2012). Among non-introgressed, autosomal 5 kb windows sampled (n=7,725), the support for the topology (SC)M* is 35.1%. However, among LRR20 regions the support increases significantly to 37.2% (Figure 2C; χ2=4.27, d.f. =1, P = 0.039), and in LRR10 to 40.0% (χ2=4.38, d.f. =1, P = 0.036). Both alternative topologies had decreased support in LRR20 regions. More generally, support for (SC)M* inversely correlates with recombination (β = −0.023, P = 0.07, logistic regression). As in Dmel, support for neither alternative species tree shows a negative relationship with recombination (βS(CM) = 0.016, β(SM)C = 0.007). We found no significant autocorrelation among windows in the Dsim dataset, and therefore did not correct for any non-independence among windows. Assuming that much of the phylogenetic incongruence in the Dsim clade is due to ILS, the regions of lowest recombination indicate that (SC)M* is the true species relationship, where D. mauritiana diverged first from a D. simulans-sechellia ancestral population. This result also agrees with the initial autosomal ML tree in Garrigan et al (2012).

Because ILS is lower in genic regions (Hobolth et al. 2007; Prufer et al. 2012; Scally et al. 2012), a confounding variable may be the genic content of each genomic region in the HCG and Dsim clades because the data were calculated in equal-length windows including both coding and non-coding sequences. However, the proportion of genic sites in a genomic region (p) does not correlate significantly with the recombination rate in either the HCG (R2=0.054, linear regression) or Dsim clades (R2=0.0014). Even when only values of p between 0.1 and 0.9 are considered, we find no significant correlation (HCG: R2=0.084; Dsim: R2=0.0010). Therefore, we find that the proportion of genic positions in each window (i.e., gene density) does not appear to be a confounding variable in our results concerning the relationship between recombination rate and ILS.

Choosing the size of genomic regions used to study ILS is a balance between having sufficient sequence data to produce confident gene trees, and making the regions small enough to capture heterogeneity within the genome. We considered the possibility that the 1 Mb regions in the HCG dataset are too large to capture the finer-scale dynamics of recombination and ILS. To test the potential effects of estimating ILS in larger aligned regions, we used the recombination and phylogenetic data from the Dsim 5 kb regions to calculate average recombination and average support for (SC)M* in 1 Mb regions across the complete genome. Among autosomal 1 Mb regions (n = 83), those regions with lowest recombination tended to show stronger support for (SC)M* (ρ = −0.244, P = 0.029, Spearman's rank correlation). This suggests that the relationship between recombination and ILS appears to be observable and consistent even when relatively large genomic regions are sampled. Although supportive of results using 1 Mb windows, we should also note that the HCG dataset measures the proportion of sites that support the putative species tree within each window, rather than estimating a single tree. Although using the proportion of sites within a window does not adequately control for associations among sites, it may be preferred to simply assigning each window to a single topology. Indeed, because all windows show a majority of sites supporting (HC)G*, the proportion of sites more informatively describes the uncertainty present in the dataset.

X-Chromosomes

In the HCG clade, 84.3% of the sites on the X chromosome predict the (HC)G* tree, compared to 69.6% of autosomal regions (Figure 3A; P = 1.2×10−26, t-test; see also Scally et al. 2012). Furthermore, 86.2% of LRR20 regions on the X chromosome provide support for (HC)G* (P = 0.022, t-test). Support is therefore maximized in this clade when using LRRs on the X chromosome, which should (all other things being equal) have the lowest Ne in the nuclear genome, aside from the Y-chromosome (Y-linked sequences are not available for many of the species considered here).

Figure 3.

Figure 3

X-linked regions in the HCG alignment (A) show markedly increased support for the putative species tree over the autosomal average (dashed line). However, X-linked genes in the Dmel clade (B) and X-linked regions in the Dsim clade (C) show decreased support relative to autosomes.

In the Dmel clade, only 49.0% of genes on the X chromosome (n=846) predict (EY)M*, while this tree is supported by 58.3% of autosomal genes (n=4,686, Figure 3B; χ2=124.8, d.f. =1, P < 2.2×10−16). Among LRR 20 genes on the X chromosome, 50.8% support (EY)M* (χ2=0.24, d.f. =1, P = 0.6). Similarly, in the Dsim clade, only 31.3% of X-linked regions (n=1,402) provide support for (SC)M* versus 42.4% for (SM)C (Figure 3C). Autosomal regions (n=7,728) support (SC)M* at 35.1% and (SM)C at 36.2%. However, among LRR10 regions on the X chromosome, support increases to 44.2% for (SC)M* and drops to 29.2% for (SM)C. While the X chromosome in Dsim overall indicates (SM)C, LRRs among X-linked loci shift toward the autosomal LRR prediction of (SC)M*(χ2=11.7, d.f. =1, P = 6.2×10-4).

Mitochondria and the Dot Chromosome

Maximum-likelihood trees for the HCG and Dmel mitochondrial alignments both returned the putative species tree. Drosophila chromosome 4 (the “dot” chromosome) does not show any evidence of crossing-over but does have gene conversion (Comeron et al. 2012). Of the 37 trees constructed for genes on chromosome 4, 25 support (EY)M* (67%), 7 support (EM)Y (19%), and 5 support (MY)E (14%). Thus, for the HCG and Dmel mitochondria and the Drosophila dot chromosome—as in the recombining portion of the nuclear genome examined here—there is strong support for a single tree [(HC)G* and (EY)M*, respectively].

For the Dsim clade, the mitochondrial genome strongly supports the (SC)M* tree. Therefore, even in the clade with the most ambiguous results from the nuclear genome, the mitochondrial genome shows clear support for a single species tree. Our results here differ from the mitochondrial analysis in Garrigan et al. (2012) because the D. mauritiana mitochondria used in that study was a haplotype introgressed recently from D. simulans. Our trees are derived from the reference genome.

Discussion

One of the most studied sets of species relationships is the one between humans, chimpanzees, and gorillas (HCG). The first sequence-based HCG species trees were inferred from single genes (e.g., Ferris et al. 1981; Brown et al. 1982; Hixson and Brown 1986; Koop et al. 1986; Saitou and Nei 1986). With single-gene species tree inference, the assumption is that the sampled gene's evolution history matches speciation events, yielding a gene tree is that is concordant (topologically identical) with the species tree. These early studies were often equivocal about the relationships within the HCG clade, but all agreed that this uncertainty was likely due to a relatively short interval between speciation events. With the exponential rise in sequencing, HCG phylogenetic datasets expanded to 5 genes (Koop et al. 1989), 25 genes (Takahata et al. 1995), 45 genes (Satta et al. 2000), 53 homologous regions of ~25 kb each (Chen and Li 2001; Chen et al. 2001), 129 homologous regions (Osada and Wu 2005), and an 18.3 Mb concatenated alignment (Patterson et al. 2006). These datasets continued to confirm that the distance between speciation events was relatively short in the HCG clade, but generally favored the divergence of gorilla from a human-chimpanzee ancestral population—a tree denoted (HC)G. Most recently, with the completion of the gorilla genome, 70.1% of nucleotides in protein-coding regions were found to support (HC)G, with 15% support each for (HG)C and (HC)G (Scally et al. 2012).

As with the HCG clade, many phylogenetic datasets have grown in size over the past decade, often leading to a concomitant rise in the evidence for discordance among individual gene trees. While much of this discordance can be due to artifacts arising from tree inference methods or simply to a lack of resolution, incomplete lineage sorting in internode populations can lead to gene trees that contradict the true species tree for biological reasons. Therefore, even in cases where whole-genome data are obtained, multiple topologies may be supported by hundreds of different loci each (e.g., White et al. 2009). In these cases, no further gene trees can be collected, and we must therefore turn to other pieces of evidence to help in inferring the species tree from the individual gene trees.

In this study, we tested the hypothesis that specific genomic regions have reduced ILS, and as a result are more likely to give a topology matching the species tree. Specifically, we asked whether low-recombination regions, the X chromosome, and/or the mitochondrial genome showed evidence for reduced ILS due to reductions in Ne. To address these questions we used three clades: one with a strong consensus species tree (HCG), one with a moderate consensus tree (Dmel), and one previously declared a hard polytomy (Dsim). In the HCG and Dmel datasets, the gene trees inferred from LRRs showed increased support for the putative consensus tree. In the Dsim clade, the LRRs showed relatively stronger support for a single species tree, one that the mitochondrial genome also supports.

The use of LRRs to predict the species phylogeny in part assumes that recombination rates are conserved across all species in the clade. Without some conservation of recombination rates across the clade, genes in the ancestral populations experiencing ILS would experience recombination rates that may not be predicted by current recombination rates; we would therefore expect no association between recombination measured in extant taxa and reduced Ne in ancestral populations. While still a relatively open question, it appears that recombination rates among closely related species are generally conserved across large genomic loci (Dumont and Payseur 2008, 2011; Smukowski and Noor 2011). Between humans and chimpanzee, a very high correlation in inferred recombination rates is seen between orthologous 1 Mb intervals, even though finer-scale variation in the recombination rate exists within these intervals (Auton et al. 2012). Because there is undoubtedly some variation in recombination rates among lineages— especially in the Dsim clade (Cattani and Presgraves 2012) — the relationships we see between recombination and reduced ILS may reflect decreased power due to noise in this predictor variable. It is also highly likely that the method used here would not work for ILS that has occurred much further back in time, such as in the base of the metazoans (Rokas et al. 2005), as we cannot accurately estimate the recombination rates in these ancient genomes.

We also expected that the X chromosome would have reduced ILS, since it has a lower Ne compared to autosomes. While data from the primate X chromosome supports this hypothesis, the Drosophila X appears to have a more complex history. One factor may be the outsized role of the X chromosome in reproductive isolation—the so-called large X-effect (Coyne and Orr 2004). It has been demonstrated that hybrid incompatibilities accumulate on the X chromosome faster than on the autosomes in comparisons between D. mauritiana and D. sechellia (Masly and Presgraves 2007; McNabney 2012), between D. mauritiana and D. simulans (Davis and Wu 1996; Nunes et al. 2010), and between D. melanogaster and D. simulans (Presgraves 2003; Cattani and Presgraves 2012). In addition, fewer loci on the X chromosome show evidence for introgression among the species in the Dsim clade (Garrigan et al. 2012), further supporting its role in reproductive isolation. However, why this involvement in hybrid incompatibility manifests itself as a lower proportion of trees reflecting the species phylogeny is unclear. In fact, it has previously been proposed that loci involved in incompatibility should more faithfully reflect the species tree (Ting et al. 2000), although such genes would have to be involved only in the more recent speciation event in order to do so. One possibility is that the most common gene tree in both Drosophila clades is the outcome of massive hybridization after the initial speciation events, with only the X chromosome retaining the original, bifurcating species tree. Unfortunately, we do not currently have the ability to test this hypothesis.

Our results are consistent with the hypothesis that the levels of reduced heterozygosity seen in regions of low recombination really do indicate lower Ne, and not just decreased mutation rates (e.g., Hellmann et al. 2003). Many different pieces of evidence support the role of strong selection in reducing Ne, especially in LRRs, including a decreased efficacy of natural selection on weakly selected variants (Betancourt et al. 2002; Hey and Kliman 2002; Betancourt et al. 2009; Gossmann et al. 2011). To our knowledge, however, this is some of the first evidence from topological concordance that is consistent with reduced Ne in LRRs. Our results imply not only that strong selection is reducing Ne in these regions currently, but also that recurrent selection has been reducing Ne since at least the time when the internode populations existed. In fact, our expectation about the relationship between Ne and ILS was predicated on the assumption that Ne was also reduced in LRRs in the ancestral internode populations. This recurrent selection may be either negative selection against deleterious mutation or positive selection on advantageous mutations (or a mixture of the two), and may differ qualitatively between the clades considered here (cf. Hahn 2008). Previous studies examining the Dmel and HCG datasets found mixed effects of measures of selection on protein-coding genes in predicting ILS. Pollard et al. (2006) found no effect of dN/dS on the probability of discordance, while Scally et al. (2012) found that genes with lower dN/dS showed less discordance. This latter relationship suggests that variation in dN/dS when it is less than 1 is more indicative of the strength of negative selection than the incidence of positive selection.

We can also use these data to make rough calculations about the variation in Ne among genomic regions: if the probability of sampling a gene tree that matches the species tree is equivalent to 1-⅔e-τ/2Ne (Hudson 1983b), then variation in the observed proportion of concordant gene trees among regions with different recombination rates should be due in part to variation in 2Ne. For the HCG dataset, setting τ=1 for simplicity, our results imply an approximately 20% reduction in Ne going from LRR100 to LRR20. This value is strikingly similar to the estimated average reduction in nucleotide diversity caused by linked selection in the human-chimpanzee ancestral population (19-26% on the autosomes; McVicker et al. 2009) and may therefore accurately reflect variation in Ne across the genome.

The finding that LRRs more often indicate the true species tree could be used to boost the inference of species relationships from whole genome data in two ways. Quantitatively, gene trees could be weighted by recombination rates in order to estimate the species tree more accurately. Qualitatively, if regions of low recombination (LRRs, sex chromosomes) and no recombination (mitochondria, microchromosomes) all support a particular phylogeny more strongly than the genomic average, this would provide additional support for that species phylogeny. These pieces of data could provide crucial contextual information for resolving species trees when ILS is prevalent in a clade. The future of phylogenetics inevitably lies in the inference of phylogenies from genome-wide sequence data. Therefore, an important challenge is to develop phylogenetic models that can handle not only the heterogeneous process of molecular evolution across the genome, but that can also utilize these differences in order to enhance inferences of the species phylogeny. This implies that some of the most crucial developments in phylogenetic models will not necessarily be improvements in sequence alignment or gene tree inference, but rather in models that contextually interpret genomic heterogeneity in sequence evolution.

Acknowledgements

We thank Daniel Pollard and Grace Yuh Chwen Lee for providing data for the Dmel clade, and Daniel Garrigan for providing Dsim data. L. Kubatko and two anonymous reviewers also made very helpful suggestions that improved the manuscript. This research was supported by the Indiana University Genetics, Molecular and Cellular Sciences Training Grant (T32-GM007757) and a fellowship from the Alfred P. Sloan Foundation to M.W.H.

LITERATURE CITED

  1. Auton A, Fledel-Alon A, Pfeifer S, Venn O, Segurel L, Street T, Leffler EM, Bowden R, Aneas I, Broxholme J, Humburg P, Iqbal Z, Lunter G, Maller J, Hernandez RD, Melton C, Venkat A, Nobrega MA, Bontrop R, Myers S, Donnelly P, Przeworski M, McVean G. A fine-scale chimpanzee genetic map from population sequencing. Science. 2012;336:193–198. doi: 10.1126/science.1216872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Begun DJ, Aquadro CF. Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature. 1992;356:519–520. doi: 10.1038/356519a0. [DOI] [PubMed] [Google Scholar]
  3. Betancourt AJ, Presgraves DC, Swanson WJ. A test for faster X evolution in Drosophila. Mol. Biol. Evol. 2002;19:1816–1819. doi: 10.1093/oxfordjournals.molbev.a004006. [DOI] [PubMed] [Google Scholar]
  4. Betancourt AJ, Welch JJ, Charlesworth B. Reduced effectiveness of selection caused by a lack of recombination. Curr. Biol. 2009;19:655–660. doi: 10.1016/j.cub.2009.02.039. [DOI] [PubMed] [Google Scholar]
  5. Brown WM, Prager EM, Wang A, Wilson AC. Mitochondrial DNA sequences of primates: tempo and mode of evolution. J. Mol. Evol. 1982;18:225–239. doi: 10.1007/BF01734101. [DOI] [PubMed] [Google Scholar]
  6. Cai JJ, Macpherson JM, Sella G, Petrov DA. Pervasive hitchhiking at coding and regulatory sites in humans. PLoS Genet. 2009;5:e1000336. doi: 10.1371/journal.pgen.1000336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Campos JL, Charlesworth B, Haddrill PR. Molecular evolution in nonrecombining regions of the Drosophila melanogaster genome. Genome Biol. Evol. 2012;4:278–288. doi: 10.1093/gbe/evs010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cattani MV, Presgraves DC. Incompatibility between X chromosome factor and pericentric heterochromatic region causes lethality in hybrids between Drosophila melanogaster and its sibling species. Genetics. 2012;191:549–559. doi: 10.1534/genetics.112.139683. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Charlesworth B. The effects of deleterious mutations on evolution at linked sites. Genetics. 2012;190:5–22. doi: 10.1534/genetics.111.134288. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Charlesworth D, Morgan MT, Charlesworth B. Mutation Accumulation in Finite Outbreeding and Inbreeding Populations. Genet. Res. 1993;61:39–56. [Google Scholar]
  11. Chen FC, Li WH. Genomic divergences between humans and other hominoids and the effective population size of the common ancestor of humans and chimpanzees. Am. J. Hum. Genet. 2001;68:444–456. doi: 10.1086/318206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Chen FC, Vallender EJ, Wang H, Tzeng CS, Li WH. Genomic divergence between human and chimpanzee estimated from large-scale alignments of genomic sequences. J. Hered. 2001;92:481–489. doi: 10.1093/jhered/92.6.481. [DOI] [PubMed] [Google Scholar]
  13. Comeron JM, Ratnappan R, Bailin S. The Many Landscapes of Recombination in Drosophila melanogaster. PLoS Genet. 2012;8:e1002905. doi: 10.1371/journal.pgen.1002905. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Coyne JA, Orr HA. Speciation. Sinauer Associates; Sunderland, Mass: 2004. [Google Scholar]
  15. Davis AW, Wu CI. The broom of the sorcerer's apprentice: the fine structure of a chromosomal region causing reproductive isolation between two sibling species of Drosophila. Genetics. 1996;143:1287–1298. doi: 10.1093/genetics/143.3.1287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Degnan JH, Rosenberg NA. Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol. Evol. 2009;24:332–340. doi: 10.1016/j.tree.2009.01.009. [DOI] [PubMed] [Google Scholar]
  17. Degnan JH, Salter LA. Gene tree distributions under the coalescent process. Evolution. 2005;59:24–37. [PubMed] [Google Scholar]
  18. Dumont BL, Payseur BA. Evolution of the genomic rate of recombination in mammals. Evolution. 2008;62:276–294. doi: 10.1111/j.1558-5646.2007.00278.x. [DOI] [PubMed] [Google Scholar]
  19. Dumont BL, Payseur BA. Genetic analysis of genome-scale recombination rate evolution in house mice. PLoS Genet. 2011;7:e1002116. doi: 10.1371/journal.pgen.1002116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Edwards SV. Is a new and general theory of molecular systematics emerging? Evolution. 2009;63:1–19. doi: 10.1111/j.1558-5646.2008.00549.x. [DOI] [PubMed] [Google Scholar]
  21. Felsenstein J. Inferring phylogenies. Sinauer Associates; Sunderland, Mass: 2004. [Google Scholar]
  22. Ferris SD, Wilson AC, Brown WM. Evolutionary tree for apes and humans based on cleavage maps of mitochondrial DNA. Proc. Natl. Acad. Sci. U. S. A. 1981;78:2432–2436. doi: 10.1073/pnas.78.4.2432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, Goldman M, Barber GP, Clawson H, Coelho A, Diekhans M, Dreszer TR, Giardine BM, Harte RA, Hillman-Jackson J, Hsu F, Kirkup V, Kuhn RM, Learned K, Li CH, Meyer LR, Pohl A, Raney BJ, Rosenbloom KR, Smith KE, Haussler D, Kent WJ. The UCSC Genome Browser database: update 2011. Nucleic Acids Res. 2011;39:D876–882. doi: 10.1093/nar/gkq963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Galtier N, Nabholz B, Glemin S, Hurst GD. Mitochondrial DNA as a marker of molecular diversity: a reappraisal. Mol. Ecol. 2009;18:4541–4550. doi: 10.1111/j.1365-294X.2009.04380.x. [DOI] [PubMed] [Google Scholar]
  25. Garrigan D, Kingan SB, Geneva AJ, Andolfatto P, Clark AG, Thornton KR, Presgraves DC. Genome sequencing reveals complex speciation in the Drosophila simulans clade. Genome Res. 2012;22:1499–1511. doi: 10.1101/gr.130922.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Gossmann TI, Woolfit M, Eyre-Walker A. Quantifying the variation in the effective population size within a genome. Genetics. 2011;189:1389–1402. doi: 10.1534/genetics.111.132654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Haddrill PR, Halligan DL, Tomaras D, Charlesworth B. Reduced efficacy of selection in regions of the Drosophila genome that lack crossing over. Genome Biol. 2007;8:R18. doi: 10.1186/gb-2007-8-2-r18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Hahn MW. Toward a selection theory of molecular evolution. Evolution. 2008;62:255–265. doi: 10.1111/j.1558-5646.2007.00308.x. [DOI] [PubMed] [Google Scholar]
  29. Hellmann I, Ebersberger I, Ptak SE, Paabo S, Przeworski M. A neutral explanation for the correlation of diversity with recombination rates in humans. Am. J. Hum. Genet. 2003;72:1527–1535. doi: 10.1086/375657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Hey J, Kliman RM. Interactions between natural selection, recombination and gene density in the genes of Drosophila. Genetics. 2002;160:595–608. doi: 10.1093/genetics/160.2.595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Hixson JE, Brown WM. A comparison of the small ribosomal RNA genes from the mitochondrial DNA of the great apes and humans: sequence, structure, evolution, and phylogenetic implications. Mol. Biol. Evol. 1986;3:1–18. doi: 10.1093/oxfordjournals.molbev.a040379. [DOI] [PubMed] [Google Scholar]
  32. Hobolth A, Christensen OF, Mailund T, Schierup MH. Genomic relationships and speciation times of human, chimpanzee, and gorilla inferred from a coalescent hidden Markov model. PLoS Genet. 2007;3:e7. doi: 10.1371/journal.pgen.0030007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Hobolth A, Dutheil JY, Hawks J, Schierup MH, Mailund T. Incomplete lineage sorting patterns among human, chimpanzee, and orangutan suggest recent orangutan speciation and widespread selection. Genome Res. 2011;21:349–356. doi: 10.1101/gr.114751.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Hudson RR. Properties of a neutral allele model with intragenic recombination. Theor. Popul. Biol. 1983a;23:183–201. doi: 10.1016/0040-5809(83)90013-8. [DOI] [PubMed] [Google Scholar]
  35. Hudson RR. Testing the constant-rate neutral allele model with protein sequence data. Evolution. 1983b;37:203–217. doi: 10.1111/j.1558-5646.1983.tb05528.x. [DOI] [PubMed] [Google Scholar]
  36. Hudson RR. Gene genealogies and the coalescent process. In: Futuyma D, Antonovics J, editors. Oxford Surveys in Evolutionary Biology. Oxford University Press; 1990. pp. 1–44. [Google Scholar]
  37. International HapMap Consortium. Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM, Pasternak S, Wheeler DA, Willis TD, Yu F, Yang H, Zeng C, Gao Y, Hu H, Hu W, Li C, Lin W, Liu S, Pan H, Tang X, Wang J, Wang W, Yu J, Zhang B, Zhang Q, Zhao H, Zhao H, Zhou J, Gabriel SB, Barry R, Blumenstiel B, Camargo A, Defelice M, Faggart M, Goyette M, Gupta S, Moore J, Nguyen H, Onofrio RC, Parkin M, Roy J, Stahl E, Winchester E, Ziaugra L, Altshuler D, Shen Y, Yao Z, Huang W, Chu X, He Y, Jin L, Liu Y, Shen Y, Sun W, Wang H, Wang Y, Wang Y, Xiong X, Xu L, Waye MM, Tsui SK, Xue H, Wong JT, Galver LM, Fan JB, Gunderson K, Murray SS, Oliphant AR, Chee MS, Montpetit A, Chagnon F, Ferretti V, Leboeuf M, Olivier JF, Phillips MS, Roumy S, Sallee C, Verner A, Hudson TJ, Kwok PY, Cai D, Koboldt DC, Miller RD, Pawlikowska L, Taillon-Miller P, Xiao M, Tsui LC, Mak W, Song YQ, Tam PK, Nakamura Y, Kawaguchi T, Kitamoto T, Morizono T, Nagashima A, Ohnishi Y, Sekine A, Tanaka T, Tsunoda T, Deloukas P, Bird CP, Delgado M, Dermitzakis ET, Gwilliam R, Hunt S, Morrison J, Powell D, Stranger BE, Whittaker P, Bentley DR, Daly MJ, de Bakker PI, Barrett J, Chretien YR, Maller J, McCarroll S, Patterson N, Pe'er I, Price A, Purcell S, Richter DJ, Sabeti P, Saxena R, Schaffner SF, Sham PC, Varilly P, Altshuler D, Stein LD, Krishnan L, Smith AV, Tello-Ruiz MK, Thorisson GA, Chakravarti A, Chen PE, Cutler DJ, Kashuk CS, Lin S, Abecasis GR, Guan W, Li Y, Munro HM, Qin ZS, Thomas DJ, McVean G, Auton A, Bottolo L, Cardin N, Eyheramendy S, Freeman C, Marchini J, Myers S, Spencer C, Stephens M, Donnelly P, Cardon LR, Clarke G, Evans DM, Morris AP, Weir BS, Tsunoda T, Mullikin JC, Sherry ST, Feolo M, Skol A, Zhang H, Zeng C, Zhao H, Matsuda I, Fukushima Y, Macer DR, Suda E, Rotimi CN, Adebamowo CA, Ajayi I, Aniagwu T, Marshall PA, Nkwodimmah C, Royal CD, Leppert MF, Dixon M, Peiffer A, Qiu R, Kent A, Kato K, Niikawa N, Adewole IF, Knoppers BM, Foster MW, Clayton EW, Watkin J, Gibbs RA, Belmont JW, Muzny D, Nazareth L, Sodergren E, Weinstock GM, Wheeler DA, Yakub I, Gabriel SB, Onofrio RC, Richter DJ, Ziaugra L, Birren BW, Daly MJ, Altshuler D, Wilson RK, Fulton LL, Rogers J, Burton J, Carter NP, Clee CM, Griffiths M, Jones MC, McLay K, Plumb RW, Ross MT, Sims SK, Willey DL, Chen Z, Han H, Kang L, Godbout M, Wallenburg JC, L'Archeveque P, Bellemare G, Saeki K, Wang H, An D, Fu H, Li Q, Wang Z, Wang R, Holden AL, Brooks LD, McEwen JE, Guyer MS, Wang VO, Peterson JL, Shi M, Spiegel J, Sung LM, Zacharia LF, Collins FS, Kennedy K, Jamieson R, Stewart J. A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007;449:851–861. doi: 10.1038/nature06258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Kaplan NL, Hudson RR, Langley CH. The “hitchhiking effect” revisited. Genetics. 1989;123:887–899. doi: 10.1093/genetics/123.4.887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Koop BF, Goodman M, Xu P, Chan K, Slightom JL. Primate η-globin DNA sequences and man's place among the great apes. Nature. 1986;319:234–238. doi: 10.1038/319234a0. [DOI] [PubMed] [Google Scholar]
  40. Koop BF, Tagle DA, Goodman M, Slightom JL. A molecular view of primate phylogeny and important systematic and evolutionary questions. Mol. Biol. Evol. 1989;6:580–612. doi: 10.1093/oxfordjournals.molbev.a040574. [DOI] [PubMed] [Google Scholar]
  41. Langley CH, Stevens K, Cardeno C, Lee YC, Schrider DR, Pool JE, Langley SA, Suarez C, Corbett-Detig RB, Kolaczkowski B, Fang S, Nista PM, Holloway AK, Kern AD, Dewey CN, Song YS, Hahn MW, Begun DJ. Genomic variation in natural populations of Drosophila melanogaster. Genetics. 2012;192:533–598. doi: 10.1534/genetics.112.142018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23:2947–2948. doi: 10.1093/bioinformatics/btm404. [DOI] [PubMed] [Google Scholar]
  43. Legrand D, Chenel T, Campagne C, Lachaise D, Cariou ML. Inter-island divergence within Drosophila mauritiana, a species of the D. simulans complex: Past history and/or speciation in progress? Mol. Ecol. 2011;20:2787–2804. doi: 10.1111/j.1365-294X.2011.05127.x. [DOI] [PubMed] [Google Scholar]
  44. Maddison WP. Gene trees in species trees. Syst. Biol. 1997;46:523–536. [Google Scholar]
  45. Masly JP, Presgraves DC. High-resolution genome-wide dissection of the two rules of speciation in Drosophila. PLoS Biol. 2007;5:e243. doi: 10.1371/journal.pbio.0050243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Maynard Smith J, Haigh J. Hitch-hiking effect of a favorable gene. Genet. Res. 1974;23:23–35. [PubMed] [Google Scholar]
  47. McNabney DR. The genetic basis of behavioral isolation between Drosophila mauritiana and D. sechellia. Evolution. 2012;66:2182–2190. doi: 10.1111/j.1558-5646.2012.01600.x. [DOI] [PubMed] [Google Scholar]
  48. McQuilton P, St Pierre SE, Thurmond J, FlyBase C. FlyBase 101 – the basics of navigating FlyBase. Nucleic Acids Res. 2012;40:D706–714. doi: 10.1093/nar/gkr1030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. McVicker G, Gordon D, Davis C, Green P. Widespread genomic signatures of natural selection in hominid evolution. PLoS Genet. 2009;5:e1000471. doi: 10.1371/journal.pgen.1000471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Meiklejohn CD, Montooth KL, Rand DM. Positive and negative selection on the mitochondrial genome. Trends Genet. 2007;23:259–263. doi: 10.1016/j.tig.2007.03.008. [DOI] [PubMed] [Google Scholar]
  51. Nachman MW, Bauer VL, Crowell SL, Aquadro CF. DNA variability and recombination rates at X-linked loci in humans. Genetics. 1998;150:1133–1141. doi: 10.1093/genetics/150.3.1133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Nunes MD, Wengel PO, Kreissl M, Schlotterer C. Multiple hybridization events between Drosophila simulans and Drosophila mauritiana are supported by mtDNA introgression. Mol. Ecol. 2010;19:4695–4707. doi: 10.1111/j.1365-294X.2010.04838.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Osada N, Wu CI. Inferring the mode of speciation from genomic data: a study of the great apes. Genetics. 2005;169:259–264. doi: 10.1534/genetics.104.029231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Pamilo P, Nei M. Relationships between gene trees and species trees. Mol. Biol. Evol. 1988;5:568–583. doi: 10.1093/oxfordjournals.molbev.a040517. [DOI] [PubMed] [Google Scholar]
  55. Patterson N, Richter DJ, Gnerre S, Lander ES, Reich D. Genetic evidence for complex speciation of humans and chimpanzees. Nature. 2006;441:1103–1108. doi: 10.1038/nature04789. [DOI] [PubMed] [Google Scholar]
  56. Piganeau G, Eyre-Walker A. Evidence for variation in the effective population size of animal mitochondrial DNA. PLoS ONE. 2009;4:e4396. doi: 10.1371/journal.pone.0004396. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Pollard DA, Iyer VN, Moses AM, Eisen MB. Widespread discordance of gene trees with species tree in Drosophila: evidence for incomplete lineage sorting. PLoS Genet. 2006;2:e173. doi: 10.1371/journal.pgen.0020173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Presgraves DC. A fine-scale genetic analysis of hybrid incompatibilities in Drosophila. Genetics. 2003;163:955–972. doi: 10.1093/genetics/163.3.955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Prufer K, Munch K, Hellmann I, Akagi K, Miller JR, Walenz B, Koren S, Sutton G, Kodira C, Winer R, Knight JR, Mullikin JC, Meader SJ, Ponting CP, Lunter G, Higashino S, Hobolth A, Dutheil J, Karakoc E, Alkan C, Sajjadian S, Catacchio CR, Ventura M, Marques-Bonet T, Eichler EE, Andre C, Atencia R, Mugisha L, Junhold J, Patterson N, Siebauer M, Good JM, Fischer A, Ptak SE, Lachmann M, Symer DE, Mailund T, Schierup MH, Andres AM, Kelso J, Paabo S. The bonobo genome compared with the chimpanzee and human genomes. Nature. 2012;486:527–531. doi: 10.1038/nature11128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Rand DM. Population genetics of the cytoplasm and the units of selection on mitochondrial DNA in Drosophila melanogaster. Genetica. 2011;139:685–697. doi: 10.1007/s10709-011-9576-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Rokas A. Phylogenetic analysis of protein sequence data using the Randomized Axelerated Maximum Likelihood (RAXML) Program. Curr. Protoc. Mol. Biol. 2011 doi: 10.1002/0471142727.mb1911s96. Chapter 19:Unit19 11. [DOI] [PubMed] [Google Scholar]
  62. Rokas A, Kruger D, Carroll SB. Animal evolution and the molecular signature of radiations compressed in time. Science. 2005;310:1933–1938. doi: 10.1126/science.1116759. [DOI] [PubMed] [Google Scholar]
  63. Saitou N, Nei M. The number of nucleotides required to determine the branching order of three species, with special reference to the human-chimpanzee-gorilla divergence. J. Mol. Evol. 1986;24:189–204. doi: 10.1007/BF02099966. [DOI] [PubMed] [Google Scholar]
  64. Satta Y, Klein J, Takahata N. DNA archives and our nearest relative: the trichotomy problem revisited. Mol. Phylogenet. Evol. 2000;14:259–275. doi: 10.1006/mpev.2000.0704. [DOI] [PubMed] [Google Scholar]
  65. Scally A, Dutheil JY, Hillier LW, Jordan GE, Goodhead I, Herrero J, Hobolth A, Lappalainen T, Mailund T, Marques-Bonet T, McCarthy S, Montgomery SH, Schwalie PC, Tang YA, Ward MC, Xue Y, Yngvadottir B, Alkan C, Andersen LN, Ayub Q, Ball EV, Beal K, Bradley BJ, Chen Y, Clee CM, Fitzgerald S, Graves TA, Gu Y, Heath P, Heger A, Karakoc E, Kolb-Kokocinski A, Laird GK, Lunter G, Meader S, Mort M, Mullikin JC, Munch K, O'Connor TD, Phillips AD, Prado-Martinez J, Rogers AS, Sajjadian S, Schmidt D, Shaw K, Simpson JT, Stenson PD, Turner DJ, Vigilant L, Vilella AJ, Whitener W, Zhu B, Cooper DN, de Jong P, Dermitzakis ET, Eichler EE, Flicek P, Goldman N, Mundy NI, Ning Z, Odom DT, Ponting CP, Quail MA, Ryder OA, Searle SM, Warren WC, Wilson RK, Schierup MH, Rogers J, Tyler-Smith C, Durbin R. Insights into hominid evolution from the gorilla genome sequence. Nature. 2012;483:169–175. doi: 10.1038/nature10842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Smukowski CS, Noor MA. Recombination rate variation in closely related species. Heredity. 2011;107:496–508. doi: 10.1038/hdy.2011.44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Stajich JE, Hahn MW. Disentangling the effects of demography and selection in human history. Mol. Biol. Evol. 2005;22:63–73. doi: 10.1093/molbev/msh252. [DOI] [PubMed] [Google Scholar]
  68. Tajima F. Evolutionary relationship of DNA sequences in finite populations. Genetics. 1983;105:437–460. doi: 10.1093/genetics/105.2.437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Takahata N, Satta Y, Klein J. Divergence time and population size in the lineage leading to modern humans. Theor. Popul. Biol. 1995;48:198–221. doi: 10.1006/tpbi.1995.1026. [DOI] [PubMed] [Google Scholar]
  70. Ting CT, Tsaur SC, Wu CI. The phylogeny of closely related species as revealed by the genealogy of a speciation gene, Odysseus. Proc. Natl. Acad. Sci. U. S. A. 2000;97:5313–5316. doi: 10.1073/pnas.090541597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. White MA, Ane C, Dewey CN, Larget BR, Payseur BA. Fine-scale phylogenetic discordance across the house mouse genome. PLoS Genet. 2009;5:e1000729. doi: 10.1371/journal.pgen.1000729. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES