Abstract
The Dobzhansky-Muller model of speciation posits that defects in hybrids between species are the result of negative epistatic interactions between alleles that arose in independent genetic backgrounds. Tests of one important prediction from this model, that incompatibilities “snowball”, have relied on comparisons of the number of incompatibilities between closely related pairs of species separated by different divergence times. How incompatibilities accumulate along phylogenies, however, remains poorly understood. We extend the Dobzhansky-Muller model to multi-species clades to describe the mathematical relationship between tree topology and the number of shared incompatibilities among related pairs of species. We use these results to develop a statistical test that distinguishes between the snowball and alternative incompatibility accumulation models, including non-epistatic and multi-locus incompatibility models, in a phylogenetic context. We further demonstrate that patterns of incompatibility sharing across species pairs can be used to estimate the relative frequencies of different types of incompatibilities, including derived-derived vs. derived-ancestral incompatibilities. Our results and statistical methods should motivate comparative genetic mapping of hybrid incompatibilities to evaluate competing models of speciation.
Keywords: Dobzhansky-Muller incompatibilities, phylogenetic comparison, reproductive isolation, speciation
INTRODUCTION
The genetics of reproductive isolation provides a powerful framework for understanding the mechanisms of speciation (Dobzhansky 1937). Though other types of reproductive isolation have been investigated, most studies of speciation from a genetic perspective have targeted hybrid inviability and hybrid sterility (intrinsic postzygotic isolation; Coyne and Orr 2004; Wolf et al. 2010; Butlin et al. 2012). Hybrid dysfunction has been attributed to a large number of genomic regions across a wide variety of species (True et al. 1996; Gadau et al. 1999; Presgraves 2003; Slotman et al. 2004; Good et al. 2008; Moyle and Nakazato 2008; White et al. 2011; Burkart-Waco et al. 2012). In a growing number of cases, specific genes have been identified (Sawamura and Yamamoto 1997; Ting et al. 1998; Barbash et al. 2003; Presgraves et al. 2003; Brideau et al. 2006; Masly et al. 2006; Bomblies et al. 2007; Lee et al. 2008; Bayes and Malik 2009; Bikard et al. 2009; Ferree and Barbash 2009; Mihola et al. 2009; Phadnis and Orr 2009; Tang and Presgraves 2009). This remarkable success motivates continued development of theoretical models that generate testable predictions about the genetics of speciation.
Chief among genetic models of postzygotic isolation is one proposed by Bateson (1909), Dobzhansky (1937) and Muller (1940, 1942). This “Dobzhansky-Muller model” supposes that hybrid sterility and inviability arise from negative epistatic interactions between alleles at different loci, dubbed Dobzhansky-Muller incompatibilities [DMIs] (Turelli and Orr 2000). Incompatibility alleles arise and fix in the distinct genetic backgrounds of incipient species where they do not decrease viability or fertility. When hybridization occurs, genotypic combinations that have not been tested by natural selection form and reduce fitness. By attributing hybrid inviability and hybrid sterility to epistatic interactions between loci, the model avoids forcing any lineage to pass through an unfit intermediate at a single locus. Under the Dobzhansky-Muller model, hybrid incompatibilities result from either substitutions along different lineages (derived-derived) or from substitutions along a single lineage (derived-ancestral).
While recent theoretical work on the accumulation of hybrid incompatibilities has focused on the Dobzhansky-Muller model, there are several alternatives that do not invoke epistasis. Theory suggests such models are unlikely to produce strong reproductive isolation because they require populations to temporarily transition through unfit intermediates (Gavrilets 2003). These models have nevertheless been proposed to explain empirical results. Reproductive isolation resulting from the fixation of mildly underdominant chromosomal rearrangements has been explored as one such non-epistatic model (Kirkpatrick and Barton 2006; Coyne and Orr 2004; Walsh 1982). Experimental work in the genus Saccharomyces has led to a model of reproductive isolation involving sequence divergence and mismatch repair without epistasis (Liti 2006; Greig et al. 2002). Since these non-epistatic models invoke the progressive accumulation of mutations at single loci, hybrid incompatibilities between divergent species should accumulate linearly with time (Gourbière and Mallet 2010). This prediction has rarely been tested empirically.
Catalyzed by Orr (1995), the Dobzhansky-Muller model became the focus of mathematical theory that generated many useful insights into incompatibility evolution (Orr 1996; Gavrilets 1997; Barton 2001; Turelli and Orr 2000; Orr and Turelli 2001; Turelli et al. 2001; Kondrashov et al. 2002; Gavrilets 2004; Mendelson et al. 2004; Welch 2004; Palmer and Feldman 2009; Unckless and Orr 2009; Fierst and Hansen 2010; Bank et al. 2012; Livingstone et al. 2012). One noteworthy prediction is the “snowball effect”: the expected number of hybrid incompatibilities increases faster than linearly with time since divergence. To test this prediction, incompatibilities are counted in at least two species pairs separated by different divergence times. This approach has recently uncovered evidence for the snowball effect in Drosophila (Matute et al. 2010) and some isolation traits in Solanum (Moyle and Nakazato 2010), although other isolation traits show patterns consistent with alternative modes of evolution (Moyle and Nakazato 2010).
The potential power of comparing incompatibilities among species pairs within a clade raises the prospect of embedding the Dobzhansky-Muller model within a phylogenetic comparative framework (Moyle and Payseur 2009). The temporal dynamics of Dobzhansky-Muller incompatibility (DMI) evolution can be reconstructed by scoring incompatibilities as shared or unshared among several pairs of species with known phylogenies (Cattani and Presgraves 2009). As a result, predictions from the Dobzhansky-Muller model can be tested in a phylogenetic context. Despite widespread acceptance of the Dobzhansky-Muller model as an explanation for the evolution of postzygotic isolation and empirical success in mapping incompatibilities across a range of species, the way in which incompatibilities accumulate along phylogenies remains largely unknown.
Theory describing how incompatibilities evolve along phylogenies would also enable estimation of the relative contributions of different incompatibility classes. For example, mathematical descriptions of the Dobzhansky-Muller model have assumed that derived-derived incompatibilities have the same probability of occurring as derived-ancestral incompatibilities (Orr 1995; Turelli and Orr 2000; Orr and Turelli 2001). This has led to the prediction that derived alleles are three times more likely to participate in incompatibilities than ancestral alleles (Orr, 1995). However, substitutions at a single locus may influence the rate of evolution at epistatically linked loci (Presgraves and Stephan 2007; Schlosser and Wagner 2008; Tang and Presgraves 2009). Co-evolution of incompatible loci would result in departures from the expected proportion of derived-derived versus derived-ancestral incompatibilities.
Here, we describe how the numbers of incompatibilities shared between species pairs depend on phylogenetic history. We use these theoretical results to develop statistical tests that use the number of shared and unique incompatibilities counted between species pairs to compare models of incompatibility evolution and to estimate their parameters. We show that this comparative genetic mapping framework is a powerful approach for understanding the evolution of reproductive isolation.
METHODS AND RESULTS
Model Preliminaries
Mathematical models of DMI accumulation predict the number of incompatibilities between two lineages as a function of divergence time (Orr 1995; Orr and Turelli 2001; Welch 2004). These models consider two populations slowly diverging in allopatry from a common ancestor whereupon they independently accumulate and fix new mutations. The mathematical treatment of the model involves a number of simplifications:
New substitutions in the two populations are at independent sites and do not overlap.
Substitutions appear in these populations according to a Poisson process and fix instantly.
The mean number of substitutions separating two nodes is proportional to the branch length, or evolutionary time, separating them. The proportionality constant is denoted by k. This is also the rate of the Poisson process.
Any pair of substitutions evolving in different lineages has a small probability, p, of interacting to cause a hybrid incompatibility.
As the number of divergent sites, K, increases, the possible number of epistatic interactions between n loci increases combinatorially as (Orr 1995).
When only epistatic interactions between two loci are considered, the number of incompatibilities, I, as a function of time since divergence has expected value and variance (Orr and Turelli 2001): and , where E[K] equals 2kt, the expected number of substitutions separating two species at time t. Similar mathematical treatment can be used to find expected values and variances in the case of epistatic interactions between more than two loci. The expected number of DMIs for interactions between n loci (see Appendix for details) is
| (1) |
where In is the number of n-locus incompatibilities and pn is the probability that an n-locus interaction between substitutions results in a hybrid incompatibility. This relationship can also be employed to determine the variance in the number of n-locus DMIs,
| (2) |
It is important to remember that these values are expectations and variances for incompatibilities involving exactly n loci only. If higher-order interactions are thought to contribute, a model for the total number of DMIs will need to quantify the contributions of different multi-locus interactions to the total number of incompatibilities (e.g., the relative frequencies of two- and three-locus incompatibilities).
Incompatibilities on a Three-Species Tree
Our goal is to understand predictions of the Dobzhansky-Muller model for the evolution of DMIs in a multi-species clade, including predictions about the number and phyletic distribution of incompatibilities among hybrids between different species pairs. We consider the simplest case with phylogenetic structure where three species have diverged from a recent common ancestor. Assumptions from the previous mathematical treatment (Orr 1995; Orr and Turelli 2001), including the gradual divergence of populations in allopatry and the instantaneous fixation of mutations, continue to apply.
On this simple tree (see Figure 1) we label the two most recently diverged sister taxa Species A and Species B; and we label the third taxon Species C. We call the time of the event where A and B diverge Td and define t2 to be the elapsed time from Td to present. We define t1 to be the elapsed time from the initial divergence at the root to Td. We assume that the substitution rate per genome per unit time, k, is the same across every branch in the tree.
Figure 1.
A three-species tree on which hybrid incompatibilities are considered. Td marks the time when Species A and B diverge, t1 marks the time from the initial divergence of Species C to Td, and t2 marks the time from Td to present. The number of substitutions from each branch, as referenced in the text, KL, KR, Ka, Kb, and Kc, are also labeled on their respective branches. Note that Kt = KL + KR.
Pairs of taxa from these three lineages yield three classes of hybrids: AB, AC, and BC. When considering each of these hybrid classes individually, the expected number of DMIs follows the behavior described by Turelli and Orr (2001). That is, each of the two taxa forming a hybrid can be considered separately as two populations that have diverged from a common ancestor. The total time since divergence of the taxa that form the AB hybrid is t2 and the total time since divergence of the taxa that form the AC and BC hybrids is t1 + t2. The expected values and variances for the number of incompatibilities can be calculated as described in previous work (Turelli and Orr 2001; Table 1). However, the numbers of DMIs between different hybrid classes are not independent, but depend on the topology of the phylogeny. To understand covariation between the numbers of DMIs, we must first describe how incompatibilities are shared between different species pairs.
Table 1.
Expectation and variance values for the two-locus model; k is the substitution rate per genome per unit time, t1 is the time from the root to Td, t2 is the time from Td to the present, and p is the probability of an incompatibility.
| Expectation | Variance | |
|---|---|---|
| IAB | 2p(kt2)2 | 2p(kt2)2(1+4pkt2) |
| IAC | 2p(kt1+kt2)2 | 2p(kt1+kt2)2(1+4p(kt1+kt2)) |
| IBC | 2p(kt1+kt2)2 | 2p(kt1+kt2)2(1+4p(kt1+kt2)) |
| IsharedA | ||
| IsharedB | ||
| IsharedC |
Shared Incompatibilities
We call an incompatibility shared between different hybrids if it involves the same interaction between the same set of loci in each of the hybrids. With three species, there are three possible pairwise comparisons: AB with AC, AB with BC, and AC with BC. We assume the same alleles that cause an incompatibility in one hybrid are sufficient to cause the same incompatibility in another hybrid. As a consequence of our modeling assumptions, we disregard incompatibilities that are the result of convergence since new incompatibilities must arise from independently occurring substitutions.
Thus, shared incompatibilities can only exist among species pairs if they share some of the same substitutions. This leads to a distinction in how derived-derived and derived-ancestral incompatibilities are shared between hybrids (Figure 2). Notice in our example that there is no way for a DMI to be shared among all three species pairs. Since derived-derived incompatibilities involve interactions between substitutions that have occurred in different lineages, they can only be shared between species pairs when there is shared evolutionary history among all parental species. In contrast, because derived-ancestral incompatibilities involve substitutions that have occurred on a single lineage, they can be shared if at least one parent from each pair share evolutionary history. Thus, sharing of derived-derived incompatibilities is more restrictive.
Figure 2.
Diagram of derived-derived (left) and derived-ancestral (right) interactions involving the last substitution in Species C. Open circles on the tree indicate where substitutions have occurred on a lineage. Closed circles on the tree indicate the corresponding ancestral allele on other lineages. Solid arrows represent interactions that would lead to shared incompatibilities between the AC and BC hybrid while dashed arrows represents interactions unique to a single hybrid.
Consider the shared incompatibilities between the AC and BC hybrid in our three-species example. Interactions between substitutions that arose before Td, labeled Kt = KL + KR in Figure 1, can result in shared incompatibilities that are both derived-derived and derived-ancestral since all parental species of AC and BC share the same history before Td. Substitutions that arose after Td on Species C’s lineage, Kc, can participate in shared derived-ancestral incompatibilities with other substitutions in Kc, and shared derived-derived incompatibilities by interacting with substitutions in KL and KR. In contrast, substitutions that arose after Td in Species A and B’s lineage, Ka and Kb respectively, will not result in any shared DMIs.
The total number of shared two-locus incompatibilities between AC and BC, IsharedC, is then
| (3) |
where IKt × Kt represents the incompatibilities from substitutions before Td, IKc × Kc represents the derived-ancestral incompatibilities from substitutions on the C branch after Td, and IKc × Kt represents the interactions from substitutions on the C lineage after Td with all substitutions that arose before Td.
We can calculate the expected value of IsharedC by using information about the expected number of substitutions on each branch. Let Kt be the total number of substitutions between the two lineages before Td; there are possible two locus incompatibilities involving only the Kt substitutions. Let Kc be the number of substitutions on the C lineage after Td; there are possible two locus derived-ancestral incompatibilities involving only the Kc substitutions. All interactions between the Kt and Kc substitutions are also shared; there are (Kt)(Kc) such shared interactions. Given the number of substitutions, the expected value of IsharedC is then
| (4) |
This simplification shows that the number of shared interactions is equivalent to the number of combinations between substitutions shared by both pairs of parents. Under the assumption that the substitution rate k is the same across every branch of the tree,
| (5) |
we can calculate E[IsharedC] using Eq. (1), because Kt + Kc is Poisson distributed with mean E[Kt]+E[Kc]=k(2t1+t2)
| (6) |
We now consider the incompatibilities shared between the AB and AC hybrids. While interactions between KL and KR can exist in the AC hybrid, they are absent in the AB hybrid, since the substitutions before divergence are fixed in both parental species. The only shared incompatibilities are derived-ancestral incompatibilities arising from Ka. By the same reasoning, the incompatibilities shared between the AB and BC hybrid involve only derived-ancestral interactions arising from Kb substitutions. The total number of possible two-locus incompatibilities shared between the AB and AC hybrids, IsharedA, and between the AB and BC hybrids, IsharedB, is then
| (7) |
where IKa × Ka and IKb × Kb are the number of derived-ancestral incompatibilities that have arisen from substitutions after Td in the A and B lineages respectively.
The expected value of IsharedA and IsharedB given information on the number of substitutions in each branch, K, is then
Again, under the assumption that the substitution rate, k, is the same across every branch of the tree
| (8) |
We then calculate E[IsharedA] and E[IsharedB] using Eq. (1), substituting from Eq. (8),
| (9) |
We now consider the variance of these shared incompatibilities, following Eq. (2). We do so below for the variance of IsharedC. Recall from Eq. (4) that the number of shared interactions is equal to the number of substitutions shared between both pairs of parents, Kt + Kc; applying Eq. (2),
Using the independence of Kt and Kc and substituting from Eq. (5), the variance of IsharedC is
| (10) |
The calculation for IsharedB and IsharedA follow closely and the results are presented in Table 1.
Covariance between Incompatibilities
From the pairwise comparison of three species, we have derived six different measures of the number of DMIs between three species: the incompatibilities in each of the three species pairs and the incompatibilities shared among the three species pairs. Next, we consider the covariance between each of these six measures of DMIs. Recall that each interaction between a set of substitutions can be modeled as a Bernoulli trial with probability p of becoming an incompatibility. While this assumption is unlikely to be true across the genome, empirical data to guide the development of a more sophisticated alternative is absent and previous theoretical treatments have maintained this assumption (Orr 1995; Turelli and Orr 2001; Welch 2004). Given the substitutions on each branch, the formation of one incompatibility is then assumed to be independent from another. The relationship between different measures of the number of incompatibilities is then dependent only on which incompatibilities are shared. We partition the incompatibilities of each measure into the independent components that can be shared between measures.
| (11) |
This partitioning follows the logic from the calculation of the shared incompatibilities. The measures of shared incompatibilities involve interactions as described in Eqs. (3) and (7).
These relationships allow us to calculate the covariance given the number of substitutions. The calculation of this sum is demonstrated below for the covariance between IsharedC and IAC. We begin by partitioning the covariance following Eq. (11),
| (12a) |
Notice that the second term, the covariance of IsharedC with IKa × Ka, is zero as there are no substitutions that participate in both sets of incompatibilities. Since the variance of IsharedC was calculated earlier, this leaves only the calculation of the third term, which can be determined by conditioning on K.
We need only consider the second term, as the expected covariance given K for the first term is zero.
We calculate this using the definition of covariance, then expand and simplify with Eq. (1)
| (12b) |
Substituting Eq. (12b) and the variance of IsharedC from Eq. (10) into Eq. (12a) yields the total covariance. We can also substitute from Eqs. (5) and (8) to arrive at the covariance in terms of k, p, t1 and t2.
| (13) |
The 14 other covariances, out of 15 unique covariances from the 6 measures of incompatibility, are provided in Table 2 (derivations are shown in supplementary material).
Table 2.
Covariance values for the two-locus model.
| Covariance | |
|---|---|
| IAB, IAC | |
| IAB, IBC | |
| IAB, IsharedA | |
| IAB, IsharedB | |
| IAC, IsharedA | |
| IBC, IsharedB | |
| IAC, IsharedC | |
| IBC, IsharedC | |
| IAC, IBC | |
| IAB, IsharedA | 0 |
| IAC, IsharedB | 0 |
| IBC, IsharedA | 0 |
| IsharedA, IsharedB | 0 |
| IsharedA, IsharedC | 0 |
| IsharedB, IsharedC | 0 |
Alternative Models on a Simple Tree
The treatment of the two-locus Dobzhansky-Muller model described above can be modified to include additional scenarios that describe the evolution of incompatibilities. We consider scenarios that involve different probabilities of derived-derived and derived-ancestral incompatibilities, three-locus interactions, and incompatibility accumulation that is linear with time.
Different Probabilities of Derived-Derived and Derived-Ancestral Incompatibilities
The number of incompatibilities does not depend on where substitutions fall on the two lineages in the two-locus Dobzhansky-Muller model (Muller 1942). Though interactions of alleles that arise in the same lineage are distinguished from interactions of alleles that arise from different lineages, it has been assumed that both types of interactions, derived-ancestral and derived-derived, are equally likely to produce an incompatibility (Orr 1995; Orr and Turelli 2001; Welch 2004). By relaxing this assumption, we may be able to recapture variation from population processes neglected by assumptions in the mathematical formulation of the Dobzhansky-Muller model. This would mean that the number of incompatibilities is no longer independent of where substitutions occur with respect to the lineages.
Recall that the classical mathematical treatment of the Dobzhansky-Muller model assumes the instantaneous and independent fixation of substitutions. If fixation is less than instantaneous, alleles that arise sequentially on a single lineage have a chance of co-existing in the same genome at the same time. By coexisting in a single lineage, the allelic combination would be subject to selection, precluding them from participating in a DMI and reducing the overall probability of derived-ancestral incompatibilities. Alternatively, substitutions on a single lineage may co-evolve such that a substitution at one incompatibility-forming allele favors substitutions at interacting alleles. This type of dependence would create an imbalance of substitutions on one lineage and manifest as a greater probability of derived-ancestral incompatibilities.
We define two new variables; let pa be the probability that an untested derived-ancestral combination leads to an incompatibility, and pd be the probability that an untested derived-derived combination leads to an incompatibility. Departing from our three taxa case momentarily, we reconsider the number of two-locus DMIs in two divergent populations with these new parameters. Let KL be the number of substitutions on one branch of the tree and KR be the number of substitutions on the other. Since interactions between substitutions from the same branch lead to derived-ancestral incompatibilities while interactions between substitutions from different branches lead to derived-derived incompatibilities, we have
| (14) |
Since we have maintained the assumption of equal rates of substitution on all branches,
| (15) |
and the expected number of incompatibilities is
| (16) |
It is clear from Eq. (16) that it is impossible to estimate pa independently from pd given only an estimate of kt and a measure of the number of incompatibilities between two species. Because the probabilities occur together as a sum, it is only possible to estimate (pa+pd). We show below this is not the case when more than two species are considered and that separate estimates of pa and pd can be made.
Returning to our example of three species on a known tree, we apply Eq. (16) to derive the expected number of incompatibilities in the AB, AC, and BC hybrids by simply substituting the correct divergence times.
| (17) |
Recall that hybrids of the AB and AC species pairs as well as hybrids of the AB and BC species pairs share only their derived-ancestral incompatibilities, IshA and IshB respectively. Following Eq. (9), the expected number of shared incompatibilities is
However, shared incompatibilities between hybrids of the AC and BC species pair, IshC, involve both derived-derived and derived-ancestral incompatibilities. To consider these, let KL and KR now represent the number of substitutions on their respective branches as in Figure 1. Following Eq. (3), we partition IKt in IshC into IKL and IKR.
E[IKC × KC] consists of (pa/2) E[Kc]2 derived-ancestral incompatibilities, and we have already solved for IKt × Kt in Eq. (14), leaving us only to partition IKc × Kt. Kc and KL are substitutions from branches on opposite sides of the root while Kc and KR are on branches from the same side, making the incompatibilities that form from the former, derived-derived, and the latter, derived-ancestral.
Combining this with our previous results and substituting from Eqs. (5) and (15) results in
| (18) |
The expected values for all six incompatibility measures are listed in Table 3. Similar reasoning and methods can be used to calculate the variance and covariances between the numbers of incompatibilities for this model. The results of these calculations are presented in Table 4.
Table 3.
Expectation and variance values for the Dobzhansky-Muller model where the probability of derived-ancestral incompatibilities, pa, is different from the probability of derived-derived incompatibilities, pd.
| Expectation | Variance | |
|---|---|---|
| IAB | (pa+pd(kt2)2 | (pa+pd)(kt2)2(1+2kt2(pa+pd)) |
| IAC | (pa+pd)(kt1+kt2)2 | (pa+pd)(kt1+kt2)2(1+2(kt1+kt2)(pa+pd)) |
| IBC | (pa+pd)(kt1+kt2)2 | (pa+pd)(kt1+kt2)2(1+2(kt1+kt2)(pa+pd)) |
| IsharedA | ||
| IsharedB | ||
| IsharedC |
Table 4.
Covariance values for the pa ≠ pd Dobzhansky-Muller model.
| Covariance | |
|---|---|
| IAB, IAC | |
| IAB, IBC | |
| IAB, IsharedA | |
| IAB, IsharedB | |
| IAC, IsharedA | |
| IBC, IsharedB | |
| IAC, IsharedC | |
| IBC, IsharedC | |
| IAC, IBC | |
| IAB, IsharedA | 0 |
| IAC, IsharedB | 0 |
| IBC, IsharedA | 0 |
| IsharedA, IsharedB | 0 |
| IsharedA, IsharedC | 0 |
| IsharedB, IsharedC | 0 |
Notice that when pa = pd = p, this model reduces to the formulation presented in Section 2. The observation that predicted quantities change when pa ≠ pd raises the possibility of estimating the fractions of derived-derived and derived-ancestral incompatibilities. If the assumption that substitutions are independent and accumulate proportional to branch length holds, then the ratio of these parameters represents the ratio of derived-derived and derived-ancestral incompatibilities.
Three-Locus Incompatibilities
Most of the developments in both the theoretical and empirical understanding of reproductive isolation under a Dobzhansky-Muller framework have focused on epistasis between two loci (Turelli and Orr 2001; Mendelson et al. 2004; Turelli and Moyle 2007; Gourbière and Mallet 2010; Moyle and Nakazato 2010; Matute et al 2010). This focus is likely explained by the challenge of mapping higher-order epistatic incompatibilities in crosses with practical sample sizes. There is, however, no theoretical reason to believe DMIs responsible for reproductive isolation are mainly the result of two-locus interactions. DMIs involving a greater number of loci may actually be more common because a greater number of participating loci allow a greater number of unconstrained evolutionary paths (Cabot et al. 1994; Orr 1995).
We expand the Dobzhansky-Muller model and incorporate three-locus incompatibilities by modeling the total number of DMIs as a result of both the interactions between two loci and the interactions between three loci. Let untested allele pairs and triples from these substitutions have a probability, p2 and p3, of resulting in an incompatibility respectively. Note that both types of incompatibilities arise from the same set of substitutions. The expected numbers of incompatibilities and shared incompatibilities follow from Eq. (1), as a sum of terms from n = 2 and n = 3 because of the linearity of the expectation operator.
For example, the expected number of incompatibilities between species A and species B, IAB, from our three species example can be calculated as
| (19) |
where IAB(2) are the two-locus incompatibilities between species A and B, and IAB(3) three-locus incompatibilities between species A and B. Given information on the number of substitutions, K, the expectation is
Applying Eq. (1) and then substituting for E[Ka] and E[Kb],
| (20) |
The calculation of the variances and covariances for the number of incompatibilities in this model are more involved because of statistical interaction between the n = 2 and n = 3 terms. These values can still be computed with the same techniques used in the previous sections but we do not reproduce them here. The results are presented, along with other expected values for this model, in Tables 5 and 6.
Table 5.
Expectation and variance values for the three-locus model with probability p3 for three-locus incompatibilities and probability p2 for two-locus incompatibilities.
| Expectation | Variance | |
|---|---|---|
| IAB | ||
| IAC | ||
| IBC | ||
| IsharedA | ||
| IsharedB | ||
| IsharedC |
Table 6.
Covariance values for the three-locus model.
| Covariance | |
|---|---|
| IAB, IAC | |
| IAB, IBC | |
| IAB, IsharedA | |
| IAB, IsharedB | |
| IAC, IsharedA | |
| IBC, IsharedB | |
| IAC, IsharedC | |
| IBC, IsharedC | |
| IAC, IBC | |
| IAB, IsharedA | 0 |
| IAC, IsharedB | 0 |
| IBC, IsharedA | 0 |
| IsharedA, IsharedB | 0 |
| IsharedA, IsharedC | 0 |
| IsharedB, IsharedC | 0 |
Notice that unlike the model where derived-ancestral incompatibilities are allowed to occur with a different probability than derived-derived incompatibilities, the two parameters in this model, p2 and p3, are not directly comparable. There are many more untested allele combinations that are mathematically possible and even if two-locus and three-locus interactions contributed equally to the total number of incompatibilities, p2 would be orders of magnitude greater than p3.
Non-Epistatic Models: Linear Incompatibility Accumulation
To model the linear accumulation of hybrid incompatibilities among three species, we begin with the same framework developed for the two-locus epistatic model. But instead of interacting with other substitutions, we imagine each new substitution has a small probability of causing an incompatibility. This probability, which we denote p1, represents the chance that a single substitution in one lineage causes an incompatibility in a hybrid with any other lineage. While this value appears similar to the probability, p, in the two-locus model, it is not analogous and parameterizes a single locus model that does not involve interaction.
The number of incompatibilities predicted in a hybrid of two species by the single locus model has expectation and variance
| (21) |
where t is the amount of time since the divergence of the two species. Consider now the behavior of the number of incompatibilities in the three species example. As in the two-locus interaction model, the number of incompatibilities when considering each hybrid individually follows the expectations and variance described above. However, the manner in which incompatibilities are shared is much simplified. Since each incompatibility involves only a single allele, incompatibilities are shared as long as they exist in both hybrids. For example, the incompatibilities shared between the AB and AC hybrids, IsharedA, include all incompatibilities that arose from Ka substitutions. The set of these sharing relationships for the single locus model is summarized below; contrast this with the two-locus Dobzhansky-Muller relationships presented in Eqs. (11).
| (22) |
Using the relationships in Eqs. (21) and (22), the expectation, variance, and covariances of the number of incompatibilities can be calculated for the single locus model; results are presented in Tables 7 and 8.
Table 7.
Expectation and variance values for the single-locus model of incompatibility accumulation with probability p1 of incompatibility formation for each new substitution.
| Expectation | Variance | |
|---|---|---|
| IAB | 2p1kt2 | 2p1kt2 |
| IAC | 2p1(kt1 + kt2) | 2p1(kt1 + kt2) |
| IBC | 2p1(kt1 + kt2) | 2p1(kt1 + kt2) |
| IsharedA | p1(2kt1 + kt2) | p1(2kt1 + kt2) |
| IsharedB | p 1 kt 2 | p 1 kt 2 |
| IsharedC | p 1 kt 2 | p 1 kt 2 |
Table 8.
Covariance values for the single-locus model.
| Covariance | |
|---|---|
| IAB, IAC | p 1 kt 2 |
| IAB, IBC | p 1 kt 2 |
| IAB, IsharedA | p 1 kt 2 |
| IAB, IsharedB | p 1 kt 2 |
| IAC, IsharedA | p 1 kt 2 |
| IBC, IsharedB | p 1 kt 2 |
| IAC, IsharedC | p1(2kt1 + kt2) |
| IBC, IsharedC | p1(2kt1 + kt2) |
| IAC, IBC | p1(2kt1 + kt2) |
| IAB, IsharedA | 0 |
| IAC, IsharedB | 0 |
| IBC, IsharedA | 0 |
| IsharedA, IsharedB | 0 |
| IsharedA, IsharedC | 0 |
| IsharedA, IsharedC | 0 |
Inference Methods and Simulation
Maximum-Likelihood Estimators and the AIC
We have described how hybrid incompatibilities are distributed across a three-taxon phylogeny under different models of incompatibility accumulation. We use these properties to develop a method that compares the goodness-of-fit between models and estimates model parameters. We continue to focus on the three species example with six observations of incompatibilities; however, it is mathematically trivial to extend this method to a larger number of related species.
We first consider the probability distribution of a vector of observable incompatibility measures. A set of incompatibility measures, e.g. II = {IAB, IAC, IBC, IsharedA, IsharedB, IsharedC}, is the result of a single instance of the incompatibility accumulation process. The set of observations from a genetic mapping study can be thought of as a single vector on a distribution of possible observations. This distribution of possible observations is a function of the nature of hybrid incompatibility formation. Recall that the distribution for the number of substitutions between species has been modeled to be Poisson and the distribution for the number of incompatibilities given the number of substitutions has been modeled to be binomial; when a hybrid from a single pair of species is considered, the distribution of the number of DMIs has been shown to be well approximated by a Gaussian (Orr and Turelli 2001). We apply this idea and approximate the distribution of the vector of incompatibility measures, II, with a multivariate Gaussian.
We have mathematically formulated the expectations and covariances of incompatibility measures for a selection of different models (results summarized in Tables 1 through 8). By constructing a vector mean and covariance matrix from these results, the probability distribution function of possible incompatibility observations can be defined for any given model. Since the calculated expectations and covariances for each model are themselves functions of the model parameters, the probability distribution function can be written explicitly as a function of the model parameters.
From this explicit probability distribution function, we define a likelihood function for a vector of observed incompatibilities and a set of model parameters. For a given model with six observations, we have
| (23) |
where x is a vector of observed incompatibilities, μ is the vector of expected incompatibilities, Σ is the covariance matrix, ∣Σ∣ is the determinant of the covariance matrix, and θ is a set of parameters in the model.
We find a set of parameters that maximizes this likelihood function for an observed vector of incompatibilities. This is the maximum-likelihood estimator for the model parameters
| (24) |
The relative goodness of fit for each model can be compared by applying the Akaike Information Criterion (Akaike 1974), taking into account that the single-locus and two-locus models each have a single parameter, p1 and p respectively, while the pa ≠ pd and the three-locus models each have two parameters, pa, pd and p2, p3 respectively.
Calculating the likelihood for the single-locus model is complicated by the rank of the covariance matrix. Based on how we have modeled single-locus incompatibilities to accumulate, there are three independent measures of incompatibility (see Eq. (22)). As a result, the dimensionality of the covariance matrix exceeds the rank, making the matrix singular and the likelihood ill-defined. To remedy this, we add to the single-locus model a small, bounded parameter, δ, that we call the error term. We envision an empirical measurement error of each observation that scales with the theoretical variance. The error term captures this variance and conveniently increases the rank of the covariance matrix,
| (25) |
where σ2, σ, and δ correspond to the respective variances, covariances, and error factor.
To summarize, given a particular set of incompatibility observations, the maximum-likelihood estimators are calculated for each model under consideration and the model that yields the lowest AIC is preferred. We have not rigorously justified the approximation that the distribution of a vector of incompatibilities should be a multivariate Gaussian, but the effectiveness of the inference procedure in distinguishing between different models of incompatibility accumulation can be determined by its application to simulated data.
Simulation and Results
To test our inference method, we ran simulations of hybrid incompatibility accumulation on three-species trees under the single-locus, two-locus, three-locus, and pa ≠ pd models we described earlier. We simulated the number of substitutions on each branch by drawing from a Poisson with a mean proportional to the branch length and substitution rate, kt. From these numbers of substitutions, we calculated the number of potential interactions based on the model of incompatibility accumulation being simulated. The number of incompatibilities for each branch was then drawn from a binomial, using the number of potential interactions and the model’s probability (or probabilities) that an interaction becomes an incompatibility as parameters. The maximum-likelihood estimators and AIC values for each of the models were then calculated from the results of each simulation to measure the method’s ability to correctly identify the underlying simulated model.
Since power depends on the total number of incompatibilities simulated, we divided the simulations into 4 groups, corresponding to 20, 50, 100, and 150 expected incompatibilities between the two most diverged species by adjusting the probability that an interaction led to an incompatibility. For each of these groups, we ran 2000 simulations on 15 trees of differing internal branch length while holding the longest branch length constant at k(t1+t2) = 10000. For simulations of models with more than one parameter, the three-locus and pa ≠ pd models, the total number of expected incompatibilities was held constant while the contribution from each component was randomly varied. The same constraint was applied to p2 and p3 for the three-locus model.
Our estimation of θ is complicated by the fact that the covariance matrix, Σ, depends on its estimation, that is Σ = Σ(θ). In order to deal with this, we use an optimization routine; our maximum-likelihood estimators were calculated numerically using the bounded BFGS algorithm as implemented in the Python programming language’s SciPy library. The error term, δ, for calculation of the single-locus model’s maximum likelihood, was bounded to [1 × 10−4, 1]. The error term is bounded so that it does not dominate the parameter estimates when the model fits poorly. One concern with adding a bounded parameter is that it will penalize the AIC by adding an additional parameter to the model without allowing the parameter to fully contribute to increasing the likelihood. However, we will see this is not a concern since the single-locus model is overwhelmingly supported when the underlying simulation is single-locus.
The first simulation result we consider is the relative fit of the single-locus and two-locus models when the simulation is single-locus. Figure 3(top) shows the AIC values calculated for the single-locus model to be uniformly lower than the AIC values calculated for the two-locus model. Comparisons of the single-locus AIC with other model AICs show a similar pattern; that is, when the underlying mode of simulated incompatibility accumulation is linear, the single-locus model is always favored. In contrast, two-locus AIC values are not always lower than the single-locus AIC when the underlying mode of incompatibility simulation involves two-locus epistasis (Figure 3(middle)). Nevertheless, the two-locus model is correctly favored over the single-locus model in greater than 80% of simulations even with the longest internal branch length and only 20 expected incompatibilities between the most divergent species (Figure 3(bottom)). These results suggest high power to detect the acceleration in incompatibility accumulation predicted by the Dobzhansky-Muller model. In both comparisons, the distinction between the AIC values increases with a greater number of expected incompatibilities and with greater divergence time between the most closely related species.
Figure 3.
Results from simulations of single-locus and two-locus incompatibility accumulation. Plots are divided horizontally into four columns based on model parameters that correspond to, from left to right, 20, 50, 100, and 150 expected incompatibilities between the most diverged species. Simulations on fifteen three-species trees of differing times since most recent divergence, Td, are represented in each individual plot. Boxes show median and quartile values while whiskers show range. Highlighted values correspond to AIC calculated for a single-locus model while those in black correspond to AIC calculated for a two-locus model. (top) AIC values from a simulation of single-locus incompatibility accumulation. (middle) AIC values from a simulation of two-locus incompatibility accumulation. (bottom) Percentages of two-locus simulations where each model is favored by having lower AIC. (In single-locus simulations, the single-locus model is always favored.)
Differences in AIC values between simulations of Dobzhansky-Muller-type models (two-locus, three-locus, and pa ≠ pd) are much smaller, making the distinction between these models less clear (Figure 4). Nevertheless, Dobzhansky-Muller-type models are consistently favored over the single-locus model in these simulations. As we might expect, the underlying model of simulation is correctly favored more frequently when a greater number of total incompatibilities is simulated. Unlike the comparison between the single-locus and two-locus models, shorter relative internal branch lengths do not necessarily result in a clearer distinction between the different Dobzhansky-Muller-type models. While the two-locus model is still most easily identified with decreasing relative internal branch length, the power to correctly identify the other two Dobzhansky-Muller-type models is not monotonic with relative internal branch length.
Figure 4.
Comparison of favored models for Dobzhansky-Muller-type model simulations. Plots are divided horizontally into four columns based on model parameters that correspond to, from left to right, 20, 50, 100, and 150 expected incompatibilities between the most diverged species. Simulations on fifteen three-species trees of differing times since most recent divergence, Td, are represented in each individual plot. Each bar shows the percentage of simulations where each model had the lowest calculated AIC among compared models. (top) Comparison of favored models from simulations of the two-locus model. (middle) Comparison of favored model from simulations of the pa ≠ pd model. (bottom) Comparison of favored models from simulations of the three-locus model.
We further investigated the nested Dobzhansky-Muller-type models, the pa ≠ pd and three-locus models, to determine the power of the inference method to identify more complex models under various parameter values. For these two models, we ran 5000 simulations on each of the 15 trees of varying internal branch length. The percentage of incompatibilities attributable to any single parameter was randomly assigned, by randomly selecting the ratio of pa to pd or p2 to p3 respectively, while the total number of expected incompatibilities was fixed for the 4 groups (20, 50, 100, and 150 expected incompatibilities). The results were binned into ten fractions based on the percent that each parameter contributed to the total number of incompatibilities. Figure 4 is a representative plot showing the percentage of simulations where the more complex model is favored over simpler Dobzhansky-Muller-type models. There is no simple relationship between power and relative internal branch length. As expected, the pa ≠ pd model is more likely to be favored when there is a greater imbalance between derived-derived and derived-ancestral incompatibilities. Similarly, the three-locus model is more likely to be favored when three-locus interactions account for a greater fraction of the total number of incompatibilities. These results indicate that evidence for complex models of DMI accumulation can be inferred under the right conditions.
DISCUSSION
The comparison of patterns of reproductive isolation across species has generated tremendous insights into the speciation process (Haldane 1922; Coyne and Orr 1989, 1997; Sasa et al. 1998; Presgraves 2002; Lijtmaer et al. 2003; Price and Bouvier 2003; Bolnick and Near 2005). The benefits of the comparative approach extend to genetic mapping of isolation phenotypes (Moyle and Payseur 2009). Our results demonstrate that different modes of incompatibility evolution leave contrasting patterns of incompatibility sharing across species pairs, in a manner that depends on the phylogeny. This finding has important implications for genetic studies of speciation. First, genetic models of speciation can be directly tested using comparative mapping data by counting the number of shared and unique incompatibilities between each lineage. Though the Dobzhanksy-Muller model enjoys considerable support from genetic mapping studies, its evolutionary predictions (e.g., Orr 1995) have only recently been tested, with mixed results (Matute et al. 2010; Moyle and Nakazato 2010). Second, our results situate the evolving genetic architecture of reproductive isolation in its phylogenetic context. As with any trait, the genetic changes that generate isolation evolve along a tree: statistical inferences must take into account this fact. Finally, we see that clades of species with different phylogenetic structures (but similar ages) are expected to show distinct patterns. Our results provide guidance on choosing species groups for reconstructing the evolution of reproductive barriers from a genetic perspective.
Assumptions and Challenges
Two major categories of assumptions underlie the approach described here, those associated with our mathematical extension of the Dobzhansky-Muller model and those associated with our phylogenetic test. Much of the power from our comparative framework comes from modeling shared incompatibilities; this extension of the Dobzhansky-Muller model assumes that the same allelic combinations in different species pairs result in the same incompatibilities. Our extended model also assumes that all of the incompatibilities measured from different species pairs are distributed as a multivariate normal. The validity of this latter assumption should improve with larger numbers of incompatibilities.
Our phylogenetic test assumes that relative divergence time and tree topology estimates are accurate. Relative divergence times can be estimated from substitution rates. For example, the number of synonymous substitutions per synonymous site, Ks, has been used as a proxy for divergence time in recent empirical studies of incompatibility accumulation (Matute et al., 2010; Moyle and Nakazato, 2010). Obtaining an accurate phylogeny can be complicated by gene flow and incomplete lineage sorting, which can cause the phylogeny to vary across the genome, especially in closely related species (Maddison 1997; Pollard et al. 2006; White et al. 2009). Phylogenies at incompatibility loci are the most relevant for the evolution of reproductive isolation, but phylogenies from randomly chosen loci may often be the only ones available. The maximum-likelihood method used in our test also implicitly weights each of the incompatibility measurements equally. This assumption, that each measure is equally accurate, is most reasonable when species pairs have similar divergence times so that deviations from the expected values are equally distributed among the measurements.
Our method currently assumes that all incompatibilities are counted, but this will rarely be true in practice. Sample size constraints and error in genetic mapping generally prohibit the identification of all loci that affect genetically complex traits such as hybrid inviability and hybrid sterility. This problem is exacerbated for epistatic loci, where genetic identification requires the comparison of multi-locus genotypes, further partitioning the sample and requiring large numbers of individuals to find both incompatibility partners. A reduced count of total incompatibilities should not bias the results, as long as incompatibilities that arise on different parts of the phylogeny are equally detectable. Similarly, if we assume that incompatibilities are independent, finding one of the multiple loci that comprise an incompatibility should not introduce a bias. Nevertheless, the effects on method performance when only a subset of incompatibility loci is detected should be examined.
A further challenge comes from the designation of an incompatibility as shared. Because genetic mapping often associates large chromosomal regions with the phenotype, positional overlaps are not always shared genetic changes. Although the ideal solution is to identify or finely map the causative mutations, approaches that statistically evaluate the evidence for shared loci are available (Li et al. 2005; Broman et al. 2012). Extending these methods to treat sets of epistatic loci would be helpful for studies of the genetics of speciation.
Studying the Genetics of Speciation in its Phylogenetic Context
Our simulations show that the Dobzhansky-Muller model and a non-epistatic alternative in which isolation mutations accumulate linearly with time can be reliably distinguished with comparative mapping data. The distinction is clearest in phylogenies with substantial external branch lengths. This is because incompatibilities accumulate stochastically and there is greater uncertainty in the number of incompatibilities when few exist between the most recently diverged species.
Our approach reveals additional characteristics of incompatibility evolution that have long been of interest in speciation genetics. Coevolution of incompatibility substitutions (Tang and Presgraves 2009) – so that one substitution affects the fate of the other – increases the proportion of incompatibilities that arise along one lineage (derived-ancestral incompatibilities). Alternatively, when incompatibility substitutions are evolutionarily independent, a higher fraction of derived-derived incompatibilities is predicted because derived alleles can be incompatible with both derived and ancestral alleles (while ancestral alleles can only be incompatible with derived alleles) (Orr 1995). Identifying both partner loci and reconstructing the phylogenetic placement of substitutions is a powerful approach for distinguishing the two incompatibility types (Cattani and Presgraves 2009), but this is a daunting task for many species. Our method allows estimation of the frequencies of derived-derived and derived-ancestral incompatibilities across the genome. Furthermore, whether incompatibilities mostly involve pairs of substitutions or instead feature higher-order interactions measures the complexity of disrupted genetic networks in hybrids. Incompatibilities featuring changes at more than two loci are well-known (Cabot et al. 1994; Davis et el. 1994; Palopoli and Wu 1994; Perez and Wu 1995; Orr and Irving 2001; Phadnis and Orr 2009; Phadnis 2011) and predicted to be common (Orr 1995), but their overall contribution to hybrid dysfunction compared to pairwise interactions has not been quantified in the same hybrids. These genetic characterizations of the speciation process would be difficult to draw from individual mapping studies, but are possible with comparative mapping data.
The shape of the tree also affects the ability to differentiate between these flavors of the Dobzhansky-Muller model. Phylogenies with greater external branch length offer higher power to find unequal evolutionary rates for ancestral-derived and derived-derived incompatibilities. In contrast, the ability to detect a significant contribution from three-locus incompatibilities is greatest for phylogenies with intermediate values of internal branch length. This reflects a balance between measuring the number of incompatibilities involving substitutions on the internal branch, which improves with longer internal branch lengths, and measuring the number of incompatibilities between the most recently diverged species, which improves with longer external branch lengths. An exception to these patterns is observed when one species pair has very recently diverged; in this case, the pa ≠ pd model is favored in all simulations of complex Dobzhansky-Muller type models. This is likely due to the greater freedom of pa in the pa ≠ pd model when t2 is small. Comparing the two-parameter models (Table 3 and Table 4), we see that IsharedA and IsharedB depend only on a single parameter, pa, in the pa ≠ pd model whereas all six measures involve both parameters in the three-locus model.
In summary, our results suggest guidelines for selecting species groups for comparative mapping of reproductive isolation. If the primary goal is to distinguish epistatic from non-epistatic models, clades with long internal branches should be avoided. When attempting to gauge the contribution of three-locus incompatibilities or to compare the frequencies of derived-derived and derived-ancestral incompatibilities, groups of species with variable divergence times should be favored. Finally, the addition of data from mapping studies of taxa on the same phylogeny will undoubtedly improve the power to distinguish between models, as it will increase the total number of incompatibilities measured. Extensions to our approach are worthy of consideration. The method of analysis we employed, incorporating shared incompatibilities and the expected covariance between different species pairs, can be generalized to more complex models. While we have mathematically described only the incorporation of phylogenetic information from three species, the calculation for additional species pairs is straightforward. The primary barrier to the addition of more species in the analysis is the difficulty in collecting incompatibility data from more species pairs. This is apparent when we consider the addition of a fourth species to the phylogeny. For the complete set of incompatibilities measured between a phylogeny with four species, a genetic analysis of the isolating traits in six hybrids is necessary – this is double the number from a phylogeny with only three species. Nevertheless, we anticipate that the insights into speciation uniquely provided by comparative genetic mapping will motivate the increased application of this framework.
Supplementary Material
Acknowledgements
RJW’s participation in this research was supported by a UW NIH Genetics graduate training Grant, the Advanced Opportunity Fellowship through SciMed Graduate Research Scholars at UW, and an NLM training grant to UW in Computation and Informatics in Biology and Medicine NLM 5T15LM007359. BAP’s participation was supported by NSF grant DEB 0918000.
Appendix.
Below we derive Equation (1), the expected number of DMIs when considering interactions between n loci as a function of the expected number of substitutions separating two lineages.
where (x)n is the falling factorial, (x)n = x(x-1)…(x-n+1). Using the definition of the expectation and the fact that K is Poisson distributed, let λ = E[K]
| (A1) |
Recognizing the summation as over the probability mass function of a Poisson distribution, the expected number of DMIs is then
We derive below Equation (2), the variance in the number of DMIs when considering interactions between n loci as a function of the expected number of substitutions between two lineages. We begin with the law of total variance, conditioning on the total number of substitutions between the lineages, K,
Since the number of incompatibilities given the number of substitutions is binomially distributed, each interaction is a Bernoulli trial with probability pn of becoming an incompatibility. The variance in the number of incompatibilities given K is then the product of pn(1-pn) and the number of interactions.
Using Eq. (A1) to simplify the left hand term and recognizing the expected number of incompatibilities given K is the product of pn and the number of substitutions,
Using the definition of variance on the right hand term and Eq. (A1) again
The product of two falling factorials can be expressed with connection coefficients as,
Applying Eq. (A1) and reindexing the sum yields
For n = 2,
For n = 3,
For n = 4,
LITERATURE CITED
- Akaike H. New look at statistical-model identification. IEEE Trans. Autom. Control. 1974;AC19:716–723. [Google Scholar]
- Bank C, Buerger R, Hermisson J. The limits to parapatric speciation: Dobzhansky-Muller incompatibilities in a continent-island Model. Genetics. 2012;191:845–U345. doi: 10.1534/genetics.111.137513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barbash DA, Ashburner M. A novel system of fertility rescue in Drosophila hybrids reveals a link between hybrid lethality and female sterility. Genetics. 2003;163:217–226. doi: 10.1093/genetics/163.1.217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bayes JJ, Malik HS. Altered heterochromatin binding by a hybrid sterility protein in Drosophila sibling species. Science. 2009;326:1538–1541. doi: 10.1126/science.1181756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bikard D, Patel D, Metté CL, Giorgi V, Camilleri C, Bennett MJ, Loudet O. Divergent evolution of duplicate genes leads to genetic incompatibilities within A. thaliana. Science. 2009;323:623–626. doi: 10.1126/science.1165917. [DOI] [PubMed] [Google Scholar]
- Bolnick DI, Near TJ. Tempo of hybrid inviability in centrarchid fishes (Teleostei: Centrarchidae) Evolution. 2005;59:1754–1767. [PubMed] [Google Scholar]
- Bomblies K, Lempe J, Epple P, Warthmann N, Lanz C, Dangl JL, Weigel D. Autoimmune response as a mechanism for a Dobzhansky-Muller-type incompatibility syndrome in plants. PLoS Biol. 2007;5:e236. doi: 10.1371/journal.pbio.0050236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brideau NJ, Flores HA, Wang J, Maheshwari S, Wang X, Barbash DA. Two Dobzhansky-Muller genes interact to cause hybrid lethality in Drosophila. Science. 2006;314:1292–1295. doi: 10.1126/science.1133953. [DOI] [PubMed] [Google Scholar]
- Broman KW, Kim S, Sen S, Ane C, Payseur BA. Mapping quantitative trait loci onto a phylogenetic tree. Genetics. 2012;192:267–U549. doi: 10.1534/genetics.112.142448. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burkart-Waco D, Josefsson C, Dilkes B, Kozloff N, Torjek O, Meyer R, Altmann T, Comai L. Hybrid incompatibility in Arabidopsis is determined by a multiple-locus genetic network. Plant Physiol. 2012;158:801–812. doi: 10.1104/pp.111.188706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Butlin R, Debelle A, Kerth C, Snook RR, Beukeboom LW, Cajas RFC, Diao W, Maan ME, Paolucci S, Weissing FJ, van de Zande L, Hoikkala A, Geuverink E, Jennings J, Kankare M, Knott KE, Tyukmaeva VI, Zoumadakis C, Ritchie MG, Barker D, Immonen E, Kirkpatrick M, Noor M, Macias Garcia C, Schmitt T, Schilthuizen M. What do we need to know about speciation? Trends Ecol. Evol. 2012;27:27–39. doi: 10.1016/j.tree.2011.09.002. [DOI] [PubMed] [Google Scholar]
- Cabot EL, Davis AW, Johnson NA, Wu CI. Genetics of reproductive isolation in the Drosophila simulans clade: complex epistasis underlying hybrid male sterility. Genetics. 1994;137:175. doi: 10.1093/genetics/137.1.175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cattani MV, Presgraves DC. Genetics and lineage-specific evolution of a lethal hybrid incompatibility between Drosophila mauritiana and its sibling species. Genetics. 2009;181:1545–1555. doi: 10.1534/genetics.108.098392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coyne JA, Orr HA. Patterns of speciation in Drosophila. Evolution. 1989;43:362–381. doi: 10.1111/j.1558-5646.1989.tb04233.x. [DOI] [PubMed] [Google Scholar]
- Coyne JA, Orr HA. “Patterns of speciation in Drosophila” revisited. Evolution. 1997;51:295–303. doi: 10.1111/j.1558-5646.1997.tb02412.x. [DOI] [PubMed] [Google Scholar]
- Coyne JA, Orr HA. Speciation. 1st ed Sinauer Associates, Inc.; 2004. [Google Scholar]
- Davis A, Noonburg E, Wu C. Evidence for complex genic interactions between conspecific chromosomes underlying hybrid female sterility in the Drosophila-Simulans clade. Genetics. 1994;137:191–199. doi: 10.1093/genetics/137.1.191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dobzhansky TG. Genetics and the origin of species. Columbia University Press; 1937. [Google Scholar]
- Ferree PM, Barbash DA. Species-specific heterochromatin prevents mitotic chromosome segregation to cause hybrid lethality in Drosophila. PLoS Biol. 2009;7:e1000234. doi: 10.1371/journal.pbio.1000234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fierst JL, Hansen TF. Genetic architecture and postzygotic reproductive isolation: evolution of Bateson-Dobzhansky-Muller incompatibilities in a polygenic model. Evolution. 2010;64:675–693. doi: 10.1111/j.1558-5646.2009.00861.x. [DOI] [PubMed] [Google Scholar]
- Gadau J, Page RE, Jr, Werren JH. Mapping of hybrid incompatibility loci in Nasonia. Genetics. 1999;153:1731–1741. doi: 10.1093/genetics/153.4.1731. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gavrilets S. Hybrid zones with Dobzhansky-type epistatic selection. Evolution. 1997;51:1027–1035. doi: 10.1111/j.1558-5646.1997.tb03949.x. [DOI] [PubMed] [Google Scholar]
- Gavrilets S. Perspective: Models of speciation: What have we learned in 40 years? Evolution. 2003;57:2197–2215. doi: 10.1111/j.0014-3820.2003.tb00233.x. [DOI] [PubMed] [Google Scholar]
- Gavrilets S. Fitness landscapes and the origin of species. Princeton University Press; 2004. [Google Scholar]
- Good JM, Dean MD, Nachman MW. A complex genetic basis to X-linked hybrid male sterility between two species of house mice. Genetics. 2008;179:2213–2228. doi: 10.1534/genetics.107.085340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourbiere S, Mallet J. Are species real? The shape of the species boundary with exponential failure, reinforcement, and the “missing snowball”. Evolution. 2010;64:1–24. doi: 10.1111/j.1558-5646.2009.00844.x. [DOI] [PubMed] [Google Scholar]
- Greig D, Borts RH, Louis EJ, Travisano m. Epistasis and hybrid sterility in Saccharomyces. Proceedings of the Royal Society B: Biological Sciences. 2002;269:1167–1171. doi: 10.1098/rspb.2002.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haldane JBS. Sex ratio and unisexual sterility in hybrid animals. Journal of Genetics Cambridge. 1922:12. [Google Scholar]
- Kirkpatrick M, Barton N. Chromosome inversions, local adaptation and speciation. Genetics. 2006;173:419–434. doi: 10.1534/genetics.105.047985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kondrashov AS, Sunyaev S, Kondrashov FA. Dobzhansky-Muller incompatibilities in protein evolution. Proc. Natl. Acad. Sci. U. S. A. 2002;99:14878–14883. doi: 10.1073/pnas.232565499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee H-Y, Chou J-Y, Cheong L, Chang N-H, Yang S-Y, Leu J-Y. Incompatibility of nuclear and mitochondrial genomes causes hybrid sterility between two yeast species. Cell. 2008;135:1065–1073. doi: 10.1016/j.cell.2008.10.047. [DOI] [PubMed] [Google Scholar]
- Li RH, Lyons MA, Wittenburg H, Paigen B, Churchill GA. Combining data from multiple inbred line crosses improves the power and resolution of quantitative trait loci mapping. Genetics. 2005;169:1699–1709. doi: 10.1534/genetics.104.033993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lijtmaer DA, Mahler B, Tubaro PL. Hybridization and postzygotic isolation patterns in pigeons and doves. Evolution. 2003;57:1411–1418. doi: 10.1111/j.0014-3820.2003.tb00348.x. [DOI] [PubMed] [Google Scholar]
- Liti G, Barton DBH, Louis EJ. Sequence diversity, reproductive isolation and species concepts in Saccharomyces. Genetics. 2006;174:839–850. doi: 10.1534/genetics.106.062166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Livingstone K, Olofsson P, Cochran G, Dagilis A, MacPherson K, Seitz KA. A stochastic model for the development of Bateson-Dobzhansky-Muller incompatibilities that incorporates protein interaction networks. Math. Biosci. 2012;238:49–53. doi: 10.1016/j.mbs.2012.03.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maddison WP. Gene trees in species trees. Syst. Biol. 1997;46:523–536. [Google Scholar]
- Masly JP, Jones CD, Noor MAF, Locke J, Orr HA. Gene transposition as a cause of hybrid sterility in Drosophila. Science. 2006;313:1448–1450. doi: 10.1126/science.1128721. [DOI] [PubMed] [Google Scholar]
- Matute DR, Butler IA, Turissini DA, Coyne JA. A test of the snowball theory for the rate of evolution of hybrid incompatibilities. Science. 2010;329:1518–1521. doi: 10.1126/science.1193440. [DOI] [PubMed] [Google Scholar]
- Mendelson TC, Inouye BD, Rausher MD. Quantifying patterns in the evolution of reproductive isolation. Evolution. 2004;58:1424–1433. doi: 10.1111/j.0014-3820.2004.tb01724.x. [DOI] [PubMed] [Google Scholar]
- Mihola O, Trachtulec Z, Vlcek C, Schimenti JC, Forejt J. A mouse speciation gene encodes a meiotic histone H3 methyltransferase. Science. 2009;323:373–375. doi: 10.1126/science.1163601. [DOI] [PubMed] [Google Scholar]
- Moyle LC, Nakazato T. Comparative genetics of hybrid incompatibility: sterility in two Solanum species crosses. Genetics. 2008;179:1437–1453. doi: 10.1534/genetics.107.083618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moyle LC, Nakazato T. Hybrid Incompatibility “Snowballs” between Solanum species. Science. 2010;329:1521–1523. doi: 10.1126/science.1193063. [DOI] [PubMed] [Google Scholar]
- Moyle LC, Payseur BA. Reproductive isolation grows on trees. Trends Ecol. Evol. (Amst.) 2009;24:591–598. doi: 10.1016/j.tree.2009.05.010. [DOI] [PubMed] [Google Scholar]
- Muller HJ. Bearing of the Drosophila work on systematics. The new systematics. 1940:185–268. [Google Scholar]
- Muller HJ. Isolating mechanisms, evolution and temperature. Biol. Symp. 1942:71–125. [Google Scholar]
- Orr HA. The population genetics of speciation: the evolution of hybrid incompatibilities. Genetics. 1995;139:1805. doi: 10.1093/genetics/139.4.1805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orr HA. Dobzhansky, Bateson, and the genetics of speciation. Genetics. 1996;144:1331–1335. doi: 10.1093/genetics/144.4.1331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orr HA, Irving S. Complex epistasis and the genetic basis of hybrid sterility in the Drosophila pseudoobscura Bogota-USA hybridization. Genetics. 2001;158:1089–1100. doi: 10.1093/genetics/158.3.1089. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orr HA, Turelli M. The evolution of postzygotic isolation: accumulating Dobzhansky-Muller incompatibilities. Evolution. 2001;55:1085–1094. doi: 10.1111/j.0014-3820.2001.tb00628.x. [DOI] [PubMed] [Google Scholar]
- Palmer ME, Feldman MW. Dynamics of hybrid incompatibility in gene networks in a constant environment. Evolution. 2009;63:418–431. doi: 10.1111/j.1558-5646.2008.00577.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palopoli M, Wu C. Genetics of hybrid male sterility between Drosophila sibling species – A complex web of epistasis is revealed in interspecific studies. Genetics. 1994;138:329–341. doi: 10.1093/genetics/138.2.329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Perez D, Wu C. Further characterization of the Odysseus locus in hybrid sterility in Drosophila – One gene is not enough. Genetics. 1995;140:201–206. doi: 10.1093/genetics/140.1.201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phadnis N. Genetic architecture of male sterility and segregation distortion in Drosophila pseudoobscura Bogota-USA Hybrids. Genetics. 2011;189:1001–U428. doi: 10.1534/genetics.111.132324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phadnis N, Orr HA. A single gene causes both male sterility and segregation distortion in Drosophila hybrids. Science. 2009;323:376–379. doi: 10.1126/science.1163934. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pollard DA, Iyer VN, Moses AM, Eisen MB. Widespread discordance of gene trees with species tree in Drosophila: Evidence for incomplete lineage sorting. PLoS Genet. 2006;2:1634–1647. doi: 10.1371/journal.pgen.0020173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Presgraves DC. Patterns of postzygotic isolation in Lepidoptera. Evolution. 2002;56:1168–1183. doi: 10.1111/j.0014-3820.2002.tb01430.x. [DOI] [PubMed] [Google Scholar]
- Presgraves DC. A fine-scale genetic analysis of hybrid incompatibilities in Drosophila. Genetics. 2003;163:955–972. doi: 10.1093/genetics/163.3.955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Presgraves DC, Balagopalan L, Abmayr SM, Orr HA. Adaptive evolution drives divergence of a hybrid inviability gene between two species of Drosophila. Nature. 2003;423:715–719. doi: 10.1038/nature01679. [DOI] [PubMed] [Google Scholar]
- Presgraves DC, Stephan W. Pervasive adaptive evolution among interactors of the Drosophila hybrid inviability gene, Nup96. Mol. Biol. Evol. 2007;24:306–314. doi: 10.1093/molbev/msl157. [DOI] [PubMed] [Google Scholar]
- Price TD, Bouvier MM. The evolution of F-1 postzygotic incompatibilities in birds. Evolution. 2002;56:2083–2089. [PubMed] [Google Scholar]
- Sasa MM, Chippindale PT, Johnson NA. Patterns of postzygotic isolation in frogs. Evolution. 1998;52:1811–1820. doi: 10.1111/j.1558-5646.1998.tb02258.x. [DOI] [PubMed] [Google Scholar]
- Sawamura K, Yamamoto M-T. Characterization of a reproductive isolation gene, zygotic hybrid rescue, of Drosophila melanogaster by using minichromosomes. Heredity. 1997;79:97–103. [Google Scholar]
- Schlosser G, Wagner GP. A simple model of co-evolutionary dynamics caused by epistatic selection. J. Theor. Biol. 2008;250:48–65. doi: 10.1016/j.jtbi.2007.08.033. [DOI] [PubMed] [Google Scholar]
- Slotman M, Torre A. Della, Powell JR. The genetics of inviability and male sterility in hybrids between Anopheles gambiae and An. arabiensis. Genetics. 2004;167:275–287. doi: 10.1534/genetics.167.1.275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tang S, Presgraves DC. Evolution of the Drosophila nuclear pore complex results in multiple hybrid incompatibilities. Science. 2009;323:779–782. doi: 10.1126/science.1169123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ting C-T, Tsaur S-C, Wu M-L, Wu C-I. A rapidly evolving homeobox at the site of a hybrid sterility gene. Science. 1998;282:1501–1504. doi: 10.1126/science.282.5393.1501. [DOI] [PubMed] [Google Scholar]
- True JR, Weir BS, Laurie CC. A genome-wide survey of hybrid incompatibility factors by the introgression of marked segments of Drosophila mauritiana chromosomes into Drosophila simulans. Genetics. 1996;142:819–837. doi: 10.1093/genetics/142.3.819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Turelli M, Barton NH, Coyne JA. Theory and speciation. Trends Ecol. Evol. 2001;16:330–343. doi: 10.1016/s0169-5347(01)02177-2. [DOI] [PubMed] [Google Scholar]
- Turelli M, Moyle LC. Asymmetric postmating isolation: Darwin’s corollary to Haldane’s rule. Genetics. 2007;176:1059–1088. doi: 10.1534/genetics.106.065979. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Turelli M, Orr HA. Dominance, epistasis and the genetics of postzygotic isolation. Genetics. 2000;154:1663–1679. doi: 10.1093/genetics/154.4.1663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Unckless RL, Orr HA. Dobzhansky-Muller incompatibilities and adaptation to a shared environment. Heredity. 2009;102:214–217. doi: 10.1038/hdy.2008.129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walsh JB. Rate of accumulation of reproductive isolation by chromosome rearrangements. The American Naturalist. 1982;120:510–532. [Google Scholar]
- Welch JJ. Accumulating Dobzhansky-Muller incompatibilities: reconciling theory and data. Evolution. 2004;58:1145–1156. doi: 10.1111/j.0014-3820.2004.tb01695.x. [DOI] [PubMed] [Google Scholar]
- White MA, Ane C, Dewey CN, Larget BR, Payseur BA. Fine-Scale phylogenetic discordance across the house mouse genome. PLoS Genet. 2009:5. doi: 10.1371/journal.pgen.1000729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- White MA, Steffy B, Wiltshire T, Payseur BA. Genetic dissection of a key reproductive barrier between nascent species of house mice. Genetics. 2011;189:289–304. doi: 10.1534/genetics.111.129171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wolf JBW, Lindell J, Backstrom N. Speciation genetics: current status and evolving approaches. Philos. Trans. R. Soc. B-Biol. Sci. 2010;365:1717–1733. doi: 10.1098/rstb.2010.0023. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




