Skip to main content
Systematic Biology logoLink to Systematic Biology
. 2017 Mar 27;66(6):950–963. doi: 10.1093/sysbio/syx028

Inference of Evolutionary Jumps in Large Phylogenies using Lévy Processes

Pablo Duchen 1, Christoph Leuenberger 1,2, Sándor M Szilágyi 3,4,5, Luke Harmon 6, Jonathan Eastman 6, Manuel Schweizer 7, Daniel Wegmann 1,*
PMCID: PMC5790141  PMID: 28204787

Abstract

Although it is now widely accepted that the rate of phenotypic evolution may not necessarily be constant across large phylogenies, the frequency and phylogenetic position of periods of rapid evolution remain unclear. In his highly influential view of evolution, G. G. Simpson supposed that such evolutionary jumps occur when organisms transition into so-called new adaptive zones, for instance after dispersal into a new geographic area, after rapid climatic changes, or following the appearance of an evolutionary novelty. Only recently, large, accurate and well calibrated phylogenies have become available that allow testing this hypothesis directly, yet inferring evolutionary jumps remains computationally very challenging. Here, we develop a computationally highly efficient algorithm to accurately infer the rate and strength of evolutionary jumps as well as their phylogenetic location. Following previous work we model evolutionary jumps as a compound process, but introduce a novel approach to sample jump configurations that does not require matrix inversions and thus naturally scales to large trees. We then make use of this development to infer evolutionary jumps in Anolis lizards and Loriinii parrots where we find strong signal for such jumps at the basis of clades that transitioned into new adaptive zones, just as postulated by Simpson’s hypothesis. [evolutionary jump; Lévy process; phenotypic evolution; punctuated equilibrium; quantitative traits.


A key goal of evolutionary biology is to understand the mechanisms by which the phenotypic diversity seen today evolved. Our understanding of these mechanisms is improving rapidly with the advent of increasingly powerful sequencing approaches. For instance, the accumulation of molecular data has led to the resolution of phylogenetic trees encompassing entire orders. Further, methods to reliably identify substitutions that likely resulted from selection, and to accurately place them on a phylogeny have been developed. In contrast, methods to infer events of rapid evolution from phenotypic data have lagged and are mostly restricted to inferring independent evolutionary rates for different clades.

In general, quantitative studies of the evolution of phenotypic/quantitative traits date back just a few decades. A first attempt was by Edwards et al. (1964) and Cavalli-Sforza and Edwards (1967), who modeled quantitative traits stochastically as “Brownian motion” (BM). However, given the current wealth of molecular data available, a more realistic goal is to only aim at inferring the rates at which quantitative traits evolve, while assuming the underlying phylogeny to be known. This has been successfully done using a BM model in multiple taxa. Freckleton et al. (2002), for instance, used a BM model on a given phylogeny to test if traits showed phylogenetic signal. More recently, Brawand et al. (2011) modeled gene expression evolution as BM and rejected evolution at a constant rate for several genes.

Several extensions to a basic BM model have been proposed. Butler and King (2004) were the first to implement Ornstein–Uhlenbeck (OU) processes with multiple evolutionary optima, as initially described by Hansen (1997), and recently used to describe the evolution of gene expression (e.g., Bedford and Hartl 2009; Rohlfs et al. 2013). Other extensions to BM allow evolutionary rates to change over time. O’Meara et al. (2006), for instance, contrasted maximum likelihood (ML) estimates of evolutionary rates under BM and showed that major clades of angiosperms vastly differ in their rate of genome size evolution. More recently, Eastman et al. (2011) developed a Bayesian method to jointly infer evolutionary rates in different clades and found evidence for multiple rate shifts in body size evolution in emydid turtles. Shortly after, Slater et al. (2012) have introduced an extension to incompletely sampled phylogenies and trait data using Approximate Bayesian Computation. However, they found no evidence for an elevated rate of body size evolution in pinnipeds in comparison to terrestrial carnivores, despite considerable power. This suggests that the larger body size found in pinnipeds may be the result of rapid evolutionary changes early in the clade, rather than a change in the rate itself, and hence that models of occasional “evolutionary jumps” may often more accurately explain the evolution of quantitative traits.

According to Simpson (1944), such evolutionary jumps are triggered by shifts of lineages into different adaptive zones, either by dispersal into new geographic areas, the appearance of evolutionary novelties, key innovations, the extinction of lineages leaving niches empty, or by rapid changes in the environment (climatic or ecological). Additionally, the existence of “ecological opportunities” (Losos 2010) might also trigger such jumps. While OU processes have been proposed to model the dynamics of adaptive landscapes (e.g., Ingram and Mahler 2013; Uyeda and Harmon 2014), a promising alternative is to model this type of evolution as a compound process (or Lévy process) consisting of a continuous background process and a discrete jump process. The first implementation of such a model assumed that jumps only occurred at speciation events (Bokma 2008), but Landis et al. (2013) recently described Lévy processes in a much more general way and showed that while the likelihood functions of most of these models are intractable, inference is possible under a Bayesian framework. For instance, when modeling the evolution of quantitative traits as a Poisson compound process, in which traits are assumed to evolve under BM with occasional jumps that occur as a Poisson process on the tree, the likelihood can be calculated analytically when conditioning on a jump configuration (a placement of jumps on the tree). Under the assumption that jump effects are normally distributed, a jump configuration can be seen as simply stretching the branches of the tree on which they occur, and the likelihood is then given by a multivariate normal distribution with the variance–covariance matrix resulting from the stretched tree. The numerical integration is then limited to sampling jump configurations, which is readily done using Markov Chain Monte Carlo (MCMC).

Unfortunately, two computational challenges prohibit the application of this approach to larger trees. First, the space of jump configurations grows exponentially with tree size, leading to very long MCMC chains. Second, the evaluation of the likelihood requires the computation of the inverse of the variance–covariance matrix, which is computationally very demanding since it scales exponentially with tree size (Tung Ho and Ané 2014). Here, we address these computational issues using an empirical Bayes approach in which we first infer the hierarchical parameters of the Brownian and Poisson processes using ML, and then fix those when inferring posterior probabilities on jump locations. This approach allows us to run MCMC chains with fixed hierarchical parameters, for which we find a computationally highly efficient approach that does not require matrix inversions. As a result, this approach readily scales up to very large phylogenies.

We then demonstrate the power and accuracy of our approach with extensive simulations and find that our approach hardly misses any jumps with a meaningful strength. We then illustrate the usefulness of our approach by identifying evolutionary jumps in Anolis lizards and Loriini parrots, two well-studied groups for which morphological data is available. We identify few but important evolutionary jumps in both groups, suggesting such periods of rapid evolutionary change to be rare but crucial in shaping the morphological diversity observed today.

Theory

The Null Hypothesis: Brownian Motion

We first consider a Brownian motion (BM) process on a phylogenetic tree Inline graphic with root Inline graphic where time is measured in the unit of the branch lengths. The process starts at Inline graphic with value Inline graphic (root state) and then proceeds with variance Inline graphic along the branches. The values of the BM process, as observed at the Inline graphic leaves, give rise to the random vector

x=(x1,,xL).

Let us fix the notation: The lengths of the (inner and outer) branches of Inline graphic are called Inline graphic where Inline graphic is the number of branches. For two leaves Inline graphic we denote by Inline graphic the length of their common branch in Inline graphic as measured from the root Inline graphic. Now, under the assumption of a pure BM, and defining Inline graphic, the values Inline graphic at the leaves have the multivariate normal distribution

xN(μ1,s02T0)

or written more conveniently:

x=μ1+ϵ (1)

with Inline graphic. Since Inline graphic is positive definite and symmetric, it has a symmetric and positive definite square root Inline graphic, i.e., Inline graphic. Multiplying both sides of (1) with Inline graphic we get the homoskedastic model

x0=μv0+ϵ0,

where Inline graphic, Inline graphic, and Inline graphic. For this we have the usual OLS estimators (see e.g., Davidson and MacKinnon 2004, ch. 3.2)

μ^=(v0v0)1v0x0=1T01x1T011

and

s^02=1L1v0(Iv0(v0v0)1v0)x0=1L1(xT01x(1T01x)21T011).

Lévy Process

We now extend the BM model by super-imposing an independent Poissonian jump-process with rate Inline graphic. The jumps shall be normally distributed with zero mean and variance Inline graphic. The (unobservable) random vector

ν=(ν1,,νB)

counts the number of Poisson events (jumps) on each of the Inline graphic branches. By assumption,

P(νb=nb)=eλτb(λτb)nbnb!,nb=0,1,2,.

For a multi-index Inline graphic, we have

P(ν=n)=k=1Beλτb(λτb)nbnb!. (2)

Recall that for two leaves Inline graphic we denote by Inline graphic the length of their common branch in Inline graphic as measured from the root Inline graphic. In particular, Inline graphic is the distance (sum of branch lengths) of the leaf Inline graphic from the root Inline graphic.

We denote by Inline graphic for two leaves Inline graphic the number of Poisson events along the common branch of length Inline graphic. Conditional on Inline graphic, the random vector Inline graphic is multivariate normal with mean Inline graphic and the Inline graphic variance–covariance matrix Inline graphic where

σij(n)=τijs02+nijs12,1i,jL.

The conditional density of Inline graphic given Inline graphic is

ϕ(x|n)=1(2π)LdetΣ(n)exp(12(xμ1)Σ1(n)(xμ1)). (3)

The likelihood of Inline graphic given the four parameters Inline graphic (root state), Inline graphic (Brownian motion) and Inline graphic (Poissonian jump process) is the mixture distribution

f(x|μ,s02,λ,s12)=n1=0nB=0P(ν=n)ϕ(x|n), (4)

where we used expressions (2) and (3). It is not hard to show that

E(xi)=μandCov(xi,xj)=τij(s02+λs12).

Inference under the Lévy Process

Here we develop a computationally efficient approach to maximize the likelihood function given in equation (4). Although the infinite sums in (4) prohibit an analytical solution, they are readily evaluated using numerical approaches. Landis et al. (2013), for instance, proposed to use an MCMC approach to integrate over jump configurations. Unfortunately, however, such a solution does not scale to large trees, because the calculation of the conditional density values in (3) involves the computation of the inverse of Inline graphic and its determinant, which are computationally very demanding.

We propose to address this problem by introducing an algorithm to calculate these quantities efficiently under this model. Specifically, and as we show in Appendix 1, both the inverse and determinant can be determined cheaply from a previous solution to a case that differs in the presence of a single jump on the tree. Although this algorithm can readily be incorporated into the MCMC approach proposed of Landis et al. (2013), we will propose an alternative hierarchical Bayes approach that makes even better use of it and leads to a computationally highly efficient inference approach to obtain point estimates of the parameters Inline graphic, Inline graphic, Inline graphic, and Inline graphic, as well as posterior probabilities on the location of evolutionary jumps, as we describe in the following.

Monte Carlo EM algorithm.

—We obtain ML estimates of the parameters Inline graphic and Inline graphic by means of a Monte Carlo version of the classical Expectation Maximization (EM) algorithm, in which we treat the random variable Inline graphic as missing (unobserved) data. Although this approach does not allow us to find the ML estimate of Inline graphic, we discuss below how this can be achieved using a simple grid search.

Recall that each iteration of the EM algorithm consists of an estimation (E) and a maximization (M) step. Let us denote the old parameters determined in the previous M-step by Inline graphic, and the new parameters with respect to which the Inline graphic-function has to be maximized in the next M-step by Inline graphic, where Inline graphic is a fixed value for Inline graphic. The two steps of the EM algorithm are then as follows:

Monte Carlo E-step. In this step, we simulate stochastically Inline graphic vectors Inline graphic according to the multi-Poisson distribution Inline graphic. For this we use an MCMC scheme that fully exploits the fast computation of inverses discussed above (see Appendix 2 for details).

Determine the weights

πk=ϕ(x|nk,μ~,s~02,α0)=ckexp(12s~02(xμ~1)T1(nk,α0)(xμ~1)),

with

ck=(2πs~02)L/2(detT(nk,α0))1/2.

In the M-step we have to maximize the function

Q(θ|θ~)=E[logP(x,n|θ)|x,θ~]=nP(n|x,θ~)logP(x,n|θ) (5)

with respect to the parameters Inline graphic where

P(x,n|θ)=ϕ(x|n,μ,s02,α0)P(ν=n|λ).

From Bayes’ theorem we have

P(n|x,θ~)=ϕ(x|n,μ~,s~02,α0)P(ν=n|λ~)P(x|θ~). (6)

Thus, according to our Monte Carlo scheme and up to the factor Inline graphic, the infinite sum in (5) can be approximated by

Q(θ)k=1Kπklog[ϕ(x|nk)P(ν=nk)], (7)

where Inline graphic and Inline graphic are given by (2) and (3), respectively.

M-step. In this step, we seek the parameters Inline graphic which maximize the sum in (7) and which will serve as “old” parameters in the next E-step. We have

P(ν=nk)=eλτλ|nk|c(nk),

where Inline graphic is the total length of the tree Inline graphic, Inline graphic denotes the sum of the components of Inline graphic, and Inline graphic is a factor that does not depend on any of the parameters Inline graphic. From this it is easy to see that

λ~=πk|nk|τπk, (8)

independently of the values of the other three parameters. Since we assume the value of Inline graphic to be fixed, we can also give explicit expressions for the values of Inline graphic and Inline graphic which maximize Inline graphic. First, determine the matrix

S=k=1KπkT1(nk,α0). (9)

Standard calculus shows that

μ~=1Sx1S1 (10)

and

s~02=1Lπk(xμ~1)S(xμ~1)=1Lπk(xSx(1Sx)21S1). (11)

We note that the EM algorithm can be implemented without the Monte-Carlo part if we impose a condition Inline graphic on the likelihood, i.e., if we suppose a priori that there have been only Inline graphic or less Poisson events on the tree Inline graphic. In that case, the sum in (5) is over all Inline graphic such that Inline graphic (see Appendix 3).

Estimating factor Inline graphic.

—The Monte Carlo EM algorithm proposed above, while computationally highly efficient, does not allow for the estimation of the factor Inline graphic. We thus use a numerical approach to iteratively approach the ML estimate of Inline graphic. Specifically, we start at a value Inline graphic and then iteratively increase that value such that Inline graphic until the likelihood decreases. The algorithm then turns back by setting Inline graphic and proceeds again until the likelihood decreases. With every switch, the step size gets smaller and the estimate is guaranteed to get closer to the true MLE value as we found the likelihood surface to have a single peak (Figure S1 available as Supplementary Material on Dryad at http://dx.doi.org/10.5061/dryad.170rb). In each step, we use the Monte Carlo EM algorithm described above to calculate the likelihood at the MLE estimates of all other parameters conditioned on that Inline graphic value. In all application we set Inline graphic and the initial Inline graphic and found estimates to be accurate within five switches corresponding to about 15 values tested.

Identifying jump locations.

—To infer the location of jumps on a phylogenetic tree we implement an empirical Bayes approach. As is commonly done in such a setting, we assume the ML estimates Inline graphic and Inline graphic obtained using our Monte Carlo EM scheme are accurate and thus known constants when inferring jump locations. Under this assumption, the MCMC approach introduced above can also be used to sample configurations of jumps Inline graphic from the probability distribution Inline graphic. This allows us to numerically infer for each branch Inline graphic the posterior probabilities of Inline graphic and Inline graphic, and thus to identify branches for which there is convincing evidence for an evolutionary jump.

Implementation.

—We implemented the algorithm introduced here in C++ and optimized the code for speed. A user-friendly program to apply it to data is available at our lab website (http://www.unifr.ch/biology/research/wegmann/).

Simulations

Convergence

Convergence of the MCMC.

—We assessed the convergence of MCMC chains by comparing parameter estimates between two independent and parallel chain runs until 10,000 jump vectors Inline graphic were sampled. We run a total of 100 such chain pairs for each of two different starting locations and discarded the first 100 such vectors as burn-in. We also compared two different values to thin the chains: either we sampled every 10th or every 5000th step. Here we define an MCMC step as one proposed update for each branch (hence one step consist of as many iterations as there are branches in the tree).

Regardless of the starting values, convergence was reached rather fast but with some variation across parameters (Figure S2 available as Supplementary Material on Dryad). The parameter to converge fastest was Inline graphic, for which the difference in estimates was below 0.01 within 2000 sampled jump vectors for 90% of all chain pairs. Similarly small differences for Inline graphic and Inline graphic were only reached after sampling about 4000 jump vectors (Figure S2 available as Supplementary Material on Dryad). Interestingly, a larger thinning did not improve convergence, suggesting that the variance in estimates is dominated by variation in the jump vectors sampled, but not by autocorrelation along the chain. This was further confirmed when we repeated the same experiment for much larger trees with 1000 leaves and sampling every second step to reflect the larger number of iterations per step, in which case convergence was observed at the same rate (Figure S3 available as Supplementary Material on Dryad). For subsequent analyses, we sampled a total of 5000 jump vectors an used thinning of 10 and two for trees with 100 and 1000 leaves, respectively.

We next assessed the convergence of the MCMC for the inference of jumps on trees by assessing the difference in posterior probabilities between independent chains for trees with 100 and 1000 leaves (Figures S4 and S5 available as Supplementary Material on Dryad). We again run 100 chain pairs, fixed the thinning to 10 or two for trees with 100 or 1000 leaves, respectively, and discarded the first 100 jump vectors as burn-in. Although we found convergence to be reached within less than 2000 iterations for branches with very low (Inline graphic) and very high (Inline graphic) posterior probabilities, more iterations were required for branches with intermediate posterior probabilities. We found that sampling 5000 jump vectors gave very consistent results also for inferring the location of jumps.

Convergence of the EM for parameter inference.

—To test if the stochastic EM algorithm converges with the MCMC settings determined above, we run the EM for a wide range of parameter values for up to 100 iterations. Since the EM algorithm is stochastic, it does not converge onto a single value unless an infinitely large sample of Inline graphic vectors are used. We thus first inspected obtained patterns visually and found that parameter estimates stabilized after only a few iterations, usually between 10 and 20, regardless of tree size (Figures S6 and S7 available as Supplementary Material on Dryad).

We then implemented two different measurements to assess convergence more formally: the first is a test statistic assessing the presence of a trend in the parameter estimates, and the second is quantifying the number of slope changes in the individual parameter updates (see Appendix 4).

Power to Reject Brownian Motion

To assess the power of our approach to identify Lévy processes and to estimate associated parameters, we run our EM algorithm on data simulated with jumps on trees of 100 leaves, each simulated using a birth–death model (Stadler 2011) and scaled to a total length of 1. We generated 100 such simulations for many combinations of number of jumps and Inline graphic values but fixed Inline graphic and Inline graphic because changing these parameters does not affect the inference. We then inferred the MLE estimates for all parameters under both the null model (Brownian motion) and under the alternative Lévy model.

Using both a likelihood ratio test (LRT) or the Akaike Information Criterion resulted in generally substantial power to reject the null model over a large range of jumps simulated and for many different values of Inline graphic (Fig. 1). Unsurprisingly, power was much lower if simulated jumps were on the order of the change of the Brownian background process or lower. Here we simulated trees of length 1, and thus the average length of each of the Inline graphic branches on a tree with 100 leaves was roughly 0.005. Hence with Inline graphic, the strength of half of the evolutionary jumps are expected to be smaller or equal to the effect of the background process on an average branch. However, with Inline graphic, the power to reject the null model was Inline graphic if multiple jumps were present on the tree.

Figure. 1.

Figure. 1.

Power to reject the null model (Brownian motion) using a likelihood ratio test (LRT) a), or the Akaike information criterion (AIC); b) as a function of the number of simulated jumps Inline graphic and the jump strengths Inline graphic.

Interestingly, we also found our approach to regularly fail to reject the null model if the number of jumps was very large, i.e., on the order of the number of branches (50 jumps correspond to a jump on every 4th branch). In such situations, the large variance in traits observed under the Lévy model is also perfectly explained by a pure BM model with larger variance Inline graphic (see below). Surprisingly, we did not observe this effect at larger trees of 1,000 leaves at the same proportion of jumps (Figure S8 available as Supplementary Material on Dryad), suggesting that the fraction of branches with jumps required for the model to reduce to pure Brownian motion is larger for larger trees.

In summary, these results show that our method has considerable power to detect Lévy processes as long as jumps are meaningfully strong and there are not too many jumps, in which case the Lévy and BM models become indistinguishable from each other.

Accuracy in Inferring Lévy Parameters

For the cases in which the Lévy model was preferred we next evaluated the power of our approach to infer the associated parameters, starting with the jump strength Inline graphic and trees with 100 leaves. We found that our approach infers Inline graphic quite accurately over the whole range, but we observed a slight overestimation for lower Inline graphic values. This is a direct result of the low power to reject a model of Brownian rate at these lower jump strengths such that for simulations that resulted in larger jumps the Brownian model was more easily rejected. But the inferred values for Inline graphic were rarely further from the true value than a factor of 2 if multiple jumps were present (Fig. 2a), whereas it was unsurprisingly much harder to accurately infer the jump strength in case of a single jump.

Figure. 2.

Figure. 2.

Accuracy in inferring Lévy parameters. Each boxplot represents the distribution of inferred values across 100 replicates simulated as described in the text for different combinations of jump strengths Inline graphic and number of simulated jumps Inline graphic. a) Accuracy in inferring factor Inline graphic. The true Inline graphic values used in the simulations are indicated with red dashed lines. b) Top row: distributions of inferred jump rate Inline graphic. Connected red open circles represent the true values. The numbers printed below the boxplots indicate the percentage of simulations for which the Brownian model was rejected and are hence included here. Bottom row: distributions of inferred Brownian background rates Inline graphic for simulations in which the Brownian null model was rejected (black) or not rejected (blue).

We next evaluated the accuracy of our approach in inferring the jump rate Inline graphic, again limited to the simulations in which a Lévy model was preferred. As shown in Figure 2b, our method inferred this parameters very accurately over a large range of jumps simulated and for all values of Inline graphic, with generally higher accuracy with higher Inline graphic values.

We then finally evaluated the accuracy in inferring the Brownian background rate Inline graphic (Figure 2b) and found it to be very accurately inferred whenever the Brownian model was rejected. Interestingly, however, Inline graphic was overestimated whenever the Brownian model could not be rejected but jumps were simulated. This illustrates that under certain conditions a Lévy model is indistinguishable from a model of pure Brownian motion with an elevated rate. This is particularly true in the case of weak jumps (small Inline graphic) or if jumps are very common on the tree.

As expected given the larger amount of information, the hierarchical parameters are estimated much more accurately on larger trees. To illustrate this we repeated these analyses for trees with 1,000 leaves which resulted in much tighter confidence intervals for all parameters (Figure S9 available as Supplementary Material on Dryad).

Jump Location

We finally tested the power of our method to infer the location of jumps on the tree. For this we simulated trees with 100 leaves and trait data affected by 20 jumps randomly placed on each tree for different jump strengths Inline graphic while fixing Inline graphic. In each case, we then assumed the Lévy parameters to be known and used our MCMC approach to calculate the posterior probability on there being at least one jump for each branch.

We found our method to have a very low false positive rate in identifying jumps in that a posterior probability for jumps Inline graphic was never obtained for branches on which we did not simulate any jumps (Fig. 3a), and 90% of all such branches resulted in a posterior probability for jumps below 0.2 even for the weakest jump strengths simulated (Inline graphic).

Figure. 3.

Figure. 3.

Power to detect individual jumps. a) Cumulative distribution of jump posterior probabilities on all branches with (solid lines) and without (dashed lines) simulated jumps. Notice that branches with jumps have posterior probabilities that accumulate at 1, whereas branches without jumps accumulate at 0. b) Distribution of the absolute strengths of individual jumps that were either detected (white) or not detected (gray). c) Same as (a) but for trees with 1000 leaves. d) Same as (b) but for trees with 1000 leaves.

The power to infer true jumps (true positives) was also considerably high, especially for jumps of meaningful strength. For data simulated with Inline graphic, for instance, 90% of all branches on which jumps were simulated resulted in a posterior probability Inline graphic, and 75% even in a posterior probability Inline graphic. The few branches with jumps for which we did not obtain decisive posterior probabilities in favor of jumps all contained jumps that were considerably weak (Fig. 3b). Such jumps are expected even for large Inline graphic values since individual jump strengths are assumed to be normally distributed around zero.

A similar pattern was observed when simulating data with smaller Inline graphic, but even in the case of Inline graphic we obtain posterior probabilities in favor of jumps Inline graphic for more than one third of the branches on which jumps were simulated (Fig. 3a). At such small Inline graphic values for a tree of length 1, about 40% of all jumps are expected to have a strength smaller than 10 times the effect of the Brownian process on the same branch. But we note that the difficulty in placing weak jumps did not affect the power to infer the jump rate Inline graphic, which was inferred quite accurately even at such low Inline graphic values (Fig. 2).

All these findings were confirmed with simulations conducted for trees with 1000 leaves (Fig. 3c and d) if scaling Inline graphic appropriately (since branches are 10 times smaller, the same power is obtained with 10 times smaller Inline graphic values).

Run Times

The simulations performed here illustrate the computational efficiency of our algorithm. For a tree with 100 leaves, a single iteration of the EM required about 10 seconds on a single core of a standard desktop computer. Given that the EM converged after about 15 iterations on average, the algorithm required about 3 minutes to the find the MLE of all model parameters for a fixed Inline graphic value. For a tree with 1000 leaves, the EM also took on average 15 iterations to converge, but a single EM iteration required about eight minutes, resulting in a run time of 2 hours per Inline graphic value. The number of Inline graphic values to test directly translates into the estimation accuracy, but we found that very accurate estimates of Inline graphic were obtained with our peak-finder algorithm after already 15 values. The total run time was thus 45 minutes and 30 hours for a tree with 100 and 1000 leaves, respectively. The inference of jump locations then requires just a single run of the MCMC algorithm, and hence as long as a single EM iteration (10 seconds and 8 minutes, respectively).

We note that this almost quadratic increase in computational costs between a tree with 100 and 1000 leaves is expected given that the number of branches increases quadratically with the number of leaves. We can thus speculate that the inference of evolutionary jumps on a tree with 10000 leaves will require about 10 hours per EM iteration and thus about 6 days for a single Inline graphic value. To speed up the inference for trees this large we thus recommend not to use our peak-finder algorithm but rather to run a grid search over Inline graphic that is readily parallelized on a computer cluster. It might further be beneficial to obtain initial estimates from a subset of the tree to restrict this search range and to initialize the EM algorithm with already appropriate values. All features required for such runs are readily available in our implementation.

Applications

Quantum Evolution in Anoles

There have been a few direct tests of Simpsonian jumps between adaptive zones using empirical data (Uyeda et al. 2011). Here, we analyze “evolution by jumps” in the adaptive radiation of anoles, lizards that have adaptively radiated in the Caribbean and South America (Losos 2009). Following previous work, we focused on anoles on the four islands of the Greater Antilles, as they provide a unique opportunity for testing Simpson’s theory of adaptive zones for two reasons. First, there have been repeated dispersal events among islands in the Greater Antilles (Losos et al. 1998; Mahler et al. 2010). These dispersal events represent geographic opportunities, where anole lineages reach a new island and are no longer sympatric with the former set of competitors (Mahler et al. 2010). Second, most anole species can be classified into ecomorphs, habitat specialists that have evolved repeatedly on the four islands of the Greater Antilles (Losos et al. 1998). Transitions between ecomorph categories represent the evolution of key characters in anole lineages that allow them to invade novel habitats (see Losos (2009) for a review).

Anoles have thus repeatedly experienced two conditions under which Simpson expected evolutionary jumps to be observed: dispersal into new geographic areas and the appearance of evolutionary novelties. Importantly, both ecomorph origins and transitions among islands are replicated in the phylogeny of anoles, but are still rare enough that we can estimate the position of transitions on the phylogenetic tree with some confidence (Schluter 1995; Huelsenbeck et al. 2003).

With this background in mind, we tested if a model with evolutionary jumps fits the evolution of body size in anoles better than pure Brownian motion, and if jumps correspond with either of the two factors postulated by Simpson: evolution of key characters and/or geographic dispersal. To address this question, we made use of a recent time-calibrated phylogeny of 170 Anolis lizards (Thomas et al. 2009) and analyzed snout-to-vent length (SVL), a standard phenotypic measurement of body size in lizards. This trait is broadly correlated with habitat partitioning in Greater Antillean anoles and represent the primary axes of ecologically driven evolutionary divergence in lizards (Schoener 1970; Beuttell and Losos 1999; Losos 2009). We made use of the sex-specific data of SVL from Thomas et al. (2009) and inferred evolutionary parameters independently for females and males, but excluded five species that lacked information on SVL for one or both sexes (Anolis darlingtoni, A. guamuhaya, A. loveridgei, A. oporinus, and A. polyrhachis).

We found that the Lévy jump model is preferred over a strict BM model in females, but not in males (Table 1). Evolutionary jumps indicating rapid body size evolution (Fig. 4) were found precisely at the basis of the clade comprising the ecomorph “crown giants” Thomas et al. (2009), in which females exhibit particularly large body sizes. The large sexual size dimorphism of this group (Harmon et al. 2005) is also likely explaining why the BM model fits the evolution of male body sizes well. In addition to the clades of crown giants, we also identify evolutionary jumps at the basis of the clade consisting of the species A. barbatus, A. porcus, and A. chamaeleonides. These species, which are known as “false chamaleons” and are part of the former genus Chameleolis have been called the “most bizarre West Indian lizards” (Leal and Losos 2000).

Table 1.

Inferred Lévy parameters for Anolis and Loriinii, along with the log-likelihood (ℓ) obtained under the Lévy and BM models and the p-value of a LRT contrasting these.

  Anolis Loriini
  logSVLf logSVLm log Wgt PC1 PC2 PC3
μ 3.93 4.18 5.33 –0.062 0.051 0.19
graphic file with name inline-1.jpg 5.06 10.34 44.40 7.68 7.35 5.80
α 0.11 0.16 0.093
λ 11.27 11.95 7.38
BM 5.03 –15.19 –85.54 –65.00 –60.63 –6.22
Lévy 26.61 –13.89 –85.51 –31.99 –24.72 –4.97
LRT p 4.3.10-10 0.28 0.97 4.7.10-15 2.3.10-16 0.29
Preferred model Lévy BM BM Lévy Lévy BM

Figure. 4.

Figure. 4.

Inferred jumps for female body-size evolution on the anoles’ tree. The trait measured was the snout-to-vent length (SVL). Branches are colored according to their inferred jump posterior probability (black to red scale going from posterior probability 0 to 1, respectively). Tips are colored according to ecomorphs as defined by Thomas et al. (2009).

Our analyses support two main conclusions. First, evolutionary change in female anoles is not well described by a uniform random walk. A better description of anole evolution combines a uniform component of change that is punctuated by rapid jumps in trait values. Second, these jumps in body size very well correspond to ecological transitions to novel ecomorphs. The evolution of this trait is thus consistent with Simpson’s description of evolutionary jumps associated with the entry into new adaptive zones. The fact that we did not find such jumps at the basis of clades of other ecomorphs suggests that body size was not a trait strongly contributing to the ecological transition of those. However, evolutionary jumps might well be found at the basis of those clades when focusing on more relevant traits.

Nectarivory Evolution in Loriini

The Australasian lories belong to the tribe Loriini (Joseph et al. 2012) and are extremely species rich (Schweizer et al. 2011). Their digestive tract is highly adapted to a nectarivorous diet (Güntert 2012) and Schweizer et al. (2014) has shown quantitatively that a switch in diet to nectarivory might be considered an evolutionary novelty that created an ecological opportunity for species proliferation through allopatric partitioning of the same new niche. Using the methodology developed above we tested if the evolution of the morphology of the digestive tract in parrots as a whole is better characterized by a model of evolutionary jumps or Brownian motion. For this we made use of data from Schweizer et al. (2014), to generate a time-calibrated phylogeny of 78 parrot species using BEAST (Drummond and Rambaut 2007) implementing a secondary calibration point from Schweizer et al. (2011) for the initial split within parrots. The following 13 measurements of the morphology of the digestive tract were used: the length of intestine, length of esophagus, extension of esophagus glands, length of intermediate zone, length of proventriculus, gizzard height, gizzard width, gizzard depth, maximum gizzard height at main muscles, gizzard thickness at main muscles, gizzard lumen width including koilin layer, gizzard width at the caudoventral thin muscle, maximum gizzard height at the thin muscle, and the maximum gizzard lumen at the thin muscle. Since many of the morphological characters of the digestive tract considered are both highly correlated with body size as well as among themselves, we first regressed out body mass (Wgt) from each other morphological trait and then summarized the residuals of all traits using the first three principal components (PCA; see also (Revell 2009)).

We found that the evolution of the first two PC axis on the morphology of the digestive tract were much better explained by a model of evolutionary jumps (Inline graphic in both cases) with relatively high rates of jumps (Table 1). Overall, the jumps for PC1 identified with strongest support are both on branches basal to clades of nectarivorous species, particularly at the base of highly specialized nectar feeding Loriini, but also at the base of the genus Loriculus (Fig. 5). As postulated by Simpson the niche shift to nectarivory especially in Loriini involved a period of rapid evolution reflecting adaptations to feed effectively on nectar (and pollen) (Schweizer et al. 2014). Although the jumps within the Neotropical parrots are difficult to interpret in biological terms, the shift along the branch leading to Psittrichas fulgidus might be explained by its gizzard morphology similar to that of the Loriini probably reflecting an adaptation to its reportedly mainly frugivorous diet (Schweizer et al. 2014). Some special structures in the digestive tract of the genus Nestor have been described in Güntert (2012).

Figure. 5.

Figure. 5.

Evolutionary jumps in the morphology of the digestive tract in parrots. Results for PC1 are shown on the left phylogeny, and results for PC2 on the right phylogeny. Branches are colored according to their inferred jump posterior probability (black to red scale going from posterior probability 0 to 1, respectively). Species names are colored according to their diet: nectarivorous (blue) and nonnectarivorous (green).

Discussion

Although many traits appear to evolve at relatively constant rates over long time periods and across many taxa, some traits seem to undergo periods of rather rapid evolution (see Arnold (2014)). Simpson (1944). Simpson (1944) postulated that such evolutionary jumps are triggered by a change in selection pressure after lineages transitioned into different adaptive zones, for instance by dispersing into new geographic areas, after the appearance of evolutionary novelties, key innovations, or after rapid climatic or ecological changes of the environment. The appearance of well-calibrated phylogenies along with recent statistical developments now allow to test such models on a wide variety of data.

Bokma (2008), for instance, proposed to model evolutionary jumps as a compound process of a continuous background process and a discrete jump process. Recently, Landis et al. (2013) introduced a general framework to infer parameters of such Lévy processes under a Bayesian framework by means of Markov Chain Monte Carlo (MCMC). Unfortunately this approach, while elegant, requires the calculation of the inverse of the variance–covariance matrix describing the correlations between traits as a function of the phylogenetic tree and the jump process, which is computationally prohibitive for large trees.

Here we introduce a computationally highly efficient variant of this approach that naturally scales to large trees. The basis of our approach is an MCMC algorithm in which we can update the inverse of the above mentioned variance–covariance matrix directly without inversion when sampling jump configurations with fixed hierarchical parameters (root state, Brownian rate, jump strength and jump rate). To make use of this development for inference we propose a two-step approach in which the MCMC algorithm is embedded into an Expectation–Maximization (EM) approach to obtain maximum likelihood (ML) estimates of the hierarchical parameters while integrating over jump configurations. In a second step, the location of jumps can then be inferred under an empirical Bayes framework in which the hierarchical parameters are fixed to their ML estimate and the developed MCMC algorithm is used to obtain for each branch the posterior probability that a jump occurred at this location.

There are also other methods that deal with the burden of calculating inverses and determinants of variance–covariance matrices. For instance, Freckleton (2012) applied the results of Felsenstein (1973) and Felsenstein (1985) to calculate the likelihood in linear time of a BM model. FitzJohn (2012) also proposed a fast algorithm to calculate BM and OU likelihoods using Gaussian elimination, but this is not applicable to non-Gaussian traits. Tung Ho and Ané (2014) proposed a new method, which efficiently calculates likelihoods by avoiding the calculation of the inverse and determinant of the variance–covariance matrix. Their method requires that this matrix belongs to a class of generalized 3-point structured matrices. Our method, which applies an iterative scheme, differs from the others in the sense that the inverse and determinant of the variance–covariance matrix has to be calculated only once when obtaining the likelihoods, thus obtaining rather fast calculation times.

We demonstrated the applicability of our approach by identifying evolutionary jumps for body size evolution in Anolis lizards and the evolution of gut morphology in Australasian lories of the tribe Loriini. We found strong support for evolutionary jumps in both systems that provide direct support for Simpson’s quantum evolutionary hypothesis of adaptive zones. Among the anoles, for instance, we identified evolutionary jumps on the basal lineage leading to crown giants, a group of lizards that transitioned into a novel niche for hunting: the crowns of large tropical trees. Similarly, we identified jumps at the basis of clades of lories that transitioned to nectarivory, an evolutionary novelty that triggered rapid changes in morphology of the digestive system and promoted significant lineage diversification, which was probably mainly non-adaptive after the basal diet shift through allopatric partitioning of the same niche (Schweizer et al. 2014, cf.).

These results also show that the distinction between “gradual” and “punctuated” models of evolution is a false dichotomy; instead, evolution has a gradual component that may be frequently punctuated by periods of rapid change (Levinton 2001). We further note that in both cases studied here a single jump at the basis of clades is sufficient to explain their trait data, suggesting that the period of rapid evolution was limited to a single branch and that the background rate remained constant. We suggest that future work should follow Simpson’s lead and focus on the factors that promote these pulses of evolutionary change.

Although we model evolutionary jumps as instantaneous, we want to be clear that we are not invoking actual instantaneous evolutionary change (e.g., “hopeful monsters”) (Goldschmidt 1940; Charlesworth et al. 1982). Typical microevolutionary processes of selection and drift can cause change that would appear to be instantaneous when viewed over the timescale of macroevolution. Our model is also distinct from punctuated equilibrium, which requires evolutionary jumps to occur only at speciation events (Eldredge and Gould 1972). The punctuated changes in our model occur along branches in the tree and are not necessarily associated with speciation events. In fact, for the case of anoles, two lines of evidence argue against punctuated equilibrium: first, most speciation events in the tree are not associated with jumps; and second, we know from detailed microevolutionary studies that anole body size can evolve rapidly in response to selection even in the absence of speciation (e.g., Losos et al. (2006)).

Supplementary Material

Supplementary material including Figures can be found in the Dryad data repository at http://dx.doi.org/10.5061/dryad.170rb.

Funding

This study was supported by Swiss National Foundation grant [31003A_149920 to D.W.]. The work of S. M. Szilágyi was supported by the János Bolyai Fellowship Program of the Hungarian Academy of Sciences.

Acknowledgments

We are grateful for the helpful comments Folmer Bokma made in his signed review.

Appendix 1

Efficient Calculation of Inverses and Determinants

For a symmetric non-singular matrix Inline graphic and a (column) vector Inline graphic, we have

(A±aa)1=A111±aA1a(A1a)(A1a) (1)

(see Izenman (2008), p. 47) and

det(A±aa)=detA(1±aA1a) (2)

(see Anderson 2003, Corollary A.3.1). These formulae have recently been shown to speed up the calculation of the likelihood function under Brownian motion models (Tung Ho and Ané 2014). Here we use them to develop a fast algorithm applicable to Lévy processes.

Let us first fix some notation: For each branch Inline graphic, we define the Inline graphic incidence matrix Inline graphic by setting Inline graphic if the branch Inline graphic is common to the pair of leaves Inline graphic, and Inline graphic otherwise. Clearly,

nij=b=1BnbIijb.

In the following we replace the parameter Inline graphic with the positive factor Inline graphic given by Inline graphic. Observe that

Σ(n)=s02T(n,α)anddetΣ(n)=s02LdetT(n,α),

where

T(n,α)=T0+αb=1BnbIb

and Inline graphic. Finally, we introduce for Inline graphic the (column) vectors Inline graphic, each one with Inline graphic components. The Inline graphic-th component Inline graphic is equal to Inline graphic if leaf Inline graphic is subordinate to branch Inline graphic (i.e., the path from the root Inline graphic to node Inline graphic contains branch Inline graphic). Otherwise, if leaf Inline graphic is not subordinate to branch Inline graphic, then Inline graphic. It is easy to see that Inline graphic and thus

T(n,α)=T0+αb=1Bnbub(ub). (3)

We can now apply formulae (1) and (2) to obtain the following iterative scheme for the computation of Inline graphic and Inline graphic:

First, determine Inline graphic and Inline graphic. Then, for each term with Inline graphic in the sum (3), update Inline graphic to Inline graphic etc. as follows: Let

rb=1+α nb(ub) Tb11ub

and calculate

Tb1=Tb11α nbrb(Tb11ub)(Tb11ub);detTb=rbdetTb1. (4)

When all non-zero terms in (3) have been considered, we arrive at Inline graphic and Inline graphic. Observe that in this scheme, the only matrix inverse that ever has to be determined is Inline graphic. The number of non-zero Inline graphic will frequently be small compared to Inline graphic and so will be the number of iterations (4).

Appendix 2

Simulating Inline graphic with MCMC

Here we describe how to sample the states Inline graphic from the probability distribution Inline graphic using the Metropolis scheme. (To unburden the notation in the description of the MCMC algorithm, we drop the tilde overscript on the parameters.) At each state we will need the inverse matrix Inline graphic of Inline graphic given by (3). Start the chain e.g., at Inline graphic and with Inline graphic.

  • 1.

    Let Inline graphic denote the current state of the Markov chain and Inline graphic the inverse matrix of Inline graphic. Choose an index Inline graphic with equal probability (or with a probability proportional to Inline graphic) and an increment Inline graphic or Inline graphic with probability Inline graphic. The candidate state Inline graphic is given by in- or decreasing the Inline graphic-th index Inline graphic by Inline graphic: Inline graphic.

  • 2.
    Using (6) and the iteration formula (4) it is not hard to check that the Hastings ratio (proposal probability) can be calculated by
    h=min[1,r1/2(λτbnb+)Δnb×exp(αΔnb2rs02|(xμ1)T1ub|2)(λτbnb+)Δnb],
    where Inline graphic and Inline graphic. If the candidate state contains a negative component (i.e., if Inline graphic and Inline graphic) then set Inline graphic. This ensures that the chain is indeed symmetric.
  • 3.
    With probability Inline graphic jump to the candidate state Inline graphic, otherwise stay at Inline graphic. In the first case, update
    nnT1T1α0Δnbr(T1ub)(T1ub)
    and go to step Inline graphic.

No matrix inverse must ever be calculated in this scheme thanks to the update in step Inline graphic. (To counterbalance the accumulation of numerical errors it might however be wise to occasionally calculate Inline graphic from scratch.)

After the burn-in phase, a fraction Inline graphic of the simulated states will be retained (thinning out). These will be used to replace the matrix (9) in the M-step of the EM algorithm by

S=1Mm=1MT1(nm,α0). (5)

Appendix 3

Conditional Likelihood

If we suppose a priori that there have been only Inline graphic or less Poisson events on the tree Inline graphic, the EM algorithm can be implemented deterministically. In that case, the sum in (5) is over all Inline graphic such that Inline graphic. Observe that we have to use the conditional probabilities

P(ν=ni|νN)=P(ν=ni)/k=1KP(ν=nk).

The new Inline graphic-function is

Q(θ)=k=1Kπklog[ϕ(x|nk)P(ν=nk|νR)], (6)

where we can use

πk=ϕ(x|nk,μ~,s~02,α~)P(ν=nk|λ~).

In the conditional case, there no longer seems to exists a closed formula like (8) for the optimal Inline graphic. Setting the derivative of (6) w.r.t. Inline graphic equal to Inline graphic, one can show that Inline graphic is the root of the following Inline graphic-th order polynomial:

P(λ)=r=0Rτrr!(πk|nk|rπk)λr,

i.e., Inline graphic. The estimation of Inline graphic and Inline graphic, on the other hand, remains exactly as given by (10) and (11), respectively.

Appendix 4

Assessing Convergence of the Stochastic EM

We introduce two measures to assess convergence of the Monte Carlo EM algorithm.

Regression criterion.

—We consider a time series Inline graphic and construct a test statistic which allows to reject the null hypothesis that the time series exhibits no trend. For this we estimate the slope Inline graphic of the regression line passing through the data points Inline graphic and test for the null hypothesis Inline graphic (no trend). Determine the following quantities:

y¯=1ni=1nyi,Sxx=n(n21)12,Sxy=12i=1n(n+12i)yi,β^=SxySxx,σ^2=1n2((i=1nyi2)ny¯2β^Sxy),se(β^)=σ^2Sxx.

The test statistic

T=β^se(β^)

has the Student’s Inline graphic-distribution with Inline graphic degrees of freedom. We reject the null hypothesis on the level Inline graphic if Inline graphic. A good rule of thumb (for Inline graphic roughly Inline graphic and Inline graphic) is Inline graphic.

Proportion of slope sign changes.

—We propose a second way of assessing convergence by taking the last Inline graphic values of the EM algorithm and counting the number of times Inline graphic there is a change in the sign of the slope between consecutive values. If convergence is reached, we expect the number of slopes with a positive sign to be similar to the number of slopes with a negative sign. We report the test statistic Inline graphic

N=cn2,

where Inline graphic represents the total number of possible sign changes among the last Inline graphic values.

References

  1. Anderson T. 2003. An introduction to multivariate statistical analysis. New York: John Wiley. [Google Scholar]
  2. Arnold S.J. 2014. Phenotypic evolution: the ongoing synthesis. American Naturalist 183:729–746. [DOI] [PubMed] [Google Scholar]
  3. Bedford T.,, Hartl D.L. 2009. Optimization of gene expression by natural selection. Proc. Nat. Acad. Sci. 106:1133–1138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Beuttell K.,, Losos J.B. 1999. Ecological morphology of caribbean anoles. Herpetological Monographs, p. 1–28. [Google Scholar]
  5. Bokma F. 2008. sDetection of punctuated equilibrium by bayesian estimation of speciation and extinction rates, ancestral character states, and rates of anagenetic and cladogenetic evolution on a molecular phylogeny. Evolution 62:2718–2726. [DOI] [PubMed] [Google Scholar]
  6. Brawand D., Soumillon M., Necsulea A., Julien P., Csárdi G., Harrigan P., Weier M., Liechti A., Aximu-Petri A., Kircher M., et al. 2011. The evolution of gene expression levels in mammalian organs. Nature 478:343–348. [DOI] [PubMed] [Google Scholar]
  7. Butler M.A.,, King A.A. 2004. Phylogenetic comparative analysis: a modeling approach for adaptive evolution. American Naturalist 164:683–695. [DOI] [PubMed] [Google Scholar]
  8. Cavalli-Sforza L.L.,, Edwards A.W. 1967. Phylogenetic analysis. models and estimation procedures. American J. Human Genet. 19:233. [PMC free article] [PubMed] [Google Scholar]
  9. Charlesworth B.,, Lande R.,, Slatkin M. 1982. A neo-darwinian commentary on macroevolution. Evolution 36:474–498. [DOI] [PubMed] [Google Scholar]
  10. Davidson R.,, MacKinnon J.G. 2004. Econometric theory and methods, vol. 5 New York: Oxford University Press. [Google Scholar]
  11. Drummond A.J.,, Rambaut A. 2007. Beast: Bayesian evolutionary analysis by sampling trees. BMC Evolut. Biol. 7:214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Eastman J.M.,, Alfaro M.E.,, Joyce P.,, Hipp A.L.,, Harmon L.J. 2011. A novel comparative method for identifying shifts in the rate of character evolution on trees. Evolution 65:3578–3589. [DOI] [PubMed] [Google Scholar]
  13. Edwards A., Cavalli-Sforza L., Heywood V.. 1964. Phenetic and phylogenetic classification. Systematics Association, Publication 67. [Google Scholar]
  14. Eldredge N.,, Gould S.J. 1972. Punctuated equilibria: an alternative to phyletic gradualism. Models in Paleobiol. 82–115. [Google Scholar]
  15. Felsenstein J. 1973. Maximum-likelihood estimation of evolutionary trees from continuous characters. American J. Human Genet. 25:471. [PMC free article] [PubMed] [Google Scholar]
  16. Felsenstein J. 1985. Phylogenies and the comparative method. American Natural 125(1):1–15. [Google Scholar]
  17. FitzJohn R.G. 2012. Diversitree: comparative phylogenetic analyses of diversification in r. Methods Ecol. Evol. 3:1084–1092. [Google Scholar]
  18. Freckleton R.P. 2012. Fast likelihood calculations for comparative analyses. Methods Ecol. Evol. 3:940–947. [Google Scholar]
  19. Freckleton R.P.,, Harvey P.H.,, Pagel M. 2002. Phylogenetic analysis and comparative data: a test and review of evidence. American Naturalist 160:712–726. [DOI] [PubMed] [Google Scholar]
  20. Goldschmidt R. 1940. The material basis of evolution, vol. 28 New Haven: Yale University Press. [Google Scholar]
  21. Güntert M. 2012. Morphologische Untersuchungen zur adaptiven Radiation des Verdauungstraktes bei Papageien (Psittaci). Zoologische Jahrbucher. Abteilung für Anatomie und Ontogenie der Tiere106:471–526. [Google Scholar]
  22. Hansen T.F. 1997. Stabilizing selection and the comparative analysis of adaptation. Evolution 51(5):1341–1351. [DOI] [PubMed] [Google Scholar]
  23. Harmon L.J.,, Kolbe J.J.,, Cheverud J.M.,, Losos J.B. 2005. Convergence and the multidimensional niche. Evolution 59:409–421. [PubMed] [Google Scholar]
  24. Huelsenbeck J.P.,, Nielsen R.,, Bollback J.P. 2003. Stochastic mapping of morphological characters. Syst. Biol. 52:131–158. [DOI] [PubMed] [Google Scholar]
  25. Ingram T.,, Mahler D.L. 2013. SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise Akaike Information Criterion. Methods Ecol. Evol. 4:416–425. [Google Scholar]
  26. Izenman A. 2008. Modern multivariate statistical techniques vol. 1 New York: Springer. [Google Scholar]
  27. Joseph L.,, Toon A.,, Schirtzinger E.E.,, Wright T.F.,, Schodde R. 2012. A revised nomenclature and classification for family-group taxa of parrots (Psittaciformes). Zootaxa 3205:26–40. [Google Scholar]
  28. Landis M.J.,, Schraiber J.G.,, Liang M. 2013. Phylogenetic analysis using lévy processes: finding jumps in the evolution of continuous traits. Syst. Biol. 62:193–204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Leal M.,, Losos J.B. 2000. Behavior and ecology of the Cuban “chipojos bobos“ Chamaeleolis barbatus and C. porcus. J. Herpetol. 34:318–322. [Google Scholar]
  30. Levinton J.S. 2001. Genetics, paleontology, and macroevolution. Cambridge: Cambridge University Press. [Google Scholar]
  31. Losos J.B. 2009. Lizards in an evolutionary tree: ecology and adaptive radiation of anoles vol. 10 University of California Press. [Google Scholar]
  32. Losos J.B. 2010. Adaptive radiation, ecological opportunity, and evolutionary determinism. American Naturalist 175:623–639. [DOI] [PubMed] [Google Scholar]
  33. Losos J.B.,, Glor R.E.,, Kolbe J.J.,, Nicholson K. 2006. Adaptation, speciation, and convergence: a hierarchical analysis of adaptive radiation in caribbean Anolis lizards 1. Annals Missouri Botanical Garden 93:24–33. [Google Scholar]
  34. Losos J.B.,, Jackman T.R.,, Larson A.,, de Queiroz K.,, Rodrıguez-Schettino L. 1998. Contingency and determinism in replicated adaptive radiations of island lizards. Science 279:2115–2118. [DOI] [PubMed] [Google Scholar]
  35. Mahler D.L.,, Revell L.J.,, Glor R.E.,, Losos J.B. 2010. Ecological opportunity and the rate of morphological evolution in the diversification of greater antillean anoles. Evolution 64:2731–2745. [DOI] [PubMed] [Google Scholar]
  36. O’Meara B.C.,, Ané C.,, Sanderson M.J.,, Wainwright P.C. 2006. Testing for different rates of continuous trait evolution using likelihood. Evolution 60:922–933. [PubMed] [Google Scholar]
  37. Revell L.J. 2009. Size-correction and principal components for interspecific comparative studies. Evolution 63:3258–3268. [DOI] [PubMed] [Google Scholar]
  38. Rohlfs R.V.,, Harrigan P.,, Nielsen R. 2013. Modeling gene expression evolution with an extended ornstein-uhlenbeck process accounting for within-species variation. Mol. Biol. Evol. 31(1):201–211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Schluter D. 1995. Uncertainty in ancient phylogenies. Nature 377:108–110. [DOI] [PubMed] [Google Scholar]
  40. Schoener T.W. 1970. Size patterns in west indian anolis lizards. ii. correlations with the sizes of particular sympatric species-displacement and convergence. Am. Naturalist 104(936):155–174. [Google Scholar]
  41. Schweizer M.,, Güntert M.,, Seehausen O.,, Leuenberger C.,, Hertwig S.T. 2014. Parallel adaptations to nectarivory in parrots, key innovations and the diversification of the loriinae. Ecol. Evol. 4:2867–2883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Schweizer M.,, Seehausen O.,, Hertwig S.T. 2011. Macroevolutionary patterns in the diversification of parrots: effects of climate change, geological events and key innovations. J. Biogeography 38:2176–2194. [Google Scholar]
  43. Simpson G. 1944. Tempo and Mode in Evolution. A Wartime book. New York: Columbia University Press. [Google Scholar]
  44. Slater G.J.,, Harmon L.J.,, Wegmann D.,, Joyce P.,, Revell L.J.,, Alfaro M.E. 2012. Fitting models of continuous trait evolution to incompletely sampled comparative data using approximate bayesian computation. Evolution 66:752–762. [DOI] [PubMed] [Google Scholar]
  45. Stadler T. 2011. Simulating trees with a fixed number of extant species. Syst. Biol. 60:676–684. [DOI] [PubMed] [Google Scholar]
  46. Thomas G.H.,, Meiri S.,, Phillimore A.B. 2009. Body size diversification in Anolis: novel environment and island effects. Evolution 63:2017–2030. [DOI] [PubMed] [Google Scholar]
  47. Tung Ho, L.S.,, Ané C. 2014. A linear-time algorithm for gaussian and non-gaussian trait evolution models. Syst. Biol. 63:397–408. [DOI] [PubMed] [Google Scholar]
  48. Uyeda J.C.,, Hansen T.F.,, Arnold S.J.,, Pienaar J. 2011. The million-year wait for macroevolutionary bursts. Proc. Nat. Acad. Sci. 108:15908–15913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Uyeda J.C.,, Harmon L.J. 2014. A novel Bayesian method for inferring and interpreting the dynamics of adaptive landscapes from phylogenetic comparative data. Syst. Biol. 63:902–918. [DOI] [PubMed] [Google Scholar]

Articles from Systematic Biology are provided here courtesy of Oxford University Press

RESOURCES