Abstract
Relaxing the molecular clock using models of how substitution rates change across lineages has become essential for addressing evolutionary problems. The diversity of rate evolution models and their implementations are substantial, and studies have demonstrated their impact on divergence time estimates can be as significant as that of calibration information. In this review, we trace the development of rate evolution models from the proposal of the molecular clock concept to the development of sophisticated Bayesian and non-Bayesian methods that handle rate variation in phylogenies. We discuss the various approaches to modeling rate evolution, provide a comprehensive list of available software, and examine the challenges and advancements of the prevalent Bayesian framework, contrasting them to faster non-Bayesian methods. Lastly, we offer insights into potential advancements in the field in the era of big data.
Keywords: molecular dating, relaxed molecular clock, rate heterogeneity, model comparison, rate models, molecular clock history
Significance.
Understanding the timing of evolutionary events is crucial for studying how species have evolved over time. Traditional methods assumed a constant rate of genetic change, but this is often not accurate. Newer models that allow for changes in substitution rates across different lineages have become vital tools for researchers. This review provides a comprehensive historical overview of how these models have evolved from the initial concept of a constant “molecular clock” to more advanced methods that account for varying rates of evolution. Additionally, we highlight the software available for these analyses and discuss the current challenges and advancements in the field.
Introduction
When Watson and Crick discovered the structure of DNA in 1953, measuring the rate at which nucleotide or amino acid sequences change over time was unattainable. At that time, the pace of evolution could only be measured for phenotypic traits, as outlined in G.G. Simpson’s classic Tempo and Mode in Evolution. The pioneering works of Zuckerkandl and Pauling in the 1960s (Zuckerkandl and Pauling 1962, 1965) provided the first glimpse of how amino acids of the primary sequence of the hemoglobin protein were replaced along the evolutionary time in selected mammalian species. Surprisingly, the number of amino acid changes between species such as humans and horses correlated linearly with their divergence time, suggesting that the rate at which substitutions accumulate in molecular sequences remained constant over time (Fig. 1a). This rate constancy, which worked as a molecular evolutionary clock, was later confirmed for other proteins (Margoliash 1963; Doolittle and Blomback 1964).
Fig. 1.
The strict molecular clock (a) compared to models of rate evolution that assume discrete variation of substitution rates across branches: b) local clocks and c) discrete multirate clocks.
These findings marked the birth of molecular evolution and allowed the convergence of seemingly disparate disciplines such as paleontology and molecular biology. Since then, the molecular clock has been at the center of numerous debates, providing crucial evidence for evolutionary theories and enabling the estimation of divergence times between biological lineages and molecular evolutionary rates (Sarich and Wilson 1967; Wilson and Sarich 1969; Ohta and Kimura 1971; Langley and Fitch 1974; Read 1975; Gillespie and Langley 1979; Kimura 1983; Britten 1986; Gillespie 1986b; Martin and Palumbi 1993; Mooers and Harvey 1994). Furthermore, the field has experienced a flourishing of methodological developments, particularly over the last 25 years since the implementation of the first model of rate evolution based on the Bayesian statistical framework (Thorne et al. 1998). While several reviews have addressed various important aspects of divergence time estimation, such as molecular clock calibration (Ho and Duchêne 2014; Donoghue and Yang 2016; Bromham et al. 2018; Guindon 2020; Tiley et al. 2020), modeling of the branching process (Ho and Duchêne 2014; Bromham et al. 2018), and associated statistical challenges (dos Reis et al. 2016; Kumar and Hedges 2016; Bromham et al. 2018; Bromham 2019; Guindon 2020), the modeling of rates has not been thoroughly reviewed in light of recent developments. Here, we provide a comprehensive overview of the development of models of how substitution rates evolve across lineages, since the proposal of the molecular clock to later models that relax the rate constancy assumption. Our goal was to concentrate on the evolution of rate models themselves, which we believe warrants independent consideration.
Early Models of Rate Evolution and Testing the Clock
The concept of the molecular clock has been controversial from its inception. Efforts to evaluate the molecular clock hypothesis within a statistical framework began in the 1970s. At its core, the molecular clock hypothesis assumes that substitutions in nucleotide and amino acid sequences follow a Poisson process, where both the mean and the variance of the distribution are equal to the product of rate and time. Under this model, the number of substitutions occurring in independent lineages over the same period should be similar, reflecting rate constancy across lineages. In a Poisson process, the ratio between the mean and variance, known as the index of dispersion, should approach unity, indicating that the observed substitution rate is consistent with a constant rate of evolution. However, when this index is greater than 1, the distribution is described as overdispersed, suggesting that the variance in evolutionary rates exceeds the level expected by chance alone. This overdispersion indicates deviations from the strict molecular clock, where the assumption of rate constancy no longer holds, pointing to heterogeneity in substitution rates across lineages.
The study of Ohta and Kimura (1971) was the first attempt to assess the rate constancy while treating the molecular clock as a null hypothesis. By considering a Poisson process, they found evidence of overdispersion, suggesting that the molecular clock was not strictly constant. Later, Langley and Fitch (1974) employed a likelihood ratio test and also rejected the clock when both lineage and residual effects were considered (Gillespie 1991). Lineage effects are those expected to affect all genes homogeneously, while residual effects result from the interaction between lineage and gene-specific effects (Dickerson 1971; Gillespie 1989; Duchêne and Ho 2015). For example, generation time impacts all genes uniformly as a lineage effect, whereas positive selection often targets a specific subset of loci (Muse and Gaut 1997; Gaut et al. 2011).
Therefore, from the outset, molecular data suggested an overdispersed molecular clock, indicating that rate constancy was not a universal feature of molecular evolution. However, exactly how the rate varied across the evolutionary time was unclear. Langley and Fitch (1974) proposed that substitution rates would change between lineages in a correlated manner. Thus, if instantaneous mutation rates were approximately homogeneous between lineages, rate variation was accounted for by differences in the rate of allele fixation (substitution rates). According to them, and contrary to the neural theory, a significant portion of observed sequence differences would result from the action of natural selection. Curiously, Langley and Fitch (1974) noticed that assuming an overall average rate of evolution yielded a reasonable correspondence between the inferred ages of between-species splits and ages derived from the fossil record. It was evident that the molecular clock was useful in establishing the timescale of evolution, even with some degree of rate variation.
As the number of studies confirming rate heterogeneity of molecular rates in various biological lineages accumulated (Giebel et al. 1985; Wu and Li 1985; Britten 1986; Li and Tanimura 1987; Moriyama 1987; Li et al. 1990), authors devised several explanations besides natural selection, from variation in generation times (or, more precisely, the number of germline DNA replications per year) to the effectiveness of DNA repair mechanisms (Wu and Li 1985; Britten 1986). Motivated by the early findings that a constant rate Poisson process inadequately describes the substitutions observed in molecular sequences, Gillespie (1984a, 1984b, 1986a, 1986b, 1986c, 1989) conducted an in-depth exploration of the factors influencing rate variation in the 1980s. He also proposed a model where substitution rates evolved in a correlated manner, with descendant lineages inheriting the rates from their ancestral lineage. These rates could subsequently change throughout the evolution of newly formed descendant lineages (Gillespie 1991).
The discussion regarding the adequacy of a single-rate Poisson process continued in the 1990s (e.g. Takahata 1991; Ohta 1995; Nielsen 1997). Goldman (1994) assessed various estimators of the dispersion index (Kimura 1983; Bulmer 1989) and concluded that it might be an inadequate statistic to test the constancy of evolutionary rates. Cutler (2000) has pointed out that variation in rates over time might be overestimated if the variance in branch length estimates is underestimated, especially if independency of sites is violated. While the precise model of the evolution of rates was still unknown, the usefulness of the clock as a means of establishing the ages of lineage divergence fomented the development of methodological tools for molecular dating (Fitch 1976; Wu and Li 1985; Felsenstein 1987; Muse and Weir 1992; Tajima 1993; Takezaki et al. 1995; Rambaut and Bromham 1998). In general, most data sets analyzed exhibited some level of rate heterogeneity, limiting their applicability for inferring divergence times. A potential solution was the removal of lineages that violated rate constancy from the data, which had the drawback of information loss.
Molecular Dating without Rate Constancy
Driven by the difficulty in devising a mechanistic model of rate evolution, several methodological approaches were developed to estimate divergence times while accommodating evolutionary rate variation between lineages during the 1990s (Kishino and Hasegawa 1990; Rambaut and Bromham 1998). These methods avoided the assumption of an explicit rate evolution model either by smoothing rate change between branches or by a priori definition of local molecular clocks. They consist of the first attempts to relax the strict molecular clock. It is important to note that relaxing the strict clock assumption is essential not only for accurately estimating divergence times between lineages but also for elucidating the evolutionary trajectory of substitution rates. This understanding is crucial for addressing various evolutionary questions such as convergent rate changes in distinct genomic regions (e.g. Clarck et al. 2012; Hu et al. 2019), correlations between molecular rates and phenotypic traits (e.g. Lartillot and Poujol 2011; Partha et al. 2019), and genomic evolution (e.g. Wolf et al. 2013; Mello and Schrago 2019).
The adoption of local molecular clocks is a straightforward strategy to estimate node ages under rate heterogeneity. Instead of requiring that all branches evolve under a single rate, some predefined branches are allowed to have different substitution rates, which can be inferred from data (Fig. 1b). This approach was pioneered by Kishino and Hasegawa (1990), and later, a maximum likelihood (ML) method was developed for a case of two calibrated sister lineages with independent rates (Rambaut and Bromham 1998). Further extensions of the ML implementation, allowing for phylogenetic trees of any size, were proposed (Yoder and Yang 2000; Yang and Yoder 2003). However, all these methods required a priori assignment of the branches to local clocks. Efforts to mitigate this drawback were put forward, allowing allocation of branches into rate categories using some statistical criteria (Yang 2004; Aris-Brosou 2007).
Around this time, it became evident that relaxing the rate constancy constraint of the original molecular clock was imperative. To achieve this, the evolution of substitution rates across branches had to be modeled. However, this presented a significant challenge because it required the development of complex, parameter-rich models that raised issues of statistical identifiability (Sanderson 1997; Thorne et al. 1998). As the number of substitutions in a branch (branch lengths) is the product between the rate in that branch and its time duration (elapsed time), infinite combinations of rate and time values yield the same likelihood. Low rates with long elapsed times may produce the same branch length as high rates with short durations. This hindered the development of methods for estimating node ages assuming a continuous change in evolutionary rates.
Sanderson (1997) addressed this challenge relying on a nonparametric rate smoothing (NPRS) strategy. The idea was imposing a constraint on how rates could vary across a phylogeny. This was achieved through the autocorrelation principle, where daughter branches inherit the evolutionary rate from their ancestor before evolving their own rates (Fig. 2a). Although nonparametric, the method implicitly assumes that rates evolve in a correlated manner, minimizing rate change across lineages. Later, Sanderson (2002) extended his approach into a semiparametric penalized likelihood (PL) framework (Green 1987), adopting a global penalization of rate changes across the phylogeny and determining the optimal level of rate smoothing by a cross-validation procedure. This method has been successful in establishing a penalty function that governs the rate variation from branch to branch. This avoided the identifiability problem of very parametric models by forcing the rates to change smoothly along branches. PL was implemented in the r8s software (Sanderson 2003).
Fig. 2.
Models of rate evolution that assume continuous variation of substitution rates across branches: a) the autocorrelated lognormal relaxed clock and b) the uncorrelated lognormal relaxed clock. Autocorrelated rates were simulated following the method described by Thorne et al. (1998), resulting in a distribution of evolutionary rates shown as a mixture of lognormal distributions, which arises from the recursively iteration of the autocorrelated process. Uncorrelated rates were simulated by independently drawing from a lognormal distribution.
The Emergence of Bayesian Modeling of Rate Evolution
The use of explicit models of substitution rate evolution across tree branches effectively took off after the seminal work of Thorne et al. (1998). This fully parametric method for node age inference was implemented in a Bayesian framework which, at that period, was beginning to be experimented in phylogenetics (Rannala and Yang 1996; Mau and Newton 1997; Yang and Rannala 1997). Thorne et al. (1998) recognized that modeling rate evolution under the ML framework posed serious challenges in terms of computational tractability. On the other hand, Bayesian inference, largely due to the Markov chain Monte Carlo (MCMC) algorithm, is computationally feasible and offers greater statistical flexibility. This statistical framework allows for the use of probability distributions as priors, enabling the explicit modeling of rate evolution and the branching process (node heights) in the phylogenetic tree (e.g. Lartillot and Philippe 2004; Blanquart and Lartillot 2008; Pagel and Meade 2008; Heath et al. 2014; Zhang et al. 2016). These distributions can take various shapes, such as normal, lognormal, exponential, gamma, and skew normal. For calibrating node ages, for instance, the Bayesian approach offers significant flexibility to incorporate uncertainties derived from the fossil record. This represented a great advantage, because under the nonparametric and semiparametric frameworks, calibrations were informed by minimum and/or maximum constraints.
To model of rate evolution, Thorne et al. (1998) used a lognormal prior probability density. Also drawing inspiration from Gillespie’s concepts of rate correlation (Fig. 2a), the key element of their model was the autocorrelation coefficient (ν). This parameter was fixed and determined the degree of autocorrelation of evolutionary rates between branches; the lower the value, the more closely the model adheres to the strict molecular clock. The model is autocorrelated because the expected rate at the midpoint of the descendant branch equals the rate at the midpoint of the ancestor branch. To avoid negative rate values, the rate logarithm was used, making the probability distribution characteristically skewed to the right. The variance of this lognormal was set to be the product between the time duration of the ancestor-descendent branch midpoints and the autocorrelation coefficient, which served as a hyperparameter governing the lognormal prior. The ages of branching events were modeled using a Yule process prior (Yule 1925; Edwards 1970), enabling the estimation of relative node ages, as the incorporation of calibrations had not been integrated yet.
Modifications were subsequently implemented. Kishino et al. (2001) modeled rates at nodes rather than branch midpoints, thus ensuring that the correlation of rates between two sister branches remained independent of the time duration of the ancestor branch. In this context, the rate was assumed to evolve under a geometric Brownian motion process. The autocorrelation coefficient continued to play a pivotal role in their modified model. To enhance statistical flexibility, a Dirichlet prior was adopted for branching times instead of the Yule. Importantly, the inclusion of time constraints (minimum and maximum node ages) now enabled the computation of absolute times, thereby facilitating the calibration of evolutionary rates. This method was implemented in Multidivtime, the first Bayesian relaxed clock software (Thorne et al. 1998; Kishino et al. 2001).
The use of a lognormal distribution to model rate change in a phylogeny to relax the molecular clock assumption gained rapid popularity and was readily applied in taxonomically diverse lineages (e.g. Cao et al. 2000; Korber et al. 2000; Nikaido et al. 2000; Douady and Douzery 2003; Schrago and Russo 2003; Springer et al. 2003; Wiegmann et al. 2003; Delsuc et al. 2004; Bell et al. 2005; Opazo 2005). This approach was soon extended to accommodate multigene data sets, providing the flexibility to use either independent ν values for each gene or a single ν value for all loci (Thorne and Kishino 2002). This reflected the growing availability of molecular data and the need for genomic-aware methodologies. The method also allowed for independent rate trajectories among genes, accommodating both gene and residual effects.
At about the same period, Huelsenbeck et al. (2000) introduced a Bayesian approach that considered autocorrelation while employing a distinct rationale to model the evolution of rates. Motivated by the early ideas that substitution rates follow a Poisson process, they proposed that rate switches across the phylogeny should also be Poisson distributed, characterizing a compound Poisson process (CPP). Once changed, the new rate was a product of the ancestor rate by a gamma distributed multiplier. Therefore, ancestor-descendant autocorrelation was applied. Interestingly, rate switches could occur along the branch instead of being limited to take place at nodes or midpoints. The CPP model was implemented in MrBayes (Ronquist et al. 2012), utilizing a lognormal distribution for the rate multipliers instead of a gamma. It was also implemented in TreeTime (Himmelmann and Metzler 2009).
The emergence of molecular dating methods that model rate evolution in a Bayesian framework enabled a more efficient use of molecular sequence information for inferring absolute times. In comparison to previous nonparametric and semiparametric strategies, the novel Bayesian methods held the advantage of obtaining uncertainties of divergence time estimates using measures of dispersion of their posterior distributions (credibility intervals). This feature was very appealing to empirical biologists. In non-Bayesian approaches, uncertainties of time estimates are attained by confidence intervals (CIs), which have ambiguous interpretation besides being difficult to compute. For instance, in the PL framework, CIs are estimated via the bootstrap procedure, which can produce overly narrow intervals, particularly in the context of phylogenomic data (Barba-Montoya et al. 2021; Costa et al. 2022).
In the Bayesian framework, however, the properties of credibility intervals have been extensively examined (Yang and Rannala 2006; Rannala and Yang 2007; Zhu et al. 2015; Angelis et al. 2018). The Bayesian credibility intervals will not approach zero even when the number of sampled sites approaches statistical infinity, which is generally the case for phylogenomic data. This is because molecular sequences only provide information on the number of site-wise substitutions, without informing on rates and elapsed times independently. Therefore, uncertainties are recovered in the Bayesian posterior distributions of divergence times.
Due to the array of advantages offered by the Bayesian framework, it has become the predominant approach for estimating rates and times, leading to a surge in the proposal of new methods (Table 1). One drawback of the Bayesian framework is the impact of prior distribution choice on divergence time estimates (e.g. Battistuzzi et al. 2010; Christin et al. 2014; dos Reis et al. 2014, 2015; Foster et al. 2017; Pacheco et al. 2018). Therefore, while Bayesian molecular dating is a valuable tool for inferring timescales and evolutionary rates, estimates should be interpreted critically and should be ideally associated with hypothesis testing and prior distribution scrutiny (Bromham et al. 2018; Bromham 2019).
Table 1.
A summary of the most widely used Bayesian dating programs that relax the molecular clock assumption, detailing their available models of rate evolution and their capability to perform node and/or tip-dating under a relaxed clock framework.
| Software | Model of rate evolution | Node dating | Tip dating | References |
|---|---|---|---|---|
| BactDating | UR | … | √ | Didelot et al. 2018 |
| BEAST | AR; UR; RLC | √ | √ | Suchard et al. 2018 |
| BEAST2 | UR; RLC | √ | √ | Bouckaert et al. 2019 |
| Coevol | AR + UR | √ | √a | Lartillot and Poujol 2011 |
| DPPDiv/PLL-DPPDiv | DMC | √ | … | Flouri and Stamatakis 2012; Heath et al. 2012 |
| MCMCTree | AR; UR | √ | √ | Yang 1997, 2007 |
| MrBayes | UR | √ | √a | Huelsenbeck and Ronquist 2001; Ronquist and Huelsenbeck 2003; Ronquist et al. 2012 |
| Multidivtime | AR | √ | … | Thorne et al. 1998; Kishino et al. 2001 |
| PhyloBayes | AR; UR | √ | … | Lartillot and Philippe 2004; Lartillot and Philippe 2006; Lartillot et al. 2007 |
| PhyTime | AR | √ | … | Guindon 2010 |
| RevBayes | AR; UR | √ | √a | Hohna et al. 2016 |
UR, uncorrelated (independent) rates; AR, autocorrelated rates; RLC, random local clock; DMC, discrete multirate clock.
aThese models were primarily developed for tip-dating using fossil data or ancient biological samples, rather than for pathogen data.
Development of Uncorrelated Rate Models: BEAST and MCMCTree
Although autocorrelation of rates is likely to occur when the main component of rate variation is lineage effects, it is difficult to differentiate between these effects from stochastic events over short timescales. In contrast, when comparing distantly related lineages, it is biologically expected that the influence of the inherited determinants on rates will diminish (Drummond et al. 2006). Therefore, ideally, rate evolution models should be flexible enough not to assume between-lineage rate autocorrelation a priori. If rates are indeed autocorrelated, this would be evidenced in the posterior distributions of model parameters.
The first model to eliminate the correlation constraint was the uncorrelated model, initially implemented in BEAST (Drummond et al. 2006; Drummond and Rambaut 2007). In this model, the rate at each branch is independent draws from a lognormal distribution (Fig. 2b). Drummond et al. (2006) also proposed that the tree topology should also be estimated concomitantly to evolutionary rates and divergence times, in a general “relaxed phylogenetics” Bayesian framework. Simulations indicated that the uncorrelated lognormal model of rate evolution was able to estimate rates accurately under several evolutionary scenarios. The authors also developed an uncorrelated exponential model, where an exponential prior distribution on rates was used instead of the lognormal. BEAST was the first software to allow phylogenetic inference while decomposing branch lengths into rates and times.
Rannala and Yang (2007) also proposed an uncorrelated rate evolution model that was implemented in the MCMCTree software (Yang 2007). Unlike BEAST (Drummond et al. 2006), their model used a nondiscretized lognormal distribution as rate prior. This attenuated a bias for uncorrelated rates that existed in Drummond et al. (2006)'s approach. Importantly, in both MCMCTree (Yang and Rannala 2006) and BEAST (Drummond et al. 2006) implementations, it was possible to use probability distributions to calibrate divergence times, instead of relying on hard bounds (Thorne et al. 1998; Kishino et al. 2001). Soft bounds offer greater flexibility to accommodate the inherent uncertainty from the fossil record. Moreover, in BEAST and MCMCTree, the likelihood in each MCMC cycle was fully calculated, in contrast to the multivariate normal approximation used by Thorne et al. (1998). While exact likelihood calculation is statistically advantageous, it imposes a heavy computational burden. Therefore, the approximate likelihood calculation was subsequently integrated into MCMCTree, allowing for faster computation of divergence times (dos Reis and Yang 2011).
For modeling the distribution of node heights along the tree, MCMCTree utilized a birth–death (BD) process, enabling the adjustment of BD parameter values and, consequently, accommodating distinct tree shapes with unevenly spaced branch time lengths. BEAST initially adopted a Yule process as the tree prior for dating species divergences, but later incorporated the BD process (Drummond et al. 2012). Additionally, for population-level data, coalescent priors could be employed (e.g. Pybus and Rambaut 2002; Drummond 2005). The use of soft bounds and the BD process represents advancements in comparison to the methods of Thorne et al. (1998) and Kishino et al. (2001). Over the past decade, significant effort has been invested in developing more realistic tree priors for Bayesian methods (e.g. Heath et al. 2014).
Further Continuous Models within the Bayesian Framework
The theoretical flexibility of the Bayesian relaxed clock fomented the investigation of several models of rate evolution (Fig. 3). Research on these models was driven by the need for greater biological realism, improved statistical properties, or both. Most of these models incorporated the autocorrelation principle evoked by earlier studies. For example, during the initial stages of Bayesian molecular dating, researchers explored the gamma and exponential models of autocorrelated rate evolution (Aris-Brosou and Yang 2002, 2003), along with the Ornstein–Uhlenbeck (OU) process (Aris-Brosou and Yang 2002). In the OU process implemented in PhyBayes (Aris-Brosou and Yang 2002), the rate of a branch is normally distributed, with its mean influenced by the rate of the ancestral branch.
Fig. 3.
Diversity of current molecular dating software depicted according to their assumptions regarding the variation of rates along the phylogenetic tree. Asterisks indicate approaches that use a combination of rate evolution models. Methods marked in light gray indicate approaches that were originally developed with a focus on serially sampled/pathogen data. ME, mixed effects clock (Bletsa et al. 2019); FLC, flexible local clock (Fourment and Darling 2018); S-RLC, shrinkage-based random local clock (Fisher et al. 2023); RLC, random local clock (Drummond and Suchard 2010); AURC, additive uncorrelated relaxed clock (Didelot et al. 2021); WN, white-noise process (Lepage et al. 2007); MRC, mixed relaxed clock (Lartillot et al. 2016); CIR, Cox–Ingersoll–Ross process (Lepage et al. 2007); CPP, compound Poisson process (Huelsenbeck et al. 2000).
In the 2000s, it was argued that Thorne and collaborators’ strategy led to a nonstationary process of rate evolution along a phylogeny, because rate variance would increase linearly with time (Lepage et al. 2007). Alternatively, Lepage et al. (2007) employed a Cox–Ingersoll–Ross (CIR) process (Cox et al. 1985; Lepage et al. 2007) to ensure stationarity of rates and model the evolution of rates in a correlated manner. The same authors also proposed an uncorrelated rates model, known as the white-noise (WN) process, which is a special case of an uncorrelated gamma rate evolution model (Drummond et al. 2006) with a gamma distribution used to model the evolution of rates (Lepage et al. 2007). These rate evolution models were integrated into the PhyloBayes software (Lartillot and Philippe 2004; Lepage et al. 2007; Lartillot et al. 2009), where they can be used alongside complex nucleotide substitution models (Lartillot and Philippe 2004).
In most implementations of relaxed molecular clocks, to calculate the rate of a given branch, the arithmetic average of the rate values at the starting and ending nodes are used. Therefore, although the evolution of rates along the tree is probabilistic, branch rates are treated deterministically. To alleviate this issue, Guindon (2013) assumed that the average substitution rate along branches is governed by a bridged geometric Brownian process, where the expected value is calculated given the fixed rates at the nodes. This strategy was implemented in PhyTime (Guindon 2013).
The flexibility of Bayesian inference also allowed the proposition of a mixed model of rate evolution, which aims to account for the various factors generating rate variation (Lartillot et al. 2016). In this model, Brownian motion is used to describe long-term rate variations, while the WN process is used to model short-term rate fluctuations. In this case, long-term changes in life history traits, such as generation times, and short-term rate variations would be accommodated concomitantly. This rate evolution model can accommodate both correlated and uncorrelated changes in rate evolution and avoids choosing between these two processes. Additionally, this method provides a quantitative measure of the extent to which rate variation in the tree results from an autocorrelated process. It also facilitates the identification of rate fluctuations that are not due to autocorrelation. This approach was implemented in the Coevol program (Lartillot and Poujol 2011; Lartillot and Delsuc 2012; Lartillot 2013).
Discrete Models of Rate Evolution within the Bayesian Framework
Unlike models where rate evolution across lineages follows some continuous distribution (such as the lognormal or exponential), the number of rate classes in a phylogeny can be limited according to some criteria, akin to local clocks. These categories, representing discrete models of rate change, allow only a finite number of rate values (Fig. 1). During the 2010s, several such discrete models of rate evolution emerged within a Bayesian framework. The pioneering approach was the Bayesian random local clock (RLC) (Drummond and Suchard 2010). In this model, each branch is a candidate location for a rate switch between local clocks. Upon detecting a rate switch, subsequent daughter lineages inherit the new rate. Whereas continuous Bayesian relaxed clock methods favor a smooth evolution of the rates throughout the tree, the RLC promotes fewer, more distinct rate changes.
Instead of randomly evaluating local clock classes, Heath et al. (2012) proposed that branch rates could be fitted within a finite number of categories according to a Dirichlet process prior (DPP). Therefore, each branch is assigned to one of the designated DPP categories, permitting branches that are not in ancestor-descendant relationships to be grouped within the same class (Fig. 1c). Unlike the RLC, branches are clustered regardless of their position on the tree. The DPP model was shown to provide more precise rate and time estimates in simulations. It was first implemented in the DPPDiv program (Heath et al. 2012) and was later optimized to a more efficient version (Flouri and Stamatakis 2012).
Further advances in discrete models were made to deal with large data sets. A shrinkage-based approach was developed based on the RLC (shrinkage-based random local clocks [S-RLCs]), saving a significant amount of computational time (Fisher et al. 2023). The autocorrelated shrinkage clock is modeled with a basal autocorrelated rate framework and considers that sudden speed-ups and slowdowns in rates are rare but may eventually occur, resulting in local clocks. Rate variation between ancestral and descendant rates is minimized using a Bayesian bridge prior (Polson et al. 2014) that reduces the number of rate switches, making it useful to determine the number and location of local clocks.
Finally, some discrete rate evolution models still adopted some form of the early a priori definition of rate classes. For instance, in the flexible local clocks (FLCs) (Fourment and Darling 2018), entire clades may share the same rate of evolution, while others may have rates evolving according to a different model. The researcher needs to specify each rate evolution model for the several local clocks. FLC is available in a BEAST2 package (Bouckaert et al. 2019). Similarly, the mixed effects (ME) molecular clock (Vrancken et al. 2014; Bletsa et al. 2019), also implemented in BEAST, assumes that rates vary among branches in addition to clade or lineage-specific effects on the rate. Therefore, this model allows for multiple uncorrelated relaxed clock models in a single tree. It also requires a priori specification of the local clocks.
Organism-Specific Models: Viruses and Bacteria
Rate evolution models were proposed with a focus on fast-evolving pathogens, which typically involves a strategy known as tip dating. In this approach, the dates associated with the tips of a phylogenetic tree, usually the collection times of the sampled pathogen sequences, are used to directly calibrate the tree (i.e. the sampling times of the sequences are used as constraints). In viral evolution, rate evolution modeling is mostly performed using uncorrelated rate models or local clocks. Due to the short timescales involved, changes in life history traits such as generation times—which often justify the use of autocorrelated clocks—do not usually apply (Worobey et al. 2014). Nonetheless, the use of uncorrelated molecular clocks can introduce biases in molecular dating of virus data sets, especially when rates systematically vary across different subtrees. In addition to the FLC and ME discrete rate evolution models, which have been used to analyze pathogen data, substantial advancements have been made in continuous rate evolution models. These models are now applied not only to viral data but also to dating divergences of metazoan lineages (Othman et al. 2022; Asadollahi et al. 2023; Kovacs et al. 2024).
Focusing on bacterial evolution, Didelot et al. (2018) designed a Bayesian method to estimate divergence times in bacterial populations that handle homologous recombination, which can severely impact molecular dating if unaccounted for (Didelot et al. 2018). Unlike the usual approach of modeling rate evolution itself, the authors modeled branch lengths employing a gamma distribution. By doing this, the authors could isolate substitutions that were associated with the nonrecombinant fraction of the genome. The method was implemented in the program BactDating, which does not inherently assume strict or relaxed clock models; instead, it assigns equal prior weight to both. Then, a reversible jump MCMC computation (Green 1995) is used to propose movements between both clocks.
Later, Didelot et al. (2021) proposed an uncorrelated rate model considering an additive property between branches. This property denotes that the expected total number of substitutions across two neighboring branches should be distributed as the number of substitutions observed on a single branch of equivalent length. The authors proposed the use of a gamma distribution that accounts for the additive property of molecular data to model the mutation rates (Didelot et al. 2021). This additivity is especially important for dating pathogens because of the many short branches present in the tree due to intensive sampling of sequences within a short timeframe. This approach, called additive uncorrelated relaxed clock (AURC), is also implemented in BactDating (Didelot et al. 2018), as well as in treedater (Volz and Frost 2017) and BEAST2 (Bouckaert et al. 2019).
Non-Bayesian Approaches
Since the proposal of the first Bayesian approach to model rate evolution (Thorne et al. 1998), it has been recognized that, although the Bayesian framework offers several advantages for molecular dating, it suffers from the serious drawback of requiring significant computational time (Sanderson et al. 2004; Rutschmann 2006; Ho and Duchêne 2014; dos Reis et al. 2016). This limitation prompted the development of non-Bayesian methods as alternatives to save computational time (e.g. Pérez-Losada et al. 2004; Ericson et al. 2006). Therefore, rate smoothing methods were refined and implemented in other software packages, such as the NPRS in DAMBE (Xia and Yang 2011; Xia 2018), and the PL in both treePL (Smith and O’Meara 2012) and ape (Paradis et al. 2004; Paradis and Schliep 2019) (Table 2). Further methods within the non-Bayesian framework were developed, employing distinct strategies to relax the strict molecular clock assumption. Owing to their rapid execution times, these methods are often classified as fast-dating approaches for estimating divergence times. A comparative evaluation of fast-dating methods is beyond the scope of this review; therefore, we focus on the main assumptions made in estimating rate variation (Fig. 3). For a thorough review about the performance and drawbacks of non-Bayesian methods, refer to Tao et al. (2020b).
Table 2.
A summary of the most widely used non-Bayesian molecular dating software that relax the molecular clock assumption, detailing the strategy used to accommodate rate variation and their capability to perform node and/or tip-dating under a relaxed clock framework.
| Software | Model of rate evolution | Node dating | Tip dating | Reference |
|---|---|---|---|---|
| ape (chronos) | AR; UR; DMC | √ | … | Paradis and Schliep 2019 |
| DAMBE | LC; AR | √ | √ | Xia and Yang 2011; Xia 2018 |
| LSDa | UR | √ | √ | To et al. 2016 |
| MEGA (RelTime and RTDT) | AR | √ | √ | Tamura et al. 2012, 2018b; Miura et al. 2020; Tamura et al. 2021 |
| PATHd8 | AR | √ | … | Britton et al. 2007 |
| Physher | LC; DMC | √ | √ | Fourment and Holmes 2014 |
| r8S | AR | √ | √ | Sanderson 2003 |
| treedater | UR | … | √ | Volz and Frost 2017 |
| treePL | AR | √ | … | Smith and O’Meara 2012 |
| TreeTime | AR | … | √ | Sagulenko et al. 2018 |
| (w)LogDate | UR | √ | √ | Mai and Mirarab 2021 |
UR, uncorrelated (independent) rates; AR, autocorrelated rates; LC, local clock; DMC, discrete multirate clock.
aLeast-squares dating (LSD) method is also implemented in IQ-TREE software.
Regarding rate evolution, an approach based on mean path lengths (MPLs) was developed, which calculates the average distance from a node to all its leaves (Britton et al. 2002), and was implemented in the PATHd8 software (Britton et al. 2007). In this method, rate variation is smoothed locally by averaging over all the path lengths descending from a node, implicitly assuming that substitution rates do not vary significantly. This contrasts with PL and Bayesian autocorrelated models, where the smoothing occurs between ancestral and descendant branches. The performance of PATHd8 diminishes as rate variation increases, and studies have demonstrated that, although much faster, it performs worse compared to PL (Britton et al. 2007; Smith and O’Meara 2012).
Building on the original PL approach, Paradis (2013) developed a method where rates evolve independently, relaxing the autocorrelation assumption. This method was implemented it in R's ape package through the chronos function (Paradis et al. 2004; Paradis and Schliep 2019). Additionally, PL was improved to deal with large phylogenies in the software treePL, reducing the probability of becoming trapped in local optima (Smith and O’Meara 2012). This implementation has gained a lot of popularity for estimating divergence times without the computational burden of Bayesian methods, making treePL one of the most widely used non-Bayesian options (e.g. Lu et al. 2018; Rabosky et al. 2018; Smith and Brown 2018; Li et al. 2019; Zhang et al. 2020; Kawahara et al. 2023).
The RelTime is another widely used method implemented in the popular MEGA software (Tamura et al. 2012, 2018a) (Tamura et al. 2021). It offers an algebraic solution to minimize rate differences between ancestral and descendent lineages, thereby accommodating rate variation through the calculation of relative rates in a correlated manner (Tamura et al. 2018a). Unlike PL, which employs a global penalty function to regulate rate changes across the entire phylogeny, RelTime minimizes differences in substitution rates recursively (Tao et al. 2020b). The relative rate framework of RelTime was further developed to accommodate serially sampled sequences in the RTDT method, also available in MEGA (Miura et al. 2020). RelTime has been employed for molecular dating across a broad spectrum of biological lineages (Lewin et al. 2016; Irisarri et al. 2018; Fan et al. 2022; Cepeda et al. 2024; Pi et al. 2024; Surizon et al. 2024).
Both treePL and RelTime complete calculations up to orders of magnitude faster than Bayesian methods (Tamura et al. 2012; Tao et al. 2020b). Based on empirical evaluation and simulated data, they provide reliable estimates of divergence times (Smith and O’Meara 2012; Mello et al. 2017, 2021; Barba-Montoya et al. 2021; Costa et al. 2022). RelTime has particularly demonstrated good performance in estimating evolutionary rates under distinct rate evolution scenarios, even when there is rate acceleration in specific clades within a phylogeny (Barba-Montoya et al. 2021). However, the accuracy of non-Bayesian methods may be compromised by extremely short branches. In contrast, Bayesian models can assign longer elapsed times to such branches, facilitated by the use of tree prior models such as the coalescent (Mello et al. 2021).
Various fast-dating methods have an emphasis on tip dating, particularly fast-evolving pathogens. Due to the rapid growth of genomic data of viruses and bacteria, the development of tip dating approaches that accommodate rate variation has flourished over the last decade. Consequently, there has been a surge in studies adopting non-Bayesian methods aimed at enhancing genomic surveillance of pathogens (e.g. Weill et al. 2017; Sánchez-Busó et al. 2019; Avanzato et al. 2020; Naveca et al. 2021; Da Silva et al. 2022; Jones et al. 2022; Viana et al. 2022; Kandeil et al. 2023; Serrano-Fujarte et al. 2024).
For instance, discrete multirate clocks and local clocks were developed and implemented in Physher (Fourment and Holmes 2014), which uses both greedy and genetic algorithms to find the number and location of the local clocks as well as the number and allocations of rate categories (Fourment and Holmes 2014). On the other hand, branch-specific rate variation can be accommodated by assuming autocorrelation or independence of rates. RTDT is based on the relative rate framework and, therefore, assumes autocorrelation (Miura et al. 2020). TreeTime is a ML method that allows for controlled rate variation using a normal prior, enabling rates to exhibit either an autocorrelated or uncorrelated pattern (Sagulenko et al. 2018). Treedater combines ML and least squares criteria to allow rates to evolve independently across branches (Volz and Frost 2017). Least-squares dating (LSD) is also centered on the principle of uncorrelated rates (To et al. 2016). It is based on the Langley and Fitch (1974) approach (LF), which assumes a strict clock and uses a Poisson distribution to model the number of substitutions along each branch of the phylogeny; subsequently, a single global evolutionary rate is estimated by ML. Unlike LF, LSD employs a least squares approach to estimate rates and assumes that the noise introduced in rate variation is normally distributed. The method is also implemented in IQ-TREE software (Minh et al. 2020). Similarly inspired by LF, the (w)LogDate method (Mai and Mirarab 2021) minimizes the variance of rate using log transformation and allows uncorrelated rate variation without assuming a specific rate distribution. This is an improvement over previous LF-based methods because it avoids unrealistic assumptions such as negative rates, which can occur when using a Gaussian model, as in LSD (Mai and Mirarab 2021).
Choosing between Rate Evolution Models
Rate modeling, like any complex system modeling, requires balancing model complexity with its applicability. Highly complex models often incur significant computational costs and are susceptible to issues such as high variance of the inferred parameters and challenges with their identifiability. Given the diverse range of rate evolution models developed over the years, it is reasonable to ask which model best describes the processes that generated the observed data. As a result, various statistical procedures for model comparison have been devised to address this question, particularly in the context of rate evolution models. Essentially, all tests of the molecular clock hypothesis involve comparisons of rate evolution models, with the assumption of rate constancy serving as the null model. The early tests of Langley and Fitch (1974) and likelihood ratio tests of rate constancy hypothesis (Felsenstein 1988; Muse and Weir 1992) are examples of model comparison within a frequentist statistical framework. It is worth mentioning that the concept of hypothesis testing for rate constancy using likelihood ratio tests was first introduced in Felsenstein's (1981) article, where he presented the application of ML for estimating phylogenetic trees. However, the test itself was neither implemented nor applied in that study.
In a Bayesian framework, most model comparisons rely on calculating the marginal likelihood, which measures the probability of the observed data under a specific model. This quantity is the integral of the likelihood function over all parameter space, weighted by their prior distributions. Models can be compared using statistical measures such as Bayes factors (BFs), which are calculated as the ratio of marginal likelihoods between models (Jeffreys 1935). A higher BF indicates stronger evidence in favor of one model over another.
However, calculating the marginal likelihood can be particularly challenging in multiparameter scenarios, which is the case for most rate evolution models. Therefore, approximations of the marginal likelihood are necessary (Oaks et al. 2019). In theory, one could consider using the posterior likelihoods obtained during the MCMC run and calculate the harmonic mean as an approximation of the marginal likelihood of the model (Newton and Raftery 1994; Kass and Raftery 1995). However, using samples from the MCMC run is problematic because these estimates are heavily influenced by the likelihood function, and parameter values with low probability are hardly sampled (Oaks et al. 2019). This leads to poor approximations of the marginal likelihood. Several techniques have been proposed to mitigate this issue, typically aiming to better balance the influence of the posterior and the prior, thereby enhancing the reliability of the marginal likelihood estimation for a model (Fourment et al. 2020).
To compare models of rate evolution, Lepage et al. (2007) used path sampling (also known as thermodynamic integration) to approximate BFs and concluded that autocorrelated models (CIR and lognormal) provided a better fit to the three data sets investigated than all uncorrelated models examined. Later, Baele et al. (2013), besides path sampling, also implemented the stepping-stone method to approximate marginal likelihoods and concluded that the autocorrelated lognormal was the model that better fit data when compared to uncorrelated exponentially distributed rates. Another efficient method to approximate marginal likelihoods is the generalized stepping-stone sampling, which was used to test models of episodic (branch-specific) rate acceleration (Tay et al. 2023). The computational cost of these methods is usually very high; therefore, faster algorithms such as nested sampling has been proposed for phylogenetic problems, including the comparison between rate evolution models (Russel et al. 2019). These approximation methods are implemented in both BEAST and BEAST2 software (Suchard et al. 2018; Bouckaert et al. 2019).
Due to the technical challenges involved in approximating marginal likelihoods, alternative approaches have been explored to assess how well models fit data. In Bayesian inference, Bayesian model averaging (BMA) serves as an effective method for accounting for model uncertainty during parameter estimation. This technique weights parameter estimates across multiple models according to each model’s posterior probabilities. These probabilities are derived from the marginal likelihoods of models and are used for the computation of BFs, which evaluate the relative evidence provided by the models. Li and Drummond (2012) used BMA to show that, based on the mammalian genes analyzed, the lognormal uncorrelated model better described the data than the exponential. Baele et al. (2013) showed that BFs obtained via model averaging performed as well as BFs calculated by the path sampling and stepping-stone methods.
Another intuitive way to investigate whether a model reasonably describes the data-generating process is to verify how well the model predicts new data. This can be carried out by splitting the data into training and test sets, as performed by several techniques of cross-validation. Another approach is posterior predictive analysis, in which new data are simulated using the posterior distribution of parameters. Then, some metric is employed to verify the extent to which simulated data differs from the observed data (Lewis et al. 2014). In evaluating the adequacy of rate evolution models to diverse taxonomic data sets, Duchêne et al. (2015) used the posterior predictive approach and concluded that overparameterized models are generally more accurate. Duchêne et al. (2016) showed that cross-validation was capable of distinguishing between the strict and relaxed clocks.
Evaluating rate evolution models in a non-Bayesian framework is complicated by the fact that the phylogenetic likelihood is calculated for branch lengths instead of the rates and times independently. Rate evolution models are essentially implemented via prior density functions in the Bayes formula. Thus, the several probabilistic functions discussed above, such as the lognormal and exponential, are generally not explicitly formulated in non-Bayesian methods (but see Paradis 2013). However, model adequacy to empirical data can be investigated in a non-Bayesian setting. For instance, Tao et al. (2019) elaborated a statistic (CorrTest) that evaluates the probability that the data evolved under rate autocorrelation. Using simulated data from both autocorrelated and uncorrelated scenarios, they trained a logistic regression model with features that were extracted using the relative rates framework (Tamura et al. 2018a). This very fast method outperformed BFs calculated using the stepping-stone approximation of the marginal likelihood, especially in larger data sets, which are increasingly common in the phylogenomic era. They also found that rate autocorrelation was widespread in data sets from taxonomically diverse species compositions.
Perspectives on Modeling Rate Evolution
The development of models for evolutionary rate evolution has increasingly focused on relaxing the molecular clock. This represents a shift from the theoretical studies motivated by the neutral theory, which aimed at elucidating the long-term effects of population-level evolutionary factors on nucleotide and amino acid changes across species (Kimura 1983; Gillespie 1991). The emphasis on divergence time estimation has often relegated the estimation of rates per se to an auxiliary role, even though relaxing the clock fundamentally concerns rate evolution modeling. Here, we have seen that several models were developed, typically guided by specific assumptions—whether correlated or not—or mathematical convenience (e.g. number of parameters), provided they deliver accurate estimates of timescales. In philosophical parlance, models became more phenomenological (pattern oriented) than mechanistic (cause/process oriented) (White and Marshall 2019). This shift is likely a result of the high complexity of the underlying problem.
Advances in theoretical phylogenetics that incorporate population dynamics, such as the multispecies coalescent model (Degnan and Rosenberg 2009; Edwards et al. 2016), may enable rate evolution models across lineages to be linked to population-level parameters. It is worth noting that most relaxed clock methods assume that all genes share the same set of divergence times, a simplification that overlooks the variance in coalescent times among genomic segments. In fact, Thorne and Kishino (2002) recognized this limitation and foresaw the potential of incorporating population processes into molecular dating. Currently, several approaches allow divergence time inference considering that speciation times may differ from the coalescent times of genes (Degnan and Salter 2005; Liu and Pearl 2007; Liu et al. 2009, 2010; Heled and Drummond 2010). Consequently, inference of the elapsed time and evolutionary rate of a branch using multigene data sets should also take this process into account (Ogilvie et al. 2017; Tabatabaee et al. 2023).
Another important consideration is the incorporation of site-specific heterogeneity in rate trajectories along the phylogenetic tree. For example, Lee et al. (2015, 2016) demonstrated that applying different relaxed clocks to various substitution types can significantly impact divergence time estimates. This approach is biologically sound, as mutations arise from diverse sources that may vary over time. For instance, CpG dinucleotides have been shown to evolve more consistently with a molecular clock in mammals compared to non-CpG sites (e.g. Hwang and Green 2004; Kim et al. 2006; Peifer et al. 2008; Moorjani et al. 2016). As a result, different substitution types follow their own distinct rate histories. Although this method has been applied in some studies (Lee et al. 2015, 2016; Campbell et al. 2021), it has yet to be widely implemented in commonly used dating software. Adopting this approach more broadly could represent a significant advancement in addressing rate heterogeneity, leading not only to improved divergence time estimates but also to a deeper understanding of the processes driving genomic evolution.
Rate evolution modeling also faces the challenges associated with estimating branch lengths using models of nucleotide or amino acid substitutions. Significant efforts have been made to develop more realistic substitution models by incorporating mixture models that account for site and lineage heterogeneity, or both (Lartillot and Philippe 2004; Crotty et al. 2020). These models can be integrated with models of rate evolution to estimate divergence times, as seen in PhyloBayes. However, they come at the cost of computationally intensity, even within a Bayesian framework. This problem is aggravated if nondichotomous topological relationships between species, such as introgression, are considered (Degnan 2018). Thus, the development of more realistic models relies on efficient algorithms for parametric inference. Despite their efficiency, these algorithms still require substantial computational time to reach stationarity in MCMC runs, which highlights an important concern regarding the carbon footprint of bioinformatics research (Grealey et al. 2022). Consequently, if faster methods offer comparable performance, they should be preferred due to their reduced environmental impact (Tao et al. 2020a; Kumar 2022).
Recent applications of several machine learning algorithms to evolutionary biology have opened up promising avenues for exploring models of rate evolution (Schrider and Kern 2018; Schrago and Mello 2020; Azouri et al. 2021). Most implementations in this field involve supervised learning, where each observation is associated with labels (response variables). These algorithms are trained to optimize model fit to the training data, often using complex mathematical structures such as neural network models (deep learning), which can complicate human interpretation of the models (James et al. 2017). While training these models is computationally intensive, their subsequent application is typically very fast (Tao et al. 2019). However, because the actual histories of rate evolution of observed data sets are unknown, model training generally requires simulated data. Unfortunately, the simulation of biological sequences may be overly simplistic (Trost et al. 2024), which is a concern that recent studies measuring germline mutation rates across various lineages might help mitigate (Bergeron et al. 2023). Additionally, sequence simulation can now be informed by empirical data itself during the training of generative adversarial networks (Smith and Hahn 2023). This approach, which has not yet been applied to rate evolution studies, allows for the estimation of evolutionary parameters, represented by the final state of the generator neural network.
Relaxed Bayesian models have become increasingly complex, requiring sophisticated algorithms for parametric inference. As genome sequencing became more affordable, these methods faced the challenge of handling the data deluge of the genomic era. This challenge was highlighted during the SARS-CoV-2 pandemic, when millions of virus genomes needed to be analyzed using these advanced tools (gisaid.org). Recent methodological developments that update parameter estimates as sequences are generated, rather than reestimating parameters de novo in an automated online phylogenetic approach, could be incorporated into evolutionary rate inference (Kramer et al. 2023; Truszkowski et al. 2023). Alternatively, the development and improvement of fast-dating methods should not cease, as they offer rapid means to retrieve rate and time estimates.
Models of rate evolution and the relaxation of the molecular clock have become major areas of research in evolutionary biology. The methodological flexibility provided by the Bayesian framework has significantly impacted this field, fueling a variety of models and strategies for molecular dating. Furthermore, the use of rate evolution models has expanded beyond estimating biological timescales to address diverse problems. Identifying genomic patterns in the pace of molecular evolution (Duchêne and Ho 2015; Mello and Schrago 2019), inferring macroevolutionary scenarios (Rabosky et al. 2013; Rabosky 2014; Uyeda and Harmon 2014), and detecting punctuated molecular evolution (Manceau et al. 2020) represent advancements that highlight the wide-ranging impact of these models within evolutionary biology.
Acknowledgments
The authors would like to thank both reviewers and the associate editor for their constructive comments on the previous versions of this manuscript.
Contributor Information
Beatriz Mello, Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, RJ 21941-617, Brazil.
Carlos G Schrago, Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, RJ 21941-617, Brazil.
Funding
B.M. is supported by Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro (FAPERJ) (grants E-26/211.248/2019 and E-26/201.446/2022) and by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) (grant 311231/2022-5). C.G.S. is supported by CNPq (grants 409963/2023-2, 401725/2022-7, and 309165/2019-9).
Data Availability
No new data were generated or analyzed in support of this research.
Literature Cited
- Angelis K, Álvarez-Carretero S, Dos Reis M, Yang Z. An evaluation of different partitioning strategies for Bayesian estimation of species divergence times. Syst Biol. 2018:67(1):61–77. 10.1093/sysbio/syx061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aris-Brosou S. Dating phylogenies with hybrid local molecular clocks. PLoS One. 2007:2(9):e879. 10.1371/journal.pone.0000879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aris-Brosou S, Yang Z. Effects of models of rate evolution on estimation of divergence dates with special reference to the metazoan 18S ribosomal RNA phylogeny. Syst Biol. 2002:51(5):703–714. 10.1080/10635150290102375. [DOI] [PubMed] [Google Scholar]
- Aris-Brosou S, Yang Z. Bayesian models of episodic evolution support a late Precambrian explosive diversification of the Metazoa. Mol Biol Evol. 2003:20(12):1947–1954. 10.1093/molbev/msg226. [DOI] [PubMed] [Google Scholar]
- Asadollahi M, Boroumand H, Mohammadi S, Mercado-Salas NF, Ahmadzadeh F. Molecular and morphological evidence reveals the presence of the tadpole shrimp Lepidurus cf. couesii (crustacea: Branchiopoda) in Iran. Zool Anz. 2023:306:1–9. 10.1016/j.jcz.2023.06.009. [DOI] [Google Scholar]
- Avanzato VA, Matson MJ, Seifert SN, Williamson PR, Anzick BN, Barbian SL, Judson K, Fischer SD, Martens ER, Bowden C, et al. Case study: prolonged infectious SARS-CoV-2 shedding from an asymptomatic immunocompromised individual with cancer. Cell. 2020:183(7):1901–1912.e9. 10.1016/j.cell.2020.10.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Azouri D, Abadi S, Mansour Y, Mayrose I, Pupko T. Harnessing machine learning to guide phylogenetic-tree search algorithms. Nat Commun. 2021:12(1):1983. 10.1038/s41467-021-22073-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baele G, Li WLS, Drummond AJ, Suchard MA, Lemey P. Accurate model selection of relaxed molecular clocks in Bayesian phylogenetics. Mol Biol Evol. 2013:30(2):239–243. 10.1093/molbev/mss243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barba-Montoya J, Tao Q, Kumar S. Assessing rapid relaxed-clock methods for phylogenomic dating. Genome Biol Evol. 2021:13(11):evab251. 10.1093/gbe/evab251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Battistuzzi FU, Filipski A, Hedges SB, Kumar S. Performance of relaxed-clock methods in estimating evolutionary divergence times and their credibility intervals. Mol Biol Evol. 2010:27(6):1289–1300. 10.1093/molbev/msq014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bell CD, Soltis DE, Soltis PS. The age of the angiosperms: a molecular timescale without a clock. Evolution. 2005:59(6):1245–1258. 10.1111/j.0014-3820.2005.tb01775.x [DOI] [PubMed] [Google Scholar]
- Bergeron LA, Besenbacher S, Zheng J, Li P, Bertelsen MF, Quintard B, Hoffman JI, Li Z, St Leger J, Shao C, et al. Evolution of the germline mutation rate across vertebrates. Nature. 2023:615(7951):285–291. 10.1038/s41586-023-05752-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blanquart S, Lartillot N. A site- and time-heterogeneous model of amino acid replacement. Mol Biol Evol. 2008:25(5):842–858. 10.1093/molbev/msn018. [DOI] [PubMed] [Google Scholar]
- Bletsa M, Suchard MA, Ji X, Gryseels S, Vrancken B, Baele G, Worobey M, Lemey P. Divergence dating using mixed effects clock modelling: an application to HIV-1. Virus Evol. 2019:5(2):vez036. 10.1093/ve/vez036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bouckaert R, Vaughan TG, Barido-Sottani J, Duchêne S, Fourment M, Gavryushkina A, Heled J, Jones G, Kühnert D, Maio D, et al. BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis. PLoS Comput Biol. 2019:15(4):e1006650. 10.1371/journal.pcbi.1006650. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Britten RJ. Rates of DNA sequence evolution differ between taxonomic groups. Science. 1986:231(4744):1393–1398. 10.1126/science.3082006. [DOI] [PubMed] [Google Scholar]
- Britton T, Anderson CL, Jacquet D, Lundqvist S, Bremer K. Estimating divergence times in large phylogenetic trees. Syst Biol. 2007:56(5):741–752. 10.1080/10635150701613783. [DOI] [PubMed] [Google Scholar]
- Britton T, Oxelman B, Vinnersten A, Bremer K. Phylogenetic dating with confidence intervals using mean path lengths. Mol Phylogenet Evol. 2002:24(1):58–65. 10.1016/s1055-7903(02)00268-3. [DOI] [PubMed] [Google Scholar]
- Bromham L. Six impossible things before breakfast: assumptions, models, and belief in molecular dating. Trends Ecol Evol. 2019:34(5):474–486. 10.1016/j.tree.2019.01.017. [DOI] [PubMed] [Google Scholar]
- Bromham L, Duchêne S, Hua X, Ritchie AM, Duchêne DA, Ho SYW. Bayesian molecular dating: opening up the black box. Biol Rev Camb Philos Soc. 2018:93(2):1165–1191. 10.1111/brv.12390. [DOI] [PubMed] [Google Scholar]
- Bulmer M. Estimating the variability of substitution rates. Genetics. 1989:123(3):615–619. 10.1093/genetics/123.3.615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Campbell CR, Tiley GP, Poelstra JW, Hunnicutt KE, Larsen PA, Lee HJ, Thorne JL, Reis D, Yoder M, D A. Pedigree-based and phylogenetic methods support surprising patterns of mutation rate and spectrum in the gray mouse lemur. Heredity (Edinb). 2021:127(2):233–244. 10.1038/s41437-021-00446-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cao Y, Fujiwara M, Nikaido M, Okada N, Hasegawa M. Interordinal relationships and timescale of eutherian evolution as inferred from mitochondrial genome data. Gene. 2000:259(1-2):149–158. 10.1016/s0378-1119(00)00427-3. [DOI] [PubMed] [Google Scholar]
- Cepeda AS, Mello B, Pacheco MA, Luo Z, Sullivan SA, Carlton JM, Escalante AA. The genome of Plasmodium gonderi : insights into the evolution of human malaria parasites. Genome Biol Evol. 2024:16(2):evae027. 10.1093/gbe/evae027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Christin PA, Spriggs E, Osborne CP, Strömberg CA, Salamin N, Edwards EJ. Molecular dating, evolutionary rates, and the age of the grasses. Syst Biol. 2014:63(2):153–165. 10.1093/sysbio/syt072. [DOI] [PubMed] [Google Scholar]
- Clarck NL, Alani E, Aquadro CF. Evolutionary rate covariation reveals shared functionality and coexpression of genes. Genome Res. 2012:22(4):714–720. 10.1101/gr.132647.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Costa FP, Schrago CG, Mello B. Assessing the relative performance of fast molecular dating methods for phylogenomic data. BMC Genomics. 2022:23(1):798. 10.1186/s12864-022-09030-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cox JC, Ingersoll JE, Ross SA. A theory of the term structure of interest rates. Econometrica. 1985:53(2):385. 10.2307/1911242. [DOI] [Google Scholar]
- Crotty SM, Minh BQ, Bean NG, Holland BR, Tuke J, Jermiin LS, Haeseler AV. GHOST: recovering historical signal from heterotachously evolved sequence alignments. Syst Biol. 2020:69(2):249–264. 10.1093/sysbio/syz051. [DOI] [PubMed] [Google Scholar]
- Cutler DJ. Estimating divergence times in the presence of an overdispersed molecular clock. Mol Biol Evol. 2000:17(11):1647–1660. 10.1093/oxfordjournals.molbev.a026264. [DOI] [PubMed] [Google Scholar]
- da Silva KE, Tanmoy AM, Pragasam AK, Iqbal J, Sajib MSI, Mutreja A, Veeraraghavan B, Tamrakar D, Qamar FN, Dougan G, et al. The international and intercontinental spread and expansion of antimicrobial-resistant Salmonella typhi: a genomic epidemiology study. Lancet Microbe. 2022:3(8):e567–e577. 10.1016/S2666-5247(22)00093-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Degnan JH. Modeling hybridization under the network multispecies coalescent. Syst Biol. 2018:67(5):786–799. 10.1093/sysbio/syy040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Degnan JH, Rosenberg NA. Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol Evol. 2009:24(6):332–340. 10.1016/j.tree.2009.01.009. [DOI] [PubMed] [Google Scholar]
- Degnan JH, Salter LA. Gene tree distributions under the coalescent process. Evolution. 2005:59:24–37. 10.1111/j.0014-3820.2005.tb00891.x. [DOI] [PubMed] [Google Scholar]
- Delsuc F, Vizcaíno SF, Douzery EJ. Influence of Tertiary paleoenvironmental changes on the diversification of South American mammals: a relaxed molecular clock study within xenarthrans. BMC Evol Biol. 2004:4(1):11. 10.1186/1471-2148-4-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dickerson RE. The structure of cytochromec and the rates of molecular evolution. J Mol Evol. 1971:1(1):26–45. 10.1007/BF01659392. [DOI] [PubMed] [Google Scholar]
- Didelot X, Croucher NJ, Bentley SD, Harris SR, Wilson DJ. Bayesian inference of ancestral dates on bacterial phylogenetic trees. Nucleic Acids Res. 2018:46(22):e134–e134. 10.1093/nar/gky783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Didelot X, Siveroni I, Volz EM. Additive uncorrelated relaxed clock models for the dating of genomic epidemiology phylogenies. Mol Biol Evol. 2021:38(1):307–317. 10.1093/molbev/msaa193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Donoghue PC, Yang Z. The evolution of methods for establishing evolutionary timescales. Philos Trans R Soc Lond B Biol Sci. 2016:371(1699):20160020. 10.1098/rstb.2016.0020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Doolittle RF, Blomback B. Amino-acid sequence investigations of fibrinopeptides from various mammals: evolutionary implications. Nature. 1964:202(4928):147–152. 10.1038/202147a0. [DOI] [PubMed] [Google Scholar]
- dos Reis M, Donoghue PC, Yang Z. Bayesian molecular clock dating of species divergences in the genomics era. Nat Rev Genet. 2016:17(2):71–80. 10.1038/nrg.2015.8. [DOI] [PubMed] [Google Scholar]
- dos Reis M, Thawornwattana Y, Angelis K, Telford MJ, Donoghue PC, Yang Z. Uncertainty in the timing of origin of animals and the limits of precision in molecular timescales. Curr Biol. 2015:25(22):2939–2950. 10.1016/j.cub.2015.09.066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- dos Reis M, Yang Z. Approximate likelihood calculation on a phylogeny for Bayesian estimation of divergence times. Mol Biol Evol. 2011:28(7):2161–2172. 10.1093/molbev/msr045. [DOI] [PubMed] [Google Scholar]
- dos Reis M, Zhu T, Yang Z. The impact of the rate prior on Bayesian estimation of divergence times with multiple loci. Syst Biol. 2014:63(4):555–565. 10.1093/sysbio/syu020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Douady CJ, Douzery EJP. Molecular estimation of eulipotyphlan divergence times and the evolution of ‘Insectivora’. Mol Phylogenet Evol. 2003:28(2):285–296. 10.1016/s1055-7903(03)00119-2. [DOI] [PubMed] [Google Scholar]
- Drummond AJ. Bayesian coalescent inference of past population dynamics from molecular sequences. Mol Biol Evol. 2005:22(5):1185–1192. 10.1093/molbev/msi103. [DOI] [PubMed] [Google Scholar]
- Drummond AJ, Ho SYW, Phillips MJ, Rambaut A. Relaxed phylogenetics and dating with confidence. PLoS Biol. 2006:4(5):e88. 10.1371/journal.pbio.0040088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drummond AJ, Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007:7(1):214. 10.1186/1471-2148-7-214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drummond AJ, Suchard MA. Bayesian random local clocks, or one rate to rule them all. BMC Biol. 2010:8(1):114. 10.1186/1741-7007-8-114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drummond AJ, Suchard MA, Xie D, Rambaut A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol. 2012:29(8):1969–1973. 10.1093/molbev/mss075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duchêne S, Duchêne DA, Giallonardo D, Eden F, Geoghegan JS, Holt JL, Ho KE, Holmes SY, C E. Cross-validation to select Bayesian hierarchical models in phylogenetics. BMC Evol Biol. 2016:16(1):115. 10.1186/s12862-016-0688-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duchêne DA, Duchêne S, Holmes EC, Ho SYW. Evaluating the adequacy of molecular clock models using posterior predictive simulations. Mol Biol Evol. 2015:32(11):2986–2995. 10.1093/molbev/msv154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duchêne S, Ho SYW. Mammalian genome evolution is governed by multiple pacemakers. Bioinformatics. 2015:31(13):2061–2065. 10.1093/bioinformatics/btv121. [DOI] [PubMed] [Google Scholar]
- Edwards AWF. Estimation of the branch points of a branching diffusion process. J R Stat Soc Ser B Methodol. 1970:32(2):155–174. 10.1111/j.2517-6161.1970.tb00828.x. [DOI] [Google Scholar]
- Edwards SV, Xi Z, Janke A, Faircloth BC, McCormack JE, Glenn TC, Zhong B, Wu S, Lemmon EM, Lemmon AR, et al. Implementing and testing the multispecies coalescent model: a valuable paradigm for phylogenomics. Mol Phylogenet Evol. 2016:94:447–462. 10.1016/j.ympev.2015.10.027. [DOI] [PubMed] [Google Scholar]
- Ericson PG, Anderson CL, Britton T, Elzanowski A, Johansson US, Källersjö M, Ohlson JI, Parsons TJ, Zuccon D, Mayr G. Diversification of Neoaves: integration of molecular sequence data and fossils. Biol Lett. 2006:2(4):543–547. 10.1098/rsbl.2006.0523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fan W, Wang S, Wang H, Wang A, Jiang F, Liu H, Zhao H, Xu D, Zhang Y. The genomes of chicory, endive, great burdock and yacon provide insights into Asteraceae palaeo-polyploidization history and plant inulin production. Mol Ecol Resour. 2022:22(8):3124–3140. 10.1111/1755-0998.13675. [DOI] [PubMed] [Google Scholar]
- Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981:17(6):368–376. 10.1007/BF01734359. [DOI] [PubMed] [Google Scholar]
- Felsenstein J. Estimation of hominoid phylogeny from a DNA hybridization data set. J Mol Evol. 1987:26(1–2):123–131. 10.1007/BF02111286. [DOI] [PubMed] [Google Scholar]
- Felsenstein J. Phylogenies from molecular sequences: inference and reliability. Annu Rev Genet. 1988:22(1):521–565. 10.1146/annurev.ge.22.120188.002513. [DOI] [PubMed] [Google Scholar]
- Fisher AA, Nishimura JX, Baele A, Lemey G, Suchard P, A M. Shrinkage-based random local clocks with scalable inference. Mol Biol Evol. 2023:40(11):msad242. 10.1093/molbev/msad242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fitch WM. Molecular evolutionary clocks. Molecular evolution. Sunderland: Sinauer Associates; 1976. p. 160–178. [Google Scholar]
- Flouri T, Stamatakis A. An improvement to DPPDIV. Heidelberg Institute for Theoretical Studies, Exelixis-RRDR-2012-7; 2012. https://cme.h-its.org/exelixis/pubs/Exelixis-RRDR-2012-7.pdf. [Google Scholar]
- Foster CSP, Sauquet H, van der Merwe M, McPherson H, Rossetto M, Ho SYW. Evaluating the impact of genomic data and priors on Bayesian estimates of the angiosperm evolutionary timescale. Syst Biol. 2017:66(3):338–351. 10.1093/sysbio/syw086. [DOI] [PubMed] [Google Scholar]
- Fourment M, Darling AE. Local and relaxed clocks: the best of both worlds. PeerJ. 2018:6:e5140. 10.7717/peerj.5140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fourment M, Holmes EC. Novel non-parametric models to estimate evolutionary rates and divergence times from heterochronous sequence data. BMC Evol Biol. 2014:14(1):163. 10.1186/s12862-014-0163-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fourment M, Magee AF, Whidden C, Bilge A, Matsen FA, Minin VN. 19 dubious ways to compute the marginal likelihood of a phylogenetic tree topology. Syst Biol. 2020:69(2):209–220. 10.1093/sysbio/syz046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gaut B, Yang L, Takuno S, Eguiarte LE. The patterns and causes of variation in plant nucleotide substitution rates. Annu Rev Ecol Evol Syst. 2011:42(1):245–266. 10.1146/annurev-ecolsys-102710-145119. [DOI] [Google Scholar]
- Giebel LB, Van Santen VL, Slightom JL, Spritz RA. Nucleotide sequence, evolution, and expression of the fetal globin gene of the spider monkey Ateles geoffroyi. Proc Natl Acad Sci U S A. 1985:82(20):6985–6989. 10.1073/pnas.82.20.6985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gillespie JH. Molecular evolution over the mutational landscape. Evolution. 1984a:38(5):1116–1129. 10.2307/2408444. [DOI] [PubMed] [Google Scholar]
- Gillespie JH. The molecular clock may be an episodic clock. Proc Natl Acad Sci U S A. 1984b:81(24):8009–8013. 10.1073/pnas.81.24.8009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gillespie JH. Natural selection and the molecular clock. Mol Biol Evol. 1986a:3(2):138–155. 10.1093/oxfordjournals.molbev.a040382. [DOI] [PubMed] [Google Scholar]
- Gillespie JH. Rates of molecular evolution. Annu Rev Ecol Syst. 1986b:17(1):637–665. 10.1146/annurev.es.17.110186.003225. [DOI] [Google Scholar]
- Gillespie JH. Variability of evolutionary rates of DNA. Genetics. 1986c:113(4):1077–1091. 10.1093/genetics/113.4.1077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gillespie JH. Lineage effects and the index of dispersion of molecular evolution. Mol Biol Evol. 1989:6(6):636–647. 10.1093/oxfordjournals.molbev.a040576. [DOI] [PubMed] [Google Scholar]
- Gillespie JH. The causes of molecular evolution. Oxford, USA: Oxford University Press; 1991. [Google Scholar]
- Gillespie JH, Langley CH. Are evolutionary rates really variable? J Mol Evol. 1979:13(1):27–34. 10.1007/BF01732751. [DOI] [PubMed] [Google Scholar]
- Goldman N. Variance to mean ratio, R(t), for Poisson processes on phylogenetic trees. Mol Phylogenet Evol. 1994:3(3):230–239. 10.1006/mpev.1994.1025. [DOI] [PubMed] [Google Scholar]
- Grealey J, Lannelongue L, Saw WY, Marten J, Méric G, Ruiz-Carmona S, Inouye M. The carbon footprint of bioinformatics. Mol Biol Evol. 2022:39(3):msac034. 10.1093/molbev/msac034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Green PJ. Penalized likelihood for general semi-parametric regression models. Int Stat Rev. 1987:55:245. 10.2307/1403404. [DOI] [Google Scholar]
- Green PJ. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika. 1995:82(4):711–732. 10.1093/biomet/82.4.711. [DOI] [Google Scholar]
- Guindon S. Bayesian estimation of divergence times from large sequence alignments. Mol Biol Evol. 2010:27(8):1768–1781. 10.1093/molbev/msq060. [DOI] [PubMed] [Google Scholar]
- Guindon S. From trajectories to averages: an improved description of the heterogeneity of substitution rates along lineages. Syst Biol. 2013:62(1):22–34. 10.1093/sysbio/sys063. [DOI] [PubMed] [Google Scholar]
- Guindon S. Rates and rocks: strengths and weaknesses of molecular dating methods. Front Genet. 2020:11:526. 10.3389/fgene.2020.00526. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heath TA, Holder MT, Huelsenbeck JP. A Dirichlet process prior for estimating lineage-specific substitution rates. Mol Biol Evol. 2012:29(3):939–955. 10.1093/molbev/msr255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heath TA, Huelsenbeck JP, Stadler T. The fossilized birth-death process for coherent calibration of divergence-time estimates. Proc Natl Acad Sci U S A. 2014:111(29):E2957–E2966. 10.1073/pnas.1319091111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heled J, Drummond AJ. Bayesian inference of species trees from multilocus data. Mol Biol Evol. 2010:27(3):570–580. 10.1093/molbev/msp274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Himmelmann L, Metzler D. TreeTime: an extensible C++ software package for Bayesian phylogeny reconstruction with time-calibration. Bioinformatics. 2009:25(18):2440–2441. 10.1093/bioinformatics/btp417. [DOI] [PubMed] [Google Scholar]
- Ho SYW, Duchêne S. Molecular-clock methods for estimating evolutionary rates and timescales. Mol Ecol. 2014:23(24):5947–5965. 10.1111/mec.12953. [DOI] [PubMed] [Google Scholar]
- Höhna S, Landis MJ, Heath TA, Boussau B, Lartillot N, Moore BR, Huelsenbeck JP, Ronquist F. RevBayes: Bayesian phylogenetic inference using graphical models and an interactive model-specification language. Syst Biol. 2016:65(4):726–736. 10.1093/sysbio/syw021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hu Z, Sackton TB, Edwards SV, Liu JS. Bayesian detection of convergent rate changes of conserved noncoding elements on phylogenetic trees. Mol Biol Evol. 2019:36(5):1086–1100. 10.1093/molbev/msz049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huelsenbeck JP, Larget B, Swofford D. A compound Poisson process for relaxing the molecular clock. Genetics. 2000:154(4):1879–1892. 10.1093/genetics/154.4.1879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huelsenbeck JP, Ronquist F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics. 2001:17(8):754–755. 10.1093/bioinformatics/17.8.754. [DOI] [PubMed] [Google Scholar]
- Hwang DG, Green P. Bayesian Markov chain Monte Carlo sequence analysis reveals varying neutral substitution patterns in mammalian evolution. Proc Natl Acad Sci U S A. 2004:101(39):13994–14001. 10.1073/pnas.0404142101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Irisarri I, Singh P, Koblmüller S, Torres-Dowdall J, Henning F, Franchini P, Fischer C, Lemmon AR, Lemmon EM, Thallinger GG, et al. Phylogenomics uncovers early hybridization and adaptive loci shaping the radiation of Lake Tanganyika cichlid fishes. Nat Commun. 2018:9(1):3159. 10.1038/s41467-018-05479-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- James G, Witten D, Hastie T, Tibshirani R. An introduction to statistical learning: with applications in R. New York: Springer; 2017. [Google Scholar]
- Jeffreys H. Some tests of significance, treated by the theory of probability. Math Proc Camb Philos Soc. 1935:31(2):203–222. 10.1017/S030500410001330X. [DOI] [Google Scholar]
- Jones RA, Vazquez-Iglesias I, Hajizadeh M, McGreig S, Fox A, Gibbs AJ. Phylogenetics and evolution of wheat streak mosaic virus: its global origin and the source of the Australian epidemic. Plant Pathol. 2022:71(8):1660–1673. 10.1111/ppa.13609. [DOI] [Google Scholar]
- Kandeil A, Patton C, Jones JC, Jeevan T, Harrington WN, Trifkovic S, Seiler JP, Fabrizio T, Woodard K, Turner JC, et al. Rapid evolution of A(H5N1) influenza viruses after intercontinental spread to North America. Nat Commun. 2023:14(1):3082. 10.1038/s41467-023-38415-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kass RE, Raftery AE. Bayes factors. J Am Stat Assoc. 1995:90(430):773–795. 10.1080/01621459.1995.10476572. [DOI] [Google Scholar]
- Kawahara AY, Storer C, Carvalho APS, Plotkin DM, Condamine FL, Braga MP, Ellis EA, St Laurent RA, Li X, Barve V, et al. A global phylogeny of butterflies reveals their evolutionary history, ancestral hosts and biogeographic origins. Nat Ecol Evol. 2023:7(6):903–913. 10.1038/s41559-023-02041-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim SH, Elango N, Warden C, Vigoda E, Yi SV. Heterogeneous genomic molecular clocks in primates. PLoS Genet. 2006:2(10):e163. 10.1371/journal.pgen.0020163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kimura M. The neutral theory of molecular evolution. 1st ed. New York: Cambridge University Press; 1983. [Google Scholar]
- Kishino H, Hasegawa M. Converting distance to time: application to human evolution. Methods Enzymol. 1990:183:550–570. 10.1016/0076-6879(90)83036-9. [DOI] [PubMed] [Google Scholar]
- Kishino H, Thorne JL, Bruno WJ. Performance of a divergence time estimation method under a probabilistic model of rate evolution. Mol Biol Evol. 2001:18(3):352–361. 10.1093/oxfordjournals.molbev.a003811. [DOI] [PubMed] [Google Scholar]
- Korber B, Muldoon M, Theiler J, Gao F, Gupta R, Lapedes A, Hahn BH, Wolinsky S, Bhattacharya T. Timing the ancestor of the HIV-1 pandemic strains. Science. 2000:288(5472):1789–1796. 10.1126/science.288.5472.1789. [DOI] [PubMed] [Google Scholar]
- Kovacs TGL, Walker J, Hellemans S, Bourguignon T, Tatarnic NJ, Mcrae JM, Ho SYW, Lo N. Dating in the Dark: elevated substitution rates in cave cockroaches blattodea: nocticolidae have negative impacts on molecular date estimates. Syst Biol. 2024:73(3):532–545. 10.1093/sysbio/syae002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kramer AM, Thornlow B, Ye C, De Maio N, McBroome J, Hinrichs AS, Lanfear R, Turakhia Y, Corbett-Detig R. Online phylogenetics with matOptimize produces equivalent trees and is dramatically more efficient for large SARS-CoV-2 phylogenies than de novo and maximum-likelihood implementations. Syst Biol. 2023:72(5):1039–1051. 10.1093/sysbio/syad031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar S. Embracing green computing in molecular phylogenetics. Mol Biol Evol. 2022:39(3):msac043. 10.1093/molbev/msac043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar S, Hedges SB. Advances in time estimation methods for molecular data. Mol Biol Evol. 2016:33(4):863–869. 10.1093/molbev/msw026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langley CH, Fitch WM. An examination of the constancy of the rate of molecular evolution. J Mol Evol. 1974:3(3):161–177. 10.1007/BF01797451. [DOI] [PubMed] [Google Scholar]
- Lartillot N. Phylogenetic patterns of GC-biased gene conversion in placental mammals and the evolutionary dynamics of recombination landscapes. Mol Biol Evol. 2013:30(3):489–502. 10.1093/molbev/mss239. [DOI] [PubMed] [Google Scholar]
- Lartillot N, Brinkmann H, Philippe H. Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model. BMC Evol Biol. 2007:7 Suppl 1(Suppl 1):S4. 10.1186/1471-2148-7-S1-S4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lartillot N, Delsuc F. Joint reconstruction of divergence times and life-history evolution in placental mammals using a phylogenetic covariance model. Evolution. 2012:66(6):1773–1787. 10.1111/j.1558-5646.2011.01558.x. [DOI] [PubMed] [Google Scholar]
- Lartillot N, Lepage T, Blanquart S. PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating. Bioinformatics. 2009:25(17):2286–2288. 10.1093/bioinformatics/btp368. [DOI] [PubMed] [Google Scholar]
- Lartillot N, Philippe H. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol Biol Evol. 2004:21(6):1095–1109. 10.1093/molbev/msh112. [DOI] [PubMed] [Google Scholar]
- Lartillot N, Philippe H. Computing Bayes factors using thermodynamic integration. Syst Biol. 2006:55(2):195–207. 10.1080/10635150500433722. [DOI] [PubMed] [Google Scholar]
- Lartillot N, Phillips MJ, Ronquist F. A mixed relaxed clock model. Philos Trans R Soc B Biol Sci. 2016:371(1699):20150132. 10.1098/rstb.2015.0132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lartillot N, Poujol R. A phylogenetic model for investigating correlated evolution of substitution rates and continuous phenotypic characters. Mol Biol Evol. 2011:28(1):729–744. 10.1093/molbev/msq244. [DOI] [PubMed] [Google Scholar]
- Lee H-J, Kishino H, Rodrigue N, Thorne JL. Grouping substitution types into different relaxed molecular clocks. Philos Trans R Soc Lond B Biol Sci. 2016:371(1699):20150141. 10.1098/rstb.2015.0141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee H-J, Rodrigue N, Thorne JL. Relaxing the molecular clock to different degrees for different substitution types. Mol Biol Evol. 2015:32(8):1948–1961. 10.1093/molbev/msv099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lepage T, Bryant D, Philippe H, Lartillot N. A general comparison of relaxed molecular clock models. Mol Biol Evol. 2007:24(12):2669–2680. 10.1093/molbev/msm193. [DOI] [PubMed] [Google Scholar]
- Lewin GR, Carlos C, Chevrette MG, Horn HA, McDonald BR, Stankey RJ, Fox BG, Currie CR. Evolution and ecology of Actinobacteria and their bioenergy applications. Annu Rev Microbiol. 2016:70(1):235–254. 10.1146/annurev-micro-102215-095748. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lewis PO, Xie W, Chen M-H, Fan Y, Kuo L. Posterior predictive Bayesian phylogenetic model selection. Syst Biol. 2014:63(3):309–321. 10.1093/sysbio/syt068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li WLS, Drummond AJ. Model averaging and Bayes factor calculation of relaxed molecular clocks in Bayesian phylogenetics. Mol Biol Evol. 2012:29(2):751–761. 10.1093/molbev/msr232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li WH, Gouy M, Sharp PM, O’hUigin C, Yang YW. Molecular phylogeny of Rodentia, Lagomorpha, Primates, Artiodactyla, and Carnivora and molecular clocks. Proc Natl Acad Sci U S A. 1990:87(17):6703–6707. 10.1073/pnas.87.17.6703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li WH, Tanimura M. The molecular clock runs more slowly in man than in apes and monkeys. Nature. 1987:326(6108):93–96. 10.1038/326093a0. [DOI] [PubMed] [Google Scholar]
- Li HT, Yi TS, Gao LM, Ma PF, Zhang T, Yang JB, Gitzendanner MA, Fritsch PW, Cai J, Luo Y, et al. Origin of angiosperms and the puzzle of the Jurassic gap. Nat Plants. 2019:5(5):461–470. 10.1038/s41477-019-0421-0. [DOI] [PubMed] [Google Scholar]
- Liu L, Pearl DK. Species trees from gene trees: reconstructing Bayesian posterior distributions of a species phylogeny using estimated gene tree distributions. Syst Biol. 2007:56(3):504–514. 10.1080/10635150701429982. [DOI] [PubMed] [Google Scholar]
- Liu L, Yu L, Edwards SV. A maximum pseudo-likelihood approach for estimating species trees under the coalescent model. BMC Evol Biol. 2010:10(1):302. 10.1186/1471-2148-10-302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu L, Yu L, Pearl DK, Edwards SV. Estimating species phylogenies using coalescence times among sequences. Syst Biol. 2009:58(5):468–477. 10.1093/sysbio/syp031. [DOI] [PubMed] [Google Scholar]
- Lu LM, Mao LF, Yang T, Ye JF, Liu B, Li HL, Sun M, Miller JT, Mathews S, Hu HH, et al. Evolutionary history of the angiosperm flora of China. Nature. 2018:554(7691):234–238. 10.1038/nature25485. [DOI] [PubMed] [Google Scholar]
- Mai U, Mirarab S. Log transformation improves dating of phylogenies. Mol Biol Evol. 2021:38(3):1151–1167. 10.1093/molbev/msaa222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Manceau M, Marin J, Morlon H, Lambert A. Model-based inference of punctuated molecular evolution. Mol Biol Evol. 2020:37(11):3308–3323. 10.1093/molbev/msaa144. [DOI] [PubMed] [Google Scholar]
- Margoliash E. Primary structure and evolution of cytochrome C. Proc Natl Acad Sci U S A. 1963:50(4):672–679. 10.1073/pnas.50.4.672. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martin AP, Palumbi SR. Body size, metabolic rate, generation time, and the molecular clock. Proc Natl Acad Sci U S A. 1993:90(9):4087–4091. 10.1073/pnas.90.9.4087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mau B, Newton MA. Phylogenetic inference for binary data on dendograms using Markov chain Monte Carlo. J Comput Graph Stat. 1997:6(1):122–131. 10.1080/10618600.1997.10474731. [DOI] [Google Scholar]
- Mello B, Schrago CG. The estimated pacemaker for great apes supports the hominoid slowdown hypothesis. Evol Bioinform Online. 2019:15:1-8. 10.1177/1176934319855988. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mello B, Tao Q, Barba-Montoya J, Kumar S. Molecular dating for phylogenies containing a mix of populations and species by using Bayesian and RelTime approaches. Mol Ecol Resour. 2021:21(1):122–136. 10.1111/1755-0998.13249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mello B, Tao Q, Tamura K, Kumar S. Fast and accurate estimates of divergence times from big data. Mol Biol Evol. 2017:34(1):45–50. 10.1093/molbev/msw247. [DOI] [PubMed] [Google Scholar]
- Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, Lanfear R. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol. 2020:37(5):1530–1534. 10.1093/molbev/msaa015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miura S, Tamura K, Tao Q, Huuki LA, Kosakovsky Pond SL, Priest J, Deng J, Kumar S. A new method for inferring timetrees from temporally sampled molecular sequences. PLoS Comput Biol. 2020:16(1):e1007046. 10.1371/journal.pcbi.1007046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mooers AO, Harvey PH. Metabolic rate, generation time, and the rate of molecular evolution in birds. Mol Phylogenet Evol. 1994:3(4):344–350. 10.1006/mpev.1994.1040. [DOI] [PubMed] [Google Scholar]
- Moorjani P, Amorim CE, Arndt PF, Przeworski M. Variation in the molecular clock of primates. Proc Natl Acad Sci U S A. 2016:113(38):10607–10612. 10.1073/pnas.1600374113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moriyama EN. Higher rates of nucleotide substitution in Drosophila than in mammals. Jpn J Genet. 1987:62(2):139–147. 10.1266/jjg.62.139. [DOI] [Google Scholar]
- Muse SV, Gaut BS. Comparing patterns of nucleotide substitution rates among chloroplast loci using the relative ratio test. Genetics. 1997:146(1):393–399. 10.1093/genetics/146.1.393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muse SV, Weir BS. Testing for equality of evolutionary rates. Genetics. 1992:132(1):269–276. 10.1093/genetics/132.1.269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Naveca FG, Nascimento V, de Souza VC, Corado AL, Nascimento F, Silva G, Costa Á, Duarte D, Pessoa K, Mejía M, et al. COVID-19 in Amazonas, Brazil, was driven by the persistence of endemic lineages and P.1 emergence. Nat Med. 2021:27(7):1230–1238. 10.1038/s41591-021-01378-7. [DOI] [PubMed] [Google Scholar]
- Newton MA, Raftery AE. Approximate Bayesian inference with the weighted likelihood bootstrap. J R Stat Soc Ser B Methodol. 1994:56(1):3–26. 10.1111/j.2517-6161.1994.tb01956.x. [DOI] [Google Scholar]
- Nielsen R. Robustness of the estimator of the index of dispersion for DNA sequences. Mol Phylogenet Evol. 1997:7(3):346–351. 10.1006/mpev.1997.0411. [DOI] [PubMed] [Google Scholar]
- Nikaido M, Harada M, Cao Y, Hasegawa M, Okada N. Monophyletic origin of the order Chiroptera and its phylogenetic position among Mammalia, as inferred from the complete sequence of the mitochondrial DNA of a Japanese megabat, the Ryukyu flying fox (Pteropus dasymallus). J Mol Evol. 2000:51(4):318–328. 10.1007/s002390010094. [DOI] [PubMed] [Google Scholar]
- Oaks JR, A Cobb K, N Minin V, D Leaché A. Marginal likelihoods in phylogenetics: a review of methods and applications. Syst Biol. 2019:68(5):681–697. 10.1093/sysbio/syz003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ogilvie HA, Bouckaert RR, Drummond AJ. StarBEAST2 brings faster species tree inference and accurate estimates of substitution rates. Mol Biol Evol. 2017:34(8):2101–2114. 10.1093/molbev/msx126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ohta T. Synonymous and nonsynonymous substitutions in mammalian genes and the nearly neutral theory. J Mol Evol. 1995:40(1):56–63. 10.1007/BF00166595. [DOI] [PubMed] [Google Scholar]
- Ohta T, Kimura M. On the constancy of the evolutionary rate of cistrons. J Mol Evol. 1971:1(1):18–25. 10.1007/BF01659391. [DOI] [PubMed] [Google Scholar]
- Opazo JC. A molecular timescale for caviomorph rodents (Mammalia, Hystricognathi). Mol Phylogenet Evol. 2005:37(3):932–937. 10.1016/j.ympev.2005.05.002. [DOI] [PubMed] [Google Scholar]
- Othman SN, Choe M, Chuang MF, Purevdorj Z, Maslova I, Schepina NA, Jang Y, Borzée A. Across the Gobi Desert: impact of landscape features on the biogeography and phylogeographically-structured release calls of the Mongolian Toad, Strauchbufo raddei in East Asia. Evol Ecol. 2022:36(6):1007–1043. 10.1007/s10682-022-10206-4. [DOI] [Google Scholar]
- Pacheco MA, Matta NE, Valkiūnas G, Parker PG, Mello B, Stanley CE Jr, Lentino M, Garcia-Amado MA, Cranfield M, Kosakovsky Pond SL, et al. Mode and rate of evolution of haemosporidian mitochondrial genomes: timing the radiation of avian parasites. Mol Biol Evol. 2018:35(2):383–403. 10.1093/molbev/msx285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pagel M, Meade A. Modelling heterotachy in phylogenetic inference by reversible-jump Markov chain Monte Carlo. Philos Trans R Soc Lond B Biol Sci. 2008:363(1512):3955–3964. 10.1098/rstb.2008.0178. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paradis E. Molecular dating of phylogenies by likelihood methods: a comparison of models and a new information criterion. Mol Phylogenet Evol. 2013:67(2):436–444. 10.1016/j.ympev.2013.02.008. [DOI] [PubMed] [Google Scholar]
- Paradis E, Claude J, Strimmer K. APE: analyses of phylogenetics and evolution in R language. Bioinformatics. 2004:20(2):289–290. 10.1093/bioinformatics/btg412. [DOI] [PubMed] [Google Scholar]
- Paradis E, Schliep K. Ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics. 2019:35(3):526–528. 10.1093/bioinformatics/bty633. [DOI] [PubMed] [Google Scholar]
- Partha R, Kowalczyk A, Clark NL, Chikina M. Robust method for detecting convergent shifts in evolutionary rates. Mol Biol Evol. 2019:36(8):1817–1830. 10.1093/molbev/msz107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peifer M, Karro JE, von Grünberg HH. Is there an acceleration of the CpG transition rate during the mammalian radiation? Bioinformatics. 2008:24(19):2157–2164. 10.1093/bioinformatics/btn391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pérez-Losada M, Høeg JT, Crandall KA. Unraveling the evolutionary radiation of the thoracican barnacles using molecular and morphological evidence: a comparison of several divergence time estimation approaches. Syst Biol. 2004:53(2):244–264. 10.1080/10635150490423458. [DOI] [PubMed] [Google Scholar]
- Pi H-W, Chiang Y-R, Li W-H. Mapping geological events and nitrogen fixation evolution onto the timetree of the evolution of nitrogen-fixation genes. Mol Biol Evol. 2024:41(2):msae023. 10.1093/molbev/msae023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Polson NG, Scott JG, Windle J. The Bayesian bridge. J R Stat Soc Series B Stat Methodol. 2014:76(4):713–733. 10.1111/rssb.12042. [DOI] [Google Scholar]
- Pybus OG, Rambaut A. GENIE: estimating demographic history from molecular phylogenies. Bioinformatics. 2002:18(10):1404–1405. 10.1093/bioinformatics/18.10.1404. [DOI] [PubMed] [Google Scholar]
- Rabosky DL. Automatic detection of key innovations, rate shifts, and diversity-dependence on phylogenetic trees. PLoS One. 2014:9:e89543. 10.1371/journal.pone.0089543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rabosky DL, Chang J, Cowman PF, Sallan L, Friedman M, Kaschner K, Garilao C, Near TJ, Coll M, Alfaro ME. An inverse latitudinal gradient in speciation rate for marine fishes. Nature. 2018:559(7714):392–395. 10.1038/s41586-018-0273-1. [DOI] [PubMed] [Google Scholar]
- Rabosky D, Santini F, Eastman J, Smith SA, Sidlauskas B, Chang J, Alfaro ME. Rates of speciation and morphological evolution are correlated across the largest vertebrate radiation. Nat Commun. 2013:4:1958. 10.1038/ncomms2958. [DOI] [PubMed] [Google Scholar]
- Rambaut A, Bromham L. Estimating divergence dates from molecular sequences. Mol Biol Evol. 1998:15(4):442–448. 10.1093/oxfordjournals.molbev.a025940. [DOI] [PubMed] [Google Scholar]
- Rannala B, Yang Z. Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference. J Mol Evol. 1996:43(3):304–311. 10.1007/BF02338839. [DOI] [PubMed] [Google Scholar]
- Rannala B, Yang Z. Inferring speciation times under an episodic molecular clock. Syst Biol. 2007:56(3):453–466. 10.1080/10635150701420643. [DOI] [PubMed] [Google Scholar]
- Read DW. Primate phylogeny, neutral mutations, and ‘molecular clocks'. Syst Biol. 1975:24(2):209–221. 10.1093/sysbio/24.2.209. [DOI] [Google Scholar]
- Ronquist F, Huelsenbeck JP. Mrbayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003:19(12):1572–1574. 10.1093/bioinformatics/btg180. [DOI] [PubMed] [Google Scholar]
- Ronquist F, Teslenko M, Van Der Mark P, Ayres DL, Darling A, Höhna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP. Mrbayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012:61(3):539–542. 10.1093/sysbio/sys029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Russel PM, Brewer BJ, Klaere S, Bouckaert RR. Model selection and parameter inference in phylogenetics using nested sampling. Syst Biol. 2019:68(2):219–233. 10.1093/sysbio/syy050. [DOI] [PubMed] [Google Scholar]
- Rutschmann F. Molecular dating of phylogenetic trees: a brief review of current methods that estimate divergence times. Divers Distrib. 2006:12(1):35–48. 10.1111/j.1366-9516.2006.00210.x. [DOI] [Google Scholar]
- Sagulenko P, Puller V, Neher RA. TreeTime: maximum-likelihood phylodynamic analysis. Virus Evol. 2018:4(1):vex042. 10.1093/ve/vex042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sánchez-Busó L, Golparian D, Corander J, Grad YH, Ohnishi M, Flemming R, Parkhill J, Bentley SD, Unemo M, Harris SR. The impact of antimicrobials on gonococcal evolution. Nat Microbiol. 2019:4(11):1941–1950. 10.1038/s41564-019-0501-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sanderson MJ. A nonparametric approach to estimating divergence times in the absence of rate constancy. Mol Biol Evol. 1997:14(12):1218–1231. 10.1093/oxfordjournals.molbev.a025731. [DOI] [Google Scholar]
- Sanderson MJ. Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Mol Biol Evol. 2002:19(1):101–109. 10.1093/oxfordjournals.molbev.a003974. [DOI] [PubMed] [Google Scholar]
- Sanderson MJ. R8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics. 2003:19(2):301–302. 10.1093/bioinformatics/19.2.301. [DOI] [PubMed] [Google Scholar]
- Sanderson MJ, Thorne JL, Wikström N, Bremer K. Molecular evidence on plant divergence times. Am J Bot. 2004:91(10):1656–1665. 10.3732/ajb.91.10.1656. [DOI] [PubMed] [Google Scholar]
- Sarich VM, Wilson AC. Immunological time scale for hominid evolution. Science. 1967:158(3805):1200–1203. 10.1126/science.158.3805.1200. [DOI] [PubMed] [Google Scholar]
- Schrago CG, Mello B. Employing statistical learning to derive species-level genetic diversity for mammalian species. Mamm Rev. 2020:50(3):240–251. 10.1111/mam.12192. [DOI] [Google Scholar]
- Schrago CG, Russo CAM. Timing the origin of New World monkeys. Mol Biol Evol. 2003:20(10):1620–1625. 10.1093/molbev/msg172. [DOI] [PubMed] [Google Scholar]
- Schrider DR, Kern AD. Supervised machine learning for population genetics: a new paradigm. Trends Genet. 2018:34(4):301–312. 10.1016/j.tig.2017.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Serrano-Fujarte I, Calva E, García-Domínguez J, Ortiz-Jiménez S, Puente JL. Population structure and ongoing microevolution of the emerging multidrug-resistant Salmonella typhimurium ST213. npj Antimicrob Resist. 2024:2(1):10. 10.1038/s44259-024-00027-6. [DOI] [Google Scholar]
- Smith SA, Brown JW. Constructing a broadly inclusive seed plant phylogeny. Am J Botany. 2018:105(3):302–314. 10.1002/ajb2.1019. [DOI] [PubMed] [Google Scholar]
- Smith ML, Hahn MW. Phylogenetic inference using generative adversarial networks. Bioinformatics. 2023:39(9):btad543. 10.1093/bioinformatics/btad543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith SA, O’Meara BC. treePL: divergence time estimation using penalized likelihood for large phylogenies. Bioinformatics. 2012:28(20):2689–2690. 10.1093/bioinformatics/bts492. [DOI] [PubMed] [Google Scholar]
- Springer MS, Murphy WJ, Eizirik E, O’Brien SJ. Placental mammal diversification and the Cretaceous-Tertiary boundary. Proc Natl Acad Sci U S A. 2003:100(3):1056–1061. 10.1073/pnas.0334222100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Suchard MA, Lemey P, Baele G, Ayres DL, Drummond AJ, Rambaut A. Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10. Virus Evol. 2018:4(1):vey016. 10.1093/ve/vey016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Surizon GM, Geffen E, Roll U, Gafny S, Perl RGB. The phylogeography of Middle Eastern tree frogs in Israel. Sci Rep. 2024:14(1):2788. 10.1038/s41598-024-52700-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tabatabaee Y, Zhang C, Warnow T, Mirarab S. Phylogenomic branch length estimation using quartets. Bioinformatics. 2023:39(Supplement_1):i185–i193. 10.1093/bioinformatics/btad221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tajima F. Simple methods for testing the molecular evolutionary clock hypothesis. Genetics. 1993:135(2):599–607. 10.1093/genetics/135.2.599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Takahata N. Statistical models of the overdispersed molecular clock. Theor Popul Biol. 1991:39(3):329–344. 10.1016/0040-5809(91)90027-d. [DOI] [PubMed] [Google Scholar]
- Takezaki N, Rzhetsky A, Nei M. Phylogenetic test of the molecular clock and linearized trees. Mol Biol Evol. 1995:12(5):823–833. 10.1093/oxfordjournals.molbev.a040259. [DOI] [PubMed] [Google Scholar]
- Tamura K, Battistuzzi FU, Billing-Ross P, Murillo O, Filipski A, Kumar S. Estimating divergence times in large molecular phylogenies. Proc Natl Acad Sci U S A. 2012:109(47):19333–19338. 10.1073/pnas.1213199109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tamura K, Stecher G, Kumar S. MEGA11: molecular evolutionary genetics analysis version 11. Mol Biol Evol. 2021:38(7):3022–3027. 10.1093/molbev/msab120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tamura K, Tao Q, Kumar S. Theoretical foundation of the RelTime method for estimating divergence times from variable evolutionary rates. Mol Biol Evol. 2018a:35(7):1770–1782. 10.1093/molbev/msy044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tamura K, Tao Q, Kumar S. Theoretical foundation of the RelTime method for estimating divergence times from variable evolutionary rates. Mol Biol Evol. 2018b:35(7):1770–1782. 10.1093/molbev/msy044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tao Q, Barba-Montoya J, Huuki LA, Durnan MK, Kumar S. Relative efficiencies of simple and complex substitution models in estimating divergence times in phylogenomics. Mol Biol Evol. 2020a:37(6):1819–1831. 10.1093/molbev/msaa049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tao Q, Tamura K, Kumar S. Efficient methods for dating evolutionary divergences. In: Ho SYW, editors. The molecular evolutionary clock. Cham: Springer International Publishing; 2020b. p. 197–219. [Google Scholar]
- Tao Q, Tamura K, U. Battistuzzi F, Kumar S. A machine learning method for detecting autocorrelation of evolutionary rates in large phylogenies. Mol Biol Evol. 2019:36(4):811–824. 10.1093/molbev/msz014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tay JH, Baele G, Duchene S. Detecting episodic evolution through Bayesian inference of molecular clock models. Mol Biol Evol. 2023:40(10):msad212. 10.1093/molbev/msad212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thorne JL, Kishino H. Divergence time and evolutionary rate estimation with multilocus data. Syst Biol. 2002:51(5):689–702. 10.1080/10635150290102456. [DOI] [PubMed] [Google Scholar]
- Thorne JL, Kishino H, Painter IS. Estimating the rate of evolution of the rate of molecular evolution. Mol Biol Evol. 1998:15(12):1647–1657. 10.1093/oxfordjournals.molbev.a025892. [DOI] [PubMed] [Google Scholar]
- Tiley GP, Poelstra JW, Reis D, Yang M, Yoder Z, D A. Molecular clocks without rocks: new solutions for old problems. Trends Genet. 2020:36(11):845–856. 10.1016/j.tig.2020.06.002. [DOI] [PubMed] [Google Scholar]
- To T-H, Jung M, Lycett S, Gascuel O. Fast dating using least-squares criteria and algorithms. Syst Biol. 2016:65(1):82–97. 10.1093/sysbio/syv068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trost J, Haag J, Höhler D, Jacob L, Stamatakis A, Boussau B. Simulations of sequence evolution: how (un)realistic they are and why. Mol Biol Evol. 2024:41(1):msad277. 10.1093/molbev/msad277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Truszkowski J, Perrigo A, Broman D, Ronquist F, Antonelli A. Online tree expansion could help solve the problem of scalability in Bayesian phylogenetics. Syst Biol. 2023:72(5):1199–1206. 10.1093/sysbio/syad045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Uyeda JC, Harmon LJ. A novel Bayesian method for inferring and interpreting the dynamics of adaptive landscapes from phylogenetic comparative data. Syst Biol. 2014:63:902–918. 10.1093/sysbio/syu057. [DOI] [PubMed] [Google Scholar]
- Viana R, Moyo S, Amoako DG, Tegally H, Scheepers C, Althaus CL, Anyaneji UJ, Bester PA, Boni MF, Chand M, et al. Rapid epidemic expansion of the SARS-CoV-2 Omicron variant in Southern Africa. Nature. 2022:603(7902):679–686. 10.1038/s41586-022-04411-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Volz EM, Frost SDW. Scalable relaxed clock phylogenetic dating. Virus Evol. 2017:3(2):vex025. 10.1093/ve/vex025. [DOI] [Google Scholar]
- Vrancken B, Rambaut A, Suchard MA, Drummond A, Baele G, Derdelinckx I, Van Wijngaerden E, Vandamme AM, Van Laethem K, Lemey P. The genealogical population dynamics of HIV-1 in a large transmission chain: bridging within and among host evolutionary rates. PLoS Comput Biol. 2014:10(4):e1003505. 10.1371/journal.pcbi.1003505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weill FX, Domman D, Njamkepo E, Tarr C, Rauzier J, Fawal N, Keddy KH, Salje H, Moore S, Mukhopadhyay AK, et al. Genomic history of the seventh pandemic of cholera in Africa. Science. 2017:358(6364):785–789. 10.1126/science.aad5901. [DOI] [PubMed] [Google Scholar]
- White CR, Marshall DJ. Should we care if models are phenomenological or mechanistic? Trends Ecol Evol. 2019:34(4):276–278. 10.1016/j.tree.2019.01.006. [DOI] [PubMed] [Google Scholar]
- Wiegmann BM, Yeates DK, Thorne JL, Kishino H. Time flies, a new molecular time-scale for brachyceran fly evolution without a clock. Syst Biol. 2003:52(6):745–756. 10.1080/10635150390250965. [DOI] [PubMed] [Google Scholar]
- Wilson AC, Sarich VM. A molecular time scale for human evolution. Proc Natl Acad Sci U S A. 1969:63(4):1088–1093. 10.1073/pnas.63.4.1088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wolf YI, Snir S, Koonin EV. Stability along with extreme variability in core genome evolution. Genome Biol Evol. 2013:5(7):1393–1402. 10.1093/gbe/evt098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Worobey M, Han G-Z, Rambaut A. A synchronized global sweep of the internal genes of modern avian influenza virus. Nature. 2014:508(7495):254–257. 10.1038/nature13016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu CI, Li WH. Evidence for higher rates of nucleotide substitution in rodents than in man. Proc Natl Acad Sci U S A. 1985:82(6):1741–1745. 10.1073/pnas.82.6.1741. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. DAMBE7: new and improved tools for data analysis in molecular biology and evolution. Mol Biol Evol. 2018:35(6):1550–1552. 10.1093/molbev/msy073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X, Yang Q. A distance-based least-square method for dating speciation events. Mol Phylogenet Evol. 2011:59(2):342–353. 10.1016/j.ympev.2011.01.017. [DOI] [PubMed] [Google Scholar]
- Yang Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Bioinformatics. 1997:13(5):555–556. 10.1093/bioinformatics/13.5.555. [DOI] [PubMed] [Google Scholar]
- Yang Z. A heuristic rate smoothing procedure for maximum likelihood estimation of species divergence times. Acta Zool Sin. 2004:50:645–656. https://caod.oriprobe.com//articles/7671053/A_heuristic_rate_smoothing_procedure_for_maximum_likelihood_estmation_.htm [Google Scholar]
- Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007:24(8):1586–1591. 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
- Yang Z, Rannala B. Bayesian phylogenetic inference using DNA sequences: a Markov chain Monte Carlo method. Mol Biol Evol. 1997:14(7):717–724. 10.1093/oxfordjournals.molbev.a025811. [DOI] [PubMed] [Google Scholar]
- Yang Z, Rannala B. Bayesian estimation of species divergence times under a molecular clock using multiple fossil calibrations with soft bounds. Mol Biol Evol. 2006:23(1):212–226. 10.1093/molbev/msj024. [DOI] [PubMed] [Google Scholar]
- Yang Z, Yoder AD. Comparison of likelihood and Bayesian methods for estimating divergence times using multiple gene loci and calibration points, with application to a radiation of cute-looking mouse lemur species. Syst Biol. 2003:52(5):705–716. 10.1080/10635150390235557. [DOI] [PubMed] [Google Scholar]
- Yoder AD, Yang Z. Estimation of primate speciation dates using local molecular clocks. Mol Biol Evol. 2000:17(7):1081–1090. 10.1093/oxfordjournals.molbev.a026389. [DOI] [PubMed] [Google Scholar]
- Yule GU. A mathematical theory of evolution, based on the conclusions of Dr. J. C. Willis. Phil. Trans. R. Soc. Lond. B. 1925:213:21–87. 10.1098/rstb.1925.0002. [DOI] [Google Scholar]
- Zhang L, Chen F, Zhang X, Li Z, Zhao Y, Lohaus R, Chang X, Dong W, Ho SY, Liu X, et al. The water lily genome and the early evolution of flowering plants. Nature. 2020:577(7788):79–84. 10.1038/s41586-019-1852-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang C, Stadler T, Klopfstein S, Heath TA, Ronquist F. Total-evidence dating under the fossilized birth–death process. Syst Biol. 2016:65(2):228–249. 10.1093/sysbio/syv080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu T, Dos Reis M, Yang Z. Characterization of the uncertainty of divergence time estimation under relaxed molecular clock models using multiple loci. Syst Biol. 2015:64(2):267–280. 10.1093/sysbio/syu109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zuckerkandl E, Pauling L. Molecular disease, evolution, and genic heterogeneity. In: Kasha M, Pullman B, editors. Horizons in biochemistry. New York, USA: Academic Press; 1962. p. 189–225. [Google Scholar]
- Zuckerkandl E, Pauling L. Evolutionary divergence and convergence in proteins. Evolving genes and proteins. New York: Elsevier; 1965. p. 97–166. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
No new data were generated or analyzed in support of this research.



