Abstract
A relationship between the rate of molecular change and diversification has long been discussed, on both theoretical and empirical grounds. However, the effect on our understanding of evolutionary patterns is yet to be fully explored. Here, we develop a new model, the Covariant Evolutionary Tempo model, with the aim of integrating patterns of diversification and molecular evolution within a framework of a continuously changing “tempo” variable that acts as a master control for molecular, morphological, and diversification rates. Importantly, tempo itself is treated as being variable at a rate proportional to its own value. This model predicts that diversity is dominated by a small number of extremely large clades at any historical epoch including the present; that these large clades are expected to be characterised by explosive early radiations accompanied by elevated rates of molecular evolution; and that extant organisms are likely to have evolved from species with unusually fast evolutionary rates. Under such a model, the amount of molecular change along a particular lineage is essentially independent of its height, which weakens the molecular clock hypothesis. Finally, our model explains the existence of “living fossil” sister groups to large clades that are species poor and exhibit slow rates of morphological and molecular change. Our results demonstrate that the observed historical patterns of evolution can be modelled without invoking special evolutionary mechanisms or innovations that are unique to specific times or taxa, even when they are highly nonuniform.
Keywords: living fossils, molecular clocks, patterns of diversification
The relationship between micro- and macroevolution has long been debated (Jablonski, 2000; Erwin, 2000; Rolland et al., 2023). A central question is the extent to which large-scale evolutionary patterns—observed in the fossil record and inferred from phylogenies—are shaped by the processes operating at the population level. Regardless of the outcome of this debate, however, there is often a methodological assumption of independence between microevolutionary changes (e.g., shifts in gene frequencies due to selection) and macroevolutionary patterns (e.g., diversification trends within a clade). Contemporary models of evolutionary history conceptualize the overall process as being governed by three independent components: the model of molecular substitution, the rate at which substitutions occur, and the nature of the branching process (Warnock and Wright, 2021). The simplest approach would be to employ a strict molecular clock with a Jukes–Cantor substitution model (Jukes and Cantor, 1969) on a known phylogeny, and assuming a fixed rate of branching—often represented by a homogeneous birth–death process (BDP) (Nee, 2006). Methodological advances, such as the development of relaxed clocks, now allow substitution rates to vary across the tree (see Dos Reis et al. (2016) for a review). Additionally, increasingly sophisticated models of molecular evolution have been introduced (Arenas, 2015). More recently, models have also emerged that incorporate variable diversification rates (see below), allowing for more complex representations of evolutionary trees, although the broad-scale patterns resulting from such models remain relatively unexplored.
Increasing sophistication in modeling ability has naturally also fuelled attempts to understand the causes behind the variation being captured. To take molecular substitution rate variation first: two broad hypotheses exist about its causes. The first encompasses a range from mutational effects to features of the entire organism (such as body size or generation time), and the second is a “speciation rate hypothesis” that links molecular change to speciation (Jobson and Albert, 2002). There are sound empirical and conceptual reasons for thinking that speciation and molecular change may well be intimately related (Hua and Bromham, 2017), and attempts have sometimes been made to consider them jointly (e.g., Sarver et al. (2019); Ritchie et al. (2022b)). Indeed, Eo and DeWoody go so far as to claim that “One of the most basic predictions in evolutionary biology is that the rate of diversification along a particular branch of the tree of life is some function of the rate of genome evolution on that branch.” (Eo and DeWoody (2010), p. 3587). Provocative evidence for a close correlation of the two processes is seen for example in the early history of arthropods (Lee et al., 2013), where early branches of the clade contain just as much molecular change as later branches despite being far shorter in duration (Budd and Mann, 2020b), at least when the tree height is constrained by the fossil record. However, this is just one of several studies that over the last few decades have debated a potential link between both morphological and molecular rates of change and rates of speciation (e.g., Barraclough and Savolainen (2001); Webster et al. (2003); Xiang et al. (2004); Venditti and Pagel (2010); Lanfear et al. (2010); Rabosky et al. (2013); Berv and Field (2018); Bromham (2024)), although it should be noted that not all studies have found clear evidence of this link (e.g., Goldie et al. (2011)). There are at least two factors that might cloud the relationship between diversification and molecular change through time. The first is the so-called “node density” effect, wherein in clades with more terminals, a resulting greater number of internal nodes will recover more molecular change and thus generate a spurious relationship between clade size and amount of molecular change (Hugall and Lee, 2007). The second is that if a relaxed clock methodology is employed to ascertain the time of origin of a clade, then any early burst of molecular (or morphological (Beck and Lee, 2014)) change or indeed diversification is likely to be smoothed out by pushing the age of the root deeper (Bromham, 2003; Beaulieu et al., 2015; Budd and Mann, 2020b; Bromham, 2020; Shafir et al., 2020). If one were simply to accept the result of the molecular clock, then the apparent elevated early rates could theoretically be explained as an artefact caused by “bunching up” the early lineages to artificially squeeze the clade into a too-narrow time interval (c.f., Bromham and Hendy (2000)). However, we have previously marshalled strong reasons for thinking that the fossil record in such instances is often reliable, in which case early bursts of diversification should be taken seriously and not dismissed as dating artifacts (Budd and Mann, 2020a,b, 2024; Holmes and Budd, 2022). As a result, the well-known mismatch between the explicit fossil record and molecular clock origination estimates for many major clades such as animals (Budd and Mann, 2020b), birds (Berv and Field, 2018), placental mammals (Budd and Mann, 2024), and angiosperms (Coiro et al., 2019; Smith and Beaulieu, 2024) itself points to cryptic excess molecular change at the base of trees (Beaulieu et al., 2015; Berv and Field, 2018). Previous critiques of molecular clocks have focused on either inappropriate age priors (e.g., Budd and Mann (2024); Brown and Smith (2018)) or issues with rate heterogeneity (e.g., Bromham and Woolfit (2004); Berv and Field (2018)); below, we will suggest that these are effectively two sides of the same coin. Clearly, if the branching process and rate of molecular change really are correlated, then this would have a significant impact on our understanding of the patterns of evolutionary change through time (see Duchêne et al. (2017) for investigation and discussion of this point).
Causes of variation in diversification rates are likewise much debated (e.g., Moen and Morlon (2014)). It is clear that, similarly to the case of molecular evolution itself, rates of diversification must vary across the tree, as a single homogeneous BDP cannot possibly capture the true patterns of diversification reflected in evolutionary history (c.f., Benton and Emerson (2007)). Notwithstanding this, the homogeneous BDP (Nee, 2006) (in which rates of speciation and extinction are fixed) is still commonly employed in molecular analysis, especially for dating purposes, although its inadequacies are increasingly being recognised (e.g., Khurana et al. (2024)).
Any attempt to investigate a link between rates of genetic/morphological evolution and speciation must reckon with the heterogeneous nature of all of these variables. Historically, rate heterogeneity has largely been addressed in one of two ways: either by assuming rate shifts occur at significant points (e.g., Soltis and Soltis (2016)), or by assuming broad secular variation, for example, with declining rates through time across the entire tree (Strathmann and Slatkin, 1983; Nee et al., 1994b); or some combination of both (e.g., in BAMM (Rabosky et al., 2014)). More recent models have moved away from considering isolated rate shifts to allow rates to vary either in small frequent increments associated with speciations (Maliet et al., 2019; Shafir et al., 2020) or continuously through anagenetic diffusion (Quintero et al., 2024) (for other noncontinuous models, see the review in the supplementary information of Maliet et al. (2019)). The primary goal of these models has been the inference of rates through time, based on molecular data from extant taxa (Barido-Sottani and Morlon, 2023) which has now been implemented in BEAST2 (Bouckaert et al., 2019), clearly a substantial step forward from homogeneous models. However, some forward simulation has also revealed that these models can generate clades that match empirical observation; in particular, simulated clades are often imbalanced and “stemmy” (Maliet et al., 2019). This suggests that diversification rate heterogeneity may be one key to understanding the patterns of modern diversity. This is largely because the distribution of modern diversity predicted by homogeneous or epochally time-varying BDPs is geometric (Kendall, 1948; Nee et al., 1994b), and this remains the case even when nonselective mass extinctions are considered (Budd and Mann, 2020a). However, a certain amount of evidence suggests that extant sizes are in fact over-dispersed relative to this expectation (Blum and François, 2006; Stadler et al., 2016). Consider, for example, the crown-group animal phyla, which for the sake of argument, we can assume all emerged around 500 Ma (Budd and Mann, 2024). Estimating total species diversity in the phyla is fraught with difficulty, but even so the species count differs widely. For example, the phyla have an average diversity of c. 50 000 species, but the arthropods have a diversity of well over one million species, thus being over twenty times larger than expected. Under a geometric distribution, this is essentially impossible (). This pattern is seen repeated hierarchically: for example, most arthropods are insects, and most insects appear to be hymenopterans (Forbes et al., 2018). Similarly, the angiosperms are much more diverse than any other plant clades (e.g., c. 300 000 vs. 1000 gymnosperms) and birds much more so than crocodiles in the archosaurs (c.10 000 vs. c. 85). In other words, the existence of Stanley’s “supertaxa” (Stanley, 1998) does not seem compatible with a purely geometric distribution of clade sizes as predicted by the homogeneous BDP. In addition, clade sizes show a complex relationship with age that is not easily explained by homogeneous diversification (Magallon and Sanderson, 2001; McPeek and Brown, 2007; Rabosky, 2010), and indeed attempts to estimate absolute diversification rates within a clade suggest several orders of magnitude variation (Magallon and Sanderson, 2001). It thus seems that clade sizes do often appear overdispersed relative to any expected geometric distribution (Khurana et al., 2024).
Taking these empirical findings together, and noting the apparent importance of rate heterogeneity across both microscopic and macroscopic evolutionary scales (Henao-Diaz and Pennell, 2023), it seems that a need exists for a synthesis that unites molecular evolution and species diversification, in which both vary through time. In this paper, then, we develop a model of diversification and molecular change in which all evolutionary rates covary, being controlled by a single variable evolutionary tempo that differs both between species, and within a species over time. Although our model does not depend on a particular instantiation of tempo, we nevertheless offer some suggestions about how it might be encoded in a realistic way in the genome below (see schematic for genetic encoding of tempo in Appendix 1). Our analysis of this model will show that it is consistent with the concentration of species into relatively few “supertaxa”’ (Stanley, 1998); that it offers a resolution to conflict between the fossil record and molecular clocks; and that it makes new predictions about the early history of major clades and the fate of the smaller clades that constitute the remaining part of modern diversity. Because of the way we formulate the model, it is amenable to numerical solution that allows us to investigate its general features, as opposed to simulations that would show the outcomes of rates over specific trees.
Methods and Materials
Model Outline
As indicated above, heterogeneity in rates of speciation and extinction is key to explaining important empirical features of diversification. We here extend earlier approaches to model such heterogeneity (Rabosky et al., 2014; Maliet et al., 2019; Ritchie et al., 2022a; Quintero et al., 2024) and create a BDP model in which rates of speciation and extinction vary continuously and covariantly through anagenetic diffusion. We call this model the Covariant Evolutionary Tempo (CET) model. Under CET, all evolutionary rates are specific to a given taxon at a specific moment in time. Our model is close in formulation to that of Quintero et al. (2024). However, whereas they model this variation in speciation and extinction rates as geometric Brownian motion with an overall drift, and treat speciation and extinction independently, we instead posit that there exist baseline rates of speciation () and extinction () that are linearly modulated by a new variable we label as tempo, , which controls the relative rates of all evolutionary processes. At any given time, a taxon with tempo has a speciation rate and an extinction rate .
This model is fully covariant, in that all rates are linked directly to ; in effect, the tempo represents a local speeding-up or slowing-down of evolutionary time, such that all processes happen faster or slower. In particular, we posit that tempo itself varies through time, and because we posit that tempo is in some way genetically encoded, this implies that the evolution of itself proceeds at a rate proportional to , since the effect on molecular rates of mutation will obtain upon whichever part of the genome is responsible for this encoding. Specifically, we model the log-tempo () as evolving according to a modified Ornstein–Uhlenbeck (OU) process that incorporates the effect of the tempo itself on all rates
| (1) |
where represents an incremental change from a Wiener process (popularly known as Brownian motion). We impose this model for the evolution of the log-tempo since the tempo itself is constrained to be positive. The parameters of this stochastic differential equation are the mean reversion rate and the stationary variance of the process, . The terms in this equation come from the self-interaction of the tempo, which as well as multiplying the rate of all other processes also determines the rate at which it evolves itself, such that the effective increment of time is . Our use of an OU process is motivated by two considerations. First, as we shall show, a Wiener process without a restoring force would lead to a runaway effect, where tempos increase without limit. Secondly, in Appendix 1, we describe a plausible schematic for how tempo is inherited that produces an inherent reversion to a mean value via entropic forces.
As we show in Appendix 1, this results in a drift-diffusion partial differential equation for the generating function of the resulting BDP
| (2) |
where , with being the probability of generating species over time in a process starting with log-tempo . Solving this equation for an initial condition provides the value of the generating function .
Equation 2 does not appear to permit solution in closed form, except for the long-term extinction probability for , which is for all , and is therefore tempo invariant. More generally, equation 2 can be straightforwardly solved numerically. The values of can be retrieved from this generating function by Fourier inversion (see Appendix 1).
We can derive further equations specifying the evolution of the mean number of species generated by the process over time, the expected number of lineages (species that will have modern descendants), and the distribution of tempos over time. Derivation of these equations is described in Appendix 1. The most important of these equations specifies the evolution of the mean number of species through time. Given a generating function , the mean of the distribution is given by
| (3) |
Using this relation, equation 2 can be transformed into a simpler, linear form to represent the dynamics of the mean
| (4) |
where is the baseline net diversification rate. This equation reveals the key dynamics of the process: the expected number of species with log-tempo locally increases exponentially at the rate modulated by . At the same time, a drift-diffusion process modifies the tempo of each species, such that species tend to move toward a log-tempo of 0 (i.e., ).
Justification for a Covariant Theory
Why should all evolutionary rates be covariant? As we have discussed above, previous birth–death models have allowed for independent variation in speciation and extinction (while in practice sometimes holding one of these constant), while the rates of molecular evolution have been assumed (generally implicitly) to be completely independent of diversification rates. In one sense, our choice is pragmatic: we seek to explore the consequences of linking changing rates of molecular evolution to diversification rates, and the most parsimonious way to do this is to impose a perfect correlation between the two. Allowing for speciation and extinction rates to vary independently (or with some nonunitary correlation) would greatly complicate the mathematical formulation of the birth–death model and its analysis, and cloud its implications. Empirically, we are also strongly motivated by the apparently close (inverse) correlation between rates of molecular evolution and branch durations in for example Lee et al. (2013) and other studies, as noted in the introduction. Finally, our choice is also theoretically informed. It is clear that as speciation and extinction vary, they must remain close to one another over time; a sustained period of much higher speciation will quickly produce an unrealistically large number of species, while a period of greater extinction than speciation will almost certainly drive the clade to extinction. Indeed, the linkage between the two has been formulated by Marshall as the third of his five “paleobiological laws” (see Marshall (2017) for discussion and justification of this point). Moreover, we expect that rates of speciation and extinction may largely be driven by the same causal factors, for example, generation times and population size (for a classical discussion of the various links between speciation and extinction rates, see Stanley (1990), and more recently Greenberg and Mooers (2017)). Therefore, while we anticipate significant deviations from covariance between these processes at sufficiently short time scales, we expect it to be a realistic first-order approximation when considering rates on the scale of millions of years. We also note that although most discussions of molecular evolution have considered a link with speciation, we consider that in practice, this implies a link with extinction too, for the reasons given above.
As far as our model is concerned, we note that many of the factors operating on speciation rates are also likely to affect molecular rates of change. For example, Bromham has stressed the need to consider the genome itself as a life-history trait ((Bromham, 2003, 2009, 2020), and thus open to the same influences [population size, generation time, etc]) as other traits. Thus, under such a view of evolution, small body size or small populations might both influence speciation rate (Martin, 2017; Cooney and Thomas, 2021) and molecular evolution rates (Bromham, 2020) together, thus uniting the two broad ways of considering the causes of molecular change (Jobson and Albert, 2002). Naturally, such a linkage between the two might itself vary, but in order to investigate its general effects, and certainly to greatly simplify the analysis, we have chosen a model with complete linkage.
Few studies have shown a convincing direct link between molecular substitution rates and phenotypic change (Bromham and Woolfit, 2004). Nevertheless, the two may be indirectly linked by other factors such as speciation rate, as both phenotypic and molecular are plausibly linked to speciation (for discussion of this point with some examples such as placental mammals and lungfish, see Budd and Mann (2018)). As we suggest below, some empirical evidence points to this being true, at least in some clades.
Results
We analyzed our model by solving the probabilistic equations given above to obtain distributions at different time epochs, rather than by direct simulation of the tree evolution. Notably, our analysis does not provide a probability distribution over specific trees, but over coarser-grained variables such as diversity. It is not our goal to quantitatively fit our model to the modern diversity or evolutionary history of any specific clade, but rather to reveal the qualitative features the model predicts. Throughout we use a core set of parameters per species per myr, per species per myr, /myr, and . These parameters are chosen to reflect reasonable expectations about the real evolutionary process: a baseline extinction rate of per species per myr comports with that chosen in previous analyses (e.g., Budd and Mann (2018)) and, combined with a speciation rate of per species per myr is consistent with a typical species existing for c. 1 myr, in broad agreement with the fossil record (see. e.g., Budd and Mann (2018)). The speciation rate is chosen to be of similar magnitude to the extinction rate, such that extinction plays a significant role in the evolutionary dynamics (Marshall, 2017) but is otherwise arbitrary. We choose a mean-reversion parameter /myr to be equal to the net diversification rate as we will later show that if , then the mean log-tempo converges to 0 (see Appendix 1, equation A28). Although this choice is mathematically convenient, we do not expect that it represents any necessary feature of the evolutionary process nor do the general features of our results depend on it. Finally, the diffusion parameter is chosen to be large enough to produce significant effects of the diffusive dynamics, and otherwise is simply a mathematically convenient choice.
Distribution of clade sizes
We solved equation 2 for times myr and starting log-tempos and performed a Fourier inversion (see Appendix 1) to retrieve the implied probability distribution . The distribution of clade sizes for a clade that starts with log-tempo , excluding clades of size zero, is shown in Figure 1A. The clade sizes follow a distribution that differs strongly from the geometric distribution expected under a typical BDP (indicated by the dashed line, assuming the same mean clade size). This distribution is characterised by most clades being small, but with a few extremely large clades. This means that clades that are many times greater than average (either mean or median) are much more probable than under a standard BDP. A corollary of this is that clade size a typical species “experiences” (i.e., the expected clade size of a randomly selected species) is c. 8 times greater than the mean clade size. For clarity, we here define the experienced and mean clade sizes as the sizes of clades containing living organisms that have the same time of origin (e.g., the sizes of parent clades that are all 500 myr old).
Figure 1.
A) Distribution of the number of species generated in clades that survive 500 myr, with parameters per species per myr, per species per myr, /myr ,, and an initial log-tempo . Note the log scale on the y-axis. The distribution is long-tailed and is characterized by a high probability of few species () and a long tail allowing some very large clades to be generated (. The blue and red lines indicate the mean clade size (c. 60 000) and the mean experienced clade size of a randomly chosen taxon (c. 400 000) respectively, indicating that most taxa are found in very large clades. The dashed line shows the geometric distribution with the same mean expected under a standard BDP. B) The probability distribution for the proportion of diversity contained within one randomly chosen sister group of a crown group, indicating that clades are typically highly imbalanced, with one sister group being much larger than the other. The dashed line shows the uniform distribution expected under a standard BDP.
In Figure 1A, we indicate both the mean clade size and the mean experienced clade size for illustration. This result should be compared to the equivalent result from a standard BDP where the mean experienced clade is only two times greater than the mean (Budd and Mann, 2018). This implies that the large majority of species we might encounter and/or study are contained in extremely large clades. Since clades are hierarchically structured this also implies that the diversity of any clade is likely to be dominated by its largest sub-clade. To illustrate this, we consider the two sister-groups of a clade originating 500 Ma and calculate the expected proportion of the total diversity that is contained in one sister-group chosen at random. As shown in Figure 1B, the probability that a given proportion of total diversity is contained in a given sister-group is peaked strongly close to zero and one, indicating that one sister-group or the other typically contains the large majority of species in the clade as a whole. For example, there is a c. 50% chance that the larger sister group is at least 20 times larger than the other. This can be compared with the equivalent result under a standard BDP, in which the proportion of diversity contained in one sister-group is uniformly distributed between zero and one indicated by the dashed line), and thus, the probability of such an imbalance is only 10%. This implies that diversity among clades of the same age tends to follow the Single Big Jump principle (Vezzani et al., 2019), whereby sums of heavy-tailed random variables are dominated by their largest component.
Diversification Through Time
The above analysis reveals the expected pattern of diversity in clades of a fixed age (500 myr) which all start from a common ancestor with a typical tempo (). How does this pattern change through time, and between clades with different initial tempos? To explore these questions, we focused on how the expected clade size varies through time for different initial values of . We numerically solved equation 4 to obtain the expected clade size as a function of time values myr, and for different initial values of . In Figure 2A, we show how the mean clade size varies through time for different initial tempos including clades that have gone extinct before the time in question. In Figure 2B, we show the variation in the mean number of species through time conditioned on knowing that the clade survives to the present day (solid lines), and also the expected number of lineages (dashed lines) through time—these are species that have at least one descendant in the present day, and form the “reconstructed process” that can (in principle) be inferred from modern molecular data. Clades that survive to the present experience the “Push of the Past” (Budd and Mann, 2018), an initial period of increased diversification when the clade is small.
Figure 2.
Diversification through time as a function of starting tempo. A) The expected number of species through time for a clade starting 500 Ma with different initial log-tempos: (blue line); (black line); (red line). These expectations include clades that are extinct. Clades with a higher starting tempo initially diversify more quickly (on average); eventually diversification stabilizes to a fixed rate independent of the starting tempo. B) Expected diversification profiles for clades that survive to the present day. Solid lines indicate the expected number of species through time; dashed lines indicate the expected number of lineages—species with surviving descendants. Surviving clades of all starting tempos experience the Push of the Past, mirrored by the Pull of the Present in the lineages (Nee et al., 1994a). This effect is especially pronounced in the clades starting with the highest tempo.
These results show that the initial tempo has a substantial impact on how the clade diversifies and its eventual expected size. As we would intuitively expect, clades with high tempos initially diversify more quickly, and conversely, those with low tempos diversify slowly. However, after some period of time, the rate of diversification becomes stable; initially, high-tempo clades slow down and initially low-tempo clades speed up, such that all clades eventually diversify at the same fixed rate, as seen in emergence of parallel lines of growth from all 3 initial conditions.
The tempo of the root node of a clade therefore has transient effects that eventually decay as new species emerge whose own tempos diffuse away from the initial state. The duration of these transient effects is longer in clades that start with low tempos, since all processes including those that control the diffusion of tempos over time run slower. Although the effect of initial tempo is transient, it leaves an important signature in the eventual size of clades over the long term: because initially high tempo clades diversify more quickly in their early history, they reach a larger size before reverting to a constant diversification rate, meaning that they have a much greater expected diversity in the present. This intuitively suggests that the largest clades of a given age in the present are likely to be those that originated from a high-tempo common ancestor.
Distribution of Tempos Over Time
As a clade diversifies, the various taxa will develop different tempos as they diverge independently from the initial starting tempo, leading to a time-dependent distribution of log-tempos . In Appendix 1, we show that the evolution of this distribution obeys a replicator-mutation equation
| (5) |
where the term indicates the average value of at a given time.
We numerically integrated this equation through times myr for 3 initial starting log-tempos: specified by initial conditions of the form , where is the Dirac delta function (Shutovskyi, 2023). The resulting evolution of the log-tempo probability distributions is shown in Figure 3. These results show that regardless of the starting tempo of the process, our model converges over time to the same stable distribution of log-tempos that is approximately normally distributed. Using the core set of model parameters described earlier gives a mean log-tempo of zero. When the process is initiated with a high tempo (), the convergence to this stable distribution is very rapid (red line). This is because the initially high tempo forces all processes to run fast, so time is effectively compressed. Conversely when the process is initiated with a slow tempo , the convergence is much slower, potentially taking hundreds of millions of years. In practical terms, this predicts the existence of long-lived substructures of the evolutionary tree in which evolution is effectively “running slow.” If other evolutionary processes such as molecular and morphological change are also covariant to the tempo, this would imply the existence of lineages with low diversity and minimal morphological or molecular change over very long periods of time. Since such small clades are common (Figure 1A), we expect that these “living fossils” will be ubiquitous, and in particular, that they will often be the sister group to the few large clades that dominate total diversity.
Figure 3.
A) The evolution of the distribution of log-tempos through time for clades starting from different initial log-tempos: (blue line); (black line); (red line). Lines indicate the expected log-tempo of a randomly chosen species, and shaded areas represent the standard deviation. Regardless of starting tempo, clades converge to the same equilibrium distribution of log-tempos. This convergence is fast in clades that start with high tempos. B) Evolution of the log-tempo distribution for clades with different values of the diversification parameter , with a fixed value of /myr. Starting from the same tempo (), clades reach different equilibrium log-tempo distributions depending on the value of ; higher values of produce higher average tempos.
Varying the parameters of our model produces changes in the stable distribution of tempos. In particular, the mean of this distribution increases with larger and decreases with larger (Figure 3B); in the limiting case where , the mean log-tempo can be shown to converge to in closed form (see Appendix 1, equation A26). The dynamics of diversification tend to elevate the mean tempo, since higher tempo lineages produce more descendants on average per unit time, which inherit the same high tempo from their parent nodes. An interesting corollary to this point is that without any sort of mean reversion process, tempos (and thus diversification rates) would simply tend to rapidly increase without limit. As this is not observed empirically, the suggestion must be that something tends to draw log-tempos towards a characteristic mean value (c.f., Aris-Brosou and Yang (2003); Lepage et al. (2006); Maliet et al. (2019)). In Appendix 1, we show that such a mean reversion can arise without implying any necessary ecological mechanism: if tempo is encoded genetically, then intermediate tempos are consistent with a greater number of possible genetic configurations, such that random mutations tend to cause a drift toward these values.
Patterns of Historical Tempo
So far, we have considered what happens to various features of the evolutionary process as it is run forward from a particular initial condition. However, evolutionary analysis can be considered to be retrospective as well: one attempts to identify and explain patterns of evolution looking back in time from a vantage point in the present. As discussed by Budd and Mann (2018), this perspective necessarily distorts the patterns we are likely to observe, especially if one also chooses to analyze clades that have unusual modern-day properties. Such choices are commonplace: the most studied clades are often unusually diverse relative to clades of similar age; since most species are contained in these large clades, they are often taken to be particularly representative of a particular epoch, despite in fact being a highly unrepresentative sample of clades in general.
To investigate the role that contingencies of clade selection have on the observed patterns of evolution, we considered 2 questions. First, if one randomly selects a species in the present and traces its lineage back in time, what expectations should we have about the evolutionary tempo of those ancestors? Second, what expectations should we have about the average tempo of earlier members of that clade overall? These are different questions since most historical taxa, even those with modern descendants (the lineages) will contribute little to modern diversity, owing to the Single Big Jump Principle (Vezzani et al., 2019) identified earlier (c.f., Figure 1B). That is, the ancestors of most modern taxa constitute a very small subset of historical diversity.
First, we consider how likely it is that a species alive today at time originated from an ancestor at time with log-tempo . Since we assumed that the clade originates with an ancestor drawn from the equilibrium distribution , the prior probability that a species alive at time has log-tempo remains by definition of the equilibrium. We can determine a posterior estimate for the ancestor’s log-tempo by application of Bayes’ rule
| (6) |
The likelihood term in this equation, is proportional to the expected number of modern species that an ancestor at time will generate, . This means we can rewrite the above as
| (7) |
The equation above estimates the log-tempo of a direct ancestor of a modern taxon. We can also ask what the tempo of a randomly chosen member of the clade in the past is. To estimate this, we consider the probability of generating species at time from any starting log-tempo (based on solution of the generating function ) and the probability that a randomly chosen species at time has log-tempo if the process starts at , . From these probabilities, we can infer the probability of a historical log-tempo conditioned on the current diversity , using Bayes’ rule and marginalizing over the unknown starting log-tempo
| (8) |
where the final approximation assumes that . In general, this approximation will be reasonable, because of the earlier result that modern diversity arises from a small subset of historical taxa. If the historical number of species at time is high, a randomly chosen taxon is unlikely to contribute significantly to modern diversity and we can therefore treat as being independent of this species and its tempo. Because of the Push of the Past (Budd and Mann, 2018), surviving clades will rapidly reach this state, and in the special case where (i.e., the origin of the clade), the approximation holds exactly.
Figure 4 illustrates our expectations about the historical patterns of tempo. Figure 4A shows the distribution of log-tempos for ancestors of a randomly chosen modern taxon, conditioned on our standard set of parameters ( per species per myr, per species per myr, /myr, ). In the present, these are centered around , which is the stable overall distribution of log-tempos shown in Figure 3A. As we look backward in time the expected log-tempo of the ancestor rises sharply, before plateauing at at c. 100 Ma. While the uncertainty represented by the standard deviation in gray permits a wide variety of ancestral tempos, beyond 100 Ma, these ancestors will have elevated tempos with very high probability. Conversely, the tempo of the clade as a whole tends to peak at its origin, as shown in Figure 4B. This illustrates the overall expected log-tempo of historical species within a clade inhabited by a typical modern taxon (i.e., one with a diversity equal to the mean experienced clade size). That is, the clades that contain most modern taxa are defined by a high early rate of evolution, which then undergoes a consistent secular decline to the present, while the direct ancestors of most modern taxa have uniformly elevated rates of evolution across the history of the clade until close to the present. A consequence of this result is that most modern taxa share relatively recent common ancestors (c. 100–150 Ma), as they overwhelmingly tend to originate via a small subset of lineages that maintain high tempos until this point. This is despite the most recent common ancestor of all species being close to the origin of the clade (in other words; the crown group is expected to emerge soon after the total group—for analysis see, e.g., Budd and Mann (2018)).
Figure 4.
Expected patterns of historical tempo evolution. Lines indicate the expected log-tempo and shaded areas represent the standard deviation. A) The expected historical log-tempo of ancestors of a randomly chosen modern taxon, following its lineage back to the origin of the clade. Throughout this lineage, expected log-tempos are elevated relative to the present day, declining rapidly shortly before the present. B) Expected historical log-tempo of species in the clade as a whole. This is shown for a clade of the mean experienced clade size (the typical clade size of a randomly chosen modern species). Expected tempos are highest at the origin of the clade and decline through time as the clade diversifies. In both panels, the distribution of tempos at time = 0 Ma represents the equilibrium distribution derived as the stable solution to equation A24.
Effect of Tempo Variation on Branch Lengths and Duration
We have now considered the effect of tempo variation on the dynamics of the BDP, and by extension on diversification. We motivated our approach by noting that rates of molecular evolution are commonly assumed to vary in modern relaxed molecular clock analyses, and now, we turn our attention to the interaction of molecular evolution and diversification. Specifically, we consider the expected duration (in real time, equivalent to branch height) and amount of molecular change along branches (= branch length) with differing initial log-tempo values. In our model, tempo can vary within a branch, so the duration of branches is not necessarily exponentially distributed, in contrast to standard BDP models. Instead, the probability that a branch terminates (either by speciation or extinction) in a small interval of time depends on its current log-tempo and is given by .
As shown in Appendix 1, this implies that the probability density that a branch originating with log-tempo terminates at time obeys a partial differential equation of the form
| (9) |
with initial condition . Figure 5A shows the solution to this equation for 3 different values of , illustrating the intuitive result that branches with lower initial tempos tend to have a greater duration—that is they exist for a longer time before either speciating or going extinct.
Figure 5.
The distribution of branch durations A) and amounts of molecular change along branches B) for branches starting with different log-tempos: (blue); (black); (red), assuming that molecular evolution is covariant with tempo. Branches that start with lower tempos are much longer on average in real time than those with high tempos. However, the expected amount of molecular change is independent of the starting tempo. 1 myrs-equivalent is the expected molecular change in 1 myrs at a fixed tempo of .
How does this effect of the initial tempo translate into the amount of molecular change that occurs within a branch? This is an important question, because the relationship between branch duration and molecular change is fundamental to the practice of molecular dating and potentially more broadly to the inference of phylogenetic relationships based on the molecular genetic data from modern taxa because of the problems caused by long branch attraction (Shafir et al., 2020; Kapli et al., 2021).
If we assume that rates of molecular change covary with tempo alongside all other rates, then the amount of molecular change that occurs in some small unit of time is given by
| (10) |
Applying a change of variables to express Equation 9 in terms of the molecular change gives an equation obeyed by the probability density of molecular change in a branch that starts with log-tempo
| (11) |
with initial condition . Noticing that the partial derivatives in this equation will remain zero for all values of , this simplifies to a standard exponential distribution
| (12) |
That is, the amount of molecular change contained in a branch is independent of the value of the tempo. This is illustrated in Figure 5B.
The key result then is that branches that start at higher tempos are typically shorter, but contain just as much molecular change, as longer branches that originate from lower tempos. This implies that a clade that starts with a high tempo is likely to be characterized in its early stages by short-duration branches that nonetheless contain just as much molecular change as later branches that are longer in duration. Since we have shown above that early high tempos are expected especially in clades that are particularly large, we can expect this pattern to be commonly observed. As a corollary, if we further assume that morphological change also covaries with tempo (c.f., Omland (1997); Lee et al. (2013)), then the same pattern of rapid change along short early branches would be observed morphologically by an analogous argument.
Discussion
We have described the CET model of macroevolution that allows the rates of speciation, extinction, and molecular/morphological evolution to coevolve through a variable evolutionary tempo parameter. This model provides a resolution to several outstanding difficulties in reconciling classical birth death models with empirical data. Allowing for tempo variation produces much greater variation in clade sizes over a given time horizon than under homogeneous models, consistent with the fact that modern diversity is dominated by a relatively small number of very large clades across different taxonomic levels. An underappreciated consequence of this distribution is that if we wish to understand how modern patterns of diversity arose, it is important to study the characteristic behavior of such large clades, which, as we have shown here, differs markedly from that of clades as a whole. In other words, large and arguably charismatic clades such as arthropods, birds, and angiosperms that are the subject of understandable interest have quite different patterns of evolution than what an “average” clade might be inferred to have.
Our anaylsis predicts that these clades containing the bulk of modern diversity are likely to result from very high early evolutionary tempos, leading to short early branches (measured in real time). Because we conjecture that evolutionary tempo affects all rates in a covariant fashion, these short early branches are nonetheless expected to contain as much molecular and morphological change as later, longer branches, because the rates of molecular and morphological change are elevated in direct proportion to speciation and extinction.
This offers an explanation for the observation of, for example, such elevated rates coupled in short early branches found in molecular studies that take the fossil record as a reliable guide to the age of the clade (e.g., Lee et al. (2013)). In that example, early rates seem to be 10 times higher than later ones, which would give a value of 2.3, in a clade that is at least 20 times larger than average. Our initial value of of c. 0.7 for a clade c. 8 times larger than average in Figure 4B seems to be broadly compatible with this. We note that such studies tend to indicate a much older origin of the clade when the firm calibration based on the fossil record is removed; this emerges because of the use of a model that assumes a homogeneous BDP as the underlying description of diversification (Budd and Mann, 2024), and because of a questionable assumption that the processes of diversification and molecular evolution are independent. Modern molecular clock analyses typically employ a “relaxed-clock” methodology that permits substantial changes in the rate of molecular evolution across time and between lineages, but these rates are decoupled from the rates of speciation, extinction, and lineage creation (e.g., Aris-Brosou and Yang (2003)). Such a rigorous decoupling between evolutionary processes seems intuitively unrealistic, and indeed, elevated rates of molecular evolution have been posited as a cause of radiations (Lancaster, 2010), while in the fossil record, morphological change is (necessarily) the key signature of diversification. As such, we argue that recognizing the likely covariance between these rates is key to understanding apparent discrepancies between molecular signatures of diversification and the fossil record. Nevertheless, our model does not rely on any particular causal relationship between molecular change and diversification, and indeed, these variables may be linked by underlying factors such as body size (Berv and Field, 2018).
A covariant process that extends to rates of molecular evolution will produce similar amounts of molecular change on all branches of the tree, regardless of their duration in time. This suggests that from a molecular standpoint, there will be little or no difference between an older tree whose branch rates exhibit no secular trend, and a younger tree that experiences rapid early evolution and diversification followed by a slowdown (or indeed an even older tree that experienced very slow early evolution, although these will typically represent only a small proportion of modern diversity). As such, molecular data from modern taxa are unlikely to be able to discern which of these scenarios led to the molecular and species diversity we observe today. Precise and reliable fossil calibrations, in combination with molecular data, can potentially reveal the typical distribution of rates within the time scope of those calibrations. However, extrapolation of younger rates into deeper time is problematic, as we have shown that these are likely to be higher in the past, beyond the deepest precise calibrations (c.f., Budd and Mann (2020b)). This imposes a currently insurmountable barrier to the use of the molecular clock for providing reliable clade age estimates, unless one can argue that rates of speciation and extinction are substantially decoupled from the process of molecular change. As noted earlier, making such an argument would preclude many putative explanations for observed rapid radiations, as well as being counter-intuitive. Although we have analyzed a model in which there is a perfect correlation between all evolutionary rates, in practice, we expect that any significant coupling will severely hamper the use of current clock methodologies. We suggest therefore that the use of molecular clocks for making extrapolative deep-time age estimates is fundamentally unreliable (interpolations within a tree, between nodes of known age are likely to be more constrained, but here, we expect that molecular data will add little to dates derived directly from fossils (e.g., Brown and Smith (2018)).
As well as revealing the broad outlines of the dynamics of a varying tempo model of evolution, our analysis of this model also provides several empirical predictions:
Analysis of clades which are known to originate at similar times will show that the large majority of modern diversity is contained in a small subset of these clades. Most concretely, we anticipate that in pairs of sister groups, one group is likely to greatly dominate the diversity of the total (c.f., Aldous (2001)).
The smaller sister group in a clade will be that which also experiences lower aggregate molecular and morphological change over its history. As such, the species in this group will tend to retain more plesiomorphic features relative to those in the larger sister group. Potential examples of such a phenomenon include the onychophorans relative to arthropods, cyclostomes relative to gnathostomes (Yu et al., 2024), or priapulids relative to other ecdysozoans (e.g., Webster et al. (2006)). This prediction gives some succour to the popular notion of “living fossil” that are slow-evolving, have few species, and which to some extent resemble ancestral taxa (c.f., Crisp and Cook (2005) for the traditional view that “basal,” species-poor groups should not be regarded as ancestral or “primitive”; and Jenner (2022) for a more general discussion of the issue).
The direct ancestors of most modern species will show elevated rates of evolution (diversification, molecular, and morphological) throughout their history. Those lineages that gave rise to a majority of modern species will therefore show consistent rates of molecular evolution until close to the present, when they fall. However, if one analyses all historical taxa in a large clade (which is where most modern taxa reside), we expect to see very high rates of molecular change concentrated at the origin of the clade, declining consistently to the present. Nevertheless, both of these expected patterns take place within a wider context in which rates of evolution remain consistent overall—that is, measured over all species in all clades at a given time.
If we further assume that rates of evolution are associated with body size and generation time (e.g., high rates being linked to small bodies and short generation times), we expect that a randomly chosen modern species will have experienced an increase in body size and generation time in the recent past, having probably originated from ancestors with smaller body size and shorter generation time (c.f., Berv and Field (2018)).
Each of these predictions already enjoys some degree of empirical support in the existing literature, as indicated above. However, further research is needed to test each systematically to the extent that these predictions could be judged to be successful or falsified.
In conclusion, our analysis suggests that a strong correlation between rates of molecular evolution and diversification would explain several empirical features of the natural world, unify two key areas of statistical modeling within a common framework, and point toward necessary developments in phylogenetic inference and molecular dating in which this link is made explicit, such as an extension of the CET model to permit direct inference of actual historical rates from molecular data.
Acknowledgements
We are grateful to Tobias Uller, Ivan Prates, Dan Rabosky, Ben Slater, and Jonathan Ward for discussions and help with literature for various aspects of this work. Hélène Morlon and her group kindly made comments on a previous draft of the manuscript. We are particularly grateful to Antonio Segalini, who was generous with his time and expertise in discussing and coding the numerical solution of PDEs, and the careful and constructive comments of David Černý and 2 other reviewers.
Appendix 1
We will make extensive use of probability generating functions. A quick review of their important properties follows. A probability generating function, for the random variable is defined as:
| (A.1) |
The probability generating function has several important properties that will be useful in the subsequent exposition. In particular:
Normalisation: (in cases where represents a full probability distribution)
Extinction probability:
Expectation:
Sum of random variables: If , then
Retrieval of probabilities:
In respect of point (5) above, the values of can be retrieved efficiently by Fourier inversion:
| (A.2) |
Where the integral expression makes use of the Cauchy integral formula. This expression can be efficiently solved numerically using Fast Fourier Transform methods (Gleeson et al., 2014).
Derivation of equation specifying evolution of the generating function
Define as the generating function for the number of species alive at time from a process that starts at log-tempo at time . We indicate the dependence by means of a subscript for reasons of notational clarity in later analysis.
Assume that we know the generating function for all at some time . How will the generating function change over a small increment of time ? Since the process is fundamentally homogeneous in time (i.e., there are no ‘special’ times’), we can construct this by considering a process that starts incrementally earlier than the known generating function. Within this small interval of time the process will change log-tempo incrementally according to an OU process, and furthermore may either speciate (producing two new independent processes with identical starting tempos) or go extinct. Given a current tempo , the probability of speciation is , and that of extinction is . Based on these possible events, the new generating function is given by a mixture of generating functions at time :
| (A.3) |
Here specifies the probability for the tempo to transition from to over the time interval . We take to evolve via an OU process, with autocorrelation parameter and a stationary variance , experiencing an effective time within real time . Given this specification we have:
| (A.4) |
which yields: and up to first order terms in .
Taking a 2nd-order Taylor expansion of around and retaining first-order terms in gives:
| (A.5) |
Where we have dropped the explicit dependence of on arguments and for concision. Substituting the above expressions for and and taking the limit as gives the fundamental PDE of diversity evolution as given in equation 2.
| (A.6) |
Initial and boundary conditions
The most obvious question one can ask of this equation is: what is the probability that a process starting at log-tempo will generate species over time ? To answer this question we must solve equation 2 for different values of , and use the Fourier inversion formula to retrieve the probability distribution . Solving equation 2 requires both initial and boundary conditions. For the question posed above the appropriate initial condition is given by , since a process that does not evolve for any time must have one species. Choosing appropriate boundary conditions is more difficult; since we must solve equation 2 numerically we take ’no flow’ boundary conditions ( = 0) at some finite bounds and (we will usually use ).
We can also ask how many species of log-tempo will be produced at time by a process that starts with log-tempo at time . Define the generating function of this distribution by . Some consideration will show that the time evolution of obeys the same PDE as that of , but with a different initial condition. Since a process that starts with log-tempo cannot instantaneously evolve to one of , we use the initial condition: , where is the Dirac delta function.
Evolution of the mean diversity
The mean of a distribution is straightforwardly recovered from its generating function via the relationship . Applying this to the equation derived above for the evolution of the generating function gives the evolution of the mean diversity for a process that starts with log-tempo . Defining as the expected value of at time for a process starting with log-tempo :
| (A.7) |
Since by definition, we can simplify this to the expression given in equation 4:
| (A.8) |
where .
By using initial conditions , solving this equation gives the mean number of species generated by a process starting at time and log-tempo . As with the discussion of initial conditions above, we can also apply the same equation with different initial conditions to consider how many species with specific log-tempo are generated by a process that starts at log-tempo . Denoting the expected number of such species of this type as , in this case we use the initial condition , analogously to the case of solving for the generating function. By definition, the expected number of species in total will be the sum over all final log-tempos: . Furthermore, we can ask what the expected number of species with log-tempo is at time if the starting log-tempo is unknown but specified by a probability distribution . In this case we have:
| (A.9) |
and the expected total number of species (considering all possible starting and current log-tempos) can be denoted simply as and is given by:
| (A.10) |
Conditioning on survival
Equation 4 describes the evolution of the mean number of species through time, including all cases where the process goes extinct before the current time. If we want to ask how many species will be alive at time , assuming that the process hasn’t gone extinct, we can do so straightforwardly by excluding the extinct cases:
| (A.11) |
where is the survival probability for a process starting at log-tempo , determined from solving equation 2 for . However, we may also want to know the expected number of species at some time , conditioned on knowing that the process will survive to some future time . In this case the conditioning is more complex. We make use of the identity:
| (A.12) |
which leads to:
| (A.13) |
where is a correction term depending on and that we need to determine. Define a new generating function . Differentiating with respect to and evaluating at gives the required correction term in the equation above. As with the generating function , the evolution of is governed by equation 2:
| (A.14) |
Differentiating with respect to gives:
| (A.15) |
Unlike in the case for , varies as a function of and , and so solution of this equation for requires simultaneously solving this PDE and 2 with initial conditions: and .
Lineages
Lineages are species in the past that have descendants in the present. Since molecular studies are based on extant species, any phylogeny reconstructed from these must consist of lineages. The evolution of lineages has thus been dubbed the ‘reconstructed process’ (Nee et al., 1994b), since these constitute the phylogeny that can, in principle, be reconstructed from molecular or morphological analysis of modern taxa.
We are interested in the number of species alive at time which will have descendants at some later time . Recall is the expected number of species of log-tempo at time in a process that starts at log-tempo . The expected number of these that will have descendants at time is (the survival probability over time for a new process starting with log-tempo ). Thus the expected number of lineages of log-tempo at time is . Summing over values of gives the total expected number of lineages, at time for a process starting with log-tempo , viewed from the perspective of time (we leave this dependence on the time of observation implicit in the notation, but note that lineages are only defined from the perspective of a specific point in time):
| (A.16) |
This expectation includes the cases where the number of lineages is zero, i.e where there are no species at time . If we wish to condition on the process surviving to the present we must remove these cases by dividing by
| (A.17) |
Evolution of tempo distribution
Assuming that we start a process with log-tempo , over time species generated by that process will diverge in tempos. How does this distribution of tempos evolve?
Consider starting a process with log-tempo , and then selecting a species at random at some time . The probability that this species has log-tempo is given by:
| (A.18) |
If the starting log-tempo is unknown, but drawn from a distribution , then we can marginalise the above equation with respect to to find the later distribution :
| (A.19) |
Taking the derivative with respect to time gives:
| (A.20) |
where . That is, the distribution of log-tempos evolves according to a replicator equation, where the ‘fitness’ of a log-tempo is given by the proportional increase in .
If we assume that at some point in time the distribution of log-tempos is given by , we can consider the instantaneous evolution of from this time. Defining the current time to be , we have the initial condition:
| (A.21) |
From equation 4, this implies that:
| (A.22) |
Applying standard rules for the operation of derivatives of the Dirac delta function, we can marginalise the above equation with respect to the initial distribution to give:
| (A.23) |
Substituting this into equation A.32, and noting that again that , we get:
| (A.24) |
Where is the mean value of .
This then provides a replicator-mutation equation for the evolution of the tempo distribution, with the ‘fitness’ of log-tempo being . In particular, it specifies that the stable long term distribution of log-tempos is given by the solution to:
| (A.25) |
Notably, we can see that if , we recover the standard Fokker-Planck representation for the stationary OU process in the transformed distribution :
| (A.26) |
with the stationary solution , implying a mean log-tempo of .
From the equilibrium equation we can also find another useful relationship on the mean value. Multiplying equation A.37 by and integrating gives:
| (A.27) |
Integrating the partial differential terms by parts yields:
| (A.28) |
from which we can see that if then , i.e. the mean log-tempo will converge to zero when the diversification parameter is equal to the mean-reversion parameter.
Branch duration and expected molecular change
Considering a branch that begins with log-tempo , what is the expected time until that branch terminates, either by speciation or extinction? For a branch to endure for time it must first fail to terminate in time , and then survive for a further time with some new log-tempo . Integrating over the possible values of we have:
| (A.29) |
where is the cumulative probability that the branch originating with log-tempo has terminated by time .
Taking a second-order Taylor expansion of around and retaining first order terms in we have:
| (A.30) |
The probability density for the branch to terminate at time is given by differentiation of the cumulative distribution: . Applying this transformation to the equation above yields:
| (A.31) |
The probability density for a branch to terminate at time thus follows the same form of differential equation as that for the mean number of species (equation 4), but with taking the place of . Solving this equation requires the initial condition , which implies .
Assuming that molecular rates of change are covariant to tempo (), for every increment of time the expected amount of molecular change (in arbitrary units that we label as myrs-equivalent; 1 myrs-equivalent being the expected molecular change in 1 myrs at a fixed tempo of ) is . We can transform the above equation for (which is given in terms of real time ) into one that applies over via a change of variables, to give the cumulative probability that a branch terminates before accumulating units of molecular change.
| (A.32) |
and as above we obtain the probability density to terminate at , , by differentiation: :
| (A.33) |
Here we have the initial condition , which implies . Consideration of this equation will show that the partial derivatives in are initially zero and will remain zero for all values of . Thus we can simplify the equation to:
| (A.34) |
The solution to this equation is straightforward and shows that follows an exponential distribution with rate :
| (A.35) |
The notable feature of this density is that it does not depend on the starting log-tempo , implying that the amount of molecular change in a branch is independent of tempo.
Schematic for genetic encoding of tempo
Here we describe a simple model for how a genetic encoding of tempo can lead to the modified OU process we take as the basis for tempo evolution. Consider a binary string of bases represented as ’1’ or ’0’, and define as the proportion of bases that are ‘active’ – that is, encoded as ’1’. We assume that these bases mutate independently and neutrally, and with a rate that is covariant to the tempo , such that the probability for each base to mutate in a small interval of time is .
If the number of active bases at time is given by , then in the interval of time the number of bases that mutate from ’1’ to ’0’ is binomially distributed as , and similarly the number mutating from ’0’ to ’1’ is binomially distributed as . If we take to be large and to be small these binomial distributions can be approximated by normal distributions, such that the number of mutations from ’1’ to ’0’ is normally distributed with mean and variance , and the number of mutations from ’0’ to ’1’ is normally distributed with mean and variance .
The change in the number of active bases is given by the number mutating from ’0’ to ’1’, minus the number mutating from ’1’ to ’0’. Given the results above, this change is also normally distributed. Taking the limit as becomes infinitesimal (denoted ) and retaining only terms first order in we have:
| (A.36) |
This is equivalent to the following form of stochastic differential equation:
| (A.37) |
where is an increment from a standard Wiener process, with mean zero and variance .
To specify a genetic encoding of the log-tempo, let us now define , where is some arbitrary constant of proportionality, such that when half of bases are active this defines . We can then rewrite the above equation as:
| (A.38) |
Defining new variables and , we have:
| (A.39) |
which is precisely the modified OU process specified in equation 1. By taking to be sufficiently large we can extend the boundaries of minimum and maximum values of such that arbitrarily high or low values of are possible within this model. We have assumed that is large, and this assumption means that boundary effects around and can be safely ignored as these states are highly unlikely to occur under a random mutation process.
This then provides a schematic representation of how tempo could be genetically encoded in a manner that naturally leads to the modified OU process description that we employ in this paper. The purpose of this schematic is not to argue that this represents the actual genetic encoding of tempo in any specific details, but instead to illustrate how such an encoding would naturally give rise to the mean-reversion properties of the OU process, via the action of entropic forces. That is, the log-tempo tends to revert to the mean not due to any ecological mechanism, but simply because there are more possible encodings with than those that encode more extreme values of . One way in which tempo might influence rates in the way required by the CET model would be if it was encoded by a multilocus set of genes that influence body size, as body size appears to be associated with a syndrome of other features such as generation time and mutation rate (Martin, 2017). This encoding would satisfy the requirements of the CET model, although we would stress again that we have no formal commitment to it.
Contributor Information
Graham E Budd, Department of Earth Sciences, Palaeobiology, Uppsala University, SE 752 36 Uppsala, Sweden.
Richard P Mann, Department of Statistics, School of Mathematics, University of Leeds, Leeds LS2 9JT, UK.
Supplementary Material
Data available from the Dryad Digital Repository: https://doi.org/10.5061/dryad.q573n5ts2
Conflict of Interest
None declared.
Funding
This work was supported by Vetenskapsrådet (VR); UK Research and Future Leaders Fellowship grant 2022-03522 and UK Research and Innovation Future Leaders Fellowship MR/S032525/1 and MR/X036863/1.
Data availability
The R code for generating the figures is available from the Dryad Repository at https://doi.org/10.5061/dryad.q573n5ts2
References
- Aldous D. J. 2001. Stochastic models and descriptive statistics for phylogenetic trees, from Yule to today. Stat. Sci. 16:23–34. [Google Scholar]
- Arenas M. 2015. Trends in substitution models of molecular evolution. Front. Genet. 6:319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aris-Brosou S., Yang Z. 2003. Bayesian models of episodic evolution support a late Precambrian explosive diversification of the Metazoa. Mol. Biol. Evol. 20:1947–1954. [DOI] [PubMed] [Google Scholar]
- Barido-Sottani J., Morlon H. 2023. The ClaDS rate-heterogeneous birth–death prior for full phylogenetic inference in BEAST2. Syst. Biol. 72:1180–1187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barraclough T.G., Savolainen V. 2001. Evolutionary rates and species diversity in flowering plants. Evol. 55:677–683. [DOI] [PubMed] [Google Scholar]
- Beaulieu J.M., O’Meara B.C., Crane P., Donoghue M.J. 2015. Heterogeneous rates of molecular evolution and diversification could explain the Triassic age estimate for angiosperms. Syst. Biol. 64:869–878. [DOI] [PubMed] [Google Scholar]
- Beck R.M., Lee M.S. 2014. Ancient dates or accelerated rates? morphological clocks and the antiquity of placental mammals. Proc. R. Soc. B-Biol. Sci. 281:20141278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benton M.J., Emerson B.C. 2007. How did life become so diverse? The dynamics of diversification according to the fossil record and molecular phylogenetics. Paleontol. 50:23–40. [Google Scholar]
- Berv J.S., Field D.J. 2018. Genomic signature of an avian Lilliput effect across the K-Pg extinction. Syst. Biol. 67:1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blum M.G., François O. 2006. Which random processes describe the tree of life? A large-scale study of phylogenetic tree imbalance. Syst. Biol. 55:685–691. [DOI] [PubMed] [Google Scholar]
- Bouckaert R., Vaughan T.G., Barido-Sottani J., Duchêne S., Fourment M., Gavryushkina A., Heled J., Jones G., Kühnert D., De Maio N., Matschiner M., Mendes F.K., Müller N.F., Ogilvie H.A., du Plessis L., Popinga A., Rambaut A., Rasmussen D., Siveroni I., Suchard M.A., Wu C-H., Xie D., Zhang C., Stadler T., Drummond A.J. 2019. BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis. PLoS Comput. Biol 15((4)):e1006650. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bromham L. 2003. Molecular clocks and explosive radiations. J. Mol. Evol. 57:S13–S20. [DOI] [PubMed] [Google Scholar]
- Bromham L. 2009. Why do species vary in their rate of molecular evolution? Biol. Lett. 5:401–404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bromham L. 2020. Causes of variation in the rate of molecular evo‐ lution. In: Ho S., editor. The molecular evolutionary clock. Cham: Springer. p. 45–64. [Google Scholar]
- Bromham L. 2024. Combining molecular, macroevolutionary, and macroecological perspectives on the generation of diversity. Cold Spring Harb. Perspect. Biol. 16:a041453. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bromham L., Woolfit M. 2004. Explosive radiations and the reliability of molecular clocks: island endemic radiations as a test case. Syst. Biol. 53:758–766. [DOI] [PubMed] [Google Scholar]
- Bromham L.D., Hendy M.D. 2000. Can fast early rates reconcile molecular dates with the Cambrian explosion? Proc. R. Soc. Lond. B Biol. Sci. 267:1041–1047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brown J.W., Smith S.A. 2018. The past sure is tense: on interpreting phylogenetic divergence time estimates. Syst. Biol. 67:340–353. [DOI] [PubMed] [Google Scholar]
- Budd G.E., Mann R.P. 2018. History is written by the victors: the effect of the push of the past on the fossil record. Evol. 72:2276–2291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Budd G.E., Mann R.P. 2020a. The dynamics of stem and crown groups. Sci. Adv. 6:eaaz1626. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Budd G.E., Mann R.P. 2020b. Survival and selection biases in early animal evolution and a source of systematic overestimation in molecular clocks. Interface Focus 10:20190110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Budd G.E., Mann R.P. 2024. Two notorious nodes: a critical examination of relaxed molecular clock age estimates of the bilaterian animals and placental mammals. Syst. Biol. 73:223–234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coiro M., Doyle J.A., Hilton J. 2019. How deep is the conflict between molecular and fossil evidence on the age of angiosperms? New Phytol. 223:83–99. [DOI] [PubMed] [Google Scholar]
- Cooney C.R., Thomas G.H. 2021. Heterogeneous relationships between rates of speciation and body size evolution across vertebrate clades. Nat. Ecol. Evol. 5:101–110. [DOI] [PubMed] [Google Scholar]
- Crisp M.D., Cook, L.G. 2005. Do early branching lineages signify ancestral traits? Trends Ecol. Evol. 20:122–128. [DOI] [PubMed] [Google Scholar]
- Dos Reis M., Donoghue P.C., Yang Z. 2016. Bayesian molecular clock dating of species divergences in the genomics era. Nat. Rev. Genet. 17:71–80. [DOI] [PubMed] [Google Scholar]
- Duchêne D.A., Hua X, Bromham L. 2017. Phylogenetic estimates of diversification rate are affected by molecular rate variation. J. Evol. Biol. 30:1884–1897. [DOI] [PubMed] [Google Scholar]
- Eo S.H., DeWoody J.A. 2010. Evolutionary rates of mitochondrial genomes correspond to diversification rates and to contemporary species richness in birds and reptiles. Proc. R. Soc. B-Biol. Sci. 277:3587–3592. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Erwin D.H. 2000. Macroevolution is more than repeated rounds of microevolution. Evol. Dev. 2:78–84. [DOI] [PubMed] [Google Scholar]
- Forbes A.A., Bagley R.K., Beer M.A., Hippee A.C., Widmayer A. 2018. Quantifying the unquantifiable: why Hymenoptera, not Coleoptera, is the most speciose animal order. BMC Ecol. 18:1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gleeson J.P., Ward J.A., O’Sullivan K.P., Lee W.T. 2014. Competition-induced criticality in a model of meme popularity. Phys. Rev. Lett. 112:048701. [DOI] [PubMed] [Google Scholar]
- Goldie X., Lanfear R., Bromham L. 2011. Diversification and the rate of molecular evolution: no evidence of a link in mammals. BMC Evol. Biol. 11:286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greenberg D.A., Mooers A.Ø. 2017. Linking speciation to extinction: diversification raises contemporary extinction risk in amphibians. Evol. Lett. 1:40–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henao-Diaz L.F., Pennell M. 2023. The major features of macroevolution. Syst. Biol. 72:1188–1198. [DOI] [PubMed] [Google Scholar]
- Holmes J.D., Budd G.E. 2022. Reassessing a cryptic history of early trilobite evolution. Comm. Biol. 5:1177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hua X., Bromham L. 2017. Darwinism for the genomic age: connecting mutation to diversification. Front. Genet. 8:12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hugall A.F., Lee M.S. 2007. The likelihood node density effect and consequences for evolutionary studies of molecular rates. Evol. 61:2293–2307. [DOI] [PubMed] [Google Scholar]
- Jablonski D. 2000. Micro-and macroevolution: scale and hierarchy in evolutionary biology and paleobiology. Paleobiol. 26:15–52. [Google Scholar]
- Jenner R.A. 2022. Ancestors in evolutionary biology: linear thinking about branching trees. Cambridge University Press. [Google Scholar]
- Jobson R.W., Albert V.A. 2002. Molecular rates parallel diversification contrasts between carnivorous plant sister lineages 1. Cladistics 18:127–136. [DOI] [PubMed] [Google Scholar]
- Jukes T.H., Cantor C.R. 1969. Evolution of protein molecules. Mamm. Prot. Metab. 3:21–132. [Google Scholar]
- Kapli P., Flouri T., Telford M.J. 2021. Systematic errors in phylogenetic trees. Curr. Biol. 31:R59–R64. [DOI] [PubMed] [Google Scholar]
- Kendall D.G. 1948. On the generalized “birth-and-death” process. Ann. Math. Stat. 19:1–15. [Google Scholar]
- Khurana M.P., Scheidwasser-Clow N., Penn M.J., Bhatt S., Duchêne D.A. 2024. The limits of the constant-rate birth–death prior for phylogenetic tree topology inference. Syst. Biol. 73:235–246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lancaster L.T. 2010. Molecular evolutionary rates predict both extinction and speciation in temperate angiosperm lineages. BMC Evol. Biol. 10:1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lanfear R., Ho S.Y., Love D., Bromham L. 2010. Mutation rate is linked to diversification in birds. Proc. Natl. Acad. Sci. 107:20423–20428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee M.S., Soubrier J., Edgecombe G.D. 2013. Rates of phenotypic and genomic evolution during the Cambrian explosion. Curr. Biol. 23:1889–1895. [DOI] [PubMed] [Google Scholar]
- Lepage T., Lawi S., Tupper P., Bryant D. 2006. Continuous and tractable models for the variation of evolutionary rates. Math. Biosci. 199:216–233. [DOI] [PubMed] [Google Scholar]
- Magallon S., Sanderson M.J. 2001. Absolute diversification rates in angiosperm clades. Evol. 55:1762–1780. [DOI] [PubMed] [Google Scholar]
- Maliet O., Hartig F., Morlon H. 2019. A model with many small shifts for estimating species-specific diversification rates. Nat. Ecol. Evol. 3:1086–1092. [DOI] [PubMed] [Google Scholar]
- Marshall C.R. 2017. Five palaeobiological laws needed to understand the evolution of the living biota. Nat. Ecol. Evol. 1:0165. [DOI] [PubMed] [Google Scholar]
- Martin R.A. 2017. Body size in (mostly) mammals: mass, speciation rates and the translation of gamma to alpha diversity on evolutionary timescales. Hist. Biol. 29:576–593. [Google Scholar]
- McPeek M.A., Brown J.M. 2007. Clade age and not diversification rate explains species richness among animal taxa. Am. Nat. 169:E97–E106. [DOI] [PubMed] [Google Scholar]
- Moen D., Morlon H. 2014. Why does diversification slow down? Trends Ecol. Evol. 29:190–197. [DOI] [PubMed] [Google Scholar]
- Nee S. 2006. Birth-death models in macroevolution. Annu. Rev. Ecol. Evol. Syst. 37:1–17. [Google Scholar]
- Nee S., Holmes E.C., May R.M., Harvey P.H. 1994a. Extinction rates can be estimated from molecular phylogenies. Philos. Trans. Biol. Sci. 344:77–82. [DOI] [PubMed] [Google Scholar]
- Nee S., May R.M., Harvey P.H. 1994b. The reconstructed evolutionary process. Philos. Trans. R. Soc. Lond. B Biol. Sci. 344:305–311. [DOI] [PubMed] [Google Scholar]
- Omland K. E. 1997. Correlated rates of molecular and morphological evolution. Evolution 51:1381–1393. [DOI] [PubMed] [Google Scholar]
- Quintero I., Lartillot N., Morlon H. 2024. Imbalanced speciation pulses sustain the radiation of mammals. Sci. 384:1007–1012. [DOI] [PubMed] [Google Scholar]
- Rabosky D.L. 2010. Primary controls on species richness in higher taxa. Syst. Biol. 59:634–645. [DOI] [PubMed] [Google Scholar]
- Rabosky D.L., Grundler M., Anderson C., Title P., Shi J.J., Brown J.W., Huang H., Larson J.G. 2014. Bamm tools: an R package for the analysis of evolutionary dynamics on phylogenetic trees. Methods Ecol. Evol. 5:701–707. [Google Scholar]
- Rabosky D.L., Santini F., Eastman J., Smith S.A., Sidlauskas B., Chang J., Alfaro M.E. 2013. Rates of speciation and morphological evolution are correlated across the largest vertebrate radiation. Nat. Comm. 4:1958. [DOI] [PubMed] [Google Scholar]
- Ritchie A.M., Hua X., Bromham L. 2022a. Diversification rate is associated with rate of molecular evolution in ray-finned fish (Actinopterygii). J. Mol. Evol. 90:200–214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ritchie A.M., Hua X., Bromham L. 2022b. Investigating the reliability of molecular estimates of evolutionary time when substitution rates and speciation rates vary. BMC Ecol. Evol. 22:1–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rolland J., Henao-Diaz L.F., Doebeli M., Germain R., Harmon L.J., Knowles L.L., Liow L.H., Mank J.E., Machac A., Otto S.P., Pennell, M., Salamin N., Silvestro, D., Sugawara, M., Uveda J., Wagner, C.E., Schluter, D. 2023. Conceptual and empirical bridges between micro- and macroevolution. Nat. Ecol. Evol. 7:1181–1193. [DOI] [PubMed] [Google Scholar]
- Sarver B.A., Pennell M.W., Brown J.W., Keeble S., Hardwick K.M., Sullivan J., Harmon L.J. 2019. The choice of tree prior and molecular clock does not substantially affect phylogenetic inferences of diversification rates. PeerJ 7:e6334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shafir A., Azouri D., Goldberg E.E., Mayrose I. 2020. Heterogeneity in the rate of molecular sequence evolution substantially impacts the accuracy of detecting shifts in diversification rates. Evol. 74:1620–1639. [DOI] [PubMed] [Google Scholar]
- Shutovskyi A.M. 2023. Some applied aspects of the Dirac delta function. J. Math. Sci. 276:685–694. [Google Scholar]
- Smith S.A., Beaulieu J.M. 2024. Ad fontes: divergence-time estimation and the age of angiosperms. New Phytol. 244:760–766. [DOI] [PubMed] [Google Scholar]
- Soltis P. S. and Soltis D. E. 2016. Ancient WGD events as drivers of key innovations in angiosperms. Current Opinion in Plant Biology 30:159–165. [DOI] [PubMed] [Google Scholar]
- Stadler T., Degnan J.H., Rosenberg N.A. 2016. Does gene tree discordance explain the mismatch between macroevolutionary models and empirical patterns of tree shape and branching times? Syst. Biol. 65:628–639. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stanley S.M. 1990. The general correlation between rate of speciation and rate of extinction: fortuitous causal linkages. In: Ross R.M., Allmon W.D., editors. Causes of evolution: a paleontological perspective. Chicago: University of Chicago Press. p. 103–127. [Google Scholar]
- Stanley S.M. 1998. Macroevolution: pattern and process. Baltimore (MD): Johns Hopkins University Press. [Google Scholar]
- Strathmann R.R., Slatkin M.. 1983. The improbability of animal phyla with few species. Paleobiol. 9:97–106. [Google Scholar]
- Venditti C., Pagel M. 2010. Speciation as an active force in promoting genetic evolution. Trends Ecol. Evol. 25:14–20. [DOI] [PubMed] [Google Scholar]
- Vezzani A., Barkai E., Burioni R. 2019. Single-big-jump principle in physical modeling. Phys. Rev. E 100:012108. [DOI] [PubMed] [Google Scholar]
- Warnock R., Wright A. 2021. Understanding the tripartite approach to Bayesian divergence time estimation. Elements of Paleontology. Cambridge: Cambridge University Press. [Google Scholar]
- Webster A.J., Payne RJ, Pagel M. 2003. Molecular phylogenies link rates of evolution and speciation. Sci. 301:478–478. [DOI] [PubMed] [Google Scholar]
- Webster B.L., Copley R.R., Jenner R.A., Mackenzie-Dodds J.A., Bourlat S.J., Rota-Stabelli O., Littlewood D., Telford M.J. 2006. Mitogenomics and phylogenomics reveal priapulid worms as extant models of the ancestral ecdysozoan. Evol. Develop. 8:502–510. [DOI] [PubMed] [Google Scholar]
- Xiang Q.-Y., Zhang W.H., Ricklefs R.E., Qian H., Chen Z.D., Wen J., Li J.H. 2004. Regional differences in rates of plant speciation and molecular evolution: a comparison between eastern Asia and eastern North America. Evol. 58:2175–2184. [DOI] [PubMed] [Google Scholar]
- Yu D., Ren Y., Uesaka M., Beavan A.J., Muffato M., Shen J., Li Y., Sato I., Wan W., Clark J.W., et al. 2024. Hagfish genome elucidates vertebrate whole-genome duplication events and their evolutionary consequences. Nat. Ecol. Evol. 8:519–535. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The R code for generating the figures is available from the Dryad Repository at https://doi.org/10.5061/dryad.q573n5ts2





