Skip to main content
Springer logoLink to Springer
. 2021 Feb 25;82(4):22. doi: 10.1007/s00285-021-01559-5

Analysis of diversity-dependent species evolution using concepts in population genetics

Ingemar Kaj 1,, Sylvain Glémin 2, Daniah Tahir 1, Martin Lascoux 3
PMCID: PMC7907050  PMID: 33634353

Abstract

In this work, we consider a two-type species model with trait-dependent speciation, extinction and transition rates under an evolutionary time scale. The scaling approach and the diffusion approximation techniques which are widely used in mathematical population genetics provide modeling tools and conceptual background to assist in the study of species dynamics, and help exploring the analogy between trait-dependent species diversification and the evolution of allele frequencies in the population genetics setting. The analytical framework specified is then applied to models incorporating diversity-dependence, in order to infer effective results from processes in which the net diversification of species depends on the total number of species. In particular, the long term fate of a rare trait may be analyzed under a partly symmetric scenario, using a time-change transform technique.

Keywords: Wright–Fisher diffusion, Two-type branching, Scaling limit process, Trait fixation probability, Carrying capacity model

Introduction

Models of species richness based on simple birth-death mechanisms with constant speciation and extinction rates suffer from the classical dichotomy of supercritical branching processes; species must either go extinct or grow in number without bound. Such models applied to characterizing divergence of living organisms in terms of number of species in taxas and families, consequently fail to produce what is typically observed. The situation is similar for diversification models of multi-trait species based on multi-type, linear branching process theory. To qualify as a species model, at least one trait should have a strictly positive net growth rate, which again leads to the same supercritical dichotomy as for a single trait. See e.g., Haccou et al. (2005) for a general account of branching processes with emphasis on variation, growth and extinction of populations.

In reality, species numbers derived from fossil data or estimated from analysis of phylogenetic trees often tend to vary over long periods of time in a mode of stationarity or quasi-stationarity within a finite range of realistic values. It is, thus, rather natural to seek to implement in the modeling set-up, some form of population size control or diversity-dependent regulation. While logistic-type population-size dependence is successfully implemented in branching process theory and easily interpretable in term of population dynamics, the nature and causes of diversity-dependent diversification are still debated, cf. Lambert (2006), Rabosky (2013). The diversity-independent models studied in e.g., Maddison et al. (2007) and Tahir et al. (2019), assume that the per-lineage rates of speciation, extinction, and transition of traits are trait-dependent but constant in the sense that they do not depend on the current number of species of a particular trait, or the current total number of species. On the other hand, systems involving diversity-dependence retard supercritical growth by regulating the increase in species numbers through birth and death rates in various ways (Quental and Marshall 2010). In the papers of Parsons and Quince (2007a, 2007b), that consider population-level dynamics, the birth rates are assumed to be decreasing functions of the total number of individuals in the population, thus, imposing a maximal carrying capacity of the system in case the birth rates vanish at some level. In logistic branching models, additional deaths are imposed in proportion to the square of the number of individuals to slow down population growth, Lambert (2005). Similarly, Parsons et al. (2008, 2010) apply population size dependent mortality rates. Fournier and Méléard (2004) consider an ecological system where individuals are characterized by their location, and the mortality rate depends on the local population density. In an environment of say two competing traits, both speciation and extinction rates could be diversity-dependent, and the rates could be allowed to have different sensitivity to competition (Mallet 2012). Further variations have been proposed, such as using time varying, instead of fixed, carrying capacities (Marshall and Quental 2016), and including adaptive radiations by decoupling of the diversity-dependent dynamics of a sub-clade (Etienne and Haegeman 2012). Also proposed are models in which positive net diversification rates are followed by negative net diversification rates to produce ‘waxing and waning’ diversity dynamics (Morlon 2014), and models where the speciation rates are decreasing functions of time, such that the magnitude of the decline in speciation increases as time approaches the environment’s carrying capacity (Rabosky and Lovette 2008). Typical examples are discussed in e.g., Rabosky (2013) and Etienne et al. (2012), which summarize the parametrizations and properties of models involving density-dependence. In Rabosky (2013), Darwinian diversity-dependence and asymptotic diversity-dependence are contrasted. The former entails that a slowdown in speciation rates with diversity is in agreement with inter-species competition and Darwin’s principle of divergence. The latter represents patterns of species saturation and long term stability of diversity trajectories, but does not specify clear mechanisms of cause and effect. In general, the time scales relevant for diversity-dependence or diversity-independence are unspecified, thus leaving open whether regulatory effects act on microevolutionary rates over shorter ecological time scales, or on macroevolutionary rates averaged over much longer geological time spans (cf. Rabosky (2013). See also, Benton and Emerson (2007)).

We present here, a framework to address diversity-independent versus diversity-dependent dynamics for a family of species with a binary trait, for which the rates of creation and extinction of species as well as the rates of transition of the trait between species, are trait-dependent. Formal equivalence can be made between species level and population level dynamics, where number of species is equivalent to number of individuals, diversification rate is equivalent to growth rate, and transition in species traits is equivalent to mutation (e.g. Chevin (2016), Vellend (2010)). So, useful insights can be gained from using concepts of population genetics to species diversification analysis. We first rigorously investigate the mechanisms in species tree models which are analogous to those of population genetics models, such as mutation, selection, and genetic drift in allelic frequency models, and later, we apply the population genetics concepts to study diversity-dependent processes. The fraction of species in the family which carries one of the traits evolves in a manner directly comparable with the evolution of allele frequencies in the population genetics framework. The transition of a species of one trait to a species of the other trait resembles mutational change from one allelic type to another. Similarly, the trait-dependence present in creation rates, extinction rates, and the associated turnover rates, cause selection effects as well as frequency dependent genetic drift coefficients. Furthermore, various forms of population size dependence may be cast as density-dependent selection mechanisms. Despite, this formal equivalence, there are relevant biological differences between species level and population level models. For instance, whereas mutation is often much lower than selection in population genetics, there is no a priori reason to assume that rates of trait transition are of different magnitude than speciation and extinction rates. This makes species traits less “heritable” than individual traits. Besides, the number of species relevant to the species diversification dynamics is likely far less than the effective size of a species (100, 1000 for the former versus usually much more than 104 for the latter). This makes diversification dynamics more sensitive to stochastic effects (see below).

The central technique in our approach is a diffusion approximation of a species tree Markov chain, which runs on a suitable time scale of evolutionary time units. In particular, our scaling method incorporates the point of view that “macroevolutionary speciation rates can involve processes associated with both the splitting and extinction of populations over ecological and demographic timescales”, as discussed in Rabosky (2013). From the complementary viewpoint of diffusion processes where the dynamics is provided by solutions of a PDE, our contribution relates closely to the pioneering work by Gillespie (1974). Using these links we study the effective population size in the model, and shed some light on an apparent paradoxical conclusion reported in Gillespie (1974). Several simplified, partly symmetric, parameter settings are identified in the stochastic model leading up to a reference case, or neutral case, which corresponds to fully neutral evolution in population genetics modeling.

Research has previously been carried out on similar topics, in which diffusion approximation methods were used to study density-dependent processes. Lambert (2006) provides a branching process approach to studying probability of fixation for two-allele population models, including density-dependence mechanisms. Parsons and Quince (2007a) extended a logistic growth model to a supercritical model of two competing types, and using diffusion approximation methods, estimated the probability of fixation for both types. They proposed a so called ‘non neutral’ model, in which the ratios of birth to death rates were assumed to be different for the two types, and showed that the type with higher birth to death rate ratio eventually takes over the entire population. In a subsequent analysis by Parsons and Quince (2007b), the fixation probability was again approximated using diffusion processes, but for ‘quasi neutral’ populations in which both types had equal birth to death ratios. In this case, the type with the higher birth rate shows an increase in numbers during the growth phase at densities below the carrying capacity, whereas near the carrying capacity, the type with lower birth rate is favored, due to smaller fluctuations in population density – a phenomenon termed as ‘r vs K selection’ (see e.g., Pianka (1970)). Abu Awad and Coron (2018) use a scaling approach similar to ours to study effective population mass, absorption and extinction times, etc., for a population controlled by a carrying capacity. Chevin (2016) utilized diffusion approximation methods to analyze the evolution of binary discrete and continuous traits, and interpreted diversification models in terms of population genetics concepts of species selection and random drift. In our current work, we develop a further direction by providing a thorough mathematical basis and an analytical framework for the analogy between binary trait species diversification and population genetics models. For completeness and clarity, we re-derived some of these previous results using our proposed analytical framework.

The species branching model

We consider the dynamics of the size of a species family which carries a binary trait, marked 0 or 1, and undergoes trait-dependent splitting, extinction, and transition. We apply a two-type, continuous time Markov process X=(Xu)u0 with components X=(K,L), such that Xu=(Ku,Lu) and Ku and Lu represent the number of species with trait 0 and trait 1, respectively, at time u. The two-type branching events are

targetstatebranchingrate(k,)(k+1,)λ0k(k,+1)λ1(k-1,)μ0k(k,-1)μ1(k-1,+1)δ01k(k+1,-1)δ10 1

in terms of splitting rates λ0, λ1, extinction rates μ0, μ1, and transition rates δ01 and δ10. For the analysis in this work, however, it will be convenient to use the equivalent set of parameters consisting of net diversification rates di and turnover rates τi, i=0,1, given by

d0=λ0-μ0,d1=λ1-μ1,τ0=λ0+μ0,τ1=λ1+μ1,

in addition to the transition rates δ01 and δ10. Important aspects of the stochastic behavior of Xu are determined by the eigenvalues, γ- and γ+, γ-γ+, of the 2×2 mean matrix

A=d0-δ01δ10δ01d1-δ10.

For the case γ+0 of subcritical (<) and critical (=) branching, the species family will go extinct at a random time η0, the extinction time, which is finite with probability one and such that Xu=(0,0) for all uη0. Under the supercritical assumption γ+>1, with a positive probability strictly less than one the process goes extinct, otherwise the species family survives and grows in size without bound. The species model is a two-type branching process X=(Xu)u0 with linear jump rates, such that the dynamics of X=(K,L) (here viewed as column vector) satisfy

Xu=X0+0uAXsds+Mu,u0, 2

where X0 is the initial state of the model (e.g., X0=(1,0)) and (Mu)u0 is a stochastic (martingale) term with mean E(Mu)=0. A general class of two-type branching processes, including population size dependent models, is characterized in Kaj and Tahir (2019) as unique solutions of such stochastic equations.

We are particularly interested in the representation (PR) of X, given by

Pu=KuKu+Lu,Ru=Ku+Lu,u0

so that P is the fraction of trait 0 among the species, and R is the total number of species, regardless of trait. Here, Ru=0, uη0, and we note that Pu is well-defined for all u, with either Pu=0 or Pu=1 for uη0. Conversely, given (PR), we obtain (KL) from

Ku=PuRu,Lu=(1-Pu)Ru,u0.

By using Eq. (2) and Itô’s formula for pure jump processes, one obtains a stochastic equation for (PR), see Section 4 in Kaj and Tahir (2019).

As a simple illustration, Fig. 1 shows a simulation of the two-type process X=(K,L) and the corresponding representation (PR). The choice of parameters is supercritical, such that the net growth rate is strictly positive for trait 0, γ+=6, and strictly negative for trait 1, γ-=-5. Thus, trait 1 on its own would go extinct. Here, however, the number of trait-1 species counted by L is sustained and growing by transitions from trait 0.

Fig. 1.

Fig. 1

Upper panels: simulation of the two-type branching process X; L versus K in a) and (Ku,Lu) versus time u in b). Lower panels: simulation of P versus time in c) and R versus time in d). Parameter values used for the simulations are d0=7, d1=4, τ0=17, τ1=16, δ01=4 and δ10=6

Law of large numbers

To understand the average behavior of (KL) and of (PR), it is useful to consider a system which starts at time u=0 with m species, which is equivalent to summing m i.i.d. copies of the original model starting with one initial species each. Write X(m)=(K(m),L(m)) for such a system, with e.g. X0(m)=(m,0), and let X^m=1mX(m) be the resulting average. The (strong) law of large numbers applies, so for each u0,

X^umx^u=(ku,u)a.s.,m,

where the limit is the expected value x^u=E(Xu), which solves the linear ODE

ddux^u=Ax^u,x^0=(1,0).

It follows from the law of large numbers for X^m=(K^m,L^m), that the process (Pm,Rm), defined by

Pum=K^umK^um+L^um,Rum=K^um+L^um,u0,

converges, as m, to the solution (pr), where pu=ku/(ku+u), ru=ku+u, of the deterministic limiting ODE

ddupu=(d0-d1)pu(1-pu)-δ01pu+δ10(1-pu),p0=1dduru=ru(d0pu+d1(1-pu)),r0=1.

The limit equation for the fraction p, is a first indication of a connection to population genetics modeling, as it resembles the deterministic part of allele frequency dynamics (with mutation intensities δ01, δ10 and selection intensity d0-d1). In this model the net transition rate of trait 0, given by the term -δ01pu+δ10(1-pu), arises from so called anagenetic transitions between the two types. Anagenetic transitions occur along the branches of the species tree separate from speciation events. In contrast, a cladogenetic transition is a change in trait combined with the survival of the original type, hence associated with two-type speciation. To account for cladogenetic transitions occurring with probabilities a0 and a1 respectively for the two types, we modify the speciation and transition rates in (1) as

(k,)(k+1,)λ0k+a1δ10(k,+1)λ1+a0δ01k(k-1,+1)(1-a0)δ01k(k+1,-1)(1-a1)δ10,

while the extinction rates remain the same (Tahir et al. 2019). Then the stated ODE for (pr) holds with d0 and d1 replaced by d~0=d0+a0δ01 and d~1=d1+a1δ10. Hence the relative impact of cladogenetic transitions as measured by a0 and a1 affects the “selection intensity” d~0-d~1. This type of transition is analogous to a ‘reduction in species level heritability of the trait’ (Chevin 2016).

Evolutionary time scaling

In addition to the short time scale of “species generations”, we introduce an evolutionary time scale and apply diffusion approximation methods to help understand the dynamics of the species model over the long time scale. Indeed, some of the most powerful tools of mathematical population genetics rely on approximation with diffusion processes, which allow for efficient computation of quantities such as fixation probabilities, expected times to fixation, and expected frequency spectra. The starting point for this approach is to relate the Markov chain X, with the population genetics pre-limit process, such as the Wright–Fisher model or Moran model, using the time scale of generations.

Scaled Wright–Fisher and Moran models

To see the relevance of time scales, we recall the haploid version of the standard bi-allelic Wright–Fisher model in discrete time k0, with fixed population size N, selection coefficient s representing reproductive weights 1+s for trait 0 and 1 for trait 1, and probabilities p01 and p10 for mutations from allele 0 to 1 and 1 to 0, respectively. Letting Z0 be the initial number of trait-0 alleles and Zk, k1, the number of 0-alleles after selective sampling of k new generations allowing for mutational change of traits in each step, this is the Markov chain Z=(Zk)k0. The standard diffusion approximation of the Wright–Fisher model involves a re-scaling of both time and population size with N, as well as a scaling of the model parameters. For this aim, let γ, ρ01, and ρ10 be the scaling parameters which control the rate at which the strength of selection and mutation tends to 0 with increasing N, as

sN=γ/N,p01(N)=ρ01/N,p10(N)=ρ10/N.

More generally, these relations can be understood as limit relations, for example sNγ/N, i.e., limNNsN=γ. We write Z(N), for the associated scaled Wright–Fisher model, and consider the frequency process ξN=(ξtN)t0 on the evolutionary time scale of Nt generations, defined as

ξtN=1NZ[Nt](N),t0.

The Wright–Fisher diffusion, ξ=(ξt)t0, is the limit process of ξN as N, and is known to be the unique, strong solution of the stochastic differential equation

dξt=γξt(1-ξt)dt-ρ01ξtdt+ρ10(1-ξt)dt+ξt(1-ξt)dBt,ξ0=x, 3

with B a Brownian motion. Written in the form dξt=b(ξt)dt+σ(ξt)dBt, with

b(y)=γy(1-y)-ρ01y+ρ10(1-y),σ2(y)=y(1-y),

we may refer to b as the infinitesimal mean or the diffusion drift function and to σ2 as the infinitesimal variance, the diffusion variance, or, the genetic drift function. The Moran model applies a birth-death mechanism for reproduction of alleles, rather than the characteristic binomial sampling of the Wright–Fisher model. Since birth events, where alleles copy and spread, are always compensated for by deaths of randomly sampled individuals, the population size is still maintained at a constant level N. The corresponding rescaling and diffusion approximation as N, yield the same limit process except for a factor 2 in the variance function, σ2(y)=2y(1-y). The further parameterized case, σ2=y(1-y)/Neff, can be associated with the Wright–Fisher approximation of a system modulated by an effective population size, Neff. For the general theory we refer to Etheridge (2011).

Gillespie approach

The infinitesimal generator of the Wright–Fisher diffusion process is the differential operator G, defined for a class of sufficiently regular functions f on the unit interval by

Gf(y)=12y(1-y)f(y)+γy(1-y)f(y),0y1,

for which the function u(t,x)=Ex[f(ξt)] solves Kolmogorov’s backward equation

tut(x)=Gut(x),t>0.u0(x)=f(x),0<x<1.

The population genetics model of Gillespie (1974) allows the population size to vary following an offspring distribution with mean 1+μi and variance σi2, for two alleles Ai, i=1,2. Writing p for the frequency of allele A1 and r for the total population size, this approach suggests the two-dimensional Markov generator

Gf(p,r)=μ1-μ2+1r(σ22-σ12)fp+p(1-p)r(pσ22+(1-p)σ12)122fp2+r(pμ1+(1-p)μ2)fr+r(pσ12+(1-p)σ22)122fr2+p(1-p)(σ12-σ22)2fpr. 4

Here, we are using r for population size as opposed to the more classical population genetics notation n as in Gillespie (1974), to avoid mix-up with our general scaling parameter n used below. The associated Kolmogorov’s backward equation is dual to the corresponding forward equation given as Equation 3 in Gillespie (1974). In particular, we read out that r/(pσ22+(1-p)σ12) acts as a varying effective population size. It is perhaps worth stressing that this measure of effective population size varies not only with the size r but also, in general, with the frequency p, due to possibly different variance parameters σ12, σ22. Earlier references on random effective population size and stochastic population size models can be found in Kaj and Krone (2003), Sjödin et al. (2005), together with developments on coalescent models for such situations. An additional result in Gillespie (1974), in particular Equations 6,7, refers to the special case μ1=μ2 saying that the probability of fixation of allele A1, given an initial frequency p, equals

u(p)=σ22pσ12(1-p)+σ22p 5

The diffusion approximation model of species richness we develop next will be closely related to the model described by Eq. (4). We will recover and discuss further the role of both the random effective population size r/(pσ22+(1-p)σ12) and the fixation probability u(p).

Scaled species model

Our investigation takes the view that, given non-extinction of the continuous time branching process X, the component K=(Ku)u0 in the species model relates naturally to the discrete time Markov chain (Zk)k0, of the population genetics model. Of course these quantities are not immediately comparable since the total number of species R varies randomly, while the population size N is fixed from one generation to the next. For this reason, we introduce a species family process X(n)=(K(n),L(n)) marked by a separate scaling variable n. The two-type branching process is driven by an initial number of species of magnitude n, and has scaled parameters d0(n), d1(n), τ0(n), τ1(n), δ01(n) and δ10(n), which control the rates of diversification, turnover, and transition on the population timescale. For fixed t, the fraction n-1X[nt](n), is the typical configuration of trait frequencies in the species family seen over the span of t time units that elapse upon completing [nt] generations. The precise behavior of the rates as functions of n is not crucial as long as they scale for large n with the following macroevolutionary rates which act on the new time scale,

di(n)βin,τi(n)τi,i=0,1δ01(n)ρ01n,δ10(n)ρ10n. 6

Here, τi measures variance in the production of daughter species and is the reference turnover parameter for extinction and/or speciation events of trait i, βi is the macroevolutionary net diversification rate of trait i, seen as the long-time average net effect of speciation and extinction, and ρ01 and ρ10 are the resulting macroevolutionary rates of exchange of traits. For later reference we also adopt from Gillespie (1977) the new evolutionary principle that trait success (fitness in population genetics terminology) not only increases with offspring mean but also may decrease with offspring variance inversely proportional to population size. Using macroevolution terminology, τ refers to the species turn-over rate, a quantity which has received increasing attention in the empirical literature aiming at interpreting patterns of species diversity (Gamisch and Comes 2019; Han et al. 2020; Nakov et al. 2019). In the current situation of developmental stochastic dynamics, as opposed to environmental stochastic influences, βi (or rather 1+βi) represent offspring mean and τi offspring variance. Then, following Gillespie (1974, 1977) the fitness is properly measured by fi(r), where

fi(r)=βi-τir,i=0,1. 7

Conditionally on the set of paths of X which do not go extinct, let

Ptn=Knt(n)Knt(n)+Lnt(n),Rtn=1n(Knt(n)+Lnt(n)),t0.

The ratio Pn will play the role of the frequency process ξN, even though the scaled Wright Fisher process ξN is derived from a population of fixed size N whereas Pn is a frequency with respect to the stochastically varying species richness process Rn.

The final step towards a generic species trait richness model is finding the limit processes in the scaling regime as n, expecting the remaining macroevolutionary rates to be τ0, τ1 as rates of change for the speciation/extinction process, β0, β1 diversification rates for net growth (if positive) or net decline (if negative), and ρ01, ρ10 controlling trait interchange. We expect in the limit to obtain a bi-variate, coupled diffusion process, X=(Xt)t0, with representations (K,L) and (P,R) say. To assist in identifying the limit model we will exploit the existing theory of measure-branching processes, briefly laid out in Sect. 5 below. In fact, it is straightforward to re-interpret X(n) as a measure-valued branching process on trait space E={0,1} with scaled parameters. In this view our macroevolutionary scaling is consistent with a super-process approximation, for which the spatial motion is the on-off process on E with jump rates ρ01, ρ10. As a consequence, (n-1X[nt](n))t0, has a weak limit which is a super-on/off process with binary branching.

Proposition 1

Suppose that

n-1X0(n)X0,n,

where X0 is a nonnegative deterministic column vector. As n the scaled branching process (n-1Xnt(n))t0, converges weakly to the continuous state branching process X=(K,L), which is the unique solution of the SDE

Xt=X0+0tβ0-ρ01ρ10ρ01β1-ρ10Xsds+100tτ0KsdBs0+010tτ1LsdBs1,t0,

where B0, B1 are independent, standard Brownian motions. Furthermore, (Pn,Rn) converges weakly to (P,R), the unique solution of

dPt=Pt(1-Pt)(f0(Rt)-f1(Rt))dt-ρ01Ptdt+ρ10(1-Pt)dt+Pt(1-Pt)(τ0(1-Pt)+τ1Pt)1RtdBt-, 8
dRt=Rt(β0Pt+β1(1-Pt))dt+Rt(τ0Pt+τ1(1-Pt))dBt+, 9

where f0(r) and f1(r) are the fitness measures in Eq. (7) and Bt-, Bt+ are standard Brownian motions. The Brownian motions are independent when τ0=τ1. In general, the quadratic covariation processes of (B-,B+) and (P,R) are given by

B-,B+t=(τ0τ1-1)0tPs(1-Ps)(τ0(1-Ps)/τ1+Ps)(τ0Ps/τ1+1-Ps)ds,

and

P,Rt=(τ0-τ1)0tPs(1-Ps)ds. 10

Effective population size: resolving Gillespie’s paradox

The infinitesimal mean and infinitesimal variance terms of the diffusions P and R, except for the mutation terms involving ρ01 and ρ10, as well as the covariation in Eq. (10), have matching terms in the generator G in Eq. (4). In Eq. (8) we recognize a Wright–Fisher diffusion with selection and mutation, and with trait-dependent genetic drift (or ‘species drift’) Pt(1-Pt)/Nt in the diffusion variance of B-, such that

Nt=Rtτ0(1-Pt)+τ1Pt 11

acts as a stochastically varying effective population size for the differential dynamics of dPt. This observation is parallel to that in the original work Gillespie (1974), where the corresponding quantity appears in the coefficient of 2f/p2 in Eq. (4). The species drift increases as the number of species decreases, analogous to genetic drift of the allele frequency in population genetics. Similarly, the species drift increases with the turn over rates τ0 and τ1, which represent the species offspring variance. By applying Ito’s formula to Nt=g(Pt,Rt), t0, with g(p,r)=r/(τ0(1-p)+τ1p), we obtain directly the dynamics of (P,N) as an alternative to that of (P,R). For simplicity, we consider the case where the diversification rates are equal, β0=β1=β, and the rates of trait exchange vanish, ρ01=ρ10=0. Then

dPt=-(τ0-τ1)Pt(1-Pt)τ0(1-Pt)+τ1Pt1Ntdt+Pt(1-Pt)1NtdBt-,dNt=βNtdt+(τ0-τ1)2Pt(1-Pt)(τ0(1-Pt)+τ1Pt)2dt+Ntτ0Pt+τ1(1-Pt)τ0(1-Pt)+τ1PtdBt++Nt(τ0-τ1)Pt(1-Pt)τ0(1-Pt)+τ1PtdBt-

The benefit of this additional representation of P in comparison with the previous stochastic equation (8), is that the effective population size N in the species drift of P now appears in classical form. The disadvantage is the increased complexity of the diffusion drift for P as well as diffusion drift and variance of N, as compared to those of R in (9).

The discussion in Gillespie (1974) also points to a potential drawback of the appearance of N in the present form in (11). If trait 1 is rare or near elimination from the species family, so that P1, then formally NR/τ1 which suggests that the now-absent trait 1 controls the effective size through the parameter τ1. Quoting Gillespie (1974): This rather uncomfortable conclusion suggests that the concept of effective population size loses a good deal of its value in the context of the present model. Using the wider frame of linked diffusion processes, however, this apparent paradoxical conclusion may be avoided. The above relation for dNt suggests that if trait 1 would be eliminated and hence P reach fixation at 1, then the resulting effective size (Nt0)t0, would satisfy

dNt0=βNt0dt+Nt0(τ0/τ1)dBt+,

and hence depend on the relative turnover ratio τ0/τ1 and β, rather than just τ1. Of course, for the same parameter settings, if we let P tend to 1 in (9), the resulting richness Rt0 of trait 0 satisfies

dRt0=βRt0dt+Rt0τ0dBt+, 12

consistent with the relation N0=R0/τ1, which appears to have been the origin of the “uncomfortable conclusion” discussed in Gillespie (1974). The diffusions N0 and R0, are Feller processes with linear drift belonging to the larger class of continuous state branching processes. It is a classical result in branching process theory that, if R00=r, and accordingly N00=r/τ1, then the extinction probabilities are given by

Pr(limtRt0=0)=e-2rβ/τ0

and

Pr/τ1(limtNt0=0)=e-2(r/τ1)·β/(τ0/τ1)=e-2rβ/τ0.

Naturally, they no more depend on the parameter τ1. In response to a question raised by an anonymous referee, we also mention an alternative approach to this topic. By changing variables from (pr) to (pn) with n=r/(σ12(1-p)+σ22p) one obtains a new generator Gf(pn) parallel to Gf(pr) in (4). The coefficients in Gf(pn) will now have matching terms to those in the SDE for (Pt,Nt).

Further interpretation of Eqs. (810)

The species ‘selection coefficient’ in Eq. (8) is the fitness difference

γt=f0(Rt)-f1(Rt)=β0-β1-τ0-τ1Rt,t0,

dependent on net diversification rates, turnover rates and species richness. If β0>β1, so that trait 0 has larger net diversification, then trait 0 has a “fitness” advantage over trait 1, γt>0 at time t, if either τ1>τ0 or τ0>τ1 and Rt>(τ0-τ1)/(β0-β1). In other words, species-level selection is reduced in small populations and enhanced in large. If τ1>τ0, the opposite holds. Hence, traits with higher diversification rates and slower turnover rates are favored by species selection. The time average of γ up to time t is

1t0tγsds=(β0-β1)(1-Ft),

where

Ft=τ0-τ1β0-β11t0t1Rsds

can be seen as the relative contribution of τ0-τ1 to species selection (Chevin 2016).

The total species richness is measured by R in Eq. (9), which is a continuous state branching process with linear drift and trait-dependence in both the diffusion drift and variance functions. Conditional on non-extinction, R will grow exponentially as t. Hence, the diffusion variance term for dPt in (8), driven by B-, vanishes. The diffusion drift function for dPt also simplifies and we expect Ptp as t, where p is the solution of the second order equation

p(1-p)(β0-β1)-ρ01p+ρ10(1-p)=0. 13

The covariation of P and R in Eq. (10) measures the degree to which these quantities vary simultaneously, in the sense that the expected change over a small time interval [t,t+h), conditional on the present state (Pt,Rt), is given by

E[(Pt+h-Pt)(Rt+h-Rt)|(Pt,Rt)]=(τ0-τ1)Pt(1-Pt)h+o(h),h0.

We observe that, while the covariation of P and R is proportional to the difference in turnover rates, τ0-τ1, the analogous covariation of the two underlying Brownian motions, B- driving P and B+ driving R, is a function only of the turnover-ratio τ0/τ1. In both cases, the dependence is positive in case τ0>τ1 and negative otherwise. We conclude that the species richness co-varies positively with the trait with the highest turnover. The turnover ratio τ0/τ1 determines the strength of the covariation which is intrinsic at the level of Brownian noise, whereas both of τ0 and τ1 are required in order to find the covariation of the state variables P and R.

Special cases

(a) Symmetric net diversification rates, β0=β1

Let β0=β1=β. Then, from Eq. (8) and Eq. (9), we have

dPt=Pt(1-Pt)-(τ0-τ1)Rtdt-ρ01Ptdt+ρ10(1-Pt)dt+Pt(1-Pt)(τ0(1-Pt)+τ1Pt)1RtdBt-,dRt=βRtdt+Rt(τ0Pt+τ1(1-Pt))dBt+.

By Eq. (10) the covariance structure remains the same as in the general model. Here, R is a supercritical Feller-type diffusion process with diffusion variance function modulated by the trait proportions P and 1-P. The SDE for P can be viewed as a Wright–Fisher diffusion with population size dependent infinitesimal drift and variance functions. The selection coefficient -(τ0-τ1)/R shows that the species associated with the trait of the smallest of the two turnover rates, i.e. trait 0 if τ0<τ1 and trait 1 if τ0>τ1, are selected for instead of species with the higher turnover. Biologically, this means that, all other things being equal, more long-lived species are selected for, as increasing the species generation time will automatically reduce τ (Lin et al. 2012).

(b) Symmetric turnover rates, τ0=τ1

Let τ0=τ1=τ. Then Eq. (8) and Eq. (9) simplify as

dPt=Pt(1-Pt)(β0-β1)dt-ρ01Ptdt+ρ10(1-Pt)dt+τPt(1-Pt)RtdBt-,dRt=Rt(β0Pt+β1(1-Pt))dt+τRtdBt+,

with independent Brownian motions B- and B+. The equation for P is the SDE of the Wright–Fisher diffusion process with selection coefficient β0-β1 and mutation rates ρ01 and ρ10 as in Eq. (3), except that now the genetic drift term is inversely proportional to R, which acts as a randomly varying effective population size. At the same time, the population richness process R is a Feller-type diffusion process with diffusion drift function modulated by the trait proportions P and 1-P.

(c) Neutral evolution: symmetric net diversification rates, β0=β1 and turn over rates, τ0=τ1

Let τ0=τ1=τ and β0=β1=β be nonnegative parameters. By combining the cases a) and b), we obtain

dPt=-ρ01Ptdt+ρ10(1-Pt)dt+τPt(1-Pt)RtdBt-,dRt=βRtdt+τRtdBt+.

Here, the first equation of P is comparable to the Wright–Fisher diffusion process Eq. (3) with only mutation and no selection, hence the term ‘neutral evolution’. The total size process is a supercritical Feller diffusion process, which we encountered previously in (12) as a limiting case under trait fixation. In the present context, we use once more that for a given initial size R0=r, the extinction probability that Rt0 as t, equals e-2βr/τ. From this we retrieve the known result in stochastic demography that higher turn-over rate increases the extinction risk. On the complementary set of nonextinction, the total size process tends to infinity almost surely, Rt a.s.

(d) Quasi-neutral rates

Quasi-neutrality, as proposed and discussed in Parsons and Quince (2007b), Parsons et al. (2008), is a rate symmetry condition, which in our notation would mean

τ0β0=τ1β1=κ.

According to Parsons et al. (2008), the ratio 2βi/τi is an alternative measure of relative success or relative fitness of trait i. In this framework, therefore, quasi-neutral traits are equally fit. Eq. (8) and Eq. (9) become

dPt=Pt(1-Pt)((β0-β1)(1-κRt))dt-ρ01Ptdt+ρ10(1-Pt)dt+Pt(1-Pt)(β0(1-Pt)+β1Pt)κRtdBt-,dRt=Rt(β0Pt+β1(1-Pt))dt+κRt(β0Pt+β1(1-Pt))dBt+.

Application of the population genetics approach to diversity-dependent models

We discuss the effect of diversity dependence on the species richness model. The interaction is imposed by letting the instantaneous rates of speciation and extinction in (6) depend on the scaled total size Rn. A variety of interaction schemes can be incorporated into the measure-branching model formalism, which we use to derive Proposition 1. For an introduction to some of the principles of interaction in measure-branching and superprocesses, see Etheridge (2000). The relevant approach for us is the use of an analogue of the Girsanov theorem that allows for the introduction of “non-linear branching”, and in the end logistic versions of the superprocess limits obtained by an absolutely continuous change-of-drift. Our analysis is therefore similar in many ways to the development of the spatial model in Fournier and Méléard (2004), considering our trait space as the mark of location. As in Fournier and Méléard (2004) we apply superprocess techniques but at this stage our interaction schemes are less general. It is also known that logistic Feller diffusion processes arise under much more general branching mechanisms than ours (Lambert 2005), indicating more general results.

Suppose the current state of the species model at time t is (Ptn,Rtn). A constant c will have the role of a ‘carrying capacity’. We consider in Sect. 4.1 the diversity-dependent species models which have jump rates at t, given by

τi(n)=τi,di(n)=βin(1-Rtnc).

- In greater generality, the carrying capacity is determined by a pair of parameters, c0, c1, and the strength in reduction of the net growth is trait-dependent. In Sect. 4.2 we consider the corresponding model with jump rates

τi(n)=τi,di(n)=βin(1-Rtnci),i=0,1.

Further generalization can also be made by assuming that the two kinds of species both affect and are affected differently by species richness:

di(n)=βin1-RtnPtnc0i+1-Ptnc1i,i=0,1.

This parallels Lotka-Volterra competition models but this latter case will not be developed further here and is left for future work. We have applied classical logistic type diversity-dependence acting on the macroevolutionary net growth rate ndi(n). Starting from R0n<c, the net effect of speciations and extinctions is reduced with increasing species richness and turns into a subcritical regime above the level c, in case Rtn>c.

To simplify notation in the rest of this section we introduce the function

ψ(p)=τ0(1-p)+τ1p,0p1. 14

Diversity-dependent macroevolutionary diversification rates

The limit process as n is obtained as in Proposition 1, with β0 and β1 at time t replaced by the species richness dependent functions β0(1-Rt/c) and β1(1-Rt/c). The system in Eqs. (8)–(9) becomes

dPt=Pt(1-Pt)(f0c(Rt)-f1c(Rt))dt-ρ01Ptdt+ρ10(1-Pt)dt+Pt(1-Pt)ψ(Pt)1RtdBt-,dRt=Rt(1-Rtc)(β0Pt+β1(1-Pt))dt+Rtψ(1-Pt)dBt+, 15

with trait fitness now of the form

fic(r)=βi(1-rc)-τir.

In particular, if we impose the additional assumptions of neutral rates discussed as special case c) in Sect. 3.6, the process R is the solution of

dRt=βRt(1-Rt/c)dt+τRtdBt+,t0.

This is the logistic Feller diffusion process, which has non-negative paths and goes extinct with probability one. The logistic Feller diffusion is known to have a quasi-stationary distribution in the sense of a Yaglom limit, namely that the law of Rt conditioned on {Rt>0} converges as t to a probability measure (Cattiaux et al. 2009). Such a Yaglom limit, however, is not known in explicit terms and need not even be unique. For our model, we note that Cattiaux et al. (2009) Thm 8.2, yields uniqueness of the Yaglom limit for the case c<1, whereas under biologically relevant conditions we expect c>1.

For the general, non-neutral, case, letting R~=R/c be the total species richness in relation to the carrying capacity c, the SDE system for the state variables is

dPt=Pt(1-Pt)((β0-β1)(1-R~t)-τ0-τ1cR~t)dt-ρ01Ptdt+ρ10(1-Pt)dt+Pt(1-Pt)ψ(Pt)1cR~tdBt-,dR~t=R~t(1-R~t)(β0Pt+β1(1-Pt))dt+1cR~tψ(1-Pt)dBt+.

The two SDEs with respect to B- and B+ have second order moments proportional to 1/c. This can be used to show formally that the diffusion terms vanish in the subsequent limit of large carrying capacity, c, revealing the deterministic limit equation

dPt=Pt(1-Pt)(β0-β1)(1-R~t)dt-ρ01Ptdt+ρ10(1-Pt)dt,dR~t=R~t(1-R~t)(β0Pt+β1(1-Pt))dt,

that is, the ODE system

pt=pt(1-pt)(1-r~t)(β0-β1)-ρ01pt+ρ10(1-pt),r~t=r~t(1-r~t)(β0pt+β1(1-pt)).

With initial value (p0,r~0)(0,1)2, the equilibrium solution as t tends to infinity, is p=ρ10/(ρ10+ρ01), r~=1. We see that species selection, acting on the net diversification rates β0 and β1, is effective only during the growth phase but vanishes as the number of species approaches the carrying capacity, i.e., when r~t approaches 1. In ecology, this is said to be a model with only r-selected traits but no K-selected traits (where r is the growth rate and K is the carrying capacity), that is, the diversification of the two traits is regulated in the same way as r~t approaches 1 (Pianka 1970). This scenario would correspond to a trait allowing adaptive radiation with a rapid diversification of species possessing this trait when many niches are still available, until an almost neutral dynamics when carrying capacity is reached.

Trait- and diversity-dependent macroevolutionary diversification rates

The density dependent mechanism is again to reduce the growth of the species family with increasing diversity, but now the efficiency of the nonlinear influence is tuned for each trait, with the resulting trait fitness

fici(r)=βi(1-r/ci)-τi/r,i=0,1.

By adapting the derivation leading to Eq. (8)–(9) to this case and putting R~=R/c as before, then in the limit n,

dPt=Pt(1-Pt)(f0c0(cR~t)-f1c1(cR~t))dt-ρ01Ptdt+ρ10(1-Pt)dt+Pt(1-Pt)ψ(Pt)1cR~tdBt-,dR~t=R~t{β0Pt(1-cc0R~t)+β1(1-Pt)(1-cc1R~t)}dt+R~tcψ(1-Pt)dBt+.

A special case of interest is obtained by taking β0/c0=β1/c1=α/c, τ0=τ1=τ, using a new parameter α. For simplicity, let us also take ρ01=ρ10=0. Then

dPt=(β0-β1)Pt(1-Pt)dt+Pt(1-Pt)τcR~tdBt-,dR~t=R~t{β0Pt+β1(1-Pt)-αRt}dt+τR~t/cdBt+.

In this example we notice that the growth of species richness is diversity- and density-dependent while the selection term is now density-independent.

Let us now come back to the general model studied in this section and suppose that as c then both of c0 and c1 also tend to infinity, such that c/ciαi, i=0,1, for some positive constants α0 and α1. Then formally

pt=pt(1-pt)(β0(1-α0r~t)-β1(1-α1r~t))-ρ01pt+ρ10(1-pt),r~t=r~t(β0pt(1-α0r~t)+β1(1-pt)(1-α1r~t)). 16

The case β0=β1=β>0 gives the system of ODEs

pt=β(α1-α0)pt(1-pt)r~t-ρ01pt+ρ10(1-pt),r~t=βr~t(1-r~t(α0pt+α1(1-pt))).

The equation for p above shows that selection, positive or negative, strengthens with increasing species richness. In ecology, this would be equivalent to K-selection, whereby the trait less sensitive to competition is favored. For example, if c0>c1 so that trait 0 is less constrained than trait 1 then the selective rate of growth of trait 0 is positive, β(α1-α0)>0. For both “r-like” and “K-like” cases, selection is diversity-dependent. Returning to (16) and the special case α0β0=α1β1=α, although total species richness is regulated, species selection becomes diversity-independent and the selection coefficient reduces to β0-β1. This would correspond to a scenario where the total diversity is regulated from mechanisms unrelated to the focal traits.

In macroevolution, several traits have been supposed to be evolutionary dead-ends, such as asexuality (Maynard-Smith 1978) or self-fertilization (Igic and Busch 2013). The classical view is that negative diversification is associated with such traits but they are continuously reintroduced through asymmetrical (or even unidirectional) shifts from traits associated with positive diversification (Goldberg et al. 2010). The above formulation suggests a more elaborate scenario whereby the dead-end trait has not necessarily a basal negative diversification rate but is more sensitive to diversity dependence. Such traits could lead to initial diversification in initially species-poor environments (i.e. much lower than the species carrying capacity) but diversification would then decrease and eventually be negative through time, as the number of species would increase. This is in agreement with recent empirical observations in the plant genus Capsella that selfing species are more sensitive to competition than outcrossing ones (Petrone Mendoza et al. 2018; Yang et al. 2018), and a plausible explanation for the hypothesis that selfing and asexual lineages “senesce” in diversification rates (Ho and Agrawal 2017).

Fixation and extinction of traits

To study trait fixation probabilities using a similar approach as in Parsons et al. (2008) we let ρ01=ρ10=0 in Eq. (15), which means that species are unable to change traits. The initial composition of traits in the scaled species family at time 0 is a fraction x of trait 0 and the remaining fraction 1-x consisting of trait 1 species, 0<x<1. As in Parsons et al. (2008), we strive to analyze the fate of the trait 0 element over time as a function of x. For this aim we must restrict further our considerations to the density dependent case with symmetric net growth rates, β0=β1=β, and a single carrying capacity c, the same for both traits. Then

dPt=-Pt(1-Pt)(τ0-τ1)1Rtdt+Pt(1-Pt)ψ(Pt)1RtdBt-,dRt=βRt(1-Rtc)dt+Rtψ(1-Pt)dBt+, 17

which is a logistic version of the model listed as special case (a) in Sect. 3.6. The crucial aspect of the resulting SDE for P, which is required for our method of proof to work, is that all dependence on R is mediated through a single function g(R), in this case g(r)=1/r, which appears as a multiple in both the diffusion drift term and the diffusion variance term. This is the reason why the closely related special case β0=β1, c0c1 is not covered by our method, nor is β0β1, c0=c1 or τ0=τ1, β0β1.

In the present situation, however, general properties of logistic branching processes show that the species richness R goes extinct with probability one as t. We may then consider the sequence of stopped processes (Ptδn,Rtδn)t0, where δn=inf{t>0:Rt=1/n}, and have a well-defined solution of (17) with positive richness component on {0tδn} for each fixed n. Yet, the mixture of traits as captured by (Ptδn)t0 along the path to extinction of the species family, as n, would perhaps be of limited interest, indicating that the fixation probability concept might have limited relevance. On the other hand, the quasi-stationary behavior of R mentioned previously means that the time to extinction can be very long. This allows us to circumvent species extinction by modeling the species family rather using a process R+, say, which is R conditioned on the event {Rt>0,t>0} of ultimate nonextinction. In the following we study trait absorption with respect to such non-extinct paths. Let (P+,R+) denote the solution of (17) conditioned on the event of ultimate nonextinction. The boundaries 0 and 1 are both absorbing for P+. Let us denote by η0 the fixation time of trait 1, that is, the random time at which P+ first hits the lower boundary 0, if ever. Similarly, η1 is the fixation time of trait 0, the time at which the upper boundary point 1 is first hit. If η0 is finite, then all species are trait 1 from that time and onward, meaning that the upper limit is never reached, η1=. Similarly, if η1< all species end up as trait 0, and η0=. The absorption time η=min(η0,η1) is the time of extinction or fixation, whichever occurs first. Next we recover the fixation probability (5) within our setting, and derive a result on the “species frequency spectrum”.

Theorem 1

The fixation probability q(x)=Px(η1<) of P+ rendering all species trait 0 eventually, given initial frequency x of trait 0, equals

q(x)=τ1xτ0(1-x)+τ1x.

Moreover, for bounded functions f defined on the unit interval [0, 1],

Ex[0η+f(Ps+)Rs+ds]=2τ0(1-x)+τ1x{xx1f(y)ydy+(1-x)0xf(y)1-ydy}.

Proof

Let X=(Xt)t0 be a diffusion process, which solves

dXt=-Xt(1-Xt)(τ0-τ1)dt+Xt(1-Xt)ψ(Xt)dBt-. 18

The SDE for X is the same as that of P+ under the enforced condition R+1. It turns out that the fixation probabilities of P+ and X respectively, coincide. The reason for this is that the distribution of X can be extracted from that of P+, via a random time change. Since Rt+>0 for all t, the transform of R+, given by the function

Bt=0t1Rs+ds,t>0,

is a strictly increasing random time change with time change rate 1/Rt+, such that its left-inverse

At=inf{s>0:Bs>t},t>0,

is continuous, and ABt=t for all t0. Thus,

q(x)=Px(η1+<)=Px(Bη+<),Bη+=inf{t>0:PAt+=1}.

The desired function q(x) is now obtained as the fixation probability of the time-changed process (PAt+)t0, which turns out to be closely connected to X. Indeed, it is a consequence of the time change result Theorem 8.5.1 in Øksendal (2007), that, for each t0, PAt+ and Xt have the same distribution. Let η1X denote the fixation time of X. Since

Px(η1X>t)=Px(Xt=1)=P(PAt+=1)=Px(Bη+>t),

it follows that η1X and Bη1+ have the same distribution, and so q(x)=P(η1X<). Similarly, ηX and Bη+ have the same distribution.

The SDE for X in Eq. (18) is

dXt=b(Xt)dt+σ(Xt)dBt-,b(x)=(τ1-τ0)x(1-x),σ2(x)=x(1-x)ψ(x),

with ψ(x)=τ0(1-x)+τ1x introduced in Eq. (14), and where X0=x, 0<x<1, and the boundary points {0,1} are absorbing. The scale function S(x) and speed function m(x) associated with (Xt)t0, X0=x, are defined by

S(x)=0xs(y)dy,m(x)=1σ2(x)s(x),0<x<1,

where

s(x)=exp{-0x2b(y)σ2(y)dy},

Here,

S(x)=τ0xτ0(1-x)+τ1x,m(x)=τ0(1-x)+τ1xτ02x(1-x).

By using Feller’s boundary classification and the theory of one-dimensional diffusion processes (Karlin and Taylor 1981; Etheridge 2011), it follows that the boundary points 0 and 1 are both accessible from the interior state space as exit boundaries. The extinction probability is determined by the scale function (using the normalization S(0)=0), as

q(x)=S(x)S(1)=τ1xτ0(1-x)+τ1x.

Moreover, for bounded functions f,

Ex[0ηXf(Xs)ds]=2q(x)x1(S(1)-S(y))m(y)f(y)dy+2(1-q(x))0xS(y)m(y)f(y)dy.

It is straightforward to check that the right hand side equals

2τ0(1-x)+τ1x{xx1f(y)ydy+(1-x)0xf(y)1-ydy}.

To complete the proof, we first use once again that PAt+ and Xt have the same distribution for each fixed t, and then make the change-of-variables s=Br, to obtain

Ex[0ηXf(Xs)ds]=Ex[01{0<Xs<1}f(Xs)ds]=Ex[01{0<PAs+<1}f(PAs+)ds]=Ex[0Bη+f(PAs+)ds]=Ex[0η+f(Pr+)Rr+dr].

Application of Theorem 1: fixation and extinction of a rare trait

Suppose we have a family in which all species carry a single trait, and we introduce a species with a second trait of the same net growth rate into the population. What are the chances of the new trait getting fixed or lost? What are the implications if the new trait causes a shift in turnover rate? For example, all else being equal, this would correspond to a shift in life span, lifespan being inversely proportional with respect to turnover rate. As another example, recently, polyploidy (the doubling of the genome) has been shown to have no effect on diversification (β0=β1) but to increase turnover rate (τ0>τ1) (Zenil-Ferguson et al. 2019).

To address these questions we observe that if a small fraction x of trait-0 species are inserted into a population otherwise consisting entirely of trait-1 species, then as x0,

q(x)=τ1xτ0(1-x)+τ0xτ1τ0x.

In addition, by an application of the second statement in Theorem 1 with f(y)=1, y[0,1], the expected absorption time weighted by the inverse species richness process, satisfies

Ex[0η+1Rs+ds]=-2xln(x)-2(1-x)ln(1-x)τ0(1-x)+τ1x-2xln(x)+2xτ0,

as x0. Keeping the turnover rate τ1 of the original trait fixed, we see that the probability of fixation of the new trait is inversely proportional to the turnover rate τ0 of the new trait. Thus, traits which cause a burst in both speciation and extinction are suppressed whereas traits with low turnover rates are favored, with respect to possible fixation. Similarly, the expected weighted absorption time decreases with increasing turnover rate τ0, consistent with shorter time to extinction.

An alternative interpretation of the result in Theorem 1 suggests a notion of trait frequency spectrum in analogy to the allele frequency spectrum in population genetics. For this we assume, with reference to the space and time scaling parameter n, that in each of the n time steps forming one evolutionary time unit, trait injections occur at rate θ>0. When one of these events happens, a fraction 1/n of the family gets a new trait of turnover rate τ0. The remaining fraction has turnover rate τ1. We think of the successive injection events representing each time a new trait unrelated to previous ones. Each new trait traces out its own path P+ but all relate to the same R+. Most of the new traits quickly go extinct. A few might survive for a while and even get fixed, eventually. The scaled trait 0 fixation probability per time unit is the large n limit

nq(1/n)τ1τ0.

A possible interpretation is that with larger turnover rate τ0 of an “invading” trait, the smaller is the fixation rate τ1/τ0, and hence, the more efficient is the existing species family in purging such an intruder. For non-negative bounded functions f on the interval [0, 1] with f(0)=0, which satisfies the integrability condition 01f(y)y-1dy<, we find

limnθnE1/n[0η+f(Pr+)Rr+dr]=θτ1τ001f(y)2τ1ydy=01f(y)2θτ0ydy.

In particular,

limnθnE1/n[0η+Pr+Rr+dr]=2θτ0.

The weight function 2θτ0-1/y which arises in the above scheme of limiting expected values, plays a similar role as the stationary allele frequency spectrum in population genetics. While such a frequency weight function is not integrable over y(0,1), and hence does not allow a probability density interpretation, it does define an intensity measure for a well-defined Poisson random measure on (0, 1). For the Poisson random field approach in population genetics, see Sawyer and Hartl (1992) and e.g. Kaj and Mugal (2016), Section 2.3. Intuitively, for each y(0,1), (2θ/τ0)y-1 is in this sense the stationary intensity that (Pt)t0 occupies the frequency band near y, as measured by the size of (Pt+/Rt+)t0 within the time frame during which the trait remains in the family.

The case τ0=τ1 is the fully neutral case with fixation probability q(x)=x and trait spectrum intensity 2θ/y. The case τ0τ1 represents a form of trait selection, which only affects the relative magnitude of trait frequencies present in the species family, but not the shape of the trait frequency spectrum.

The super-process representation of the limit process

In this final section, we return to the two-type branching process X with rates (1) running on the scale of species generations. The aim is to provide a sketch of the proof of the limit result Proposition 1, shifting to the evolutionary time scale.

As an alternative and from a wider perspective, we may consider X as a measure-branching process with spatial motion on the trait space E={0,1} and spatially dependent binary branching. Each lineage in the branching tree changes its current species trait according to independent copies of the space motion J=(Ju), which is the two-state Markov jump process with jump rates δ01 and δ10 for transitions from 0 to 1 and vice versa. In addition to the on/off motion caused by J, the number of species of each trait develop independently as linear binary branching processes with generating functions

F(z;j)=λjz2+μjλj+μj=dj+τj2τjz+τj-dj2τj,j=0,1.

The branching offspring distribution for daughter species of trait j has mean dj/τj, and variance (τj-dj)/τj. Now let us put Xu,f=f(0)Ku+f(1)Lu, u0, where f=(f(0),f(1)) is a function (row vector) on E. We write Ej for the expected value conditional on J0=j, that is X0=(1,0) when j=0 and X0=(0,1) when j=1. Then X is the measure-branching process for which the function vu defined by

vu(j)=Ej[eXu,f-1],

is the unique solution of the integral equation

vu(j)=Ej[ef(Ju)-1]+Ej0u(F(1+vu-r(Jr);Jr)-1-vu-r(Jr))τJrdr, 19

for jE, cf. Li (2011), Ch. 4.1. It is well-known that suitably scaled measure-valued branching processes converge in the limit of many particles, long times, and small masses to an associated superprocess, in this case a super-on/off process. More exactly, the superprocess is the weak limit as n independent copies X(k,n), k=1,,n, of X with spatial motion J(n) and branching mechanism F(n)(z,;) scaled according to (6), are averaged over the time span u=nt. This, however, is the species model under evolutionary time scaling, and hence the superprocess will be the limit X of Xt(n)=n-1Xnt(n), t0, in the sense of weak convergence in path space as n. To study the limit process we put

Vt(n)[f](j)=nEj[eXnt(n),f/n-1],

so that Vt(1)[f](j)=vt(j). Then, for large n,

lnEj[eXt(n),f]=nlnEj[exp{Xnt(n),f/n}]=nln(1+nEj[eXnt(n),f/n-1]n)Vt(n)[f](j).

By using once more the scaling assumption (6), we have

P(Jn(t+h)(n)=1|Jnt(n)=0)=δ01(n)nh+o(nh)n=ρ01h+o(h),

and the analogous relation for converse transitions from 1 to 0. This means that the limit process of Jnt(n), t0, is again an on/off process on E but now with jump rates ρ01, ρ10. For simplicity we retain the notation J in the limit. Also, a calculation shows,

n2(F(n)(1+v/n;j)-1-v/n)τj(n)=n22(τj(n)(v/n)2+dj(n)(2(v/n)+(v/n)2))βjv+τjv22.

As we combine these observations with Eq. (19), it follows that the function V(n)[f](j) satisfies

Vt(n)[f](j)=Ej[n(ef(Jt)/n-1)]+Ej0t(βJrVt-r(n)[f](Jr)+τJrVt-r(n)[f](Jr)2/2)dr+Ht(n),

where the remainder term H(n) can be controlled and shown to vanish in the limit n, see Ch. 4.2 in Li (2011). With reference to the theory of superprocesses we can now conclude that X(n) possesses a weak limit process X, such that Vt[f](j)=lnEj[eXt,f] is the unique solution of the integral equation

Vt[f](j)=Ej[f(Jt)]+0tEj[βJrVt-r[f](Jr)+τJrVt-r[f](Jr)2/2]dr,jE,

cf. Prop. 4.5 in Li (2011). The associated stochastic equation for the superprocess is

Xt,f=X0,f+0tXs,Gfds+0tXs,βfds+Mt(f). 20

Here, Gf is the infinitesimal generator of J, defined by

Gf(0)=(f(1)-f(0))ρ01,Gf(1)=(f(0)-f(1))ρ10,

and M(f) is a continuous martingale with representation

Mt(f)=f00tτ0KsdBs0+f10tτ1LsdBs1,

using a 2-dimensional standard Brownian motion (B0,B1) with independent components. Hence the quadratic variation of M(f) is

M(f),M(f)t=0tXs,τf2ds.

By observing that X,ft is the scalar product f·Xt of the vectors f and Xt and that

Xs,Gf+Xs,βf=f·AXs,

it follows that the equation for X in Proposition 1 is an alternative representation of Eq. (20), cf. Li (2011), Ch. 7.5.

Conditional on ultimate survival, the results for (P,R) now follow from Ito’s formula with

Bt+=0tτ0Psτ0Ps+τ1(1-Ps)dBs0+0tτ1(1-Ps)τ0Ps+τ1(1-Ps)dBs1,

and

Bt-=0tτ0(1-Ps)τ0(1-Ps)+τ1PsdBs0-0tτ1Psτ0(1-Ps)+τ1PsdBs1.

Conclusions

In this work, we presented a stochastic modeling framework for a binary trait-dependent species family. The model provides insights on the joint evolution of trait frequencies and species diversity. Guided by the long-term successful use of stochastic techniques in population genetics, we applied various probabilistic methods to study the effects of diversity-dependence and related properties in the species family, hence developing a phylogenetic methodology. At the core of the study is the interpretation that phylogenetic trait frequency and population genetics allele frequency are closely related from the view point of stochastic models. We showed that evolutionary time re-scaling is a powerful tool to reveal relevant analogies and distinctions, and to provide a bridge between two seemingly distant application areas.

We applied methods, which rely on the analysis of trajectories of solutions, to stochastic differential equations of Wright–Fisher type with diversity dependent selection and genetic drift coefficients. Our main structural result, Proposition 1, derived from an embedding argument into an abstract two-type superprocess, provides the precise dynamics of trait frequency jointly with species richness. In parallel, we also discussed the corresponding Markov generator dynamics, referred to as the Gillespie approach. In doing so, we were able to resolve a paradoxical conclusion in Gillespie (1974), dealing with the effective population size. To put these abstract results into context, we discussed a number of special cases with partly symmetric parameter settings leading up to a fully symmetric, or neutral, case. To help clarify the role of parameters, we identified trait success (or trait fitness) functions, which are further utilized for the application of the model to diversity-dependent interaction in terms of carrying capacity. A particular logistic version of the diversification-dependent model was the subject of an in-depth study of fixation and extinction of traits. The main result, Theorem 1, provides the fixation probability as function of the initial frequency of a newly injected trait, as well as a type of trait frequency spectrum, which is analogous to allele frequency spectrum in population genetics.

Funding

Open access funding provided by Uppsala University.

Footnotes

Sylvain Glémin and Martin Lascoux were supported by the Marie Curie Intra-European Fellowship (Grant Number IEF-623486, project SELFADAPT)

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Ingemar Kaj, Email: ikaj@math.uu.se.

Sylvain Glémin, Email: sylvain.glemin@univ-rennes1.fr.

Daniah Tahir, Email: daniahtahir@gmail.com.

Martin Lascoux, Email: martin.lascoux@ebc.uu.se.

References

  1. Abu Awad D, Coron C. Effects of demographic stochasticity and life-history strategies on times and probabilities to fixation. Heredity. 2018;121:374–386. doi: 10.1038/s41437-018-0118-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Benton MJ, Emerson BC. How did life become so diverse? The dynamics of diversification according to the fossil record and molecular phylogenetics. Paleontology. 2007;50:23–40. [Google Scholar]
  3. Cattiaux P, Collet P, Lambert A, Martinez S, Méléard S, San Martin J. Quasi-stationary distributions and diffusion models in population genetics. Ann Prob. 2009;37:1926–1969. [Google Scholar]
  4. Chevin ML. Species selection and random drift in macroevolution. Evolution. 2016;70:513–525. doi: 10.1111/evo.12879. [DOI] [PubMed] [Google Scholar]
  5. Etheridge A. An introduction to superprocesses. Providence: American Mathematical Society; 2000. [Google Scholar]
  6. Etheridge A. Some mathematical models from population genetics. Berlin: Springer; 2011. [Google Scholar]
  7. Etienne RS, Haegeman B. A conceptual and statistical framework for adaptive radiations with a key role for diversity-dependence. Am Nat. 2012;180:E75–E89. doi: 10.1086/667574. [DOI] [PubMed] [Google Scholar]
  8. Etienne RS, Haegeman B, Stadler T, Aze T, Pearson PN, Purvis A, Phillimore AB. Diversity-dependence brings molecular phylogenies closer to agreement with the fossil record. Proc R Soc Lond B Biol Sci. 2012;297:1300–1309. doi: 10.1098/rspb.2011.1439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Fournier N, Méléard S. A microscopic probabilistic description of a locally regulated population and macroscopic approximation. Ann Appl Probab. 2004;14:1880–1919. [Google Scholar]
  10. Gamisch A, Comes HP. Clade-age-dependent diversification under high species turnover shapes species richness disparities among tropical rainforest lineages of Bulbophyllum (orchidaceae) BMC Evol Biol. 2019;19:93. doi: 10.1186/s12862-019-1416-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Gillespie JH. Natural selection for within-generation variance in offspring number. Genetics. 1974;76:601–606. doi: 10.1093/genetics/76.3.601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gillespie JH. Natural selection for variances in offspring numbers: a new evolutionary principle. Am Nat. 1977;111:1010–1014. [Google Scholar]
  13. Goldberg EE, Kohn JR, Lande R, Robertson KA, Smith SA, Igic B. Species selection maintains self-incompatibility. Science. 2010;330:493–495. doi: 10.1126/science.1194513. [DOI] [PubMed] [Google Scholar]
  14. Haccou P, Jagers P, Vatutin VA. Branching processes: variation, growth, and extinction of populations. Cambridge: Cambridge University Press; 2005. [Google Scholar]
  15. Han TS, Zheng QJ, Onstein RE, Rojas-Andres BM, Hauenschild F, Muellner-Riehl AN, Xing YW. Polyploidy promotes species diversification of Allium through ecological shifts. New Phytol. 2020;225:571–583. doi: 10.1111/nph.16098. [DOI] [PubMed] [Google Scholar]
  16. Ho EKH, Agrawal AF. Aging asexual lineages and the evolutionary maintenance of sex. Evolution. 2017;71:1865–1875. doi: 10.1111/evo.13260. [DOI] [PubMed] [Google Scholar]
  17. Igic B, Busch JW. Is self-fertilization an evolutionary dead end? New Phytol. 2013;198:386–397. doi: 10.1111/nph.12182. [DOI] [PubMed] [Google Scholar]
  18. Kaj I, Krone SM. The coalescent process in a population with stochastically varying size. J Appl Prob. 2003;40:33–48. [Google Scholar]
  19. Kaj I, Mugal CF. The non-equilibrium allele frequency spectrum in a Poisson random field framework. Theor Pop Biol. 2016;111:51–64. doi: 10.1016/j.tpb.2016.06.003. [DOI] [PubMed] [Google Scholar]
  20. Kaj I, Tahir D. Stochastic equations and limit results for some two-type branching models. Stat Probab Lett. 2019;150:35–46. [Google Scholar]
  21. Karlin S, Taylor HM. A second course in stochastic processes. New York: Academic Press; 1981. [Google Scholar]
  22. Lambert A. The branching process with logistic growth. Ann Appl Probab. 2005;15:1506–1535. [Google Scholar]
  23. Lambert A. Probability of fixation under weak selection: a branching process unifying approach. Theor Popul Biol. 2006;69:419–441. doi: 10.1016/j.tpb.2006.01.002. [DOI] [PubMed] [Google Scholar]
  24. Li Z. Measure-valued branching Markov processes. Berlin: Springer; 2011. [Google Scholar]
  25. Lin YT, Kim H, Doering CR. Features of fast living: on the weak selection for longevity in degenerate birth-death processes. J Stat Phys. 2012;148:647–663. [Google Scholar]
  26. Maddison WP, Midford PE, Otto SP. Estimating a binary character’s effect on speciation and extinction. Syst Biol. 2007;56:701–710. doi: 10.1080/10635150701607033. [DOI] [PubMed] [Google Scholar]
  27. Mallet J. The struggle for existence: how the notion of carrying capacity, K, obscures the links between demography, Darwinian evolution, and speciation. Evol Ecol Res. 2012;14:627–665. [Google Scholar]
  28. Marshall CR, Quental TB. The uncertain role of diversity dependence in species diversification and the need to incorporate time-varying carrying capacities. Phil Trans R Soc B. 2016;371:20150217. doi: 10.1098/rstb.2015.0217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Maynard-Smith J. The evolution of sex. Cambridge: Cambridge University Press; 1978. [Google Scholar]
  30. Morlon H. Phylogenetic approaches for studying diversification. Ecol Lett. 2014;17:508–525. doi: 10.1111/ele.12251. [DOI] [PubMed] [Google Scholar]
  31. Nakov T, Beaulieu JM, Alverson AJ. Diatoms diversify and turn over faster in freshwater than marine environments. Evolution. 2019;73:2497–2511. doi: 10.1111/evo.13832. [DOI] [PubMed] [Google Scholar]
  32. Øksendal B. Stochastic differential equations, an introduction with applications. Berlin: Springer; 2007. [Google Scholar]
  33. Parsons TL, Quince C. Fixation in haploid populations exhibiting density dependence I: the non-neutral case. Theor Popul Biol. 2007;72:121–135. doi: 10.1016/j.tpb.2006.11.004. [DOI] [PubMed] [Google Scholar]
  34. Parsons TL, Quince C. Fixation in haploid populations exhibiting density dependence II: the quasi-neutral case. Theor Popul Biol. 2007;72:468–479. doi: 10.1016/j.tpb.2007.04.002. [DOI] [PubMed] [Google Scholar]
  35. Parsons TL, Quince C, Plotkin JB. Absorption and fixation times for neutral and quasi-neutral populations with density dependence. Theor Popul Biol. 2008;74:302–310. doi: 10.1016/j.tpb.2008.09.001. [DOI] [PubMed] [Google Scholar]
  36. Parsons TL, Quince C, Plotkin JB. Some consequences of demographic stochasticity in population genetics. Genetics. 2010;185:1345–1354. doi: 10.1534/genetics.110.115030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Petrone Mendoza S, Lascoux M, Glémin S. Competitive ability of Capsella species with different mating systems and ploidy levels. Ann Bot. 2018;121:1257–1264. doi: 10.1093/aob/mcy014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Pianka ER. On r- and K- selection. Am Nat. 1970;104:592–597. [Google Scholar]
  39. Quental TB, Marshall CR. Diversity dynamics: molecular phylogenies need the fossil record. Trends Ecol Evol. 2010;25:434–441. doi: 10.1016/j.tree.2010.05.002. [DOI] [PubMed] [Google Scholar]
  40. Rabosky DL. Diversity-dependence, ecological speciation, and the role of competition in macroevolution. Annu Rev Ecol Evol Syst. 2013;44:481–502. [Google Scholar]
  41. Rabosky DL, Lovette IJ. Density-dependent diversification in North American wood warblers. Proc R Soc B. 2008;275:2363–2371. doi: 10.1098/rspb.2008.0630. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Sawyer SA, Hartl DL. Population-genetics of polymorphism and divergence. Genetics. 1992;132:1161–1176. doi: 10.1093/genetics/132.4.1161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Sjödin P, Kaj I, Krone S, Lascoux M, Nordborg M. On the meaning and existence of an effective population size. Genetics. 2005;169:1061–1070. doi: 10.1534/genetics.104.026799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Tahir D, Glémin S, Lascoux M, Kaj I. Modeling a trait-dependent diversification process coupled with molecular evolution on a random species tree. J Theor Biol. 2019;461:189–203. doi: 10.1016/j.jtbi.2018.10.032. [DOI] [PubMed] [Google Scholar]
  45. Vellend M. Conceptual synthesis in community ecology. Q Rev Biol. 2010;85:183–206. doi: 10.1086/652373. [DOI] [PubMed] [Google Scholar]
  46. Yang X, Lascoux M, Glémin S (2018) Variation in competitive ability with mating system, ploidy and range expansion in four Capsella species. bioRxiv 214866, ver 5 peer-reviewed and recommended by PCI Evolutionary Biology
  47. Zenil-Ferguson R, Burleigh JG, Freyman W, Igic B, Mayrose I, Goldberg EE. Interaction among ploidy, breeding system and lineage diversification. New Phytol. 2019;224:1252–1265. doi: 10.1111/nph.16184. [DOI] [PubMed] [Google Scholar]

Articles from Journal of Mathematical Biology are provided here courtesy of Springer

RESOURCES