Skip to main content
Springer logoLink to Springer
. 2024 Oct 1;89(4):46. doi: 10.1007/s00285-024-02143-3

A general multi-scale description of metastable adaptive motion across fitness valleys

Manuel Esser 1,, Anna Kraut 1,2
PMCID: PMC11445367  PMID: 39354121

Abstract

We consider a stochastic individual-based model of adaptive dynamics on a finite trait graph G=(V,E). The evolution is driven by a linear birth rate, a density dependent logistic death rate and the possibility of mutations along the directed edges in E. We study the limit of small mutation rates for a simultaneously diverging population size. Closing the gap between Bovier et al. (Ann Appl Probab 29(6):3541–358, 2019) and Coquille et al. (Electron J Probab 26:1–37, 2021) we give a precise description of transitions between evolutionary stable conditions (ESC), where multiple mutations are needed to cross a valley in the fitness landscape. The system shows a metastable behaviour on several divergent time scales, corresponding to the widths of these fitness valleys. We develop the framework of a meta graph that is constituted of ESCs and possible metastable transitions between them. This allows for a concise description of the multi-scale jump chain arising from concatenating several jumps. Finally, for each of the various time scales, we prove the convergence of the population process to a Markov jump process visiting only ESCs of sufficiently high stability.

Keywords: Adaptive dynamics, Stochastic individual-based models, Birth death processes with immigration, Metastability, Multi-scale limits

Introduction

The theory of evolution aims to understand the adaptation of biological populations to their environment through mutation and selection. Following the principles originally proposed by Darwin, it associates to each individual a fitness, which characterises their ability to survive and produce a growing population. The path of evolution, tracing the types of individuals that were able to fixate in the population, usually follows a sequence of types of increasing fitness. However, in many cases the mutational path has to pass through a number of deleterious or neutral intermediate types in order to reach a type of higher fitness. This can for example be seen in cancer initiation, where multiple driver mutations need to be accumulated to induce an outgrowing population (Martincorena et al. 2017). Other examples are the formation of complex mechanisms like flagella in bacteria, where only partially functional intermediate stages of flagella yield an evolutionary disadvantage but fully functional apparatuses lead to increased fitness (Pallen 2006). See also (De Visser and Krug 2014) for a review of empirical fitness landscapes arising in nature.

When the population needs to cross types of lower fitness in order to reach a fitter type, many such attempts will be unsuccessful. This is because the intermediate unfit types are destined to go extinct within a short time and might not produce a new mutant type before this happens. As a result, the waiting time to cross a valley in the fitness landscape is much longer than the invasion time of fit mutant types that are directly accessible. Once a fit type is attained, however, it rapidly fixates in the population. These dynamics, which can also be analysed in the framework of metastability, as illustrated below, have already been studied heuristically by Gillespie in the 80 s (Gillespie 1984). Since then, fitness valleys have been studied in a variety of mathematical models, ranging from Moran models (Komarova 2007; Gokhale et al. 2009) to multi-type branching processes (Nicholson and Antal 2019).

The model that we want to focus on in this paper is a stochastic individual-based model of adaptive dynamics, for which Bovier, Coquille and Smadi have studied fitness valleys in the simple case of a linear trait space (Bovier et al. 2019). This type of model tracks the sizes of different subpopulations and—opposed to many others like the Moran model—does not work under the assumption of a constant overall population size. It is in this aspect closer to branching processes, where the population size varies over time. However, infinite growth is limited due to competitive interactions. Moreover, selective advantages of certain traits are not prescribed by a fixed parameter but arise through these interactions. This is particularly important for the long-term evolution of the population since the fitness landscape depends on the current composition of the dominant population and changes over time.

This study of the interplay of ecology and evolution goes back to ideas from Metz and Geritz (among others) in the early 90 s (Metz et al. 1992). Shortly after, an individual-based approach has been proposed by Bolker and Pacala (1997) and a rigorous construction was first presented by Fournier and Méléard almost 20 years ago (Fournier and Méléard 2004). Since then, these models have been the topic of study for scaling limits in a variety of parameter regimes and extensions to the base model (e.g. Champagnat 2006; Champagnat and Méléard 2011; Baar et al. 2017; Smadi 2017; Bovier et al. 2018; Kraut and Bovier 2019; Champagnat et al. 2021; Coquille et al. 2021). We refer to Bovier (2021) for a comprehensive overview of various scaling limits.

To study the typical long-term behaviour of the population, two scaling parameters are introduced: The carrying capacity K, which scales the order of the population size, and the mutation probability μK, which scales the frequency of mutation events. For large populations (K) and rare mutations (μK0), different mechanisms that change the state of the population—like mutations introducing a new type or interactions between individuals that lead to a new equilibrium state of resident traits—act on different time scales. There are three important time scales in this setting: Ecological interactions between well-established subpopulations, like the competition for resources, can change the composition of the overall population within a short time of order 1. This is related to classical Lotka–Volterra dynamics and leads to equilibrium states between the larger traits. Short-range mutations and the initial exponential growth of small mutant populations can be witnessed on a logarithmic time scale of order lnK. Finally, long-range mutations—in particular those that need to traverse a large fitness valley of width L—are quite rare and occur on a time scale of order 1/KμKL. The distinction between long and short-range mutations depends on the choice of the mutation probability μK, where long ranges L satisfy KμKL1. To obtain a non-trivial limit as K, the population size is usually rescaled by K. As a result, only the established resident traits are visible. Since the ecological changes of these traits happen very fast in comparison with the other time scales, the limit of the population process yields a jump process that transitions between different equilibrium states.

The effects of short-range mutations on the lnK-time scale have been studied extensively by Coquille et al. (2021). The authors give a full description of the limiting dynamics for the scenario of a general finite graph as a trait space. As mentioned above, the crossing of fitness valleys through long-range mutations (on the 1/KμKL-time scale) has been analysed for a simple linear trait space in Bovier et al. (2019). Moreover, the case of very rare mutations, where even neighbouring traits are regarded as long-range mutations, has already been studied by Champagnat and Méléard in Champagnat (2006), Champagnat and Méléard (2011), who showed convergence to the trait substitution sequence or polymorphic evolution sequence.

The present paper finally closes the gap between the previous works and gives a full description of the jump processes resulting from long-range mutations on general finite trait graphs, thus extending the results of Bovier et al. (2019) to the more general setting of Coquille et al. (2021). This general setting entails that, for a given equilibrium state, there might be several paths to cross the surrounding fitness valley. Concentrating on the decisive, shortest paths we calculate the rate of a transition to the next evolutionary stable condition and give the precise asymptotics in Theorem 7 and Corollary 8. The length of the shortest paths determines the time scale to cross the valley. Based on this, we introduce the notion of a stability degree L to classify the equilibrium states. Combining multiple of these steps gives rise to a jump chain that moves on a so called metastability graph stated in Corollary 10. This graph typically consists of fitness valleys of different width, which can be crossed on different time scales of the form 1/KμKL. Depending on the choice of time scale, only some of these transitions are possible (valleys of width strictly larger than L cannot be crossed) or visible (transitions of valleys of width strictly smaller than L are immediate). This leads to different limiting jump processes in Theorem 11.

When long-range mutations are necessary to cross a large fitness valley, the system displays an almost stable behaviour on shorter time scales but can change its state when waiting a long time. This type of phenomenon is also known as metastability. It has been studied mathematically mostly in the context of physics and statistical mechanics (e.g. Cirillo and Nardi 2013). However, the concept is very versatile and can be applied to many dynamical systems, including models for biological processes. This has for example been mentioned in Bovier et al. (2019) for models of adaptive dynamics, and in Dawson and Greven (2014) for population dynamics.

In the former case, as well as in this paper, the role of the traditional physical energy (landscape) is taken over by the fitness (landscape). Instead of passing a critical state of high energy, the process has to cross a valley of negative fitness through a sequence of deleterious mutations. Similarly to the fast dynamics after passing a high energy state, the adaptive dynamics system quickly attains a new metastable equilibrium once a fit mutant is reached due to fast exponential growth. The results of Bovier et al. (2019) and this paper even confirm classical definitions of the mean time for a metastable transition (e.g. Bovier and den Hollander 2015), by proving that the waiting times for jumps between equilibrium states are exponentially distributed when considering the correct time scale.

While single jumps across a fitness valley can be regarded as metastable transitions, the limiting jump chain can be related to the concept of adaptive walks or flights. Those are stochastic processes that directly study the motion of the macroscopic population on the trait space, focussing on successful invasions and omitting the microscopic dynamics (see Krug 2021 for an overview). There are two sources of randomness in adaptive walks: A random fitness landscape and a random motion towards neighbours of higher fitness, according to some transition law. Based on these, properties of interest are the distribution and accessibility of fitness maxima (Schmiegelt and Krug 2014; Nowak and Krug 2015; Berestycki et al. 2016, 2017), as well as the time or path length to reach those maxima (Orr 2003). In adaptive flights, transitions are not just possible between neighbouring traits but from one local fitness maximum to another (Jain and Krug 2005, 2007; Jain 2007; Neidhart and Krug 2011). This relates back to the limiting processes derived in this paper, where the population jumps between equilibrium states that are surrounded by valleys of traits of lower fitness.

A major difference between the models of adaptive walks/flights and adaptive dynamics is that the former assume a fitness landscape that is random but fixed in time, while in the latter case the fitness landscape is dynamic and depends on the current resident traits. As mentioned before, the notion of local fitness maxima can nevertheless be translated. Moreover, if equal competition between all traits is assumed in the adaptive dynamics model, the fitness landscape can again be regarded as fixed. We study this special case in a number of examples. Overall, the results of this paper can be seen as a validation of certain types of adaptive walks or flights, deriving their macroscopic dynamics from a microscopic, individual-based model.

The remainder of this paper is structured as follows: In Sect. 2, we rigorously define the individual-based model of adaptive dynamics, for which we derive our limit theorems. We introduce key quantities, like the fitness of a trait, and recapitulate the most important results of Coquille et al. (2021) that lead to a metastable state on the lnK-time scale. Finally, we heuristically derive the limit behaviour on longer time scales and present the formal convergence results, starting with a single metastable transition in Sect. 2.3 and treating the full jump process in Sect. 2.4. Section 3 is devoted to the discussion of a number of examples that highlight different aspects of the complicated limiting dynamics in an easy set up. The proofs of the main results of this paper can be found in Sect. 4. A combinatorial result on excursions of subcritical birth death processes and the complete version of the results from Coquille et al. (2021) are stated in Appendix A, for the convenience of the reader.

Model and main results

In this chapter we introduce the individual-based model of adaptive dynamics and develop the main results of this paper. After a rigorous definition of the population process and its driving parameters we give a short overview of the behaviour on the time scales of order 1 and lnK in Sect. 2.2. Moreover, in this section we derive the key quantities that lead us to the definition of the notion of an evolutionary stable condition. Our main results on the transition out of an ESC are stated in Sect. 2.3 and we give a heuristic explanation there. Finally, Sect. 2.4 is devoted to our results on multi-scale jump chains and the convergence of the population process. For the convenience of the reader, we provide a preview of the different time scales and the main results of this paper at the end of Sect. 2.1.

Individual-based model

To study the evolution of a heterogeneous population, we consider a classical stochastic individual-based model of adaptive dynamics. Each individual of our haploid population is characterised by its trait, which can be interpreted as its geno- or phenotype. Note that we assume a one to one correspondence between trait and physical properties. In this paper we consider a finite trait space that is given by a directed graph G=(V,E). Here, the set of vertices V represents the possible traits that individuals can obtain. The set of edges E marks the possibility of mutation between traits.

To each trait we associate a number of parameters that describe the dynamics of the system. For v,wV and KN, denote by

  • b(v)R+, the birth rate of an individual of trait v,

  • d(v)R+, the (natural) death rate of an individual of trait v,

  • cK(v,w)=c(v,w)/KR+, the competition imposed by an individual of trait w onto an individual of trait v,

  • μK[0,1], the probability of mutation at a birth event,

  • m(v,·)Mp(V), the law of the trait of a mutant offspring produced by an individual of trait v.

Here, Mp(V) denotes the set of probability measures on V. The parameter K scales the competitive interaction between individuals. It is called carrying capacity and can be interpreted as the environment’s capacity to support life, e.g. through the supply of nutrients or space. The way in which the mutation probability μK may depend on K is discussed below.

To ensure a limited population size and to establish the connection between the possibility of mutation and the edges of our trait graph, we make the following assumptions on our parameters.

Assumption 1

For every vV, c(v,v)>0. Moreover, m(v,v)=0, for all vV, and (v,w)E if and only if m(v,w)>0.

The evolution of the population over time is described by the Markov process NK with values in D(R+,NV). NvK(t) denotes the number of individuals of trait vV that are alive at time t0. The process is characterised by its infinitesimal generator:

LKϕ(N)=vV(ϕ(N+δv)-ϕ(N))Nvb(v)(1-μK)+wVNwb(w)μKm(w,v)+vV(ϕ(N-δv)-ϕ(N))Nvd(v)+wVcK(v,w)Nw, 1

where ϕ:NVR is measurable and bounded and δv denotes the unit vector at vV. The process can be constructed algorithmically following a Gillespie algorithm (Gillespie 1976). Alternatively the process can be represented via Poisson measures [see Fournier and Méléard (2004)], a representation that is used in the proofs of our results. Throughout this paper, we assume that all processes NK, KN, are defined on a common probability space. We give an example of a joint construction in the proof of Lemma 14. However, we emphasize that we do not assume a specific dependence or independence between the different processes in order for our results to hold true.

We want to study the typical behaviour of this process for large populations and moderately rare mutations. We do not have a fixed population size. However, due to our scaling of cK(v,w), the equilibrium size of the population is always of order K. We therefore consider the limit of the processes (NK/K,KN) as K and μK0 simultaneously in this paper.

Outlook: In the following sections, we develop the theory to describe the systems behaviour on various time scales. Since the description of each increasing time scale builds on the behaviour on previous shorter time scales, we go through these step by step, introducing the relevant notation as well as previous and new results along the way. To give the reader some orientation, we provide a brief overview of the time scales and preview the main results:

  • During times of order 1, the limiting rescaled stochastic process can be approximated by the solution of deterministic differential equations of Lotka–Volterra type. These describe how the larger subpopulations attain an equilibrium state (if existent). Since we consider the regime of μK0, mutations cannot be observed on this time scale.

  • For moderately rare mutations μK=K-1/α, mutations occur on the time scale 1/KμK and mutant subpopulations grow from a single individual to a size of order K on the time scale lnK1/KμK. The limiting dynamics on the lnK-time scale have been described in Coquille et al. (2021). We provide the heuristics of this result in Sect. 2.2 and give the precise statement in A.2. On this time scale, the system evolves until it reaches an equilibrium state, where there are no fit mutant traits of (graph-)distance at most α to the resident traits. This state is what we call an evolutionary stable condition (ESC).

  • In Sect. 2.3, we discuss how, on a more accelerated time scale 1/KμKL that corresponds to the distance L>α of the closest fit mutant, the process can escape an ESC. Our first result Theorem 7 states that the time to produce a new fit mutant outside of the ESC is of order 1/KμKL and approximately exponentially distributed with a rate that can be calculated precisely. It moreover states the probabilities to produce specific mutant types. Corollary 8 deduces that the time to reach a new ESC has the same distribution as the time of leaving the old ESC and calculates transition probabilities to reach specific new ESCs. These single transitions between ESC states, which can be regarded as metastable transitions, are used to define the (directed) metastability graph GESC in Definition 9, in the beginning of Sect. 2.4. It consists of subsets of V that allow for an ESC and the possible transitions between them.

  • Since the time scales on which transitions on the metastability graph occur depend on the distances L between fit mutants and current resident traits, the corresponding jump chain (characterised in Corollary 10) cannot be obtained as a limiting process on a single time scale. Instead, if we fix a time scale 1/KμKL, only transitions of this precise distance L are visible in the limit of NK/K as K. Shorter jumps occur immediately and longer jumps cannot be observed. To describe these dynamics, we introduce an L-scale graph GL, consisting of all ESCs that are not left immediately on the time scale 1/KμKL and characterize the limiting jump process on this graph in Theorem 11.

Short-term dynamics and frequent mutations

A law of large numbers result by Ethier and Kurtz (1986) states that, for μK0, the rescaled processes NK/K converge to the solution of a system of Lotka–Volterra equations. The study of these equations is central to determine the short term evolution, i.e. the evolution on a finite time scale, of the process NK.

Definition 1

(Lotka-Volterra system, equilibrium states, invasion fitness) For a subset vV we denote by LVS(v) the system of Lotka-Volterra equations given by

ddtnv(t)=b(v)-d(v)-wvc(v,w)nw(t)nv(t),vv,t0. 2

By LVE(v), we denote the set of all equilibrium states n¯R0v such that

b(v)-d(v)-wvc(v,w)n¯wn¯v=0,vv, 3

and by LVE+(v):=LVE(v)R>0v the subset of positive equilibrium states. If LVE+(v) consists of a single globally asymptotically stable element, we denote it by n¯(v) and call it coexistence equilibrium.

For a trait wV and coexistence equilibrium n¯(v), we denote by

f(w,v)=b(w)-d(w)-vvc(w,v)n¯v(v) 4

the invasion fitness of w. For a given equilibrium n¯(v), we call a trait w fit if f(w,v)>0 and unfit if f(w,v)<0.

Note that the invasion fitness f(w,v) describes the approximate growth rate of a small population of trait w in a bulk population of coexisting traits v, in the mutation-free system. To simplify notation for later purpose, in the case of monomorphic equilibria, i.e. v={v}, we write

n¯(v):=n¯v({v})andf(w,v):=f(w,{v}). 5

Going back to the stochastic process NK, it is of interest to study the logarithm of the population size as K. Only subpopulations with a size of order K are visible in the rescaled limit of NK/K and exponential growth of the absolute population size translates to linear growth of the K-exponent when studying a logarithmic time scale via etlnK·f=Kt·f. This makes it easier to describe the limiting dynamics. We therefore define βK=(βvK)vV, where

βvK(t):=ln(1+NvK(t))lnK, 6

which is equivalent to NvK(t)=KβvK(t)-1. Note that we add or subtract 1 here respectively to ensure that βvK(t)=0 if and only if NvK(t)=0. As K, βvK ranges between 0 and 1.

Remark 1

In contrast to Champagnat et al. (2021), Coquille et al. (2021), we do not rescale the time by lnK in this definition of βK since we are studying a variety of different time scales.

Based on this definition, we introduce the following subsets of traits.

Definition 2

(macroscopic, microscopic, living and resident traits)

  • (i)

    A trait vV with exponent βvK is called macroscopic if lim infKβvK=1.

  • (ii)

    A trait that is not macroscopic is called microscopic.

  • (iii)

    The set of living traits is the set VlivingK:={vV:βvK>0}.

  • (iv)

    A subset of traits vV is called resident if all vv are macroscopic and have a population size close to the coexistence equilibrium n¯(v).

Remark 2

Note that these definitions are time dependent when considering an evolving population. The macroscopic traits change according to βK(t) and the varying subset of living traits is denoted by VlivingK(t). Most of the time macroscopic and resident traits coincide. A non-resident macroscopic trait is either unfit and will shrink to an order lower than K within a short time, or it is fit and will therefore induce a change in resident traits according to the short-term Lotka–Volterra dynamics.

To study multi-step mutations we consider paths on the trait graph G=(V,E).

Definition 3

(paths and distances) We denote a (finite) path on G=(V,E) by γ=(γ0,,γ) such that γiV, 0i, and (γi,γi+1)E, 0i-1.

The length of a path γ=(γ0,,γ) is defined as γ=. We write γ:vv as a short notation for all paths γ that connect vV to vV, i.e. that satisfy γ0v and γγv.

We introduce the graph distance between two vertices v,wV as the length of the shortest connecting path

d(v,w):=minγ:vwγ, 7

where the minimum over an empty set is taken to be . For two subsets v,vV we define

d(v,v):=minvv,vvd(v,v). 8

Remark 3

Note that d(vw) is not a distance in the classical sense, as it may not be symmetric in the case of a directed graph.

Along these paths γ, mutants can be produced. A macroscopic trait produces subpopulations of a size of order KμK of its neighbouring traits, which then produce subpopulations of a size of order KμK2 of the second order neighbours, and so on. These subpopulations, that are produced along a path γ, can survive as long as KμK1. This motivates the study of mutation probabilities μK=K-1/α, α>0, where mutants can survive within a radius α of the resident traits.

Remark 4

We could also study mutation probabilities μK=f(K)K-1/α such that ln(f(K))o(lnK). This would not change the following results. However, we restrict ourselves to the case of f(K)1 to simplify notation.

To avoid mutant subpopulations with a size of order K0=1 and to ensure that non-resident traits are always either fit or unfit we make the following assumptions.

Assumption 2

  • (i)

    The mutation probability satisfies μK=K-1/α for some αR+\N.

  • (ii)

    For each vV such that LVE+(v)={n¯(v)}, it holds f(w,v)0, for all wv.

Remark 5

Both of these assumptions are purely technical. The first one prevents the case where a fit mutant population of size of order 1 can die out due to stochastic fluctuations such that fixation in the population becomes random. The second one allows us to approximate non-resident subpopulations by branching processes that are either super- or subcritical, but not critical. Note that the second assumption is only required for subsets v that allow for a unique positive equilibrium state (i.e. such that LVE+(v) contains exactly one element).

Under these assumptions, the evolution of the population on the time scale lnK has been studied in Coquille et al. (2021). The authors give an algorithmic description of the limiting evolution of βK(tlnK) as long as there always exists a unique asymptotically stable equilibrium of the Lotka–Volterra system (2) involving all macroscopic traits. In the following, we give the heuristics of this description. For the precise result we refer to Sect. A.2.

Roughly speaking, for a given set of resident traits v at their (coexistence) equilibrium n¯(v), every living microscopic trait wVliving can grow (or shrink) with rate at least f(w,v). This is due to the fact that the competitive interaction with all microscopic traits can be neglected in comparison with this rate. If there was no mutation (i.e. μK=0), f(w,v) would be the exact growth rate of w. However, due to incoming mutants from neighbouring traits, the population size of w is also at least as big as a μK-fraction of the population sizes of its (incoming) neighbours. Since we only consider the order of the population size βwK, the largest of these influences dominates the asymptotics and a sum of population sizes (coming from different mutation sources) yields a maximum in the exponent. Overall, we obtain the relation

βwK(tlnK)βwK(0)+tf(w,v)maxuV:d(u,w)=1βuK(tlnK)-1α. 9

Iterating this argument for traits at increasing distance to w yields that, as long as the resident traits remain unchanged (i.e. traits v stay close to their equilibrium n¯(v) and no new traits become macroscopic), βK(tlnK) converges to β(t) such that

βw(t)=maxuVβu(0)+(t-tu)f(u,v)-d(u,w)α+. 10

Here,

tu:=infs0:uV:βu(s)=1α,(u,u)Eifβu(0)=0,0ifβu(0)>0. 11

Once a former microscopic trait w becomes macroscopic, the population sizes of vw follow the Lotka–Volterra dynamics of (2) to reach a new equilibrium associated to the resident traits vvw within a time of order 1 (if such a new unique equilibrium does not exist, or in a number of other technical special cases, the algorithm terminates as described in Sect. A.2). During this phase, the orders of population sizes βw do not change significantly. After the change of resident traits, the population sizes again follow (10), now with the changed fitnesses f(u,v).

This algorithmic description yields a series of successive resident traits. The macroscopically visible evolution stops as soon as an equilibrium v is reached such that f(w,v)<0 for all wVliving\v. All traits wV such that d(v,w)<α stay alive due to incoming mutations but all other traits eventually go extinct according to (10) on the lnK-time scale.

This observation leads us to the following definitions (visualised in Fig. 1).

Fig. 1.

Fig. 1

Trait graph G=(V,E) with an ESC associated to the non-connected set of spotted resident traits v. Unfit traits inside the mutation spreading neighbourhood Vα(v) are marked light blue while those outside are dark orange. In this case, the stability degree is L(v)=4. Therefore, only the two nearest of the three fit, spiked green traits make up the set of mutant candidates Vmut(v)

Definition 4

(mutation spreading neighbourhood) For a subset vV, we denote by Vα(v):={wV:d(v,w)<α} the mutation spreading neighbourhood of v. The traits at the boundary of Vα are denoted by Vα(v):={wV:d(v,w)=α}.

Definition 5

((asymptotic) evolutionary stable condition)

  • (i)
    A subset vV and (orders of) population sizes β are called an evolutionary stable condition (ESC) if the traits v can coexist at a unique globally asympotically stable equilibrium n¯(v),
    f(w,v)<0,wVα(v)\v, 12
    and
    βw=1-d(v,w)α+,wV. 13
  • (ii)
    A subset vV and population sizes (βK)K0 are called an asymptotic evolutionary stable condition if the traits v can coexist at equilibrium n¯(v), (12) is satisfied,
    βwK-(1-d(v,w)/α)O1lnK,wVα(v), 14
    and there exists a K0< such that βwK=0, for all K>K0 and wV\Vα(v).

Remark 6

  • (i)

    Note that (12) is only a necessary condition for a subset vV to be able to attain an ESC during the evolution of a population. (13) are the orders of population sizes that unfit traits stabilise at purely due to (multi-step) mutations from v. (12) guarantees that these will be reached for wVα(v). To attain an ESC (v,β), in addition all other traits wVliving(τv), that are alive at the time τv when the new equilibrium n¯(v) is reached, have to satisfy f(w,v)<0. If this is the case, all traits outside of Vα(v) will die out within a time of order lnK and (13) will be reached. Otherwise, if there is a wVliving(τv)\Vα(v) such that f(w,v)>0 (the case f(w,v)=0 is excluded by Assumption (2)), its subpopulation is able to grow, will not die out, and hence not satisfy (13). The characterization of ESCs is therefore highly dependent on the state of the whole system.

  • (ii)
    Note that the definition of an asymptotic ESC forces the population process to be in an ESC up to a multiplicative error of order one. That is
    NwK=(K(1-d(v,w)/α)+-1)×O(1). 15
    The reason for introducing this error is that, for finite K, NwK might never reach exactly K(1-d(v,w)/α)+. This is for example the case if n¯v(v)<1 for some vv.

By definition, an evolutionary stable condition is surrounded by unfit traits, at least within an α-radius. This form of a fitness landscape is referred to as a fitness valley and has been studied in a special case in Bovier et al. (2019). Based on this, we introduce a measure for the stability of a coexistence equilibrium, connected to the width of the surrounding fitness valley.

Definition 6

(Stability degree) For a subset vV we define its stability degree L(v) by

L(v):=minwV:f(w,v)>0d(v,w)ifvcan coexist,0else. 16

Remark 7

A subset v associated to an ESC satisfies L(v)>α by definition. The evolution of the population process reaches a final state, independent of the time scale, once the resident traits satisfy L(v)=, i.e. there are no fit traits anymore.

Transitioning out of an ESC and first convergence result

Once an ESC is obtained, there is no further evolution on the lnK-time scale. However, as long as there is a fitter trait that is connected to the resident traits, i.e. that can be reached along a finite path in G, we can witness metastable transitions on an even more accelerated time scale. On this time scale, under certain assumptions on the lnK-dynamics, we observe a direct transition from one ESC to another.

In the following, we consider one of these transitions for an arbitrary initial asymptotic ESC. We split the transition into two phases: In the first phase, a new fit mutant beyond the fitness valley fixates in the population within a time of order 1/KμKL(v). In the second phase, a new ESC is obtained, starting with these new initial conditions, which takes a time of order lnK. We assume that v and (βK(0))K0 are an asymptotic ESC. We could also consider more general initial conditions that lead to an ESC within finitely many steps of the lnK-algorithm in Coquille et al. (2021), see Remark 11. For the sake of a simpler notation, we stick with the assumption of starting in an (asymptotic) ESC here.

To consider the first phase of the transition, we introduce the set

Vmut(v):=argminwV:f(w,v)>0d(v,w)={wV:f(w,v)>0,d(v,w)=L(v)}. 17

This consists of all fit mutant traits that are closest to v (visualised in Fig. 1).

Note that Vmut(v)Vα(v)= by the definition of an ESC. It turns out that the traits Vmut(v) are the only ones that need to be considered for a crossing of the fitness valley since one of them will be the first new trait to fixate in the equilibrium population. If Vmut(v)=, i.e. L(v)=, there is no fitter trait connected to the resident traits and the equilibrium associated to v is the final state of the population.

For L(v)<, we define the stopping time

TfixK:=inft0:wV\Vα(v):βwK(t)1α, 18

the first time when a new trait reaches a size of order K1/α, can thus produce neighbouring mutants within a time of order 1 and influence the subpopulations of other traits.

Remark 8

Note that the name TfixK might be a little misleading at first glance. Generally, we speak of the fixation of a trait within a population as the event that the subpopulation corresponding to this trait does not go extinct (due to random fluctuations or negative fitness), as long as the fitness landscape stays unchanged. As this event is determined by the future progression of the population, there is no precise time point to pin it to. In particular, whether a trait fixates or goes extinct is not foreseeable at the time point when the first individual of this trait arises. Therefore, we choose instead the time point when the subpopulation has reached a size that guarantees non-extinction with probability 1, asymptotically as K. We could choose a much smaller size than K1/α for this, however, this will not influence the event of fixation and only change the stopping time by a time of order lnK, which is negligible compared to the much longer time scale on which mutants arise. We thus pick the first time when mutants can influence the population size of other traits.

Our first result describes the limiting distribution of this stopping time TfixK.

For a path γ:vVmut(v) such that γ=L(v), the rate at which a w=γL(v) mutant population arises along this path γ and fixates can be derived as the product of several factors. The rate at which the first trait in γ outside of Vα(v) arises can be calculated in terms of the equilibrium population sizes of the traits in Vα(v) (see Sect. 4.1). This rate then has to be multiplied by the probabilities that all of the following unfit traits on the path γ produce mutants of the correct trait before extinction, i.e. during small subcritical excursions. This yields the rate at which single mutants of trait w arise, which finally has to be multiplied by their probability of fixating in the population, i.e. of non-extinction.

In order to calculate the probability of mutation during a subcritical excursion, we need to introduce some notation. For a subset vV and a trait wV we define

ρ(w,v):=b(w)b(w)+d(w)+vvc(w,v)n¯v(v), 19

which is connected to the probability of a birth event in the branching process approximating the growth of a mutant w in a bulk population of coexisting traits v. Moreover, we let

λ(ρ):==1(2)!(-1)!(+1)!ρ(1-ρ)+1, 20

which is the expected number of birth events before extinction in a subcritical birth death process with birth probability ρ (related to the expected number of positive jumps in a simple random walk on N before hitting 0, as explained in the proof of this result in Sect. A, Lemma 17). Note that, for ρ[0,1/2), one can explicitely calculate that λ(ρ)=ρ/(1-2ρ)<. Moreover, the symmetry relation λ(ρ)ρ=λ(1-ρ)(1-ρ) shows convergence of the series for ρ(1/2,1] as well.

With these definitions, the overall rate of mutants of trait wVmut(v) arising along path γ and fixating in the population is approximately equal to R(v,γ)KμKL(v), where

R(v,γ):=n¯γ0(v)i=1αb(γi-1)m(γi-1,γi)f(γi,v)b(γα)m(γα,γα+1)·j=α+1L(v)-1λ(ρ(γj,v))m(γj,γj+1)·f(γL(v),v)b(γL(v)). 21

Here, the first line is the rate at which the first trait in γ outside of Vα(v) arises, which is related to the equilibrium size of trait γα. The first factor in the second line is the probability of producing consecutive mutants during subcritical excursions and the last factor is the fixation probability of trait w=γL(v). Note that, as b(γL(v)) increases, so does f(γL(v),v) (cf. (4)), and hence this fixation probability is in fact increasing in the birth rate b(γL(v)).

The total rate at which a mutant population of trait wVmut(v) arises and fixates collects all shortest paths that end in w and is approximately equal to R(v,w)μKL(v), where

R(v,w):=γ:vwγ=L(v)R(v,γ). 22

Finally, the total rate at which any mutant population of a trait in Vmut(v) arises and fixates, i.e. the rate at which the population exits the ESC associated to v, is approximately equal to R(v)μKL(v), where

R(v):=wVmut(v)R(v,w). 23

The probability that this population is of trait wVmut(v) is proportional to the rate R(v,w).

With these heuristics, we can now state the first main result of this paper.

Theorem 7

Let G=(V,E) be a finite graph. Suppose that Assumption 1 and 2 are satisfied and consider the model defined by (1) with μK=K-1/α. Assume that vV and (βK(0))K0 are an asymptotic ESC. Then there exist constants ε0>0 and 0<c< such that, for all 0<ε<ε0, there exist exponential random variables E+K(ε) and E-K(ε) with parameters R(v)(1+cε) and R(v)(1-cε), such that

lim infKP(E-K(ε)TfixKKμKL(v)E+K(ε))1-cε. 24

Moreover, for all wV, the probability of w being the trait to trigger TfixK is

limKPβwK(TfixK)=1/α=R(v,w)/R(v)ifwVmut(v),0else. 25

Remark 9

Note that traits in wVα(v) do not attain βwK=1/α before TfixK due to the assumption that αN. Therefore the probability in (25) is zero for such traits.

Once some wVmut(v) has reached βwK1/α, the lnK-dynamics evolve as described in Coquille et al. (2021), initiated with βwK=1/α and βuK=(1-d(v,u)/α)+, for uV\w. These dynamics are deterministic and in case they do not terminate early and if they lead to a new ESC, we denote the associated set of resident traits by vESC(v,w).

Observe that there is no general formula to express vESC(v,w) in terms of v and w and the parameters of the system. An interesting case is illustrated in Example 3.

Under the assumption that all traits wVmut(v) lead to asymptotic ESCs vESC(v,w), we define the stopping time at which one of these asymptotic ESCs is obtained by

TESCK:=inf{tTfixK:wVmut(v):uVα(vESC(v,w)):βuK(t)-1-d(vESC(v,w),u)α<εK,uVα(vESC(v,w)):βuK(t)=0}, 26

where we pick εK=C/lnK for a large enough 0<C<. Then this definition is precisely in line with the definition of an asymptotic ESC.

Remark 10

The minimal necessary C can be made precise using the prefactors of the population sizes in equilibrium, calculated in Lemma 14. We refrain from doing so here as it is notationally very heavy and does not provide any deeper insight.

Since the time TESCK-TfixK is of order lnK, the asymptotics for TfixK translate to TESCK. Moreover, the transition probabilities from one ESC to another can be expressed in terms of the probabilities of traits wVmut(v) fixating in the population. For wV we define

p(v,w):=wVmut(v):vESC(v,w)=wR(v,w)R(v). 27

Example 1 treats a case where this probability is indeed the sum over multiple mutant candidates w.

We can now state the result on transitions between ESCs as a direct corollary of Theorem 7.

Corollary 8

Suppose the same assumptions as in Theorem 7 are satisfied. Moreover, assume that, for every wVmut(v), the algorithmic description of the lnK-dynamics in Sect. A.2, initiated with

βu(0)=1αifu=w1-d(v,u)α+else, 28

does not stop early due to one of its termination criteria and reaches an ESC associated to some traits vESC(v,w) after finitely many steps. Then, TESCK-TfixKO(lnK) and therefore, with the same constants ε0 and c and with the same random variables E+K(ε) and E-K(ε) as in Theorem 7,

lim infKP(E-K(ε)TESCKKμKL(v)E+K(ε))1-cε. 29

Moreover, for all wV,

limKP({uV:βuK(TESCK)>1-εK}=w)=p(v,w). 30

Remark 11

  • (i)

    Note that Theorem 7 and Corollary 8 only consider a specific transition from the ESC associated to some v to another ESC. The constants ε0 and c can however be chosen uniformly for all ESCs by reason of the finite trait graph.

  • (ii)

    Both results assume that the system starts out in an asymptotic ESC. These are the natural initial conditions, particularly when a first transition between asymptotic ESCs has already occurred. We could however allow for more general initial conditions, as long as they lead to an asymptotic ESC within finitely many steps of the lnK-algorithm.

Multi-scale jump chain and limiting Markov jump processes

Building on the previous description of a single transition step from one ESC to another, we now want to describe the multi-step transitions between ESCs as a jump chain (v(k))k0 on a meta-graph. We first introduce the underlying metastability graph GESC, consisting of all sufficiently stable macroscopic equilibrium configurations, and then describe the dynamics of the jump chain. Finally, we give a convergence result that derives different Markov jump processes, depending on the chosen time scale.

Definition 9

(Metastability graph) As vertices for the general metastability graph GESC=VESC,EESC we take all sets of resident traits that correspond to an ESC, i.e. that have stability degree strictly bigger than α, and edges represent possible transitions to other ESCs. More precisely,

VESC:=vV:L(v)>α, 31
EESC:=(v,w):wVmut(v)s.t.w=vESC(v,w). 32

Recall that vESC(v,w) stands for the resident traits associated to the new ESC that is attained at the end of the lnK-algorithm being started with resident set v and invading mutant wVmut(v). We already assigned to each vertex vVESC the exit rate R(v) in (23) and to each edge (v,w)EESC the transition probability p(v,w) in (27).

Using Corollary 8, we can now work out inductively the multi-scale jump chain (v(k))k0 on GESC. To this end, let v(0)VESC be the resident traits of the initial ESC that the process starts in and set TESC(0,K):=0. We describe the kth transition, for k1, conditioned on the knowledge of v(k-1). We denote the set of closest fit mutant traits by Vmut(k)=Vmut(v(k-1)), the width of the next fitness valley to cross by L(k)=L(v(k-1)), and the exit rate by R(k)=R(v(k-1)). Moreover, we keep track of the time when the first mutant population fixates and when the next ESC is reached by introducing the stopping times

Tfix(k,K):=inf{tTESC(k-1,K):wV\Vα(v(k-1)):βwK(t)1α}, 33
TESC(k,K):=inf{tTfix(k,K):wVmut(k):uVα(vESC(v(k-1),w)):βuK(t)-1-d(vESC(v(k-1),w),u)α<εK,uVα(vESC(v(k-1),w)):βuK(t)=0}, 34

with εK as in 26.

With this notation, we can now state the result on the kth transition of the multi-scale jump chain.

Corollary 10

Assume that we constructed the process up to time TESC(k-1,K), when the ESC associated to v(k-1) is obtained, and suppose the same assumptions as in Theorem 7 are satisfied. Moreover assume that the lnK-dynamics behave as in Corollary 8, for every wVmut(k). Then there exist constants ε0>0 and 0<c< such that, for all 0<ε<ε0, there are exponential distributed random variables E+(k,K)(ε) and E-(k,K)(ε) with parameters R±(k)(ε):=R(k)(1±cε) such that

lim infKP(E-(k,K)(ε)(TESC(k,K)-TESC(k-1,K))KμK(k)E+(k,K)(ε)|v(k-1))1-cε. 35

Moreover, for all wV,

limKP({vV:βvK(TESC(k,K))>1-εK}=w|v(k-1))=p(v(k-1),w). 36

The preceding corollary allows us to construct a limiting random jump chain (v(k))k0 on the metastability graph GESC. To be precise, given the current state v(k-1), the next ESC v(k) is taken at random from VESC with probability distribution p(v(k-1),·). However, the jumps take place on varying time scales of type 1/KμK(k). The construction is valid until an ESC is obtained such that some mutant wVmut(k) does not induce a unique new ESC, following the deterministic lnK-dynamics. A visualisation of the metastability graph including a particular realisation of the jump chain is given in Fig. 2.

Fig. 2.

Fig. 2

Metastability graph GESC including a jump chain (vk)k0, where vi(4)=vESC(v(3),wi), for Vmut(v(3))={w1,w2,w3}

After this general description of the multi-scale jump chain we can now easily elaborate the true Markov jump process on each time scale. To be more precise, for each stability degree L>α, we are looking for the limit process of NwK(t/KμKL)/K, for K. The support of this process jumps between sets of coexisting traits of sufficiently high stability degree, which can only be exited on this time scale. In this context, we define the level sets of equal stability degree as

SL:=vV:LVE+(v)={n¯(v)},L(v)=L. 37

Note that, for L>α, a stability degree of L(v)=L ensures that the coexisting traits v allow for an asymptotic ESC, see Remark 7.

As the state space for the limiting jump process, we introduce the L-scale graph GL, which is a collapsed version of GESC. The vertex set consists of all ESCs that are stable enough to be visible on the respective time scale. Therefore, we set

VL:=LLS. 38

Note that it is possible that the process jumps into an ESC vS, for L>L, on the 1/KμKL-time scale. However, there is no possibility to escape from those on this time scale, which means that these are absorbing states.

Edges EL in GL represent possible transitions of the limiting process. To construct these, we study the limiting jump chain from Corollary 10.

In order to use the corollary and in particular the process (v(k))k0, we have to ensure that, for fixed L>α, this process always reaches an ESC of stability degree at least L in finitely many steps.

Assumption 3

vSL:PkN>0:L(v(k))L|v(0)=v=1 39

Note that, if this assumption is satisfied for some fixed L, this has no implications for the validity for different LL. This is due to the fact, that only the initial conditions vSL are considered. One can easily think of counterexamples where GESC is non-connected such that there may be cycles of lower time scale but there is no danger to run into them. For a broader discussion of the assumption we refer to the Examples 4 and 5.

Remark 12

If the process runs into a cycle or stable cluster on a lower time scale, there are still possibilities to escape from these by accelerating and looking at higher time scales. The detailed description of such behaviour is much more involved. This is mainly due to technical reasons: Errors accumulate in the approximation of each transition step. As long as it is ensured that the system reaches a (sufficiently stable) ESC after finitely many steps, these errors can be iteratively bounded to ensure convergence. This however fails if the number of lower time scale transitions between higher time scale jumps is not bounded. Heuristically, if one can observe ergodic behaviour on the L-scale graph, for some L<L, transitions out of the ergodic cluster will occur along one of the shortest fitness valleys of width L. Transition rates will be weighed according to the stationary distribution on states in S and the transition takes a time of order 1/KμKL. Rather than defining vertices of GL as single sets of coexisting traits in SL, one would then choose communication classes of such sets in S (for possibly multiple L<L) that support an ergodic stationary distribution. Rigorously justifying this argument is a topic of current and future research of the authors.

Now asking for possible jumps in GL we have to respect again the principle that jumps on lower time scales are absorbed in those happening on the 1/KμKL-time scale. This means that the critical event for a transition starting in vSL is to escape from v, which needs a time of order 1/KμKL. Compared to that, the subsequent transitions in GESC until reaching again a state w of stability at least L(w)L take place in very short time. Therefore we say that the (directed) egde (v,w) is in EL if and only if L(v)=L and there exists a finite path Γ:vw in GESC such that L(Γi)<L,1i<Γ.

The probability of possible transitions (v,w)EL is then the sum over all possible paths Γ that give rise to this edge, while the probability of taking a particular path is easily computed as the product of its segments in GESC.

pL(v,w):=Γ:vwL(Γi)<L,1i<Γi=1Γp(Γi-1,Γi) 40

For an explanatory computation of these probabilities we refer to the Examples 6 and 7.

The transition rate for the jumps on the 1/KμL time scale are then given by the over-all rate to escape from v weighted with the transition probability to end in w.

RL(v,w):=R(v)pL(v,w) 41

Now we are prepared to formulate the main result, i.e. the convergence to a Markov jump process on different time scales.

Theorem 11

Let L>α such that Assumption 3 holds true and take vL(0)VL. Suppose the same assumptions as in Theorem 7 are satisfied for v=vL(0) and assume that the lnK-dynamics behave as in Corollary 8, for every vLLS and all wVmut(v). Then, for all T<, the rescaled process (NvK(t/KμKL)/K,vV,t[0,T]) converges in the sense of finite dimensional distributions to a jump process of the form

NvL(t)=1vvL(t)n¯v(vL(t)),vV,t[0,T]. 42

Here (vL(t),t[0,T]) is a Markov jump process on the L-scale graph GL=(VL,EL), with transition rates given by (41).

Remark 13

  • (i)

    We like to point out that Assumption 3 does not exclude the cases where we have cycles in GL, i.e. on the time scale 1/KμKL. It only prevents the process from running into a cycle of lower time scale. We even allow for self connecting edges, i.e. edges of the form (v,v).

  • (ii)

    As shown in Champagnat (2006, Prop. 1) it is not possible to get convergence with respect to the Skorohod (J1)-topology since this would imply continuity for the limit of the total mass process, which cannot be true.

Interesting examples

In this chapter, we present and analyse a variety of examples that aim to highlight different aspects of the complicated dynamics covered in our main results. The first two examples are dedicated to single transition steps from one ESC to another, applying the results of Theorem 7 and Corollary 8. The next three examples focus on the metastability graph GESC that is constructed in Corollary 10 and we study two cases that are concerned with Assumption 3. The final two examples are focussed on applications of Theorem 11, studying the limiting Markov jump processes on different time scales as well as the L-scale-graphs GL.

In order to give a manageable and clear description of the dynamically changing fitness landscape, we introduce some new notation that helps to simplify the set up of the examples.

Definition 12

We speak of a regime of equal competition if and only if c(v,w)const>0, for all v,wV.

This is by no means a necessary assumption to produce the studied phenomena, however, it allows us to characterise the fitness landscape in a much simpler way. In the case of equal competition, the invasion fitness of a trait w with respect to a single resident trait v is fully characterised by

f(w,v)=r(w)-r(v), 43

where we set r(v):=b(v)-d(v) as the individual fitness of trait v, i.e. its net growth rate in the absence of competitive interactions. As a consequence, traits w with higher individual fitness than the resident v are able to invade the population. Hence, instead of specifying the invasion fitnesses for all possible resident traits, the fitness landscape is fully described by the individual fitnesses r(v).

To specify the fitness relations between different traits - in particular in the case of non-equal competition - we introduce the following notation.

Definition 13

For v,wV, we write vw if and only if f(w,v)>0 and f(v,w)<0. Moreover, we write v1,,vkw1,,wl whenever viwj, for all 1ik and 1jl.

This reflects the case where the equilibrium of the Lotka–Volterra system involving v and w is the monomorphic equilibrium n¯(w) of w. In other words w can invade the v population and fixate.

Single transition steps

A first example with multiple mutation paths

Example 1

Let us consider the directed graph G depicted in Fig. 3. Assume equal competition and the individual fitness r plotted in Fig. 3. Moreover, let α(1,2).

Fig. 3.

Fig. 3

Trait graph G and fitness landscape in terms of individual fitness r of Example 1

In this case, the initial resident trait 0 has stability degree L({0})=2>α. This is due to the fact that traits 1a and 1b are unfit in presence of the resident, while traits 2a, 2b and 2c are fit, with connecting paths γA=(0,1a,2a), γB=(0,1b,2b) and γC=(0,1b,2c) of length 2 respectively. Therefore, we have the possible mutant candidates Vmut({0})=2a,2b,2c. An application of Theorem 7 yields that we can observe a new fixating trait at rescaled time TfixKKμK2, which is distributed approximately as a exponential random variable with rate R({0})=R({0},2a)+R({0},2b)+R({0},2c). The probability for say trait 2b to be the trait that fixates in the population and triggers the stopping time is R({0},2b)/R({0}).

Asking for the new ESCs, which are reached after fixation, we have to take into account the subsequent evolution on the lnK time scale. This allows for jumps towards traits of higher fitness, which are in the mutation spreading neighbourhood, i.e. direct neighbours in this case. Therefore, we end up with

vESC({0},2a)={2a}vESC({0},2b)={4},vESC({0},2c)={4}. 44

In particular, note that {2b},{2c} are not ESCs and thus not part of the metastability graph GESC as plotted in Fig. 4.

Fig. 4.

Fig. 4

Metastability graph GESC and 2-scale graph G2 of Example 1

This puts us into the setting where the sum in (27) becomes relevant. In particular, despite the micro-evolutionary branching from 1b into 2b and 2c in the trait graph G , there is no such branching on the macro-evolutionary level in GESC. There, we only observe a transition from {0} to {4}. Note also that the different path lengths of 2b4 and 2c4 do not matter for the asymptotics of the time TESC until stabilising in the new ESC. This is because this time is dominated by the waiting time Tfix for the first fixation of a fit mutant trait. Since L({0})=2, this time is of order 1/KμK2 and thus absorbs the much faster lnK evolution.

Note that, since all transitions between ESCs occur on the time scale 1/KμK2 here, the metastability graph GESC agrees with the 2-scale graph G2.

An ESC with coexistence

Since in this paper we discuss the occurrence of metastable behaviour in a rather general setting, we like to point out that Definition 5 explicitly allows for ESCs v that consist of several coexisting traits. This clearly enlarges the mutation spreading neighbourhood Vα(v) and changes the set of mutant candidates Vmut in a non-trivial way.

Example 2

Let us consider the directed graph G depicted in Fig. 5. Let α(1,2) and consider a fitness landscape that satisfies

f(0,3),f(3,0)>0, 45
f(1,{0,3}),f(2,{0,3})<0, 46
f(4,{0,3}),f(5,{0,3})>0, 47
0,1,2,34,5 48
1,20,3 49
f(4,5),f(5,4)<0, 50

and allows for no polymorphic coexistence equilibria apart from {0,3}. Moreover, assume that the unique stable equilibrium of the Lotka–Volterra system involving traits {0,3,4} is n¯(4) and the same is true for 5 replacing 4.

Fig. 5.

Fig. 5

Trait graph G and metastability graph GESC (which agrees with the 2-scale graph G2) of Example 2

Checking for traits that do not have any fitter direct neighbours, and hence do not allow for transitions on the lnK-time scale, the monomorphic ESCs in this case correspond to {0}, {3}, {4}, and {5}. Classical results on Lotka–Volterra systems yield that under assumption (45) traits 0 and 3 can coexist, i.e. n¯({0,3})R>02. Now the mutation spreading neighbourhood is given by Vα({0,3})={0,1,2,3}. Apart from the resident traits themselves, those traits are by assumption unfit with respect to {0,3} and thus {0,3} allows for an ESC.

Looking for the stability degree and possible mutant candidates, the assumptions on the fitness landscape imply that

L({0,3})=2andVmut({0,3})={4,5}. 51

By Theorem 7, we can observe a fixating mutant population of one of the traitswVmut({0,3}) on the time scale 1/KμK2. The corresponding rates are given by

R({0,3},4)=n¯0({0,3})b(0)m(0,1)f(1,{0,3})b(1)m(1,4)f(4,{0,3})b(4),forw=4, 52
R({0,3},5)=n¯3({0,3})b(3)m(3,2)f(2,{0,3})b(2)m(2,5)f(5,{0,3})b(5),forw=5. 53

Note that, although there are also paths connecting 34 and 05, only the paths of shortest length γ=2 do have an impact on the above rates.

To conclude this example, we see that both mutant traits 4 and 5 are fit enough to invade the coexisting resident population. Overall, we obtain the metastability graph GESC pictured in Fig. 5, which in this case agrees with the 2-scale graph G2. Note that the traits 0 and 3 appear both as monomorphic ESCs, as well as a polymorphic coexistence ESC.

Successive metastable transitions

Self connection in GESC

By definition of an ESC, the first fixating mutant has a distance of at least α+1 from the corresponding resident traits. Despite this fact, the lnK-mechanism triggered by such a mutant may lead to a new ESC that is closer to the old one than α+1. It can even be the same and thus give rise to a self-connecting edge in GESC

Example 3

Let us consider the directed graph depicted in Fig. 6 and take α(1,2). Consider a fitness landscape that satisfies

02452, 54
12,34, 55
f(1,0),f(3,2),f(3,5)<0 56

and assume that there are no polymorphic coexistence equilibria.

Fig. 6.

Fig. 6

Trait graph G and metastability graph GESC of Example 3

After a first jump from v(0)={0} to v(1)={2} on the time scale 1/KμK2, the next fixating mutant is of trait 4 and arises on the same time scale. The chosen fitness landscape ensures that it grows and can invade the population of trait 2 within a lnK-time. Since α(1,2), we obtain a non-vanishing population of trait 5 on the same time scale, which can grow as soon as trait 4 is the new resident trait. Due to its positive invasion fitness, 5 invades the trait 4 population. Finally, the same argument applies for an invasion by trait 2, where we then get stuck in because {2} is an ESC of stability degree L({2})=2>α.

Overall, we obtain that

v(2)=vESC({2},4)={2}. 57

In view of Definition 9, this gives rise to the self-connecting edge ({2},{2})GESC, which is illustrated in Fig. 6.

On Assumption 3

Since the assumption that prevents the process from getting stuck on a slower time scale is somewhat involved, we give two examples. First, we illustrate in Example 4 that Assumption 3 may hold true even if there is a cycle in the metastability graph GESC. Second, we slightly modify the trait graph G and the fitness landscape to get Example 5, where Assumption 3 is not satisfied, and explain why this leads to difficulties.

Example 4

Let us consider the directed graph depicted in Fig. 7. Let α(0,1) and consider a fitness landscape that satisfies

02346, 58
12,56, 59
372, 60
f(1,0),f(5,4)<0 61

and assume that there are no polymorphic coexistence equilibria.

Fig. 7.

Fig. 7

Trait graph G, metastability graph GESC and L-scale graphs G1 and G2 of Example 4

Let us first remark that, because of α(0,1), we are in the regime of the trait substitution sequence (cf. Champagnat 2006). This means that we can neglect the lnK-algorithm. In particular, if vw, for some wVmut({v}), then vESC({v},w)={w}.

With this knowledge, let us construct the jump chain step by step. The first two jumps are determined easily, noting that

v(0)={0},L({0})=2,Vmut({0})={2}, 62
v(1)={2},L({2})=1,Vmut({2})={3}, 63
v(2)={3},L({3})=1,Vmut({3})={4,7}. 64

For the third jump, there are two possible triggering mutants. If trait 7 fixates first, the process jumps to the ESC v(3)={7} and then returns to v(4)={2}, all on the time scale 1/KμK. If instead trait 4 fixates earlier, the jump chain continues to v(3)={4} within a time of order 1/KμK and then to v(4)={6} on the time scale 1/KμK2, since f(5,4)<0.

Mentioning that Vmut({1})={2} and Vmut({5})={6} gives us the metastability graph drawn in Fig. 7.

To check whether Assumption 3 is satisfied, we decompose the set of ESCs VESC according to the stability degree,

S1={1},{2},{3},{5},{7},S2={0},{4},S={6}. 65

For all vS1, one directly sees that an ESC of the same or a higher stability is reached after one jump with probability one. Thus the assumption is true for L=1 and we can construct the graph G1 as drawn.

In the case of L=2, for v(0)={4}, we obtain that with probability one the process jumps to v(1)={6}, which is of higher stability. Finally, we have to check the most involved case of v(0)={0}. From the metastability graph we identify v={4} as the only reachable ESC of degree L2. Due to the branching at {3}, we have to ensure that the process does not get stuck in a cycle of ({2},{3},{7},{2}) for infinitely many steps. We can see that

PkN>0:v(k){4}|v(0)={0}=0 66

since the number of cycles that run through before exiting towards {4} has a geometric law with success probability p({3},{4})>0. Therefore, Assumption 3 also holds true for L=2. This yields the L-scale graph G2, depicted in Fig. 7.

Let us now modify the example by inserting an additional trait 8, that can be viewed as an intermediate unfit mutation between 3 and 4. Moreover, for the sake of clarity, we cut off the traits 5 and 6.

Example 5

Let us consider the directed graph depicted in Fig. 8 and let α(0,1). Consider a fitness landscape that satisfies

0234, 67
372, 68
12,84, 69
f(1,0),f(8,3)<0 70

and assume that there are no polymorphic coexistence equilibria.

Fig. 8.

Fig. 8

Trait graph G, metastability graph GESC and L-scale graphs G1 and G2 of Example 5

Since we only changed the trait graph G slightly, also the metastability graph GESC stays almost the same. Apart from the omitted traits 5 and 6, the main difference is that the valley from the ESC {3} to the fit mutant 4 is now of width 2. Therefore, trait 4 is no longer one of the nearest fit traits to trait 3 and the set of possible mutants gets reduced to Vmut({3})={7}. In particular, there is no longer an edge ({3},{4}) in the metastability graph.

To check whether Assumption 3 is satisfied, we again separate the stability classes

S1={1},{2},{3},{7},{8},S2={0},S{4}. 71

For L=1, it is again easy to see from GESC that the assumption holds true. To check this for L=2, we have to consider how the process can get from the initial ESC {0} to some ESC of at least the same stability degree. This is not possible since the only candidate would bee {4}, which is not reachable since the metastability graph is disconnected. As a conclusion, Assumption 3 is not satisfied for L=2 and thus we can neither construct the L-scale graph G2 nor apply Theorem 11.

Remark 14

Although the population process gets stuck in a cycle between of the ESCs {2},{3},{7} for infinite time, we expect that it might escape through the fitness valley 384 eventually, when looking at the time scale 1/KμK2. This is due to the fact that, from the microscopic point of view, it is possible to observe mutants of trait 4 in the phases where 3 is the resident trait. Indeed, those mutants appear with a much smaller rate than those of trait 7, but since these phases occur infinitely often, it should only be a question of acceleration to escape from this cycle (c.f. Remark 12).

Collapse on higher time scales

In the two final examples, we demonstrate how paths in the metastability graph that pass through ESCs of different stability degree collapse to a single edge in the L-scale graph when focussing on a particular time scale. To this end we start with an example of a simple linear trait graph with multiple successive fitness valleys of different length. The second example allows for a branching in the metastability graph, which again vanishes in the L-scale graph.

Example 6

Let us consider the directed graph G depicted in Fig. 9. Assume equal competition and the individual fitness r plotted in Fig. 9. Moreover, let α(1,2).

Fig. 9.

Fig. 9

Trait graph G and fitness landscape in terms of individual fitness r of Example 6

Due to the linear and directed structure of the trait graph, we can extract the fitness valleys and thus the stability degrees directly from the plotted individual fitness r. The jump chain (v(k))k0 is the deterministic sequence

v(0)={0},v(1)={3},v(2)={5},v(3)={8}. 72

This is reflected in the metastability graph drawn in Fig. 10. Note that {6} is also an ESC of stability degree 2, but it cannot be reached starting from {0}.

Fig. 10.

Fig. 10

Metastability graph GESC and L-scale graphs G2 and G3 of Example 6

Let us now have a look at the L-scale-graphs, i.e. at how the limiting jump process evolves when fixing a particular time scale. To this end we focus on the sets of ESCs of equal stability degree, namely

S2={3},{6},S3={0},{5},S={8}. 73

Following our construction in (38), the L-scale-graph G2 consists of the verticesV2={0},{3},{5},{6},{8}. Since all but {3} and {6} are of stability degree higher than L=2, the only edges are E2=({3},{5}),({6},{8}).

The construction of the edges of G3 is far more interesting. In particular, starting in the initial ESC v(0)={0}, we cannot simply take the edge ({0},{3}) from the metastability graph since L({3})<3 and thus {3} is not stable enough. Instead, we have to consider the whole path Γ=({0},{3},{5}) until an ESC of higher stability is reached. This is because the second jump of Γ happens much faster (more precisely on the time scale 1/KμK2) and hence becomes absorbed in the slower first jump when rescaling the process with 1/KμK3. This gives us one edge in E3. The second one is given by the jump ({5},{8}). Since L({8})=, no further evolution is possible here.

Overall, these considerations lead to the pictures of G2 and G3 in Fig. 10.

Example 7

Let us consider the directed graph G depicted in Fig. 11. Assume equal competition and the individual fitness r plotted in Fig. 11. Moreover, let α(1,2).

Fig. 11.

Fig. 11

Trait graph G and fitness landscape in terms of individual fitness r of Example 7

Starting with the resident population in v(0)={0}, we can directly extract from the plotted individual fitness r that only the traits 3 and 5 have positive invasion fitness. Moreover, both can be reached via a path of length γ=3, namely

γA=(0,1,2,3),γB=(0,1,6,5). 74

Hence, we associate to this ESC the stability degree L({0})=3 and the set of mutant candidates Vmut({0})=3,5.

If trait 5 fixates first, there is no further evolution and we end with vESC({0},5)={5}. In the case where trait 3 fixates, it can grow and becomes macroscopic. Moreover, since α(1,2), the population of trait 4 grows by frequent incoming mutants. However, due to its negative invasion fitness with respect to the resident {0} and later against the macroscopic population {3}, it cannot invade. Hence vESC({0},3)={3} is the corresponding ESC and is of stability degree L({3})=2. From thereon, only trait 5 is a fit reachable mutant, which arises after a waiting time of order O(1/KμK2) and replaces 3 as an ESC. Those three jumps form the edges of the drawn metastability graph GESC in Fig. 12.

Fig. 12.

Fig. 12

Metastability graph GESC and L-scale graphs G2 and G3 of Example 7

The L-scale-graph G2 is constructed easily whereas the really interesting behaviour occurs when asking for the G3. Since L({0})=3, the jumps ({0},{3}) and ({0},{5}) happen on the visible time scale. The latter one is clearly also an edge in G3, due to the high stability of the final ESC L({5})=. However, in case of the former, the ESC that the process jumps to is of smaller stability, i.e. L({3})=2. Therefore, the next jump ({3},{5}) directly occurs within a time that vanishes under rescaling. The path Γ=({0},{3},{5}) in GESC thus yields an edge ({0},{5}) for G3. This edge already exists and we do not allow for double edges in GL. However, the two edges are merged in the sense of adding up the transition rates and probabilities as in (40).

Overall, we see that even a branching in the metastability graph can disappear when multiple paths collapse to the same edge on a particular time scale.

Proofs

In this chapter, we discuss the proofs of the results on metastable transitions and limiting jump processes that are presented in Sects. 2.3 and 2.4. These build on the results in Bovier et al. (2019) on the crossing of a fitness valley on a linear trait space and in Coquille et al. (2021) on the faster lnK-dynamics on general finite graphs. The main idea is to extend the techniques from Bovier et al. (2019) to more complex trait spaces by considering sequential mutations along certain paths. Since mutations are very rare outside of the mutation spreading neighbourhood of the resident traits and unfit traits quickly go extinct, mutations along different paths can essentially be regarded as independent. Consequently, the overall rate of transitioning out of an ESC is obtained by summing over the rates of taking specific paths through the surrounding fitness valley.

The remaining chapter is structured as follows: In Sect. 4.1, we determine the precise equilibrium size of the subpopulations with traits inside the mutation spreading neighbourhood. In Sect. 4.2, we consider the rates at which mutants of any fitness arise along specific paths and combine these to the overall rate at which single mutants are born. Finally, in Sect. 4.3, we combine these rates of producing mutants beyond the fitness valley with the probability of fixation and the faster lnK-dynamics of reaching a new ESC to conclude Theorem 7 and Corollary 8. Section 4.4 is dedicated to the proof of Corollary 10 and Theorem 11, where we concatenate several jumps across fitness valleys to obtain the multi-scale jump chain and carefully study which transitions are visible on the respective time scales to obtain the dynamics of the limiting Markov jump process.

Estimation of the equilibrium size

In this section we discuss the equilibrium population sizes of the living traits once an ESC is obtained. The results from Coquille et al. (2021) only characterize the orders of population sizes βw and the actual size n¯(v) of the resident traits associated to an ESC. In order to calculate the precise transition rates from one ESC to another, we do, however, need a better estimate for the population sizes of the non-resident traits in Vα(v).

We prove that, if the initial conditions of our process satisfy the assumptions of an asymptotic ESC, all living traits in Vα(v) get arbitrarily close to their equilibrium size within a finite time. This equilibrium size preserves the orders of population sizes and is of the form

NvK(t)=avKμKd(v,v)+oKμKd(v,v)vVα(v), 75

for some avR+, which can be calculated precisely. The populations of living traits stay close to these equilibrium sizes as long as no new trait arises and reaches a size at which it can influence the population sizes of other traits, i.e. a size of order K1/α. To this extend, we recall the definition of the stopping time

TfixK:=inft0:wV\Vα(v):βwK(t)1/α. 76

Lemma 14

(Equilibrium size inside the α-radius) Let vV and (βK(0))K0 be an asymptotic ESC. Then, for all ε>0, there exist constants τε<, Uε>0 and Markov processes Nv(K,±)(t),t0K0 such that,

limKPNv(K,-)(t)NvK(t)Nv(K,+)(t)t(τε,TfixKeUεK),vVα(v)=1 77

and

ENv(K,±)(t)KμKd(v,v)-av<εtτε, 78

where

av:=γ:vvγ=d(v,v)n¯γ0(v)i=1γb(γi-1)m(γi-1,γi)f(γi,v). 79

Proof

We will prove the claim by induction w.r.t. the distance from the resident traits. For the initialisation let us start with vv. That is, we count also a single vertex as a path of length zero together with the convention that an empty product has the value one. In this case (NvK,vv) can be coupled with logistic birth-death processes with immigration, by estimating the incoming and outgoing mutants, which are of order O(KμK) or smaller. Hence we know already from Coquille et al. (2021, Lemma A.6(ii)) that the residents stabilize near their Lotka–Volterra-equlilibrium within a time of order O(1). To make this more precise, define, for all ε>0, the stopping time when the resident populations enter an ε-neighbourhood of their equilibrium size

τεK:=inft0:vv:K-1NvK(t)-n¯v(v)<εC. 80

Here C is a constant, depending only on the competition rates c(vw), which compensates the slight shift of the equilibrium due to small fluctuations of non-resident traits. Then there exists a constant time τ~ε<, such that limKPτεK<τ~ε=1. After this time τ~ε, the environment of competitive pressure stays almost constant, unless the fluctuations of the resident populations become too big or the non-residents reach a macroscopic level. These two events are described by the stopping times

SεK:=inftτεK:vv:K-1NvK(t)-n¯v(v)>2εC 81

and

σεK:=inft0:wV\vNwK(t)εK. 82

We know from Champagnat and Méléard (2011, Propostition A.2) that, for some constant Uε>0,

limKPSεK>eUεKσεK=1. 83

For the other traits in the α-radius vVα\v we prove as the induction step that (77) is satisfied with

av=(w,v)Ed(v,w)=d(v,v)-1awb(w)m(w,v)f(v,v) 84

by deriving an upper and a lower bound on the population size through couplings. These bounds then immediately imply the claim.

Following the notation of Fournier and Méléard (2004), we represent the population processes in terms of Poisson random measures. For this purpose let (Qv(b),Qv(d),Qw,v(m);v,w,V) be independent homogeneous Poisson random measures on R+2 with intensity dsdθ. Then we can write

NvK(t)=NvK(0)+0tR+1θb(v)(1-μK)NvK(s-)Qv(b)(ds,dθ)-0tR+1θ[d(v)+wVcK(v,w)NwK(s-)]NvK(s-)Qv(d)(ds,dθ)+(w,v)E0tR+1θμKb(w)m(w,v)NwK(s-)Qw,v(m)(ds,dθ). 85

Note that we use the same Poisson measures to construct the processes for each K here. However, as already pointed out in Sect. 2.1, this is not necessary and we do not use any particular correlation between the processes for different K. We can use a specific joint construction here since we are only considering the convergence of probabilities of certain events, rather than of the processes themselves.

Since we already know from Coquille et al. (2021, Theorem 2.2) that in the equilibrium state the non-resident populations wVα(v) stay of size O(KμKd(v,w)), the main part of the mutations in the last line comes only from traits lying closer to the resident traits. Thus we can adopt the inductive structure of Bovier et al. (2019, Lemma 7.1) and approximate the population size of v analogously by coupling it, for K large enough, with two processes

Nv(K,-)(t)NvK(t)Nv(K,+)(t),τ~εtσεKTfixKSεK. 86

To be precise, we take care of the admissible fluctuations of the residents by defining

n¯v(±)(v):=n¯v(v)±2εC. 87

Then, for vV\v and μK<ε, we set

Nv(K,-)(t)=NvK(τ~ε)+τ~εtR+1θb(v)(1-ε)Nv(K,-)(s-)Qv(b)(ds,dθ)-τ~εtR+1θ[d(v)+wvc(v,w)n¯w(+)(v)+εmaxw~V\vc(v,w~)]Nv(K,-)(s-)Qv(d)(ds,dθ)+(w,v)Eτ~εtR+1θμKb(w)m(w,v)NwK(s-)Qw,v(m)(ds,dθ) 88
Nv(K,+)(t)=NvK(τ~ε)+τ~εtR+1θb(v)Nv(K,+)(s-)Qv(b)(ds,dθ)-τ~εtR+1θ[d(v)+wvc(v,w)n¯w(-)(v)]Nv(K,+)(s-)Qv(d)(ds,dθ)+(w,v)Eτ~εtR+1θμKb(w)m(w,v)NwK(s-)Qw,v(m)(ds,dθ), 89

where we use the same Poisson measures as in (85). Note that this coupling satisfies (86) only on the event τεK<τ~ε. However, as mentioned above, this event’s probability converges to 1 and we can hence restrict our considerations to this case to obtain the desired convergence.

On closer inspection, the approximating processes Nv(K,-),Nv(K,+) are nothing but subcritical branching processes with immigration stemming from incoming mutations.

Similar to the proof of Bovier et al. (2019, Equation (7.8) et sqq.) we can use the martingale decomposition of Nv(K,+) and Nv(K,-) to derive, for t>τ~ε, the differential equation

ddtENv(K,)(t)=b(v)(1-1{=-}ε)-d(v)-wvc(v,w)n¯w(¯)(v)-1{=-}εsupw~V\vc(v,w~)×ENv(K,)(t)+(w,v)EμKb(w)m(w,v)ENwK(t) 90
=f()(v,v)ENv(K,)(t)+(w,v)EμKb(w)m(w,v)ENwK(t), 91

where ¯={+,-}\ denotes the inverse sign.

Here, we introduce f()(v,v) as a short notation to point out that this is nothing but a perturbation of the invasion fitness. Then we can apply our a priori knowledge on the size of the sub-populations, i.e.

ENwK(t)=OKμKd(v,w)wVα(v), 92

to rewrite the ODE system

ddtENv(K,)(t)=f()(v,v)ENv(K,)(t)+(w,v)Ed(v,w)=d(v,v)-1μKb(w)m(w,v)ENwK(t)+OKμKd(v,v)+1 93
=f()(v,v)ENv(K,)(t)+(w,v)Ed(v,w)=d(v,v)-1b(w)m(w,v)awKμKd(v,v)+oKμKd(v,v). 94

Here we use the induction hypothesis to estimate the populations with traits lying closer to the residents in the latter equality.

Rescaling with KμKd(v,v) and using (84), the equation becomes

ddtENv(K,)(t)KμKd(v,v)=f()(v,v)ENv(K,)(t)KμKd(v,v)+avf(v,v)+o(1). 95

By variation of constants the solution is given by

ENv(K,)(t)KμKd(v,v)=ef()(v,v)(t-τ~ε)ENvK(τ~ε)KμKd(v,v)-f(v,v)f()(v,v)av+o(1)+f(v,v)f()(v,v)av+o(1) 96

Note that the term in brackets can be bounded uniformly in K and ε, for ε small enough. Moreover the ratio of (perturbed) fitness can be expressed as (1±εc~ε). So (96) becomes

ENv(K,)(t)KμKd(v,v)=ef()(v,v)(t-τ~ε)O(1)+(1±εc~ε)av+o(1) 97

Finally taking into account that the fitness f()(v,v)<0 is negative for vVα(v) the first term vanishes for increasing time. Hence we see that for all ε~>0 there are ε>0 and τε~(τ~ε,) and K0N such that, for all t>τε~ and K>K0

ENv(K,)(t)KμKd(v,v)-av<ε~. 98

Finally, we can deduce again from our knowledge on the orders of population sizes that

limKPσεK<TfixKeUεK=0, 99

which allows us to drop the stopping time σεK in the claim.

Pathwise evolution rates

From the precise description of the population sizes inside the mutation spreading neighbourhood we can now deduce the rate of occurrence of mutants that lay outside.

To observe a new mutant, whose trait is far away from the resident population, a whole sequence of mutation steps is needed. Traits outside the α-neighbourhood Vα(v) cannot avoid extinction only due to incoming mutants. Therefore, if such a trait has negative invasion fitness, mutants only give rise to small excursions approximated by subcritical branching processes. During each of these excursions there is a small probability that a new mutant is produced before extinction.

To overcome the problem of tracking possible back mutations, we not only observe the sizes of the different mutant populations. Instead, we distinguish mutants by the mutational path along which they arose and keep track of the genealogy. We set

NvK(t)=γ:VαvNv,γK(t)vV\Vα, 100

where the pathwise mutations can by represented by

Nv,γK(t)=0tR+1θb(v)(1-μK)Nv,γK(s-)Qv,γ(b)(ds,dθ)+0tR+1θμKb(v~)m(v~,v)Nv~,γ\vK(s-)Qv,γ(m)(ds,dθ)-0tR+1θ[d(v)+wVcK(v,w)NwK(s-)]Nv,γK(s-)Qv,γ(d)(ds,dθ). 101

Here v~ stands for the next-to-last vertex in γ, which is the progenitor of v in γ, and for v~Vα we set

Nv~,(v~)(t):=Nv~(t). 102

As before, (Qv,γ(b),Qv,γ(d),Qv,γ(m);vV,γ:Vαv) are independent homogeneous Poisson random measures with constant intensity one.

Remark 15

It suffices to only sum over the paths starting in Vα in the decomposition. By the definition of TESCK all populations outside of Vα are extinct at that time. The probability that a mutant of trait vV\Vα arises before the finite time τε in Lemma 14, when the populations in Vα reach their equilibrium, goes to zero. After this time we have good bounds on the population sizes of all traits in Vα and it is therefore sufficient to trace back the genealogy of new mutants to the last trait in Vα, i.e. a trait in Vα.

With this representation at hand, we are now able to define the cumulated number of mutant individuals of trait v that arose as mutants of the progenitor v~, along the path γ

Mv,γK(t)=0tR+1θμKb(v~)m(v~,v)Nv~,γ\vK(s-)Qv,γ(m)(ds,dθ), 103

as well as the respective occurrence times of these mutants

Tv,γ(i,K):=inft0:Mv,γK(t)i, 104

where we set Tv,γ(0,K):=0.

Our aim is to show that new mutants outside of Vα appear at the end of a mutation path approximately as a Poisson point process with rate scaling with length of the path.

Lemma 15

Suppose v and (βK(0))K0 are an asymptotic ESC and let TfixK be defined as in (76). Let vV\Vα and γ:Vαv be such that γL-α and f(γi,v)<0, for all i=0,,γ-1. Then there exist 0<c,C< such that, for each ε>0, there exist two Poisson point processes Mv,γ(K,±) with rates R~v,γ(±)KμKα+γ such that

lim infKPMv,γ(K,-)(t)<Mv,γK(t)<Mv,γ(K,+)(t),t<TfixK1-cε, 105

where the rate parameters are defined as

R~v,γ:=aγ0b(γ0)m(γ0,γ1)j=1γ-1λ(ρ(γj,v))m(γj,γj+1),R~v,γ(±)=(1±Cε)R~v,γ. 106

For the definitions of λ(ρ) and ρ(v,v) we refer to (20) and (19) respectively, while aγ0 is the equilibrium size defined in (79).

Proof

Note that, throughout the whole proof, we assume that τε<t<TfixKeUεK, where τε and Uε are defined in Lemma 14. This can then be extended to all 0t<TfixK in the limit of K since TfixK<eUεK with probability converging to 1 and, since μK0, there is almost surely no mutation event during the finite time interval [0,τε].

Let vV\Vα and γ:Vαv be given as in the Lemma. To better distinguish from the full path γ, we refer to the vertices of the path via γ=(v0,v1,,vγ). The idea of this proof is to consider the path isolated from the remaining graph and adapt the tools from Bovier et al. (2019, Ch. 7.3.) to the present situation. We refrain from adding much more notation to our already complicated situation. We try to handle the far more general structure of our trait graph by translating the notation of the central objects between the articles instead.

The first observation is that, for every t<TfixK, we can bound the mutant counting process of trait v1 by

Mv1,γ(K,-)(t)Mv1,γK(t)Mv1,γ(K,+)(t)a.s., 107

with the bounding processes being defined as

Mv1,γ(K,±)(t)=0tR+1θμKb(v0)m(v0,v1)Nv0(K,±)(s-)Qv0,γ(m)(ds,dθ). 108

Note that the estimate corresponds to equation (7.42) in Bovier et al. (2019), while the definition is the adapted version of (7.17) therein. In order make use of Lemma 14, we continue temporarily with the simplified processes

M¯v1,γ(K,±)(t)=0tR+1θμKb(v0)m(v0,v1)ENv0(K,±)(s-)Qv0,γ(m)(ds,dθ) 109

and

T¯v1,γ(i,K,±):=inft0:M¯v1,γ(K,±)(t)i. 110

In fact, this turns out to be sufficient for our results since a standard application of Doob’s martingale inequality shows that, with probability converging to 1, the difference of the processes Mv1,γ(K,±) and M¯v1,γ(K,±) during the relevant time interval stays of sufficiently small order. To be precise there exist sequences of numbers N1(K) and N2(K), with

N1(K)(KμKL)-1andN2(K)(μKL-1-α)-1 111

such that

limKPsupsN1(K)Mv1,γ(K,±)(s)-M¯v1,γ(K,±)(s)>N2(K)=0. 112

For details, see Bovier et al. (2019, p. 3583). At each time T¯v1,γ(i,K,±) an individual of trait v1 is born. In order to track its descendants until potentially a trait vγ individual is born, in a similar way as done in the previous section, we couple the k-mutant population, for 1kγ-1, to birth-death processes with individual birth and death rates

b()(vk)=b(vk)(1-1{=-}ε) 113
d()(vk)=d(vk)+wvc(vk,w)n¯w(¯)(v)+1{=-}εsupw~V\vc(vk,w~). 114

Note that in contrast to Sect. 4.1, these subcritical processes do not gain from any immigration and hence go extinct in finite time. However, there is a small probability during such an excursion of the k-mutant population that an individual of trait (k+1) is born. Analogously to Bovier et al. (2019, pp. 3581–3582), we can use Lemma 17 (see Appendix A.1) to derive

PAn excursion of traitvkproduces exactly 1 mutant of typevk+1=μKλ(ρ(vk,v))m(vk,vk+1)(1+O(ε)), 115

while on the other hand

PAn excursion of traitvkproduces at least 2 mutants ofvk+1=O(μK2). 116

Hence, the probability that the i-th mutant of trait v1 (i.e. the one triggering T¯v1,γ(i,K,±)) produces a vγ-mutant is, for large K,

μKγ-1k=α+1γ-1λ(ρ(vk,v))m(vk,vk+1)(1+O(ε)). 117

Since Lemma 14 implies that M¯v1,γ(K,±) can be treated as a Poison process with intensity

KμKd(v,v0)+1avb(v0)m(v0,v1), 118

we get appearance of vγ-mutants also as Poison process with thinned intensity

KμKd(v,v0)+γavb(v0)m(v0,v1)k=α+1γ-1λ(ρ(vk,v))m(vk,vk+1)(1+O(ε)) 119
=R~v,γ(±)KμKα+γ. 120

Eventually, the difference between Mv1,γ(K,±) and M¯v1,γ(K,±) is of smaller order than (μKL-1-α)-1 and multiplying with the thinning probability (117), which is of order μKγ-1-α, this only changes the appearance rate for the vγ-mutants by a vanishing order.

Remark 16

Note that in general there could be an overlap of two excursions of Nvk,γK, associated to different incoming mutants. Nevertheless in the limit of K this does not happen since the time interval between the incoming mutants diverges, while the durations of the excursions stay of order one, i.e. Tvk,γ(i+1,K)-Tvk,γ(i,K)1.

As a direct corollary we can deduce the law of the appearance times of new mutants with trait vV\Vα.

Corollary 16

Suppose v and (βK(0))K0 are an asymptotic ESC. Let vV\Vα be a trait such that all paths γ:Vαv of shortest length γ=d(Vα,v) do only visit traits with negative invasion fitness, excluding the last trait v, i.e. f(γi,v)<0i=0,,γ-1. Denote by Tv(i,K) the appearance time of the i-th mutant of trait v descended from an nearest neighbour trait. Then there exists a 0<c< such that, for each ε>0, there exist sequences of iid. exponential random variables Ev(i,K,±), i1 with rates R~v(±)=(1±Cε)R~v, where

R~v:=γ:Vαvγ=d(Vα,v)aγ0b(γ0)m(γ0,γ1)j=1γ-1λ(ρ(γj,v))m(γj,γj+1) 121

Such that

lim infKPE(i,K,-)KμKd(v,v)Tv(i,K)-Tv(i-1,K)E(i,K,+)Tv(i,K)<TfixK1-cε 122

Proof

Due to Lemma 15, we can describe the arrivals of new v-type mutants approximately as sum of Poisson point processes. Since the Poisson measures Q·,·(·) in our representation (101) are taken as independent, the resulting mutation counting processes Mv,γK are also independent. Hence their sum can be approximated by a Poisson process with with intensity

γ:VαvR~v,γKμKγ+α. 123

Since each summand scales with the length of the respective path, the first order of the overall rate is given only by the shortest paths (i.e. γ such that γ=d(Vα,v)[1]=d(v,v)-α). As a result, the first order becomes (121) multiplied by Kμd(v,v). Finally, the waiting times of homogeneous Poisson point processes are exponentially distributed with the same rate.

Proof of Theorem 7 and Corollary 8

We have now assembled all the tools to finish the proof of Theorem 7 and Corollary 8.

Note that, with the notation from the proof of Lemma 14, all following considerations are only valid up to the stopping time SεKσεK, for sufficiently small ε. Since we have seen previously that TfixKSεKσεK with probability converging to one, as K, we do not condition on this anymore in the following. Moreover, constants c and C may vary throughout the proof but are always assumed to satisfy 0<c,C<.

Both results assume that the initial conditions (βK(0))K0 compose an asymptotic ESC associated to the coexisting traits vV. In a first step, we study the time until the fixation of the first mutant trait outside of Vα:=Vα(v), i.e. TfixK. Corollary 16 implies that, for all traits wV\Vα such that all shortest paths γ:vv only pass through unfit traits, new mutants of this trait arise approximately according to a Poisson point process with rate R~v. By assumption, βwK(0)=0, for all K>K0 and wV\Vα, i.e. all traits outside of Vα are initially extinct. As a result, individuals of such traits w are only present due to the above incoming mutations.

We now argue why it suffices to consider traits wV\Vα such that f(w,v)>0 and d(v,w)=L(v), i.e. the wVmut:=Vmut(v), as candidates to reach βwK=1/α first and trigger the stopping time TfixK.

For all w such that α<d(v,w)<L(v), the definition of L(v) yields f(w,v)<0. Therefore, the descendants of a mutant of such traits can be bounded from above by a subcritical birth-death process with rates that do not depend on K, that dies out within a finite time with probability 1. As a result,

limKPsupt[0,TfixKeUεK]βwK(t)1α=0. 124

For w such that d(v,w)=L(v) and f(w,v)<0, the same argument can be applied.

Finally, for all w such that d(v,w)>L(v), for all T<, Corollary 16 implies that the arrival time of the first w mutant, Tw(1), satisfies

limKPTw(1)TKμKL(v)TfixK=0. 125

Focussing on the wVmut, we can use couplings to supercritical birth-death processes (similar to the arguments in the previous sections) to bound the different mutant populations. Using classical results on branching processes (e.g. from Athreya and Ney 1972, Ch. III.4) we can approximate the probability that the descendants of a single mutant of a particular trait w do not go extinct by (1±Cε)f(w,v)/b(w). Moreover, conditioned on not going extinct, the time that such a population needs to grow to a size of K1/α can be bounded by (1±Cε)lnK/αf(w,v). It is therefore negligible on the time scale 1/KμKL(v), on which the w mutants arise.

Overall, we can deduce from Corollary 16 that there is a constant 0<c< and exponential random variables Ew,fix(K,±) with parameters (1±cε)R~wf(w,v)/b(w)=(1±cε)R(v,w) such that

lim infKPEw,fix(K,-)KμKL(v)TfixKEw,fix(K,+)|βwK(TfixK)=1α1-cε. 126

Since the mutants arising along different paths are independent (see the proof of Corollary 16), the actual stopping time KμKL(v))TfixK (without conditioning on a trait w) is roughly exponentially distributed with the sum of all rates R(v)=wVmutR(v,w). In addition, the probability that a certain trait wVmut triggers the stopping time TfixK can be approximated by R(v,w)/R(v). More precisely, there are exponential random variables E(K,±)(ε) such that

lim infKPE(K,-)(ε)KμKL(v)TfixKE(K,+)(ε)1-cε, 127
R(v,w)R(v)(1-cε)limKPβwK(TfixK)=1αR(v,w)R(v)(1+cε). 128

Since ε can be picked arbitrarily small, this concludes the proof of Theorem 7.

To deduce Corollary 8, we note that at time TfixK the population sizes satisfy (28), for some wVmut(v). Hence the assumption of the corollary and Theorem 18 imply that a new ESC associated to vESC(v,w) is obtained within a time of order lnK. We emphasise that, although Theorem 18 only implies that βuK0 for traitsuVα(vESC(v,w)) after this time, these subpopulations can be bounded from above by subcritical branching processes that go extinct within a time of order 1, such that the conditions of TESCK are truly satisfied. This yields the first claim of Corollary 8. Since this time is again negligible with respect to the 1/KμKL(v)-time scale, the second claim follows directly. For the last claim, we realise that a new ESC w might be reached from multiple wVmut(v), and we therefore add up all corresponding probabilities to obtain p(v,w). This concludes the proof of Corollary 8.

Proof of Corollary 10 and Theorem 11

In order to derive results for the jump chain (v(k))k0 on GESC, we observe that, after a successful transition according to Corollary 8, the final state of the process again satisfies the initial assumptions for another application of the corollary. We simply need to recompute the state-dependent quantities (L(v),Vmut(v), etc.). As a consequence, the strong Markov property allows us to use Corollary 8 to construct the random sequence (v(k))k0 as well as derive the asymptotics of the stopping times TESC(k,K) by an inductive procedure. This proves Corollary 10.

To extract the limiting process on the time scale 1/KμKL for fixed L>α, take an initial configuration of this stability degree, i.e. vSL. Considering the jump chain (v(k))k0 with v(0)=v, Assumption 3 implies that, with probability one, (v(k))k0 reaches an ESC of stability degree at least L within finitely many steps. We now consider such a finite path Γ:vw in GESC, where L(w)L. Without loss of generality we may assume that the intermediate ESCs are of strictly lower stability degree, i.e. L(Γi)<L1i<Γ. Otherwise we could shorten the path. Asking now for the time TΓK that it takes to transition from v to w along Γ, we can simply add up the single step transition times T(i,K)-T(i-1,K). By Corollary 10, we know that, on the time scale 1/KμK(i), those are well approximated by exponential random variables E±(i,K). Since L=L(1)>L(i), for 2iΓ, we can deduce that the rescaled transition time TΓKKμKL is dominated by the very first transition and thus well described by exponential random variables.

To compute the respective transition rates, notice that by Corollary 10, on the time scale 1/KμKL, the rate to escape from v=Γ0 is given by R(v)=R(1). Moreover, we have to take into account that we consider the case where the limit process (v(i))ii takes a particular path, i.e. v(i)=Γi, for 0iΓ. The probability of this event is simply given by the product of the one-step-probabilities p(v(i-1),v(i)). Similarly to previous arguments, there might by different paths Γ:vw and hence we add up their probabilities. This yields the rates RL(v,w) in (41) and therefore the claimed dynamics of the jump process (vL(t))t[0,T] on the L-scale graph GL.

To finally deduce the limit of the rescaled population process NK/K, we note that there is no macroscopic evolution during almost the entire waiting time for a transition on GL. The set of macroscopic traits vV:βvK(t)>1-εK only changes after a new mutant fixates, which happens at time Tfix(1,K). The rest of the transition time, which may consist of many chances of the macroscopic traits, vanishes when rescaling with KμKL. Therefore, we obtain the limit process of Theorem 11, which jumps between the Lotka–Volterra-equilibria associated to the state of (vL(t))t[0,T].

Acknowledgements

This work was partially supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy GZ 2047/1, Projekt-ID 390685813 and GZ 2151, Project-ID 390873048 and through Project-ID 211504053 - SFB 1060. The authors thank Anton Bovier for stimulating discussion and feedback on the manuscript. Moreover, the authors are very grateful to the anonymous referees for their numerous comments and questions, which helped to greatly improve the manuscript.

Appendix A Technical results

The aim of this chapter is to collect some results on the O(1)- and O(lnK)-time scale behaviour of the population process. While Sect. A.1 explains the form of λ(ρ), Sect. A.2 justifies the notation vESC(v,v). The statements have been derived in Bovier et al. (2019) and Coquille et al. (2021) whereto we refer for detailed proofs.

A.1 Excursions of subcritical birth death processes

The first lemma quantifies the mean number of birth events before a subcritical birth death process goes extinct, corresponding to λ(ρ). Although we restate an existing result here, we provide a short proof below. This proof is different to the more general scenario that is cited in Bovier et al. (2019) and gives the reader an intuition behind the expression.

Lemma 17

(Bovier et al. 2019, Lemma A.3) Consider a subcritical linear birth death process with individual birth and death rates 0<b<d. Denote by Z the total number of birth events during an excursion of this process initiated with exactly one individual. Then, for kN0,

p(b,d)(k):=PZ=k=(2k)!k!(k+1)!bb+dkdb+dk+1 A1

and in particular

e(b,d):=EZ=k=1(2k)!(k-1)!(k+1)!bb+dkdb+dk+1. A2

Moreover we have the following continuity result. There exist two positive constants c,ε0>0, such that, for all 0<ε<ε0 and 0<bi<di, if b1-b2<ε and d1-d2<ε, then

e(b1,d1)-e(b2,d2)<cε. A3
Remark 17

Note that (A2) corresponds to (19) via e(b,d)=λ(ρ), where ρ=b/(b+d).

Proof

Although the considered process takes place in continuous time, it suffices to focus on the birth and death events as jump chain in discrete time. This is nothing but a simple random walk on N0 with probabilities

p(x,x+1)=bb+d,p(x,x-1)=db+dx1 A4

and absorbing state 0. From this point of view it is only a question of counting the number of paths leading from one individual to extinction consisting of exactly k births and hence k+1 death events. As final step there has to happen a death since the population does not vanish before. So the first 2k events form a walk from 1 to 1. There are 2kk of such paths but some of them would lead to extinction earlier. To determine their number we apply a reflection principle in the following way. Let x=(x0,x1,,x2k) be a path leading from one to one such that there exists a 0<j<2k with xj=0. Then we define the partially reflected path x~ by

x~i:=xiforij-xifori>j A5

This gives us a unique path from x~0=1 to x~2k=-1 (cf. Fig. 13). Moreover there is a one to one correspondence between prematurely extincting processes and paths leading from 1 to -1. The latter ones consist of only k-1 births and hence there are 2kk-1 different ones. Finally the total number of legal paths is

#x=(x0,x1,,x2k)|x0=1,x2k=1,xi>0=2kk-2kk-1=(2k)!k!(k+1)!. A6

We now achieve (A1) by multiplying with the probability of k births and k+1 death events. The last statement is a simple consequence of the mean value theorem.

Fig. 13.

Fig. 13

Original path x that prematurely goes extinct and its reflection x~

A.2 Fast evolution until ESC

In this subsection we discuss the first phase of evolution, where an ESC is obtained on the lnK-time scale. The convergence of NK(tlnK)/K and βK(tlnK), as K, is studied in Coquille et al. (2021). In the following we cite the respective results in the notation of this paper.

For a finite graph G=(V,E) and under Assumptions 1 and 2, the trajectories (βw(t),wV) (which turn out to be the limit of (βwK(tlnK),wV)) are defined by an inductive procedure. The construction is valid until a stopping time T0.

Denote by v~(), 0, the sequence of consecutive coexisting resident traits. We emphasize that these are not to be confused with the sequence of resident traits v(k), k0, that are associated to ESCs. The invasion times, at which the sets of resident traits change due to upcoming mutant traits, are denoted by the increasing sequence (s)0.

For initial conditions β~(0), the support of the unique asymptotically stable equilibrium of the Lotka–Volterra system (2) associated to the traits {wV:β~w(0)=1} (if existent) is denoted by v~(0). The equilibrium n¯(v~(0)) is reached within a time of order 1 and we set s0:=0. Moreover, we define βw(0):=maxuV[β~u(0)-d(u,w)/α]+ as the initial condition of the limiting trajectories. This reflects that, within a time of order 1, living traits produce neighbouring mutant populations with the size of a μK-fraction of their own size. This time of order 1 is negligible on the lnK-time scale, which the limit β is defined on.

Assuming that s-1, v~(-1) such that LVE+(v~(-1))=n¯(v~(-1)), and β(s-1) are known, the next phase can be described as follows. The th invasion time is set to

s:=inf{t>s-1:wv~(-1):βw(t)=1}. A7

For s-1ts, for any wV, βw(t) is defined by

βw(t):=maxuVβu(s-1)+(t-tu,t)f(u,v~(-1))-d(u,w)α0, A8

where, for any wV,

tw,:=infts-1:uV:d(u,w)=1,βu(t)=1αifβw(s-1)=0s-1else A9

is the first time in [s-1,s] when this trait arises. If we define Vliving(t):={wV:βw(t)>0} equivalently to VlivingK (on the lnK-time scale), then this implies βw(tw,)0 and βw(tw,+δ)>0, for small δ>0.

The stopping time T0, that terminates the inductive construction of the limiting trajectories, is set to s if

  1. There is more than one wV\v~(-1) such that βw(s)=1;

  2. The mutation-free Lotka–Volterra system associated to v~(-1) and the unique wV\v~(-1) such that βw(s)=1 does not have a unique globally attractive stable equilibrium (in particular, if such an equilibrium does not exist for {wV:β~w(0)=1}, T0 is set to 0);

  3. There exists wV\v~(-1) such that βw(s)=0 and βw(s-δ)>0 for all δ>0 small enough.

  4. There exists wV\v~(-1) such that s=tw,.

These conditions are mostly technical and are discussed in Coquille et al. (2021).

With this construction, the results can be stated as follows:

Theorem 18

(Coquille et al. 2021, Theorem 2.7) Let G=(V,E) be a finite graph. Suppose that Assumption 1 and 2 hold and consider the model defined by (1) with μK=K-1/α. Let v~0V and assume that, for every wV ,

βwK(0)β~w(0),(K)in probability. A10

Then, for all T>0, as K, the sequence ((βwK(tlnK),wV),t[0,TT0]) converges in probability in D([0,TT0],R+V) to the deterministic, piecewise affine, continuous function ((βw(t),wV),t[0,TT0]), which is defined in (A8).

Theorem 19

(Coquille et al. 2021, Proposition 2.8) Under the same assumptions as in Theorem 18, for all T>0, as K, the sequence ((NwK(tlnK)/K,wV),t[0,TT0]) converges in the sense of the finite dimensional distributions to a deterministic jump process ((Nw(t),wV),t[0,TT0]), which jumps between different Lotka–Volterra equilibria according to

Nw(t):=N:s+1T01st<s+11wv~()n¯w(v~()). A11

Moreover, the invasion times s and the times tw, when new mutants arise are calculated precisely in Coquille et al. (2021). This is however not relevant to the discussion in this paper.

We notice that the constructed trajectories (βw(t),wV) stay constant precisely once an ESC is obtained. In this case, there is no more visible evolution on the lnK-time scale.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. Athreya KB, Ney PE (1972) Branching processes. Die Grundlehren der mathematischen Wissenschaften, vol 196, Springer-Verlag, New York-Heidelberg
  2. Baar M, Bovier A, Champagnat N (2017) From stochastic, individual-based models to the canonical equation of adaptive dynamics in one step. Ann Appl Probab 27(2):1093–1170. 10.1214/16-AAP1227 [Google Scholar]
  3. Berestycki J, Brunet E, Shi Z (2016) The number of accessible paths in the hypercube. Bernoulli 22(2):653–68. 10.3150/14-BEJ641 [Google Scholar]
  4. Berestycki J, Brunet E, Shi Z (2017) Accessibility percolation with backsteps. ALEA Lat Am J Probab Math Stat 14(1):45–62 [Google Scholar]
  5. Bolker B, Pacala SW (1997) Using moment equations to understand stochastically driven spatial pattern formation in ecological systems. Theor Popul Biol 52(3):179–19. 10.1006/tpbi.1997.1331 [DOI] [PubMed] [Google Scholar]
  6. Bovier A (2021) Stochastic models for adaptive dynamics: scaling limits and diversity. In: Baake E, Wakolbinger A (eds) Probabilistic structures in evolution, EMS series of congress reports, vol 17. EMS Press, Berlin, pp 127–149
  7. Bovier A, den Hollander F (2015) Metastability, a potential-theoretic approach, Grundlehren der Mathematischen Wissenschaften, vol 351. Springer Cham Heidelberg New York Dortrecht London
  8. Bovier A, Coquille L, Neukirch R (2018) The recovery of a recessive allele in a Mendelian diploid model. J Math Biol 77(4):971–103. 10.1007/s00285-018-1240-z [DOI] [PubMed] [Google Scholar]
  9. Bovier A, Coquille L, Smadi C (2019) Crossing a fitness valley as a metastable transition in a stochastic population model. Ann Appl Probab 29(6):3541–358. 10.1214/19-AAP1487 [Google Scholar]
  10. Champagnat N (2006) A microscopic interpretation for adaptive dynamics trait substitution sequence models. Stoch Process Appl 116(8):1127–116. 10.1016/j.spa.2006.01.004 [Google Scholar]
  11. Champagnat N, Méléard S (2011) Polymorphic evolution sequence and evolutionary branching. Probab Theory Relat Fields 151(1–2):45–9. 10.1007/s00440-010-0292-9 [Google Scholar]
  12. Champagnat N, Méléard S, Tran VC (2021) Stochastic analysis of emergence of evolutionary cyclic behavior in population dynamics with transfer. Ann Appl Probab 31(4):1820–1867 [Google Scholar]
  13. Cirillo ENM, Nardi FR (2013) Relaxation height in energy landscapes: an application to multiple metastable states. J Stat Phys 150:1080–1114 [Google Scholar]
  14. Coquille L, Kraut A, Smadi C (2021) Stochastic individual-based models with power law mutation rate on a general finite trait space. Electron J Probab 26:1–37 [Google Scholar]
  15. Dawson DA, Greven A (2014) Spatial Fleming-Viot models with selection and mutation, vol 2092. Lecture notes in mathematics. Springer [Google Scholar]
  16. De Visser JAG, Krug J (2014) Empirical fitness landscapes and the predictability of evolution. Nat Rev Genet 15(7):480–490 [DOI] [PubMed] [Google Scholar]
  17. Ethier SN, Kurtz TG (1986) Markov processes. Wiley Ser Probab Math Stat. Wiley, New York. 10.1002/9780470316658
  18. Fournier N, Méléard S (2004) A microscopic probabilistic description of a locally regulated population and macroscopic approximations. Ann Appl Probab 14(4):1880–191. 10.1214/105051604000000882 [Google Scholar]
  19. Gillespie DT (1976) A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J Comput Phys 22(4):403–43. 10.1016/0021-9991(76)90041-3 [Google Scholar]
  20. Gillespie JH (1984) Molecular evolution over the mutational landscape. Evolution 38(5):1116–1129 [DOI] [PubMed] [Google Scholar]
  21. Gokhale CS, Iwasa Y, Nowak MA et al (2009) The pace of evolution across fitness valleys. J Theor Biol 259(3):613–620 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Jain K (2007) Evolutionary dynamics of the most populated genotype on rugged fitness landscapes. Phys Rev E Stat Nonlinear Soft Matter Phys 76(3):031922 [DOI] [PubMed] [Google Scholar]
  23. Jain K, J Krug (2005) Evolutionary trajectories in rugged fitness landscapes. J Stat Mech Theory Exp 4:400. 10.1088/1742-5468/2005/04/p04008 [Google Scholar]
  24. Jain K, Krug J (2007) Deterministic and stochastic regimes of asexual evolution on rugged fitness landscapes. Genetics 175:1275–8. 10.1534/genetics.106.067165 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Komarova NL (2007) Loss- and gain-of-function mutations in cancer: mass-action, spatial and hierarchical models. J Stat Phys 128:413–446 [Google Scholar]
  26. Kraut A, Bovier A (2019) From adaptive dynamics to adaptive walks. J Math Biol 79(5):1699–174. 10.1007/s00285-019-01408-6 [DOI] [PubMed] [Google Scholar]
  27. Krug J (2021) Accessibility percolation in random fitness landscapes. In: Baake E, Wakolbinger A (eds) Probabilistic structures in evolution. EMS series of congress reports, vol 17. EMS Press, Berlin, pp 1–22
  28. Martincorena I, Raine KM, Gerstung M et al (2017) Universal patterns of selection in cancer and somatic tissues. Cell 171(5):1029–1041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Metz JAJ, Nisbet RM, Geritz SAH (1992) How should we define ‘fitness’ for general ecological scenarios? Trends Ecol Evol 7(6):198–202 [DOI] [PubMed] [Google Scholar]
  30. Neidhart J, Krug J (2011) Adaptive walks and extreme value theory. Phys Rev Lett 107(178):10. 10.1103/PhysRevLett.107.178102 [DOI] [PubMed] [Google Scholar]
  31. Nicholson M, Antal T (2019) Competing evolutionary paths in growing populations with applications to multidrug resistance. PLoS Comput Biol 15(4):e1006866 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Nowak S (2015) Krug J (2015) Analysis of adaptive walks on NK fitness landscapes with different interaction schemes. J Stat Mech Theory Exp 6:06014. 10.1088/1742-5468/2015/06/p06014 [Google Scholar]
  33. Orr HA (2003) A minimum on the mean number of steps taken in adaptive walks. J Theor Biol 220(2):241–24. 10.1006/jtbi.2003.3161 [DOI] [PubMed] [Google Scholar]
  34. Pallen MNM (2006) From the origin of species to the origin of bacterial flagella. Nat Rev Microbiol 4:784–790 [DOI] [PubMed] [Google Scholar]
  35. Schmiegelt B, Krug J (2014) Evolutionary accessibility of modular fitness landscapes. J Stat Phys 154(1–2):334–35. 10.1007/s10955-013-0868-8 [Google Scholar]
  36. Smadi C (2017) The effect of recurrent mutations on genetic diversity in a large population of varying size. Acta Appl Math 149:11–5. 10.1007/s10440-016-0086-x [Google Scholar]

Articles from Journal of Mathematical Biology are provided here courtesy of Springer

RESOURCES