Skip to main content
Springer logoLink to Springer
. 2017 Jul 3;76(3):697–754. doi: 10.1007/s00285-017-1153-2

Stochastic population growth in spatially heterogeneous environments: the density-dependent case

Alexandru Hening 1,2,, Dang H Nguyen 3, George Yin 3
PMCID: PMC5772867  PMID: 28674928

Abstract

This work is devoted to studying the dynamics of a structured population that is subject to the combined effects of environmental stochasticity, competition for resources, spatio-temporal heterogeneity and dispersal. The population is spread throughout n patches whose population abundances are modeled as the solutions of a system of nonlinear stochastic differential equations living on [0,)n. We prove that r, the stochastic growth rate of the total population in the absence of competition, determines the long-term behaviour of the population. The parameter r can be expressed as the Lyapunov exponent of an associated linearized system of stochastic differential equations. Detailed analysis shows that if r>0, the population abundances converge polynomially fast to a unique invariant probability measure on (0,)n, while when r<0, the population abundances of the patches converge almost surely to 0 exponentially fast. This generalizes and extends the results of Evans et al. (J Math Biol 66(3):423–476, 2013) and proves one of their conjectures. Compared to recent developments, our model incorporates very general density-dependent growth rates and competition terms. Furthermore, we prove that persistence is robust to small, possibly density dependent, perturbations of the growth rates, dispersal matrix and covariance matrix of the environmental noise. We also show that the stochastic growth rate depends continuously on the coefficients. Our work allows the environmental noise driving our system to be degenerate. This is relevant from a biological point of view since, for example, the environments of the different patches can be perfectly correlated. We show how one can adapt the nondegenerate results to the degenerate setting. As an example we fully analyze the two-patch case, n=2, and show that the stochastic growth rate is a decreasing function of the dispersion rate. In particular, coupling two sink patches can never yield persistence, in contrast to the results from the non-degenerate setting treated by Evans et al. which show that sometimes coupling by dispersal can make the system persistent.

Keywords: Stochastic population growth, Density-dependence, Ergodicity, Spatial and temporal heterogeneity, Lotka–Volterra model, Lyapunov exponent, Habitat fragmentation, Stochastic environment, Dispersion

Introduction

The survival of an organism is influenced by both biotic (competition for resources, predator-prey interactions) and abiotic (light, precipitation, availability of resources) factors. Since these factors are space-time dependent, all types of organisms have to choose their dispersal strategies: If they disperse they can arrive in locations with different environmental conditions while if they do not disperse they face the temporal fluctuations of the local environmental conditions. The dispersion strategy impacts key attributes of a population including its spatial distribution and temporal fluctuations in its abundance. Individuals selecting more favorable habitats are more likely to survive or reproduce. When population densities increase in these habitats, organisms may prosper by selecting habitats that were previously unused. There have been numerous studies of the interplay between dispersal and environmental heterogeneity and how this influences population growth; see Hastings (1983), Gonzalez and Holt (2002), Schmidt (2004), Roy et al. (2005), Schreiber (2010), Cantrell et al. (2012), Durrett and Remenik (2012), Evans et al. (2013) and references therein. The mathematical analysis for stochastic models with density-dependent feedbacks is less explored. In the setting of discrete-space discrete-time models there have been thorough studies by Benaïm and Schreiber (2009); Schreiber (2010); Schreiber et al. (2011). Continuous-space discrete-time population models that disperse and experience uncorrelated, environmental stochasticity have been studied by Hardin et al. (1988a, b, 1990). They show that the leading Lyapunov exponent r of the linearization of the system around the extinction state almost determines the persistence and extinction of the population. For continuous-space continuous-time population models Mierczyński and Shen (2004) study the dynamics of random Kolmogorov type PDE models in bounded domains. Once again, it is shown that the leading Lyapunov exponent r of the linarization around the trivial equilibrium 0 almost determines when the population goes extinct and when it persists. In the current paper we explore the question of persistence and extinction when the population dynamics is given by a system of stochastic differential equations. In our setting, even though our methods and techniques are very different from those used by Hardin et al. (1988a) and Mierczyński and Shen (2004), we still make use of the system linearized around the extinction state. The Lyapunov exponent of this linearized system plays a key role throughout our arguments.

Evans et al. (2013) studied a linear stochastic model that describes the dynamics of populations that continuously experience uncertainty in time and space. Their work has shed some light on key issues from population biology. Their results provide fundamental insights into “ideal free” movement in the face of uncertainty, the evolution of dispersal rates, the single large or several small (SLOSS) debate in conservation biology, and the persistence of coupled sink populations. In this paper, we propose a density-dependent model of stochastic population growth that captures the interactions between dispersal and environmental heterogeneity and complements the work of Evans et al. (2013). We then present a rigorous and comprehensive study of the proposed model based on stochastic analysis.

The dynamics of a population in nature is stochastic. This is due to environmental stochasticity—the fluctuations of the environment make the growth rates random. One of the simplest models for a population living in a single patch is

dU(t)=U(t)(a-bU(t))dt+σU(t)dW(t),t0, 1.1

where U(t) is the population abundance at time ta is the mean per-capita growth rate, b>0 is the strength of intraspecific competition, σ2 is the infinitesimal variance of fluctuations in the per-capita growth rate and (W(t))t0 is a standard Brownian motion. The long-term behavior of (1.1) is determined by the stochastic growth rate a-σ22 in the following way (see Evans et al. 2015; Dennis and Patil 1984):

  • If a-σ22>0 and U(0)=u>0, then (U(t))t0 converges weakly to its unique invariant probability measure ρ on (0,).

  • If a-σ22<0 and U(0)=u>0, then limtU(t)=0 almost surely.

  • If a-σ22=0 and U(0)=u>0, then lim inftU(t)=0 almost surely, lim suptU(t)= almost surely, and limt1t0tU(s)ds=0 almost surely.

Organisms are always affected by temporal heterogeneities, but they are subject to spatial heterogeneities only when they disperse. Population growth is influenced by spatial heterogeneity through the way organisms respond to environmental signals (see Hastings 1983; Cantrell and Cosner 1991; Chesson 2000; Schreiber and Lloyd-Smith 2009). There have been several analytic studies that contributed to a better understanding of the separate effects of spatial and temporal heterogeneities on population dynamics. However, few theoretical studies have considered the combined effects of spatio-temporal heterogeneities, dispersal, and density-dependence for discretely structured populations with continuous-time dynamics.

As seen in both the continuous (Evans et al. 2013) and the discrete (Palmqvist and Lundberg 1998) settings, the extinction risk of a population is greatly affected by the spatio-temporal correlation between the environment in the different patches. For example, if spatial correlations are weak, one can show that populations coupled via dispersal can survive even though every patch, on its own, would go extinct (see Evans et al. 2013; Jansen and Yoshimura 1998; Harrison and Quinn 1989). Various species usually exhibit spatial synchrony. Ecologists are interested in this pattern as it can lead to the extinction of rare species. Possible causes for synchrony are dispersal and spatial correlations in the environment (see Legendre 1993; Kendall et al. 2000; Liebhold et al. 2004). Consequently, it makes sense to look at stochastic patch models coupled by dispersion for which the environmental noise of the different patches can be strongly correlated. We do this by extending the setting of Evans et al. (2013) by allowing the environmental noise driving the system to be degenerate.

The rest of the paper is organized as follows. In Sect. 2, we introduce our model for a population living in a patchy environment. It takes into account the dispersal between different patches and density-dependent feedback. The temporal fluctuations of the environmental conditions of the various patches are modeled by Brownian motions that are correlated. We start by considering the relative abundances of the different patches in a low density approximation. We show that these relative abundances converge in distribution to their unique invariant probability measure asymptotically as time goes to infinity. Using this invariant probability measure we derive an expression for r, the stochastic growth rate (Lyapunov exponent) in the absence of competition. We show that this r is key in analyzing the long-term behavior of the populations. In Appendix A we show that if r>0 then the abundances converge weakly, polynomially fast, to their unique invariant probability measure on (0,)n. In Appendix B, we show that if r<0 then all the population abundances go extinct asymptotically, at an exponential rate (with exponential constant r). Appendix C is dedicated to the case when the noise driving our system is degenerate (that is, the dimension of the noise is lower than the number of patches). In Appendix D, we show that r depends continuously on the coefficients of our model and that persistence is robust—that is, small perturbations of the model do not make a persistent system become extinct. We provide some numerical examples and possible generalizations in Sect. 4.

Model and results

We study a population with overlapping generations, which live in a spatio-temporally heterogeneous environment consisting of n distinct patches. The growth rate of each patch is determined by both deterministic and stochastic environmental inputs. We denote by Xi(t) the population abundance at time t0 of the ith patch and write X(t)=(X1(t),,Xn(t)) for the vector of population abundances. Following Evans et al. (2013), it is appropriate to model X(t) as a Markov process with the following properties when 0Δt1:

  • the conditional mean is
    EXi(t+Δt)-Xi(t)|Xi(t)=xiaixi-xibi(xi)+jixjDji-xiDijΔt,
    where aiR is the per-capita growth rate in the ith patch, bi(xi) is the per-capita strength of intraspecific competition in patch i when the abundance of the patch is xi, and Dij0 is the dispersal rate from patch i to patch j;
  • the conditional covariance is
    CovXi(t+Δt)-Xi(t),Xj(t+Δt)-Xj(t)|X=xσijxixjΔt
    for some covariance matrix Σ=(σij).

The difference between our model and the one from Evans et al. (2013) is that we added density-dependent feedback through the xibi(xi) terms.

We work on a complete probability space (Ω,F,{Ft}t0,P) with filtration {Ft}t0 satisfying the usual conditions. We consider the system

dXi(t)=Xi(t)ai-bi(Xi(t))+j=1nDjiXj(t)dt+Xi(t)dEi(t),i=1,,n, 2.1

where Dij0 for ji is the per-capita rate at which the population in patch i disperses to patch j,Dii=-jiDij is the total per-capita immigration rate out of patch i,E(t)=(E1(t),,En(t))T=ΓB(t), Γ is a n×n matrix such that ΓΓ=Σ=(σij)n×n and B(t)=(B1(t),,Bn(t)) is a vector of independent standard Brownian motions adapted to the filtration {Ft}t0. Throughout the paper, we work with the following assumption regarding the growth of the instraspecific competition rates.

Assumption 2.1

For each i=1,,n the function bi:R+R is locally Lipschitz and vanishing at 0. Furthermore, there are Mb>0,γb>0 such that

i=1nxi(bi(xi)-ai)i=1nxi>γbfor anyxi0,i=1,,nsatisfyingi=1nxiMb 2.2

Remark 2.1

Note that if we set xj=xMb and xi=0,ij, we get from (2.2) that

bj(x)-aj>γb,xMb,j=1,,n.

Remark 2.2

Note that condition (2.2) is biologically reasonable because it holds if the bi’s are sufficiently large for large xi’s. We provide some simple scenarios when Assumption 2.1 is satisfied.

  1. Suppose bi:[0,)[0,),i=1,,n are locally Lipschitz and vanishing at 0. Assume that there exist γb>0,M~b>0 such that
    infx[M~b,)bi(x)-ai-γb>0,i=1,,n
    It is easy to show that Assumption 2.1 holds.
  2. Particular cases of (a) are for example, any bi:R+R that are locally Lipschitz, vanishing at 0 such that limxbi(x)=.

  3. One natural choice for the competition functions, which is widely used throughout the literature, is bi(x)=κix,x(0,) for some κi>0. In this case the competition terms become -xib(xi)=-κixi2.

Remark 2.3

Note that if we have the SDE

dXi(t)=Xi(t)fi(Xi(t))+j=1nDjiXj(t)dt+Xi(t)dEi(t),i=1,,n, 2.3

where fi are locally Lipschitz this can always be rewritten in the form (2.1) with

ai:=fi(0)andbi(x):=fi(0)-fi(x),i=1,,n.

Therefore, our setting is in fact very general and incorporates both nonlinear growth rates and nonlinear competition terms.

The drift f~(x)=(f~1(x),,f~n(x)) where f~i(x)=xi(ai-bi(xi))+j=1nDjiXj(t) is sometimes said to be cooperative. This is because fi(x)fi(y) if (x,y)R+n such that xi=yi,xjyj for ji. A distinctive property of cooperative systems is that comparison arguments are generally satisfied. We refer to Chueshov (2002) for more details.

Remark 2.4

If the dispersal matrix (Dij) has a normalized dominant left eigenvector α=(α1,,αn) then one can show that the system

dXi(t)=Xi(t)ai-biXi(t)+δj=1nDjiXj(t)dt+Xi(t)dEi(t),i=1,,n,

converges as δ to a system (X~1(t),,X~n(t)) for which

X~i(t)=αiX~(t),t0,i=1,,n,

where X~(t)=X~1(t)++X~n(t) and X~ is an autonomous Markov process that satisfies the SDE

dX~(t)=X~(t)i=1nαi(ai-biαiX~(t))dt+X~(t)i=1nαidEi(t).

As such, our system is a general version of the system treated in Evans et al. (2015). One can recover the system from Evans et al. (2015) as an infinite dispersion limit of ours.

We denote by Xx(t) the solution of (2.1) started at X(0)=xR+n. Following Evans et al. (2013), we call matrices D with zero row sums and non-negative off-diagonal entries dispersal matrices. If D is a dispersal matrix, then it is a generator of a continuous-time Markov chain. Define Pt:=exp(tD),t0. Then Pt,t0 is a matrix with non-negative entries that gives the transition probabilities of a Markov chain: The (ij)th entry of Pt gives the proportion of the population that was initially in patch i at time 0 but has dispersed to patch j at time t and D is the generator of this Markov chain. If one wants to include mortality induced because of dispersal, one can add cemetery patches in which dispersing individuals enter and experience a killing rate before moving to their final destination. Our model is a density-dependent generalization of the one by Evans et al. (2013). We are able to prove that the linearization of the density-dependent model fully determines the non-linear density-dependent behavior, a fact which was conjectured by Evans et al. (2013). Furthermore, we prove stronger convergence results and thus extend the work of Evans et al. (2013). Analogous results for discrete-time versions of the model have been studied by Benaïm and Schreiber (2009) for discrete-space and by Hardin et al. (1988a, b) for continuous-space.

We will work under the following assumptions.

Assumption 2.2

The dispersal matrix D is irreducible.

Assumption 2.3

The covariance matrix Σ is non-singular.

Assumption 2.2 is equivalent to forcing the entries of the matrix Pt=exp(tD) to be strictly positive for all t>0. This means that it is possible for the population to disperse between any two patches. We can always reduce our problem to this setting by working with the maximal irreducible subsets of patches. Assumption 2.3 says that our randomness is non-degenerate, and thus truly n-dimensional. We show in Appendix C how to get the desired results when Assumption 2.3 does not hold.

Throughout the paper we set R+n:=[0,)n and R+n,:=(0,)n. We define the total abundance of our population at time t0 via S(t):=i=1nXi(t) and let Yi(t):=Xi(t)S(t) be the proportion of the total population that is in patch i at time t0. Set Y(t)=(Y1(t),,Yn(t)). An application of Itô’s lemma to (2.1) yields

dYi(t)=Yi(t)ai-j=1najYj(t)-bi(S(t)Yi(t))+j=1nYj(t)bj(S(t)Yj(t))dt+j=1nDjiYj(t)dt+Yi(t)j,k=1nσkjYk(t)Yj(t))-j=1nσijYj(t)dt+Yi(t)dEi(t)-j=1nYj(t)dEj(t)dS(t)=S(t)i=1n(aiYi(t)-Yi(t)bi(S(t)Yi(t)))dt+S(t)i=1nYi(t)dEi(t) 2.4

We can rewrite (2.4) in the following compact equation for (Y(t),S(t)) where b(x)=(b1(x1),,bn(xn)).

dY(t)=diag(Y(t))-Y(t)Y(t)ΓdB(t)+DY(t)dt+diag(Y(t))-Y(t)Y(t)×(a-ΣY(t)-b(S(t)Y(t)))dtdS(t)=S(t)a-b(S(t)Y(t))Y(t)dt+S(t)Y(t)ΓdB(t), 2.5

where Y(t) lies in the simplex Δ:={(y1,,yn)R+n:y1++yn=1}. Let Δ={(y1,,yn)R+n,:y1++yn=1} be the interior of Δ.

Consider Equation (2.5) on the boundary ((y,s):yΔ,s=0) (that is, we set S(t)0 in the equation for Y(t)). We have the following system

dY~(t)=diag(Y~(t))-Y~(t)Y~(t)ΓdB(t)+DY~(t)dt+diag(Y~(t))-Y~(t)Y~(t)(a-ΣY~(t))dt 2.6

on the simplex Δ. We also introduce the linearized version of (2.1), where the competition terms bi(xi) are all set to 0,

dXi(t)=Xi(t)ai+j=1nDjiXj(t)dt+Xi(t)dEi(t),i=1,,n. 2.7

and let S(t)=i=1nXi(t) be the total population abundance, in the absence of competition. The processes (X1(t),,Xn(t)),Y~(t) and S(t) have been studied by Evans et al. (2013).

Evans et al. (2013, Proposition 3.1) proved that the process (Y~(t))t0 is an irreducible Markov process, which has the strong Feller property and admits a unique invariant probability measure ν on Δ. Let Y~() be a random variable on Δ with distribution ν. We define

r:=Δay-12yΣyν(dy)=iaiEY~i()-12EijσijY~i()Y~j() 2.8

Remark 2.5

We note that r is the stochastic growth rate (or Lyapunov exponent) of the total population S(t) in the absence of competition. That is,

PlimtlnSx(t)t=r=1.

The expression (2.8) for r coincides with the one derived by Evans et al. (2013).

We use superscripts to denote the starting points of our processes. For example (Yy,s(t),Sy,s(t)) denotes the solution of (2.4) with (Y(0),S(0))=(y,s)Δ×(0,). Fix xR+n and define the normalized occupation measures,

Πt(x)(·)=1t0t1{Xx(u)·}du. 2.9

These random measures describe the distribution of the observed population dynamics up to time t. If we define the sets

Sη:={x=(x1,,xn)R+n,:|xi|ηfor somei=1,,n},

then Πt(x)(Sη) is the fraction of the time in the interval [0, t] that the total abundance of some patch is less than η given that our population starts at X(0)=x.

Definition 2.1

One can define a distance on the space of probability measures living on the Borel measurable subsets of R+n, that is on the space (R+n,B(R+n)). This is done by defining ·,·TV, the total variation norm, via

μ,νTV:=supAB(R+n)|μ(A)-ν(A)|.

Theorem 2.1

Suppose that Assumptions 2.2 and 2.3 hold and that r>0. The process X(t)=(X1(t),,Xn(t))t0 has a unique invariant probability measure π on R+n, that is absolutely continuous with respect to the Lebesgue measure and for any q>0,

limttqPX(t,x,·)-π(·)TV=0,xR+n,, 2.10

and PX(t,x,·) is the transition probability of (X(t))t0. Moreover, for any initial value xR+n\{0} and any π-integrable function f we have

PlimT1T0TfXx(t)dt=R+n,f(u)π(du)=1. 2.11

Remark 2.6

Theorem 2.1 is a direct consequence of Theorem A.2, which will be proved in Appendix A. As a corollary we get the following result.

Definition 2.2

Following Roth and Schreiber (2014), we say that the model (2.1) is stochastically persistent if for all ε>0, there exists η>0 such that with probability one,

Πt(x)(Sη)ε

for t sufficiently large and xSη\{0}.

Corollary 2.1

If Assumptions 2.2 and 2.3 hold, and r>0, then the process X(t) is stochastically persistent.

Proof

By Theorem 2.1, we have that for all xR+n,,

PΠt(x)πast=1.

Since π is supported on R+n,, we get the desired result.

Biological interpretation of Theorem 2.1  The quantity r is the Lyapunov exponent or stochastic growth rate of the total population process (S(t))t0 in the absence of competition. This number describes the long-term growth rate of the population in the presence of a stochastic environment. According to (2.8) r can be written as the difference μ¯-12σ¯2 where

  • μ¯ is the average of per-capita growth rates with respect to the asymptotic distribution Y~() of the population in the absence of competition.

  • σ¯2 is the infinitesimal variance of the environmental stochasticity averaged according to the asymptotic distribution of the population in the absence of competition.

We note by (2.8) that r depends on the dispersal matrix, the growth rates at 0 and the covariance matrix of the environmental noise. As such, the stochastic growth rate can change due to the dispersal strategy or environmental fluctuations.

When the stochastic growth rate of the population in absence of competition is strictly positive (i.e. r>0) our population is persistent in a strong sense: for any starting point (X1(0),,Xn(0))=(x1,,xn)R+n, the distribution of the population densities at time t in the n patches (X1(t),,Xn(t)) converges as t to the unique probability measure π that is supported on R+n,.

Definition 2.3

We say the population of patch i goes extinct if for all xR+n\{0}

PlimtXix(t)=0=1.

We say the population goes extinct if the populations from all the patches go extinct, that is if for all xR+n\{0}

PlimtXx(t)=0=1.

Theorem 2.2

Suppose that Assumptions 2.2 and 2.3 hold and that r<0. Then for any i=1,,n and any x=(x1,,xn)R+n,

PlimtlnXix(t)t=r=1. 2.12

Biological interpretation of Theorem 2.2  If the stochastic growth rate of the population in the absence of competition is negative (i.e. r<0) the population densities of the n patches (X1(t),,Xn(t)) go extinct exponentially fast with rates r<0 with probability 1 for any starting point (X1(0),,Xn(0))=(x1,,xn)R+n.

In Appendix A, we prove Theorem 2.1 while Theorem 2.2 is proven in Appendix B.

Degenerate noise

We consider the evolution of the process (X(t))t0 given by (2.1) when Assumption 2.3 does not hold. If the covariance matrix Σ=ΓTΓ coming for the Brownian motions E(t)=(E1(t),,En(t))T=ΓB(t) is singular, the environmental noise driving our SDEs has a lower dimension than the dimension n of the underlying state space. It becomes much more complex to prove that our process is Feller and irreducible. In order to verify the Feller property, we have to verify the so-called Hörmander condition, and to verify the irreducibility, we have to investigate the controllability of a related control system.

We are able to prove the following extinction and persistence results.

Theorem 2.3

Assume that Y~(t) has a unique invariant probability measure ν. Define r by (2.8). Suppose that r<0. Then for any i=1,,n and any x=(x1,,xn)R+n

PlimtlnXix(t)t=r=1. 2.13

In particular, for any i=1,,n and any x=(x1,,xn)R+n

PlimtXix(t)=0=1.

Remark 2.7

The extra assumption in this setting is that the Markov process describing the proportions of the populations of the patches evolving without competition, Y~(t), has a unique invariant probability measure. In fact, we conjecture that Y~(t) always has a unique invariant probability measure. We were able to prove this conjecture when n=2—see Remark 3.1 for details.

Theorem 2.4

Assume that Y~(t) has a unique invariant probability measure ν. Define r by (2.8). Suppose that Assumption 2.2 holds and that r>0. Assume further that there is a sufficiently large T>0 such that the Markov chain (Y(kT),S(kT))kN it is irreducible and aperiodic, and that every compact set in Δ×(0,) is petite for this Markov chain.

The process X(t)=(X1(t),,Xn(t))t0 has a unique invariant probability measure π on R+n, that is absolutely continuous with respect to the Lebesgue measure and for any q>0,

limttqPX(t,x,·)-π(·)TV=0,xR+n,, 2.14

where ·,·TV is the total variation norm and PX(t,x,·) is the transition probability of (X(t))t0. Moreover, for any initial value xR+n\{0} and any π-integrable function f, we have

PlimT1T0TfXx(t)dt=R+n,f(u)π(du)=1. 2.15

Remark 2.8

We require as before that Y~(t) has a unique invariant probability measure. Furthermore, we require that there exists some time T>0 such that if we observe the process (Y(t),S(t)) at the fixed times T,2T,3T,,kT, it is irreducible (loosely speaking this means that the process can visit any state) and aperiodic (returns to a given state occur at irregular times).

Case study: n=2

Note that the two Theorems above have some extra assumptions. We exhibit how one can get these conditions explicitly as functions of the various parameters of the model. For the sake of a clean exposition we chose to fully treat the case when n=2 and bi(x)=bix,x0,i=1,2 for some b1,b2>0 (each specific case would have to be studied separately as the computations change in each setting). As a result, (2.1) becomes

dX1(t)=(X1(t)(a1-b1X1(t))-αX1(t)+βX2(t))dt+σ1X1(t)dB(t)dX2(t)=(X2(t)(a2-b2X2(t))+αX1(t)-βX2(t))dt+σ2X2(t)dB(t),

where σ1,σ2 are non-zero constants and (B(t))t0 is a one dimensional Brownian motion. The Lyapunov exponent can now be expressed as (see Remark 3.1)

r=a2-σ222+(a1-a2+σ22)01yρ1(y)dy-(σ1-σ2)2201y2ρ1(y)dy 2.16

where ρ1 is given in (3.5) later.

If σ1=σ2=:σ, one has (see Remark 3.1)

r=a2-σ22+(a1-a2+σ2)y. 2.17

Theorem 2.5

Define r by (2.16) if σ1σ2 and by (2.17) if σ1=σ2=σ. If r<0 then for any i=1,2 and any x=(x1,x2)R+2

PlimtlnXix(t)t=r=1. 2.18

Theorem 2.6

Suppose that σ1σ2 or β+(b2/b1)(a1-a2-α+β)-α(b2/b1)20. Define r as in Theorem 2.5. If r>0 then the conclusion of Theorem 2.4 holds.

Remark 2.9

Once again the parameter r tells us when the population goes extinct and when it persists. To obtain the conclusion of Theorem 2.4 when r>0, we need σ1σ2 or β+(b2/b1)(a1-a2-α+β)-α(b2/b1)20. The condition σ1σ2 tells us that the noise must at least differ through its variance. If σ1=σ2 then we require

a1+βb1+b2b2a2+αb1+b2b1.

The term βb1+b2b2 measures the dispersion rate of individuals from patch 2 to patch 1 averaged by the inverse relative competition strength of patch 2. In particular, if b1=b2 we have that

2(β-α)a2-a1,

that is twice the difference of the dispersal rates cannot equal the difference of the growth rates. The dynamics of the system is very different if these conditions do not hold (see Sect. 3.2 and Theorem 2.7).

Theorem 2.7

Suppose that σ1=σ2=σ,b1=b2 and 2(β-α)=a2-a1. In this setting one can show that the stochastic growth rate is given by r=a1-α+β-σ22. Assume that (X1(0),X2(0))=x=(x1,x2)R+2, and let U(t) be the solution to

dU(t)=U(t)(a1-α+β-bU(t))dt+σU(t)dB(t),U(0)=x2.

Then we get the following results

  1. If x1=x2 then P(X1x(t)=X2x(t)=U(t),t0)=1.

  2. If x1x2 then P(X1x(t)X2x(t),t0)=1.

  3. If r<0 then X1(t) and X2(t) converges to 0 exponentially fast. If r>0 then
    PlimtX1x(t)Ux(t)=limtX2x(t)Ux(t)=1=1.
    Thus, both X1(t) and X2(t) converge to a unique invariant probability measure ρ on (0,), which is the invariant probability measure of U(t). The invariant probability measure of (X1(t),X2(t))t0 is concentrated on the one-dimensional manifold {x=(x1,x2)R+2,:x1=x2}.

The proof of Theorem 2.7 is presented in Sect. 3.2.

Robust persistence and extinction

The model we work with is an approximation of the real biological models. As a result, it is relevant to see if ‘close models’ behave similarly to ours. This reduces to studying the robustness of our system. Consider the process

dX^i=X^ia^i-b^i(Xi)dt+D^ij(X^)X^idt+X^iΓ^(X^)dB(t) 2.19

where b^(·),D^(·),Γ^(·) are locally Lipschitz functions and D^ij(x)0 for all xR+n,ij and D^ii(x)=-jiDij(x). If there exists θ>0 such that

supxR+n,a-a^,b(x)-b^(x),D-D^(x),Γ-Γ^(x)<θ, 2.20

then we call X^ a θ-perturbation of X.

Theorem 2.8

Suppose that the dynamics of (X(t))t0 satisfy the assumptions of Theorem 2.1. Then there exists θ>0 such that any θ-perturbation (X^(t))t0 of (X(t))t0 is persistent. Moreover, the process (X^(t))t0 has a unique invariant probability measure π^ on R+n, that is absolutely continuous with respect to the Lebesgue measure and for any q>0

limttqPX^(t,x,·)-π^(·)TV=0,xR+n,,

where PX^(t,x,·) is the transition probability of (X^(t))t0.

Biological interpretation of Theorem 2.8  As long as the perturbation of our model is small, persistence does not change to extinction. Our model, even though it is only an approximation of reality, can provide relevant information regarding biological systems. Small enough changes in the growth rates, the competition rates, the dispersion matrix and the covariance matrix leave a persistent system unchanged.

Theoretical and numerical examples

This subsection is devoted to some theoretical and numerical examples. We choose the dimension to be n=2, so that we can compute the stochastic growth rate explicitly.

Remark 3.1

If an explicit expression for r is desirable, one needs to determine the first and second moments for the invariant probability measure ν. One can show that ρ, the density of ν with respect to Lebesgue measure, satisfies

-iyi[μi(y)ρ(y)]+12i,j2yiyj[vij(y)ρ(y)]=0,yΔ, 3.1

where μi(y) and vi,j(y) are the entries of

μ(y)=Dy+diag(y)-yya-Σy,v(y)=diag(y)-yy(t)ΓΓdiag(y)-yy(t),

and ρ is constrained by Δρ(y)dy=1 with appropriate boundary conditions. The boundary conditions are usually found by characterizing the domain of the infinitesimal generator of the Feller diffusion process Y~(t), which is usually a very difficult problem.

However, following Evans et al. (2013), in the case of two patches (n=2) and non-degenerate noise the problem is significantly easier. Let Σ=diag(σ12,σ22). The system becomes

dX1(t)=(X1(t)(a1-bX1(t))-αX1(t)+βX2(t))dt+σ1X1(t)dB1(t)dX2(t)=(X2(t)(a2-bX2(t))+αX1(t)-βtX2(t))dt+σ2X2(t)dB2(t). 3.2

It is easy to find the density ρ1 of Y~1() explicitly (by solving (3.1)) and noting that 0, 1 are both entrance boundaries for the diffusion Y~1(t)). Then

ρ1(x)=Cxβ-α1(1-x)-β-α2exp-2σ12+σ22βx+α1-x,x(0,1)

where C>0 is a normalization constant and

αi:=2σi2σ12+σ22,i=1,2β:=2σ12+σ22(a1-a2+β-α).

One can then get the following explicit expression for the Lyapunov exponent

r=a2-σ222+(a1-a2+σ22)01yρ1(y)dy-σ12+σ22201y2ρ1(y)dy. 3.3

Next, consider the degenerate case

dX1(t)=(X1(t)(a1-b1X1(t))-αX1(t)+βX2(t))dt+σ1X1(t)dB(t)dX2(t)=(X2(t)(a2-b2X2(t))+αX1(t)-βX2(t))dt+σ2X2(t)dB(t), 3.4

where σ1,σ2 are non-zero constants and (B(t))t0 is a one dimensional Brownian motion. Since Y~1(t)+Y~2(t)=1, to find the invariant probability measure of Y~(t), we only need to find the invariant probability measure of Y~1(t).

If σ2σ2 we can find the invariant density ρ1 of Y~1() explicitly (by solving (3.1). Then

ρ1(x)=Cxβ^-α^1(1-x)-β^-α^2exp-2(σ1-σ2)2βx+α1-x,x(0,1) 3.5

where C>0 is a normalization constant and

α^1:=-2σ1(σ1-σ2),α^2:=2σ2(σ1-σ2),β^:=2(σ1-σ2)2(a1-a2+β-α).

The Lyapunov exponent can now be expressed as

r=a2-σ222+(a1-a2+σ22)01yρ1(y)dy-(σ1-σ2)2201y2ρ1(y)dy.

We note that the structure of the stochastic growth rate r for non-degenerate noise (3.3) and for degenerate noise (2.16) with σ1σ2 is the same. The only difference is that one needs to make the substitution σ12+σ22(σ1-σ2)2 and the changes in α^i.

If σ1=σ2=:σ the system (2.6) for Y~(t)=(Y~1(t),Y~2(t)) can be written as

dY~1(t)=(Y~1(t)(a1-a1Y~1(t)-a2Y~2(t))-αY~1(t)+βY~2(t))dt+σ2Y~1(t)[(Y~1(t)+Y~2(t))2-(Y~1(t)+Y~2(t))2]dtdY~1(t)=(Y~2(t)(a2-a1Y~1(t)-a2Y~2(t))-βY~2(t)+αY~1(t))dt+σ2Y~2(t)[(Y~1(t)+Y~2(t))2-(Y~1(t)+Y~2(t))2]dt. 3.6

Using the fact that Y~1(t)+Y~2(t)=1 this reduces to

dY~1(t)=((a1-a2)[1-Y~1(t)]Y~1(t)+β-(α+β)Y~1(t)])dt. 3.7

The unique equilibrium of 3.7 in [0,1] is the root y in [0,1] of (a1-a2)(1-y)y+β-(α+β)y=0. Hence, the unique invariant probability measure of Y~(t) in this case is the Dirac measure concentrated in (y,1-y). Thus

r=a2-σ22+(a1-a2+σ2)y.

The degenerate case σ1=σ2,α=β

Consider the following system, where α,σ,ai,bi,i=1,2 are positive constants.

dX1(t)=(X1(t)(a1-b1X1(t))-αX1(t)+αX2(t))dt+σX1(t)dB(t)dX2(t)=(X2(t)(a2-b2X2(t))+αX1(t)-αX2(t))dt+σX2(t)dB(t). 3.8

Suppose that a1a2 or that b1b2. This system is degenerate since both equations are driven by a single Brownian motion. In this case, the unique equilibrium of (3.7) in [0,1] is the root y in [0,1] of (a1-a2)(1-y)y+α(1-2y)=0. Solving this quadratic equation, we have y=a1-a2-2α+(a1-a2)2+4α22(a1-a2) if a1a2 and y=12 if a1=a2.

It can be proved easily that this equilibrium is asymptotically stable and that limtY~1(t)=y. Thus, if a1a2

r=a1y+a2(1-y)-σ22=a2+a1-a2-2α+(a1-a2)2+4α22-σ22=a1+a2-2α+(a1-a2)2+4α22-σ22.

As a result

r=a1+a2-2α+(a1-a2)2+4α22-σ22ifa1a2,b1=b2a1-σ22ifa1=a2,b1b2. 3.9

Note that if a1a2 and b1=b2

α+(b2/b1)(a1-a2)-α(b2/b1)2=a1-a20

and that if a1=a2 and b1b2

α+(b2/b1)(a1-a2)-α(b2/b1)2=α1-b2/b10.

Therefore, the assumptions of Theorem 2.6 hold. If r<0, by Theorem 2.5 the population goes extinct, while if r>0, the population persists by Theorem 2.6.

The degenerate case when the conditions of Theorem 2.6 are violated

We analyse the system

dX1(t)=(X1(t)(a1-bX1(t))-αX1(t)+βX2(t))dt+σX1(t)dB(t)dX2(t)=(X2(t)(a2-bX2(t))+αX1(t)-βX2(t))dt+σX2(t)dB(t), 3.10

when 2(β-α)=a2-a1. In this case σ1=σ2=σ,

β+(b2/b1)(a1-a2-α+β)-α(b2/b1)=0

and

r=a1-α+β-σ22.

If r<0 then limtX1(t)=limtX2(t)=0 almost surely as the result of Theorem 2.5.

We focus on the case r>0 and show that some of the results violate the conclusions of Theorem 2.6.

If we set Z(t)=X1(t)/X2(t) then (see (C.6))

dZ(t)=((1-Z(t))Z(t)X2(t)+β+a^1Z(t)-αZ2(t))dt.

Noting that a^1=a1-a2-α+β=α-β yields

d(Z(t)-1)=(-(Z(t)-1)Z(t)X2(t)-(Z(t)-1)(αZ(t)+β))dt.

Assume Z(0)1 and without loss of generality suppose Z(0)>1. This implies

Z(t)-1=(Z(0)-1)exp-0tZ(s)X2(s)+(αZ(s)+β)ds. 3.11

Since Z(t) and X2(t) do not explode to ± in finite time we can conclude that if Z(0)0 then Z(t)0 for any t0 with probability 1. In other words, if x=(x1,x2)R+2, with x1x2 then

P(X1x(t)=X2x(t),t0)=0.

One can further see from (3.11) that Z(t)-1 tends to 0 exponentially fast. If Z(0)=1 let X1(0)=X2(0)=x>0. Similar arguments to the above show that

P(X1x(t)X2x(t),t0)=0.

To gain more insight into the asymptotic properties of (X1(t),X2(t)), we study

dX2(t)=X2(t)((a^2-bX2(t))+αZ(t))dt+σX2(t)dB(t)=X2(t)(a1-α+β-bX2(t))+α(Z(t)-1))dt+σX2(t)dB(t)

We have from Itô’s formula that,

d1X2(t)=b+(-a1+α-β+σ2-α(Z(t)-1))1X2(t)dt-σ1X2(t)dB(t).

By the variation-of constants formula (see Mao 1997, Section 3.4), we have

1X2(t)=ϕ-1(t)1x2+b0tϕ(s)ds

where

ϕ(t):=exprt+α0t(Z(s)-1)ds+σB(t).

Thus,

X2(t)=ϕ(t)x2-1+b0tϕ(s)ds.

It is well-known that

U(t):=ert+σB(t)x2-1+b0ters+σB(s)ds,

is the solution to the stochastic logistic equation

dU(t)=U(t)(a1-α+β-bU(t))dt+σU(t)dB(t),U(0)=x2.

By the law of the iterated logarithm, almost surely

limtϕ(t)=limtert+σB(t)=. 3.12

We have

X2(t)U(t)=expα0t(Z(s)-1)dsx2-1+b0tert+σB(t)dsx2-1+b0tϕ(s)ds.

In view of (3.12), we can use L’hospital’s rule to obtain

limtX2(t)U(t)=limtexpα0t(Z(s)-1)dsert+σB(t)ϕ(t)+limtα(Z(t)-1)expα0t(Z(s)-1)dsx2-1+b0tert+σB(t)dsbϕ(t)=1+limtα(Z(t)-1)x2-1+b0tert+σB(t)dsbert+σB(t) 3.13

almost surely. By the law of the iterated logarithm, limtert+σB(t)e(r-ε)t= and limtert+σB(t)e(r+ε)t=0 for any ε>0. Applying this and (3.11) to (3.13), it is easy to show that with probability 1

limtX2(t)U(t)=1.

Since limtZ(t)=1 almost surely, we also have limtX1(t)U(t)=1 almost surely. Thus, the long term behavior of X1(t) and X2(t) is governed by the one-dimensional diffusion U(t). In particular, both X1(t) and X2(t) converge to a unique invariant probability measure ρ on (0,), which is the invariant probability measure of U(t). In this case, the invariant probability measure of X(t)=(X1(t),X2(t))t0 is not absolutely continuous with respect to the Lebesgue measure on R+2,. Instead, the invariant probability measure is concentrated on the one-dimensional manifold {x=(x1,x2)R+2,:x1=x2}.

Biological interpretation The stochastic growth rate in this degenerate setting is given by r=a1-α+β-σ22. We note that this term is equal to the stochastic growth rate of patch 1,a1-σ22, to which we add β, the rate of dispersal from patch 1 to patch 2, and subtract α, the rate of dispersal from patch 2 to patch 1. When

a1-σ22>α-β

one has persistence, while when

a1-σ22<α-β

one has extinction. In particular, if the patches on their own are sink patches so that a1-σ22<0 and a2-σ22<0 dispersion cannot lead to persistence since

a1-σ22>α-βanda2-σ22>β-α

cannot hold simultaneously. The behavior of the system when r>0 is different from the behavior in the non-degenerate setting of Theorem 2.1 or the degenerate setting of Theorem 2.6. Namely, if the patches start with equal populations then the patch abundances remain equal for all times and evolve according to the one-dimensional logistic diffusion U(t). If the patches start with different population abundances then X1(t) and X2(t) are never equal but tend to each other asymptotically as t. Furthermore, the long term behavior of X1(t) and X2(t) is once again determined by the logistic diffusion U(t) as almost surely Xi(t)U(t)1 as t. As such, if r>0 we have persistence but the invariant measure the system converges to does not have R+2, as its support anymore. Instead the invariant measure has the line {x=(x1,x2)R+2,:x1=x2} as its support.

Example 3.1

We discuss the case when a1a2 and σ1=σ2. The stochastic growth rate can be written by the analysis in the sections above as

r=a1+a2-2α+(a1-a2)2+4α22-σ22ifα=β,b1=b2a1-α+β-σ22ifa2-a1=2(β-α),b1=b2. 3.14

Biological interpretation In the case when a1=a2,σ1=σ2 and b1b2 (so that the two patches only differ in their competition rates) the stochastic growth rate r does not depend on the dispersal rate α. The system behaves just as a single-patch system with stochastic growth rate a1-σ22. In contrast to Evans et al. (2013, Example 1) coupling two sink patches by dispersion cannot yield persistence.

However, if the growth rates of the patches are different a1a2 then the expression for r given in (3.14) yields for α|a1-a2| that

ra1+a22-σ22+(a1-a2)28α.

In particular

limαr(α)=a1+a22-σ22.

We note that r is a decreasing function of the dispersal rate α for large values of α (also see Fig. 1). This is different from the result of Evans et al. (2013, Example 1) where r was shown to be an increasing function of α. In contrast to the non-degenerate case, coupling patches by dispersal decreases the stochastic growth rate and as such makes persistence less likely. This highlights the negative effect of spatial correlations on population persistence and why one may no longer get the rescue effect. This is one of your main biological conclusions. Furthermore, we also recover that dispersal has a negative impact on the stochastic growth rate when there is spatial heterogeneity (i.e. a1a2). This fact has a long history, going back to the work by Karlin (1982).

Fig. 1.

Fig. 1

Consider (3.2) when α=β and the Brownian motions B1 and B2 are assumed to have correlation ρ. The graphs show the stochastic growth rate r as a function of the dispersal rate α for different values of the correlation. Note that if ρ=0 we get the setting when the Brownian motions of the two patches are independent while when ρ=1 we get that one Brownian motion drives the dynamics of both patches. The parameters are α=β,a1=3,a2=4,σ12=σ22=7

Discussion and generalizations

For numerous models of population dynamics it is natural to assume that time is continuous. One reason for this is that often environmental conditions change continuously with time and therefore can naturally be described by continuous time models. There have been a few papers dedicated to the study of stochastic differential equation models of interacting, unstructured populations in stochastic environments (see Benaïm et al. 2008; Schreiber et al. 2011; Evans et al. 2015). These models however do not account for population structure or correlated environmental fluctuations.

Examples of structured populations can be found by looking at a population in which individuals can live in one of n patches (e.g. fish swimming between basins of a lake or butterflies dispersing between meadows). Dispersion is viewed by many population biologists as an important mechanism for survival. Not only does dispersion allow individuals to escape unfavorable landscapes (due to environmental changes or lack of resources), it also facilitates populations to smooth out local spatio-temporal environmental changes. Patch models of dispersion have been studied extensively in the deterministic setting (see for example Hastings 1983; Cantrell et al. 2012). In the stochastic setting, there have been results for discrete time and space by Benaïm and Schreiber (2009), for continuous time and discrete space by Evans et al. (2013) and for structured populations that evolve continuously both in time and space.

We analyze the dynamics of a population that is spread throughout n patches, evolves in a stochastic environment (that can be spatially correlated), disperses among the patches and whose members compete with each other for resources. We characterize the long-term behavior of our system as a function of r—the growth rate in the absence of competition. The quantity r is also the Lyapunov exponent of a suitable linearization of the system around 0. Our analysis shows that r<0 implies extinction and r>0 persistence. The limit case r=0 cannot be analyzed in our framework. We expect that new methods have to be developed in order to tackle the r=0 scenario.

Since mathematical models are always approximations of nature it is necessary to study how the persistence and extinction results change under small perturbations of the parameters of the models. The concept of robust persistence (or permanence) has been introduced by Hutson and Schmitt (1992). They showed that for certain systems persistence holds even when one has small perturbations of the growth functions. There have been results on robust persistence in the deterministic setting for Kolmogorov systems by Schreiber (2000) and Garay and Hofbauer (2003). Recently, robust permanence for deterministic Kolmogorov equations with respect to perturbations in both the growth functions and the feedback dynamics has been analyzed by Patel and Schreiber (2016). In the stochastic differential equations setting results on robust persistence and extinction have been proven by Schreiber et al. (2011) and Benaïm et al. (2008). We prove analogous results in our framework where the populations are coupled by dispersal. For robust persistence we show in Appendix D that even with density-dependent perturbations of the growth rates, dispersion matrix and environmental covariance matrix, if these perturbations are sufficiently small and if the unperturbed system is persistent then the perturbed system is also persistent. In the case of extinction we can prove robustness when there are small constant perturbations of the growth rates, dispersal matrices and covariance matrices.

In ecology there has been an increased interest in the spatial synchrony present in population dynamics. This refers to the changes in the time-dependent characteristics (i.e. abundances etc) of structured populations. One of the mechanisms which creates synchrony is the dependence of the population dynamics on a synchronous random environmental factor such as temperature or rainfall. The synchronizing effect of environmental stochasticity, or the so-called Moran effect, has been observed in multiple population models. Usually this effect is the result of random but correlated weather effects acting on spatially structured populations. Following Legendre (1993) one could argue that our world is a spatially correlated one. For many biotic and abiotic factors, like population density, temperature or growth rate, values at close locations are usually similar. For an in-depth analysis of spatial synchrony see Kendall et al. (2000) and Liebhold et al. (2004). Most stochastic differential models appearing in population dynamics treat only the case when the noise is non-degenerate (although see Rudnicki 2003; Dieu et al. 2016). This simplifies the technical proofs significantly. However, from a biological point of view it is not clear that the noise should never be degenerate. For example if one models a system with multiple populations then all populations can be influenced by the same factors (a disease, changes in temperature and sunlight etc). Environmental factors can intrinsically create spatial correlations and as such it makes sense to study how these degenerate systems compare to the non-degenerate ones. In our setting the n different patches could be strongly spatially correlated. Actually, in some cases it could be more realistic to have the same one-dimensional Brownian motion (Bt)t0 driving the dynamics of all patches. We were able to find conditions under which the proofs from the non-degenerate case can be generalized to the degenerate setting. This is a first step towards a model that tries to explain the complex relationship between dispersal, stochastic environments and spatial correlations.

We fully analyze what happens if there are only two patches, n=2, and the noise is degenerate. Our results show unexpectedly, and in contrast to the non-degenerate results by Evans et al. (2013), that coupling two sink patches cannot yield persistence. More generally, we show that the stochastic growth rate is a decreasing function of the dispersal rate. In specific instances of the degenerate setting, even when there is persistence, the invariant probability measure the system converges to does not have R+2, as its support. Instead, the abundances of the two patches converge to an invariant probability measure supported on the line {x=(x1,x2)R+2,:x1=x2}. These examples shows that degenerate noise is not just an added technicality—the results can be completely different from those in the non-degenerate setting. The negative effect of spatial correlations (including the fully degenerate case) has been studied in several papers for discrete-time models (see Schreiber 2010; Harrison and Quinn 1989; Palmqvist and Lundberg 1998; Bascompte et al. 2002; Roy et al. 2005). The negative impact of dispersal on the stochastic growth rate r when there is spatial heterogeneity (i.e. a1a2) has a long history going back to the work of Karlin (1982) on the Reduction Principle. Following Altenberg (2012) the reduction principle can be stated as the widely exhibited phenomenon that mixing reduces growth, and differential growth selects for reduced mixing. The first use of this principle in the study of the evolution of dispersal can be found in Hastings (1983). The work of Kirkland et al. (2006) provides an independent proof of the Reduction Principle and applications to nonlinear competing species in discrete-time, discrete-space models. In the case of continuous-time, discrete-space models (given by branching processes) a version of the Reduction Principle is analysed by Schreiber and Lloyd-Smith (2009).

k species competing and dispersing in n patches

Real populations do not evolve in isolation and as a result much of ecology is concerned with understanding the characteristics that allow two species to coexist, or one species to take over the habitat of another. It is of fundamental importance to understand what will happen to an invading species. Will it invade successfully or die out in the attempt? If it does invade, will it coexist with the native population? Mathematical models for invasibility have contributed significantly to the understanding of the epidemiology of infectious disease outbreaks (Cross et al. 2005) and ecological processes (Law and Morton 1996; Caswell 2001). There is widespread empirical evidence that heterogeneity, arising from abiotic (precipitation, temperature, sunlight) or biotic (competition, predation) factors, is important in determining invasibility (Davies et al. 2005; Pyšek and Hulme 2005). However, few theoretical studies have investigated this; see, e.g., Schreiber and Lloyd-Smith (2009), Schreiber and Ryan (2011) and Schreiber (2012).

In this paper we have considered the dynamics of one population that disperses through n patches. One possible generalization would be to look at k populations (X1,,Xk) that compete with each other for resources, have different dispersion strategies and possibly experience the environmental noise differently. Looking at such a model could shed light upon fundamental problems regarding invasions in spatio-temporally heterogeneous environments.

The extension of our results to competition models could lead to the development of a stochastic version of the treatment of the evolution of dispersal developed for patch models in the deterministic setting by Hastings (1983) and Cantrell et al. (2012). In the current paper we have focused on how spatio-temporal variation influences the persistence and extinction of structured populations. In a follow-up paper we intend to look at the dispersal strategies in terms of evolutionarily stable strategies (ESS) which can be characterized by showing that a population having a dispersal strategy (Dij) cannot be invaded by any other population having a different dispersal strategy (D~ij). The first thing to check would be whether this model has ESS and, if they exist, whether they are unique. One might even get that there are no ESS in our setting. For example, Schreiber and Li (2011) show that there exist no ESS for periodic non-linear models and instead one gets a coalition of strategies that act as an ESS. We expect to be able to generalize the results of Cantrell et al. (2012) to a stochastic setting using the methods from this paper.

Acknowledgements

We thank Sebastian J. Schreiber and three anonymous referees for their detailed comments which helped improve this manuscript.

Appendix A: The case r>0

The next sequence of lemmas and propositions is used to prove Theorem 2.1. We start by showing that our processes are well-defined Markov processes.

Proposition A.1

The SDE (stochastic differential equation) defined by (2.1) has unique strong solutions X(t)=(X1(t),,Xn(t)),t0 for any x=(x1,,xn)R+n. Furthermore, X(t) is a strong Markov process with the Feller property, is irreducible on R+n\{0} and P{Xi(t)>0,t>0,i=1,,n}=1 if X(0)R+n\{0}.

Proof

Since the coefficients of (2.1) are locally Lipschitz, there exists a unique local solution to (2.1) with a given initial value. In other words, for any initial value, there is a stopping time τe>0 and a process (X(t))t0 satisfying (2.1) up to τe and limtτeX(t)= (see e.g. Khasminskii 2012, Section 3.4). Clearly, if X(0)=0 then X(t)=0,t[0,τe) which implies that τe=. By a comparison theorem for SDEs (see Geiß and Manthey (1994, Theorem 1.2) and Remark A.2 below),

PXi(t)<Xi(t),t(0,τe),i=1,,n=1ifXi(0)=Xi(0)Mb A.1

where (Xi(t))t0 is given by (2.7). Since (2.7) has a global solution due to the Lipschitz property of its coefficients, we have from (A.1) that τe= almost surely. Define the process

dX¯i(t)=-3ai2X¯i(t)+j=1nDjiX¯j(t)dt+X¯i(t)dEi(t),i=1,,n.

Since the bis are continuous and vanish at 0, there exists r>0 such that for |x|r we have

-3ai2ai-bi(xi),i=1,,n. A.2

Let τ be the stopping time

τ:=inft:X¯(t)>r A.3

Now, consider the case X(0)R+n\{0}. By Evans et al. (2013, Proposition 3.1), (A.2), (A.3) and a comparison argument (see Remark A.2 and the proof of Evans et al. (2015, Theorem 4.1)), we can show that

PXiX¯i(t)>0,t(0,τ)=1,

which implies

PXi(t)>0,t(0,)=1for allX(0)R+n\{0}. A.4

Moreover, since P0Xi(t)<Xi(t)for allt0,i=1,,n=1, we can use standard arguments (e.g., Mao 1997, Theorem 2.9.3) to obtain the Feller property of the solution to (2.1).

Remark A.1

There are different possible definitions of “Feller” in the literature. What we mean by Feller is that the semigroup (Tt)t0 of the process maps the set of bounded continuous functions Cb(R+n) into itself i.e.

Tt(Cb(R+n))Cb(R+n),t0.

Definition A.1

We call a mapping f:RdRd quasi-monotonously increasing, if for j=1,,d

fj(x)fj(y),

whenever xj=yj and xlyl,lj.

Remark A.2

One often wants to apply the well-known comparison theorem for one-dimensional SDEs (see Ikeda and Watanabe 1989) to a multidimensional setting. Below we explain why we can make use of comparison theorems for stochastic differential equations in our setting. Consider the following two systems

dRj(t)=aj(t,R(t))dt+k=1rσjk(t,R(t))dWk(t) A.5

and

dSj(t)=bj(t,S(t))dt+k=1rσjk(t,S(t))dWk(t) A.6

for j=1,,d,t0 together with the initial condition

Rj(0)Sj(0),j=1,,dP-a.s., A.7

where W=(W1(t),,Wr(t))t0 is an r-dimensional standard Brownian motion, and the coefficients ai,bi,σjk are continuous mappings on R+×Rd. Suppose (A.5) and (A.6) have explosion times θR,θS.

Let (C0), (C1), and (C2) be the following conditions.

  • (C0)

    The solution to (A.5) is pathwise unique and the drift coefficient a(tx) is quasi-monotonously (see Definition A.1) increasing with respect to x.

  • (C1)
    For every t0,j=1,,d and xRd the following inequality holds
    aj(t,x)bj(t,x).
  • (C2)
    There exists a strictly increasing function ρ:R+R+ with ρ(0)=0 and
    0+1ρ2(u)du=
    such that for each j=1,,d
    k=1r|σjk(t,x)-σjk(t,y)|ρ(|xj-yj|)for allt0,x,yRd.

Sometimes it is assumed incorrectly that conditions (C1) and (C2) suffice to conclude that P{R(t)Y(t),t[0,θRθS)}=1. Some illuminating counterexamples regarding this issue can be found in Assing and Manthey (1995, Section 3). However, if in addition to conditions (C1) and (C2), one also has condition (C0), then Geiß and Manthey (1994, Theorem 1.2) indicates that P{R(t)Y(t),t[0,θRθS)}=1. Note that, in the setting of our paper, the drift coefficient of (2.7) is quasi-monotonously increasing and we can pick ρ(x)=x,xR+. Therefore, conditions (C0), (C1), and C(2) hold, which allows us to use the comparison results. In special cases one can prove comparison theorems even when quasi-monotonicity fails; see Evans et al. (2015, Theorem 6.1) and Nlath et al. (2007, Corollary A.2).

To proceed, let us recall some technical concepts and results needed to prove the main theorem. Let Φ=(Φ0,Φ1,) be a discrete-time Markov chain on a general state space (E,E), where E is a countably generated σ-algebra. Denote by P the Markov transition kernel for Φ. If there is a non-trivial σ-finite positive measure φ on (E,E) such that for any AE satisfying φ(A)>0 we have

n=1Pn(x,A)>0,xE

where Pn is the n-step transition kernel of Φ, then the Markov chain Φ is called φ-irreducible. It can be shown (see Nummelin 1984) that if Φ is φ-irreducible, then there exists a positive integer d and disjoint subsets E0,,Ed-1 such that for all i=0,,d-1 and all xEi, we have

P(x,Ej)=1wherej=i+1(modd)

and

φE\i=0d-1Ei=0.

The smallest positive integer d satisfying the above is called the period of Φ. An aperiodic Markov chain is a chain with period d=1.

A set CE is called petite, if there exists a non-negative sequence (an)nN with n=1an=1 and a nontrivial positive measure ν on (E,E) such that

n=1anPn(x,A)ν(A),xC,AE.

The following theorem is extracted from Jarner and Roberts (2002, Theorem 3.6).

Theorem A.1

Suppose that Φ is irreducible and aperiodic and fix 0<γ<1. Assume that there exists a petite set CE, positive constants κ1,κ2 and a function V:E[1,) such that

PVV-κ1Vγ+κ21C.

Then there exists a probability measure π on (E,E) such that

(n+1)γ1-γP(x,·)-π(·)TV0asnfor allxE.

The next series of lemmas and propositions are used to show that we can construct a function V satisfying the assumptions of Theorem A.1.

Lemma A.1

For any T>0, there exists an open set N0R+n, such that the Markov chain {(Y(kT),S(kT)),kN} on Δ×(0,) is φ-irreducible and aperiodic, where φ(·)=m(·N0) and m(·) is Lebesgue measure. Moreover, every compact set KΔ×(0,) is petite. Similarly, Δ is a petite set of the Markov chain {Y~(kT),kN}.

Proof

To prove this lemma, it is more convenient to work with the process X(t) that lives on R+n\{0}. Since (X(t))t0 is a nondegenerate diffusion with smooth coefficients in R+n,, by Rey-Bellet (2006, Corollary 7.2), the transition semigroup PX(t,x,·) of (X(t))t0 has a smooth, positive density (0,)×R+2n,(t,x,x))pX(t,x,x)[0,). Fix a point x0R+n,. Since R+n,p(t,x0,x)dx=1 there exists x1R+n, such that pXT2,x0,x1>0. There exist bounded open sets N0x0,N1x1 satisfying

p^:=infpXT2,x,x>0:xN0,xN1>0. A.8

Slightly modifying the proof of Evans et al. (2013, Proposition 3.1) (the part proving the irreducibility of the solution process), we have that p~x:=PXT2,x,N0>0 for all xR+n\{0}. Since (X(t))t0 has the Feller property, there is a neighborhood Nxx such that

PXT2,x,N0>p~x2,xNx. A.9

For any compact set KR+n\{0}, there are finite x2,,xk such that Ki=2kNxi. As a result,

PXT2,x,N0>p~K:=minp~xi2,i=2,,k. A.10

In view of (A.8), (A.9), and (A.10), an application of the Chapman-Kolmogorov equations yields that for any xK and any measurable set AR+n,,

PX(T,x,A)N0PXT2,x,dxPXT2,x,Ap~Kp^m(AN1),

where m(·) is Lebesgue measure on R+n,. Since the measure ν(·)=m(·N1) is non-trivial, we can easily obtain that K is a petite set of the Markov chain {(X(kT)),kN}. Moreover, K can be chosen arbitrarily. Hence, for any xR+n\{0} there is p¯x>0 such that

PX(T,x,·)p¯xm(·N1), A.11

which means that {(X(kT)),kN} is irreducible.

Suppose that {(X(kT)),kN} is not aperiodic. Then there are disjoint subsets of R+n\{0}, denoted by A0,,Ad-1 with d>1 such that for any xAi,

PX(T,x,Aj)=1wherej=i+1(modd).

Since P(T,x,·) has a density, m(Ai)>0 for i=0,,d-1. In view of (A.11), we must have m(N0Ai)=0 for any i=0,,d-1. This contradicts the fact that

mN0E\i=0d-1Ai=0.

This contradiction implies that {X(kT),kN} is aperiodic. In the same manner, we can prove that Y~(t) is irreducible, aperiodic and its state space, Δ, is petite.

Lemma A.2

There exists a positive constant K1 such that

ESy,s(t)e-γbts+K1,(y,s)Δ×(0,),t0. A.12

Moreover, for any H>0,T>0, and ε>0, there is a k~=k~(H,T,ε)>0 such that

P{Sy,s(t)<k~,t[0,T]}>1-ε,(y,s)Δ×(0,H]. A.13

Proof

In view of (2.2), if sMb then -[b(sy)]y+ay+γb0. Let

K~1=supyΔ,sMbs(-[b(sy)]y+ay+γb)<.

For kN, define the bounded stopping time

ηky,s=inf{t0:Sy,s(t)k}. A.14

Dynkin’s formula for the function f(t,s):=eγbts and the bounded stopping time tηky,s yield

E[eγbtηky,sSy,s(tηky,s)]=s+E0tηky,seγbuSy,s(u)γb+[a-b(Sy,s(u)Yy,s(u)]Yy,s(u)dus+E0tηky,sK~1eγbudus+K~1γb(eγbt-1). A.15

The claim (A.13) follows directly from (A.15). Moreover, by letting k in (A.15), we obtain from Fatou’s lemma that

EeγbtSy,s(t)s+K1eγbtforK1=K~1γb, A.16

which implies (A.12).

Proposition A.2

For any ε>0 and T>0, there is a δ=δ(ε,T)>0 such that

P(Yy,s(t),Sy,s(t))-(Y~y(t),0)ε,0tT>1-ε

given that (y,s)Δ×(0,δ).

Proof

In view of (A.13), for any ε>0,T>0, there is k~=k~(ε,T)>0 such that

P{ηk~y,s<T}1-ε3,(y,s)Δ×(0,ε) A.17

where ηky,s is defined by (A.14). Since the coefficients of equation (2.4) are locally Lipschitz, using the arguments from Mao (1997, Lemma 9.4) and noting Sy,0(t)0, we obtain for any (y,s)Δ×(0,ε) that

Esup0tTηk~y,sηk~y,0Yy,s(t),Sy,s(t)-Yy,0(t),02Cs2, A.18

where C is a constant that depends on H,T,k~. Applying Chebyshev’s inequality to (A.18), there is a δ(0,ε) such that for all (y,s)Δ×(0,δ)

Psup0tTηk~y,sηk~y,0Yy,s(t),Sy,s(t)-Yy,0(t),0<ε>1-ε3. A.19

Combining (A.18) and (A.19) yields

Psup0tTYy,s(t),Sy,s(t)-Yy,0(t),0<ε>1-ε. A.20

for any (y,s)Δ×(0,δ). The desired result is obtained by noting that Yy,0(t)=Y~y(t),t0.

Lemma A.3

There are positive constants K2 and K3 such that for any (y,s)Δ×(0,),T0, one has

E[lnSy,s(T)]2((lns)2+1)K2exp{K3T}, A.21

Proof

In view of Itô’s formula,

dln2S(t)=Y(t)ΣY(t)+2lnS(t)aY(t)-[b(S(t)Y(t))]Y(t)-12Y(t)ΣY(t)dt+2lnS(t)Y(t)dE(t). A.22

Now, we estimate g(y,s)=yΣy+2lnsay-b(sy)-12yΣy for (y,s)Δ×(0,). Let Mb be as in (2.2). If s>Mb then lns>0 and ay-b(sy)-12yΣy<0. Letting

M1:=sup{(y,s)Δ×(0,Mb]}ay-b(sy)-12yΣy<

and

Σ:=sup{yΣy:yΔ},g(y,s)Σ+M1|lns|M1ln2s+2M1+Σfor all(y,s)Δ×(0,).

With this estimate, we can apply Dynkin’s formula to (A.22) and use standard arguments (e.g., Mao 1997, Theorem 2.4.1) to obtain

E[lnSy,s(T)]21AK2(lns)2exp{K3T}for all(y,s)Δ×(0,)

for some positive constants K2 and K3.

Lemma A.4

There is a positive constant K4 such that for any (y,s)Δ×(0,1), and AF,

E[lnSy,s(Tζy,s)]-2(lns)2+K4P(A)(T+1)[lns]-+K4T2, A.23

where [lnx]-:=max{0,-lnx}, and

ζy,s:=inf{t0:Sy,s(t)=1}. A.24

Proof

Let

M2=sup{(y,s)Δ×(0,1)}(-ay+12yΣy+b(sy)y<.

Using Dynkin’s formula,

-lnSy,s(Tζy,s)=-lns-My,s(Tζy,s)+0Tζy,s-aYy,s(t)+b(Sy,s(t)Yy,s(t))TYy,s(t)+12Yy,s(t)ΣYy,s(t)dt[lns]-+M2T+|My,s(Tζy,s)|, A.25

where

My,s(t)=0tY(t)dE(t)=0tY(t)ΓdB(t). A.26

It follows from (A.25) that

[lnSy,s(Tζy,s)]-21A[lns]-+M2T+|My,s(Tζy,s)|21A[lns]-2+2(M2T+|My,s(Tζy,s)|)[lns]-1A+2(M2T)2+2|My,s(Tζy,s)|21A A.27

An application of Itô’s isometry yields

E[|Mz,y(Tζy,s)|21A]E|Mz,y(Tζy,s)|2ΣT. A.28

By a straightforward use of Hölder’s inequality and (A.28),

E[|Mz,y(Tζy,s)|1A](P(A)E|Mz,y(Tζy,s)|2)1/2P(A)ΣTP(A)Σ(T+1). A.29

Taking expectation on both sides of (A.27) and using the estimates from (A.28) and (A.29), we have

E[lnSy,s(Tζy,s)]-21A[lns]-2P(A)+K4(T+1)P(A)[lns]-+K4T2,

for some positive constant K4.

Let M3 be a positive constant such that

ay-12yΣy-ay-12yΣyM3y-y|,y,yΔ. A.30

From now on, we assume that ε(0,1) is chosen small enough to satisfy the following

M3+2ε+sup{0sε,yΔ}{b(sy)y}<r4-3r2(1-3ε)+2K4ε<-r A.31

Lemma A.5

For ε satisfying (A.31), there is δ(ε)=δ(0,1) and T(ε)=T>1 such that

Plns+3rT4lnSy,s(T)<01-3ε A.32

for all (y,s)Δ×(0,δ).

Proof

Since Δ is a petite set of {Y~(t):t0}, in view of Meyn and Tweedie (1993, Theorem 6.1), there are γ1 and γ2>0 such that

P~(t,y,·)-νTVγ1exp(-γ2t),yΔ,t[0,). A.33

where P~(t,y,·) is the transition probability of {Y~(t):t0}. Let M4=maxyΔ{|ay-12yΣy|}<. In view of (2.8) and (A.33), we have

1TE0TaY~y(t)-12Y~y(t)ΣY~y(t)dt-rT1T0TΔay-12yΣyP~(t,y,dy)-ν(dy)M4T0TP~(t,y,·)-νTVdtM4γ1T. A.34

On one hand, letting My,s(T) be defined as (A.26), we have from Itô’s isometry that

EMy,s(T)T2=1T2E0TYy,s(t)ΣYy,s(t)dtΣT. A.35

With standard estimation techniques, it follows from (A.34) and (A.35) that for any ε>0, there is a T=T(ε) such that

P1T0TaY~y(t)-12Y~y(t)ΣY~y(t)dt-r<ε>1-ε,yΔ, A.36

and

PMy,s(T)T<ε>1-ε,(y,s)Δ×(0,). A.37

By virtue of Proposition A.2, (A.30), and (A.36), there is δ=δ(ε,T)(0,ε) such that

P(Ω1y,s)>1-2ε,(y,s)Δ×(0,δ)

where

Ω1y,s:=0TaYy,s(t)-12Yy,s(t)ΣYy,s(t)dt>T(r-(M3+1)ε)Sy,s(t)<ε,t[0,T].

Using yb(sy)<r4for all(y,s)Δ×(0,ε) from (A.31) we have that on the set Ω2y,s:=Ω1y,sMy,s(T)T<ε the following holds

0>lnεlnSy,s(T)=lns+My,s(T)-0TYy,s(t)b(Sy,s(t)Yy,s(t))dt+0TaYy,s(t)-12Yy,s(t)ΣYy,s(t)dtlns+r-(M3+2)ε-sup{0sε,yΔ}{b(sy)y}Tlns+3r4T. A.38

Noting

PΩ2y,s1-3εfor all(y,s)Δ×(0,δ),

the proof is complete.

Proposition A.3

Assume r>0. Let δ and T be as in Lemma A.5. There exists a positive constant K=K(δ,T) such that

E[lnSy,s(T)]-2[lns]-2-rT[lns]-+K A.39

for any (y,s)Δ×(0,).

Proof

We look at three cases of the initial data (y,s).

Case I s(0,δ). We have from Lemma A.5 that P(Ω2y,s)1-3ε where Ω2y,s is defined as in the proof of Lemma A.5. On Ω2y,s, we have

-lns-3rT4-lnSy,s(T)>0.

Hence,

0[lnSy,s(T)]-[lns]--3rT4.

Squaring both sides yields

[lnSy,s(T)]-2[lns]-2-3rT2[lns]-+9r2T216,

which implies that

E1Ω2y,s[lnSy,s(T)]-2P(Ω2y,s)[lns]-2-3rT2P(Ω2y,s)[lns]-+9r2T216P(Ω2y,s). A.40

On Ω3y,s:={ζy,s<T} with ζy,s defined in (A.24), since lnSy,s(ζy,s)=0, we have from Lemma A.3 and the strong Markov property of (Y(t),S(t)) that

E1Ω3y,s[lnSy,s(T)]-2P(Ω3y,s)K2exp(K3T). A.41

On the set Ω4y,s:=Ω\(Ω2y,sΩ3y,s), applying Lemma A.4 and noting that ζy,s>T in Ω4y,s and T>1, we obtain

E1Ω4y,s[lnSy,s(T)]-2[lns]-2P(Ω4y,s)+2K4TP(Ω4y,s)[lns]-+K4T2. A.42

Adding (A.40), (A.41), and (A.42) side by side, we get

E[lnSy,s(T)]-2[lns]-2+(-3r2(1-3ε)+2K4ε)T[lns]-+K5(T)[lns]-2-rT[lns]-+K5(T), A.43

where K5(T) is a positive constant independent of (y,s)Δ×(0,δ).

Case II s[δ,1]. We have from Lemma A.3 that

E[lnSy,s(T)]-2E[lnSy,s(T)]2[lns]2+K2exp(K3T)([lnδ]2+1)K2exp(K3T). A.44

Case III s(1,). Note that if ζy,s>T, then [lnSy,s(T)]-2=0. Thus, using Lemma A.3 and the strong Markov property of (Y(t),S(t)) once more, we obtain

E[lnSy,s(T)]-2=E1{ζy,s<T}[lnSy,s(T)]-2K2exp(K3T). A.45

Combing (A.43), (A.44), and (A.45), and setting K=max{K5(T),([lnδ]2+1)K2exp(K3T)}, the proof is concluded.

Theorem A.2

Suppose that Assumptions 2.2 and 2.3 hold and that r>0. Let P(t,(y,s),·) be the semigroup of the process ((Y(t),S(t))t0. Then, there exists an invariant probability measure π of the process ((Y(t),S(t))t0 on Δ×(0,). Moreover, π(Δ×(0,))=1,π is absolutely continuous with respect to the Lebesgue measure on Δ×(0,) and

limttqP(t,(y,s),·)-π(·)TV=0,(y,s)Δ×(0,), A.46

where ·TV is the total variation norm and q is any positive number. In addition, for any initial value (y,s)Δ×(0,) and any π-integrable function f, we have

PlimT1T0TfYy,s(t),Sy,s(t)dt=Δ×(0,)f(y,s)π(dy,s)=1. A.47

Proof

By virtue of Lemma A.2, there is an h1:=1-exp-γbT>0 satisfying

ESy,s+1s+1-h1s+K1s+1-h1s+1+K1+h1,(y,s)Δ×(0,). A.48

Let V(s)=s+1+[lns]-2. In view of Proposition A.3 and (A.48),

EV(Sy,s(T))s+1-h2(s+1+[lns]-)+H2V(s)-h22V(s)+H2for all(y,s)Δ×(0,), A.49

where h2=min{h1,rT},H2=H1+h1+K1. Let κ>1 such that

V(s)2H2for alls[κ-1,κ]. A.50

Combining (A.49) and (A.50), we arrive at

EV(Sy,s(T))V(s)-h24V(s)+H21{(y,s)Δ×[κ-1,κ]}for all(y,s)Δ×(0,). A.51

Using the estimate (A.51), Lemma A.1, and Theorem A.1, the Markov chain (Y(kT),S(kT))k0 has a unique invariant probability measure π and

kP(kT,(y,s),·)-πTV0ask. A.52

As a direct consequence, for fixed y0,s0, the family {P(kT,(y0,s0),·),kN} is tight, that is, for any θ>0, there is a compact set KθΔ×(0,) such that

P(kT,(y0,s0),Kθ)>1-θfor allkN. A.53

Since s2+ln2s as s0 or s, in view of Lemmas A.2 and A.3 and a standard estimate, there is a κθ>1 such that

PSy,s(t)[κθ-1,κθ]>1-θ,for all(y,s)Kθ,t[0,T],

or equivalently,

Pt,(y,s),Δ×[κθ-1,κθ]>1-θfor all(y,s)Kθ,t[0,T]. A.54

Using the Chapman-Kolmogorov relation together with (A.53) and (A.54) yields

Pu,(y0,s0),Δ×[κθ-1,κθ]>1-2θfor allu0,

which implies that the family of empirical measures 1T0TP(u,(y0,s0),·)du,T>0 is tight in Δ×(0,). Thus (Y(t),S(t)) has an invariant probability measure π on Δ×(0,) (see e.g., Evans et al. 2015, Proposition 6.4). As a result, the Markov chain (Y(kT),S(kT))kN has an invariant probability measure π. In view of (A.52), π must coincide with π. Thus, π is an invariant probability measure of the process (Y(t),S(t))t0 on Δ×(0,).

In the proofs, we used the function [lns]-2 for the sake of simplicity. In fact, we can treat [lns]-1+q for any small q(0,1) in the same manner. We can show that there are hq,Hq>0, and a compact set KqΔ×(0,) satisfying

EVq(Sy,s(T))Vq(s)-hq[Vq(s)]11+q+Hq1{(y,s)Kq},(y,s)Δ×(0,), A.55

where Vq(s):=s+1+[lns]-1+q. Then applying Theorem A.1, we obtain

k1/qP(kT,(y,s),·)-π0ask. A.56

Let f:Δ×(0,)[-1,1] be a measurable function. Since π is an invariant measure, then for any u0,

Δ×(0,)f(y,s)π(dy,ds)=Δ×(0,)π(dy1,ds1)Δ×(0,)P(u,(y1,s1),(y,ds))f(y,s).

Using this equality and the Chapman–Kolmogorov equation, we have

|f(y,s)(P(t+u,(y,s),dy,ds)-π(dy,ds)|=|Δ×(0,)P(t,(y,s),dy1,ds1)-π(dy1,ds1)×Δ×(0,)(f(y,s)P(t,(y1,s1),(dy,ds))|P(t,(y,s),·)-πTVsince|Δ×(0,)(f(y,s)P(t,(y1,s1),(dy,ds)|1for ally1,s1,

which means that P(t,(y,s),·)-πTV is decreasing in t. As a result, we deduce from (A.55) that

tqP(t,(y,s),·)-πTV0ast,

where q=1/q(1,).

In view of Proposition A.1, for any t>0,P{Yy,s(t)Δ}=1. Thus,

π(Δ×(0,))=Δ×(0,)P{Yy,s(t)Δ}π(dy,ds)=π(Δ×(0,))=1.

By Kallenberg (2002, Theorem 20.17), our process (Y(t),S(t))t0 is either Harris recurrent or uniformly transient on Δ×(0,). Using Kallenberg (2002, Theorem 20.21), our process cannot be uniformly transient and also have an invariant probability measure. Therefore, our process is Harris recurrent. Kallenberg (2002, Theorem 20.17) further indicates that any Harris recurrent Feller process on Δ×(0,) with strictly positive transition densities has a locally finite invariant measure that is equivalent to Lebesgue measure and is unique up to normalization. Since we already know that (Y(t),S(t))t0 has a unique invariant probability measure, this probability measure has an almost everywhere strictly positive density with respect to the Lebesgue measure.

Appendix B: The case r<0

Theorem B.1

Suppose that r<0. Then for any i=1,,n and any x=(x1,,xn)R+n,

PlimtlnXix(t)t=r=1. B.1

In particular, for any i=1,,n and any x=(x1,,xn)R+n

PlimtXix(t)=0=1.

Proof

Let θ>0 and aˇi=ai+θ, and define the process Xˇx(t)=(Xˇ1x(t),,Xˇnx(t)) as the solution to

dXˇi(t)=Xˇi(t)(aˇi)+j=1nDjiXˇj(t)dt+Xˇi(t)dEi(t),i=1,,n B.2

started at x=(x1,,xn)R+n. Letting Sˇ(t)=Xˇi(t) and Yˇ(t)=Xˇ(t)S(t), we have

dYˇ(t)=diag(Yˇ(t))-Yˇ(t)Yˇ(t)ΓdB(t)+DYˇ(t)dt+diag(Yˇ(t))-Yˇ(t)Yˇ(t)(aˇ-ΣYˇ(t))dtdlnSˇ(t)=aˇYˇ(t)-12Yˇ(t)ΣYˇ(t)dt+Yˇ(t)ΓdB(t) B.3

Let (Yˇy(t),Sˇy,s(t)) be the solution to (B.3) with initial condition (y,s). Note that Yˇy(t) does not depend on s. First, fix y0Δ. We have that

limt1t0taˇYˇy0(u)-12Yˇy0(u)ΣYˇy0(u)du+0tYˇy0(u)ΓdB(u)=rˇ:=Δaˇy-12yΣyνˇ(dy),P-a.s., B.4

where νˇ is the unique invariant probability measure of (Yˇ(t))t0. By the continuous dependence of r on the coefficients (established in the Proposition D.1), there is θ>0 such that rˇ<r2<0. Let δ>0 such that sup{-bi(x):x<δ,i=1,,n}<θ (this is possible since the bi’s are continuous and vanish at 0). Because rˇ<0, it follows from (B.4) that

supt[0,)0taˇYˇy0(s)-12Yˇy0(s)ΣYˇy0(s)ds+0tYˇy0(s)ΓdB(s)<P-a.s..

As a result, for any ε>0, there is an Hε>0 satisfying

Psupt[0,)0taˇYˇy0(u)-12Yˇy0(u)ΣYˇy0(u)du+0tYˇy0(u)ΓdB(u)<Hε>1-ε,

which combined with (B.3) implies that

Psupt[0,)Sˇy0,s0(t)<δ>1-εifs0<δexp(-Hε). B.5

Then, a comparison argument shows (see Remark A.2) that for x0=s0y0R+n and i=1,,n

PXix0(t)Xˇix0(t),t[0,ξx0)=1 B.6

where ξx0=inf{t0:i=1nXˇix0(t)δ}. By virtue of (B.5), P{ξx0=}>1-ε if s0<δexp(-Hε). Using (B.4) and (B.6) yields that

Plim suptlnSy0,s0trˇ<0>1-εifs<δexp(-Hε). B.7

Thus, the process (Y(t),S(t))t0 is not a recurrent diffusion process in Δ×(0,). Hence, it must be transient with probability 1, that is, for any compact K(0,) and any initial value (y,s)Δ×(0,) we have

Plimt1{Sy,s(t)K}=0=1. B.8

In view of Lemma A.2,

PlimtSy,s(t)==0. B.9

It follows from (B.8) and (B.9) that PlimtSy,s(t)=0=1 for any (y,s)Δ×(0,). Moreover, since (Y~(t)){t0} has a unique invariant probability measure ν, on the boundary Δ×{0},(Y(t),S(t)) has a unique invariant probability measure ν×δ0, where δ0 is the Dirac measure concentrated on {0}. Fix (y,s)Δ×(0,), and define the normalized occupation measures,

Πt(·)=1t0t1{(Yy,s(u),Sy,s(u))·}du.

Since PlimtSy,s(t)=0=1, the family Πk(·),kN is tight in the space Δ×[0,) for almost every ω. In view of the proofs of Evans et al. (2015, Theorem 4.2) or Schreiber et al. (2011, Theorems 4, 5) the set of weak limit points of {Πk,kN} is a nonempty set of invariant probability measures of the process (Y(t),S(t)). As pointed out above, the process (Y(t),S(t)) has only one invariant probability measure, namely, ν×δ0. Thus, for almost every ωΩ,{Πk(·),kN} converges weakly to ν×δ0 as k. As a result, for any bounded continuous function g(·,·):Δ×[0,)R we have limk1k0kg(Yy,s(t),Sy,s(t))dt=Δg(y,0)ν(dy)P-a.s. Since g(·,·) is bounded, we easily obtain

limT1T0Tg(Yy,s(t),Sy,s(t))dt=Δg(y,0)ν(dy)P-a.s. B.10

Consequently,

limT1T0TaYy,s(t)-12Yy,s(t)ΣYy,s(t)dt=rP-a.s B.11

Since PlimtSy,s(t)=0=1 and bi(0)=0,i=1,,n, we have by Dominated Convergence that

limT1T0TYy,s(t)b(Sy,s(t)Yy,s(t))dt=0P-a.s. B.12

Applying the strong law of large numbers for martingales to the process (My,s(t))t0 defined by (A.26), we deduce

limTMy,s(T)T=0P-a.s. B.13

Note that

lnSy,s(T)T=lnsT+My,s(T)T-1T0TYy,s(t)b(Sy,s(t)Yy,s(t))dt+1T0TaYy,s(t)-12Yy,s(t)ΣYy,s(t)dt B.14

Applying (B.11), (B.12), and (B.13) to (B.14), we obtain

limTlnSy,s(T)T=r,P-a.s. B.15

In light of (B.15), to derive PlimTlnXix(T)T=r=1, it suffices to show PlimTlnYiy,s(T)T=0=1 for each i=1,,n. In view of Itô’s lemma,

lnYiy,s(T)T=lnyiT+1T0Tai-j=1najYjy,s(t)-Dii-σii2+j,k=1nσkj2Yky,s(t)Yjy,s(t))dt+1T0T-bi(Sy,s(t)Yiy,s(t))+j=1nYjy,s(t)bj(Sy,s(t)Yjy,s(t))dt+1T0Tj=1,jinDjiYjy,s(t)Yiy,s(t)dt+1T0TdEi(t)-j=1nYjy,s(t)dEj(t), B.16

and

lnY~iy(T)T=lnyiT+1T0Tai-j=1najY~jy(t)-Dii-σii2+j,k=1nσkj2Y~ky(t)Y~jy(t))dt+1T0Tj=1,jinDjiY~jy(t)Y~iy(t)dt+1T0TdEi(t)-j=1nY~jy(t)dEj(t). B.17

By the strong laws of large numbers for martingales,

limT1T0TdEi(t)-j=1nY~jy(t)dEj(t)=0,P-a.s. B.18

Let Gi=supyΔai-j=1najyj-Dii-σii2+j,k=1nσkj2ykyj<. As a result of (B.17) and (B.18) and the fact that lim supTlnY~iy(T)T0 almost surely, we obtain

lim supT1T0Tj=1,jinDjiY~jy(t)Y~iy(t)dtGi,P-a.s. B.19

For any k>0, it follows from (B.18) and the strong law of large numbers that

Δkj=1,jinDjiyjyiν(dy)=limT1T0Tkj=1,jinDjiY~jy(t)Y~iy(t)dtGi

Letting k we have

ρi:=Δj=1,jinDjiyjyiν(dy)Gi,

which implies

limT1T0Tj=1,jinDjiY~jy(t)Y~iy(t)dt=ρi. B.20

Using (B.18), (B.20), and applying the strong law of large numbers for the process (Y~(t))t0, we arrive at

limTlnY~iy(T)T=βi+ρi0,P-a.s., B.21

where

βi:=Δai-j=1najyj-Dii-σii2+j,k=1nσkj2ykyjν(dy).

If βi+ρi<0, then Y~iy(T)0 almost surely as T, which contradicts the fact that Y~y(T) converges weakly to ν that is concentrated on Δ. As a result, βi+ρi=0. For any θ>0, there is kθ>0 such that

Δkθj=1,jinDjiyjyiν(dy)>ρi-θ.

Using (B.10), we have with probability 1 that

lim infT1T0Tj=1,jinDjiYjy,s(t)Yiy,s(t)dtlimT1T0Tkθj=1,jinDjiYjy,s(t)Yiy,s(t)dtρi-θ. B.22

and

limT1T0Tai-j=1najYjy,s(t)-Dii-σii2+j,k=1nσkj2Yky,s(t)Yjy,s(t))dt=βi. B.23

Applying (B.22), (B.23), and the fact PlimTSy,s(T)=0=1 to (B.16), we obtain that

lim infTlnYiy,s(T)Tβi+ρi-θ=-θ,P-a.s.

Since it holds for any θ>0, we have

limTlnYiy,s(T)T=0,P-a.s.

The above equality combined with (B.15) and Xi(T)=Yi(T)S(T) yield the desired result.

Appendix C: Degenerate diffusion in Rn

If the correlation matrix Σ is degenerate, the diffusion Y~(t) from (2.6) still has an invariant probability measure ν since it is a Feller-Markov process in a compact set. Moreover, ν(Δ)=1 because the property that PY~(t)Δ,t>0=1 is satisfied as long as Assumption 2.2 holds, that is, the dispersion matrix (Dij) is irreducible. It is readily seen that the following is true.

Theorem C.1

Assume that Y~(t) has a unique invariant probability measure ν. Define r by (2.8). Suppose that r<0. Then for any i=1,,n and any x=(x1,,xn)R+n

PlimtlnXix(t)t=r=1. C.1

In particular, for any i=1,,n and any x=(x1,,xn)R+n

PlimtXix(t)=0=1.

Remark C.1

The Markov process {Y~(t),t0} has a unique invariant probability measure if it is irreducible. Moreover, since P{Y~y(t)>0for allt>0}=1 for any yΔ, we need only check its irreducibility in Δ. To prove that the diffusion {Y~(t),t0} is irreducible in Δ, we pursue the following approach:

  • First, we show that the process {Y~(t),t0} verifies Hörmander’s condition. As a result, the process {Y~(t),t0} has a smooth density function for any t>0; see e.g., Rey-Bellet (2006).

  • Next, we show that there is an open set NΔ such that for any open set N0N, and yΔ, there is a t0>0 such that P{Y~y(t0)N0}>0. This claim is usually proved by analyzing the control systems corresponding to the diffusion and using the support theorem. We refer to Kliemann (1987) and Rey-Bellet (2006) for more details. This then shows that the process {Y~(t),t0} is irreducible in Δ.

Now we consider the case r>0. We still assume that {Y~(t):t0} has a unique invariant probability measure. In order to obtain Theorem 2.1 for our degenerate process, we have to show that there is a sufficiently large T>0 such that the Markov chain (Y(kT),S(kT))kN is irreducible and aperiodic and every compact subset of Δ×(0,) is petite for this Markov chain. Note that if every compact subset of Δ×(0,) is petite with respect to (Y(kT),S(kT))kN, then any compact subset of Δ×(0,) is petite with respect to (Y(kT),S(kT))kN by the arguments in the proof of Lemma A.1.

Sufficient conditions for the above properties can be obtained by verifying the well-known Hörmander condition as well as investigating the control systems associated with the diffusion (2.4). Once we have the Markov chain (Y(kT),S(kT))kN being irreducible and aperiodic, and every compact subset of Δ×(0,) being petite for sufficiently large T, we can follow the steps from Appendix A to obtain the following result.

Theorem C.2

Assume that Y~(t) has a unique invariant probability measure ν. Define r by (2.8). Suppose that Assumption 2.2 holds and that r>0. Assume further that there is a sufficiently large T>0 such that the Markov chain (Y(kT),S(kT))kN is irreducible and aperiodic, and that every compact set in Δ×(0,) is petite for this Markov chain.

The process X(t)=(X1(t),,Xn(t))t0 has a unique invariant probability measure π on R+n, that is absolutely continuous with respect to the Lebesgue measure and for any q>0,

limttqPX(t,x,·)-π(·)TV=0,xR+n,, C.2

where ·,·TV is the total variation norm and PX(t,x,·) is the transition probability of (X(t))t0. Moreover, for any initial value xR+n\{0} and any π-integrable function f, we have

PlimT1T0TfXx(t)dt=R+n,f(u)π(du)=1. C.3

C.1: Case study: n=2

In what follows, we show that if r>0, there is a sufficiently large T>0 such that the Markov chain (Y(kT),S(kT))kN is irreducible and aperiodic, and that every compact set in Δ×(0,) is petite for the Markov chain.

For simplicity of presentation, we restrict ourselves to the n=2 case, and assume that bi(x)=bix,x0,i=1,2 for some b1,b2>0. As a result, (2.1) becomes

dX1(t)=(X1(t)(a1-b1X1(t))-αX1(t)+βX2(t))dt+σ1X1(t)dB(t)dX2(t)=(X2(t)(a2-b2X2(t))+αX1(t)-βX2(t))dt+σ2X2(t)dB(t), C.4

where σ1,σ2 are non-zero constants and (B(t))t0 is a one dimensional Brownian motion.

Setting S(t)=X1(t)+X2(t) and Yi(t)=Xi(t)/S(t),i=1,2, we have from Itô’s Lemma,

dYi(t)=Yi(t)ai-j=12ajYj-biS(t)Yi(t)+S(t)j=12bjYj2(t))dt+(-1)iαY1(t)-βY2(t)dt+Yi(t)j,k=12σkσjYk(t)Yj(t))-j=12σiσjYj(t)dt+(-1)i(σ2-σ1)Y1(t)Y2(t)dB(t)dS(t)=S(t)i=12(aiYi(t)-Yi(t)biS(t)Yi(t)dt+S(t)(σ1Y1(t)+σ2Y2(t))dB(t). C.5

We use the process (Y1(t),Y2(t),S(t))t0 to construct a Lyapunov function for a suitable skeleton (Y1(kT),Y2(kT),S(kT))kN as we have done in Appendix A. However, to simplify the computations when verifying the hypotheses of Theorems C.1 and C.2, instead of working with (Y1(t),Y2(t),S(t)), we treat the system (Z(t),X2(t)) where Z(t):=X1(t)/X2(t). An application of Itô’s Lemma yields

dZ(t)=((b2-b1Z(t))Z(t)X2(t)+β+a^1Z(t)-αZ2(t))dt+Z(t)[σ1-σ2]dB(t)dX2(t)=X2(t)((a^2-b2X2(t))+αZ(t))dt+σ2X2(t)dB(t), C.6

where a^1=a1-a2-α+β+σ22-σ1σ2 and a^2=a2-β.

To proceed, we first convert (C.6) to Stratonovich form to facilitate the verification of Hörmander’s condition. System (C.6) can be rewritten as

dZ(t)=((b2-b1Z(t))Z(t)X2(t)+β+a^1-(σ1-σ2)22Z(t)-αZ2(t))dt+Z(t)[σ1-σ2]dB(t)dX2(t)=X2(t)a^2-σ222-b2X2(t)+αZ(t)dt+σ2X2(t)dB(t). C.7

Let

A0(z,y)=(b2-b1z)zy+β+a^1-(σ1-σ2)22z-αz2ya^2-σ222-b2y+αzy,

and

A1(z,y)=(σ1-σ2)zσ2y.

Recall that the diffusion (C.7) is said to satisfy Hörmander’s condition if the set of vector fields A1, [A1,A0], [A1,[A1,A0]], [A0,[A1,A0]], spans R2 at every (z,y)R+2,, where [·,·] is the Lie bracket, which is defined as follows (see Rey-Bellet 2006 for more details). If Φ(z,y)=(Φ1(z,y),Φ2(z,y)) and Ψ(z,y)=(Ψ1(z,y),Ψ2(z,y)) are vector fields on R2 (where z denotes the transpose of z), then the Lie bracket [Φ,Ψ] is a vector field given by

[Φ,Ψ]j(z,y)=Φ1(z,y)Ψjz(z,y)-Ψ1(z,y)Φjz(z,y)+Φ2(z,y)Ψjy(z,y)-Ψ2(z,y)Φjy(z,y),j=1,2.
Proposition C.1

Suppose that σ1σ2 or β+(b2/b1)(a1-a2)-α(b2/b1)20. Then Hörmander’s condition holds for the diffusion (Z(t),X2(t))t0 given by (C.7). As a result, the transition probability P(t,(z,y),·) of (Z(t),X2(t))t0 has a smooth density R+×R+4(t,z,y,z,y)p(t,z,y,z,y)R+ with respect to Lebesgue measure.

Proof

Set σ:=σ1-σ2σ2. By a direct calculation,

A2(z,y):=1σ2[A0,A1](z,y)=σ(β+αz2)+(σ+1)b1z2y-zyb2-σαzy+b2y2,

and for k>2, we have

Ak(z,y):=1σ2[A1,Ak-1](z,y)=σk-1(β+(-1)kαz2)+(-1)k(σ+1)2b1z2y+(-1)k+1zyb2(-1)k+1σ2αzy+(-1)kb2y2..

If σ0 or equivalently σ1σ2, a straightforward but tedious computation shows that the rank of the matrix with columns A1,A2,A3,A4 is always 2 for any (z,y)R+2,. As a result, if σ1σ2, Hörmander’s condition is satisfied for the diffusion (C.7). Therefore, the transition probability P(t,(z,y),·) of (Z(t),X2(t)) has a smooth density function, denoted by p(t,z,y,z,y); see Rey-Bellet (2006, Corollary 7.2).

Now, we show that Hörmander’s condition holds if σ1=σ2 and β+(b2/b1)(a1-a2-α+β)-α(b2/b1)20. In this case,

A2(z,y)=[A0,A1](z,y)=-αyz(b2-b1z)αb2y2,

and

C(z,y)=C1(z,y)C2(z,y):=A0,1αb2A2(z,y),

where

C1(z,y)=y(2b1z/b2-1)A0,1(z,y)+yz(1-zb1/b2)A0,1(z,y)z+z(zb1/b2-1)A0,2(z,y)+y2z(zb1/b2-1).

With A0,i(z,y) denoting the i-th component of A0(z,y). Observe that A1(x,y),A2(z,y) span R2 for any (z,y)R+2, satisfying zb2/b1. If z=b2/b1 we have C1(b2/b1,y)=yA0,1(b2/b1,y)=y[β+(b2/b1)(a1-a2-α+β)-α(b2/b1)2]0 hence C(b2/b1,y) and A2(b2/b1,y) span R2 for all y>0. As a result, we obtain the desired result.

To proceed, we consider the following control system, which is associated with (C.7).

dzϕ(t)=(b2-b1zϕ(t))zϕ(t)yϕ(t)+β+a^1-(σ1-σ2)22zϕ(t)-αzϕ2(t)+(σ1-σ2)zϕϕ(t)dyϕ(t)=yϕ(t)a^2-σ222-b2yϕ(t)+αzϕ(t)yϕ(t)+σ2yϕ(t)ϕ(t) C.8

Let (zϕ(t,z,y), yϕ(t,z,y)) be the solution to equation (C.8) with control ϕ and initial value (zy). Denote by O1+(z,y) the reachable set from (zy), that is the set of (z,y)R+2, such that there exists a t0 and a control ϕ(·) satisfying zϕ(t,z,y)=z,yϕ(t,z,y)=z. We first recall some concepts introduced in Kliemann (1987). Let U be a subset of R+2, satisfying u2O1+(u1)¯ for any u1,u2U. Then there is a unique maximal set VU such that this property still holds for V. Such V is called a control set. A control set C is said to be invariant if O1+(w)¯C¯ for all wC.

Finding invariant control sets for (C.8) is facilitated by using a change of variables argument. Put wϕ(t)=zϕ(t)yϕr+1(t) with r=-σ1σ2. We have

dwϕ(t)=h(wϕ(t),yϕ(t))dtdyϕ(t)=yϕ(t)a^2-σ222-b2yϕ(t)+αwϕ(t)yϕ-r(t)+σ2yϕ(t)ϕ(t), C.9

where

h(w,y)=wa1-σ122+ra2-σ222+rβ-α-b1wyr-b2ry+βy1-rw-1+αrwyr-1.

Denote by O2+(w,y) the set of (w,y)R+2, such that there is a t>0 and a control ϕ(·) such that wϕ(t,w,y)=w,zϕ(t,w,y)=w.

Lemma C.1

The control system (C.9) has only one invariant control set C~ and O2+(w,y)¯C~ for any (w,y)R+2,, The set C~ is defined by C~={(w,y)R+2,:w<c}, where

c=supw:supy>0{h(w,y)}0for allw<w.

Consequently, the control system (C.8) has only one invariant control set C and O1+(z,y)¯C for any (w,y)R+2,, where C:={(z,y)R+2,:zyr+1c}. Moreover, by Kliemann (1987, Lemma 4.1), (Z(t),X2(t)) has at most one invariant probability measure whose support is C.

Proof

First, we need to show that c is well-defined (although it can be +). Since limw0h(w,y)=, which implies that w:supy>0{h(w,y)}0for allww is a nonempty set. Hence c is well-defined. The claim that O2+(w,y)¯C~ for any (w,y)R+2, can be proved by standard arguments. Let us explain the main ideas here. On the phase space (w,y)R+2,, since the control ϕ(t) only appears in the equation of yϕ, we can easily control vertically, that is, for any initial points y0 and w0, there is a control so that yϕ can reach any given point y1 while wϕ stays in a given neighborhood of w0. If h(w0,y0)<0, we can choose a feedback control such that (wϕ(t),uϕ(t)) reaches a point to the ‘left’ (w1,y0) with w1<w0 as long as h(w,y0)<0 for w[w1,w0]. Likewise, for h(w0,y0)>0, we can choose a feedback control such that (wϕ(t),uϕ(t)) can reach a point to the ‘right’ (w1,y0) with w1>w0 as long as h(w,y0)>0 for w[w0,w1]. We also have that infy>0{h(w,y)}=- for any w>0. Using these facts, we can follow the steps from Du et al. (2016, Section 3) to obtain the desired results.

Lemma C.2

There is a point (z,y)C such that for any open set N(z,y) and T>0, there is an open neighborhood W(z,y) and a control ϕ such that

(zϕ(t,z,y),yϕ(t,z,y))Nfor all(z,y)W,t[0,T].
Proof

To obtain the result, we work on (C.9), which is equivalent to (C.8). By the definition of C~ and the fact that limyh(w,y)=- if r>0 and limy0h(w,y)=- if r<0, there is a point (w,y)C~ such that h(w,y)=0. We can design a feedback control ϕ such that

dwϕ(t)=h(wϕ(t),y)dtdyϕ(t)=0. C.10

If wϕ(t)=w then wϕ(t)=wfor allt>0. By the continuous dependence on initial data of solutions to differential equations, for any given neighborhood N~ of (w,y), we can find a neighborhood W~ of (w,y) such that (wϕ(t,w,y),yϕ(t,w,y))N~ for any t[0,T] and (w,y)W~, which proves the lemma.

Proposition C.2

Suppose σ1σ2 or β+(b2/b1)(a1-a2)-α(b2/b1)20. For any T>0, every compact set KR+2, is petite set with respect to the Markov chain (Z(kT),X2(kT))kN.

Proof

Let (z,y) be as in Lemma C.2. Pick (z,y)R+2, such that p(T,z,y,z,y)>0. By the smoothness of p(T,·,·,·,·), there exists a neighborhood N and an open set N(z,y) such that

p(1,z,y,z,y)p>0for all(z,y)N,(z,y)N. C.11

Let W be a neighborhood of (z,y) satisfying

(zϕ(t,z,y),yϕ(t,z,y))Nfor all(z,y)W,t[0,T]. C.12

For each (z,y)R+2,, noting that (z,y)CO1+(z,y)¯, there is a control ϕ and tz,y>0 such that

(zϕ(tz,y,z,y),yϕ(tz,y,z,y))W. C.13

Let nz,yN such that (nz,y-1)T<tz,ynn,yT and ϕ~ be defined as ϕ~(t)=ϕ(t) if t<tz,y and ϕ~(t)=ϕ(t) if t>tz,y. Using the control ϕ~, we obtain from (C.12) and (C.13) that

zϕ~(nz,yT,z,y),yϕ~(nz,yT,z,y)N. C.14

In view of the support theorem (see Ikeda and Watanabe 1989, Theorem 8.1, p. 518),

P(nz,yT,z,y,N):=2ρz,y>0.

Since (Zz,y(t),Yz,y(t)) is a Markov–Feller process, there exists an open set Vz,y(z,y) such that P(nz,yT,z,y,N)ρu,vfor all(z,y)Vz,y. Since K is a compact set, there is a finite number of Vzi,yi,i=1,,k0 satisfying Ki=1k0Vzi,yi. Let ρK=min{ρzi,yi,i=1,,k0}. For each (z,y)K, there exists nzi,yi such that

P(nzi,yiT,z,y,N)ρK. C.15

From (C.11) and (C.15), for all (z,y)K, there exists nzi,yi such that

p((nzi,yi+1)T,z,y,z,y)ρKpfor all(z,y)N. C.16

It follows from (C.16) that

1k0i=1k0P((nzi,yi+1)T,z,y,A)1k0ρKpm(NA)for allABR+2,, C.17

where m(·) is the Lebesgue measure on R+2,. Equation (C.17) implies that every compact set KR+2, is petite for the Markov chain (Z(kT),X2(kT))kN.

We have shown in the beginning of Sect. 2.2. that Y~(t) has a unique invariant probability measure ν. Having Proposition C.2, we note that the assumptions, and therefore the conclusions, of Theorems C.1 and C.2 hold for model (C.4). This argument proves Theorems 2.5 and 2.6.

Appendix D: Robustness of the model

The robustness is studied from several angles, including continuous dependence of r on the coefficients of the stochastic differential equation, robustness of persistence, and robust attenuation against extinction. They are presented in a couple subsections.

D.1: Continuous dependence of r on the coefficients

We show that r depends continuously on the coefficients of the stochastic differential equation (2.6). Consider the equation

dY^(t)=diag(Y^(t))-Y^(t)Y^(t)Γ^dB(t)+D^Y^(t)dt+diag(Y^(t))-Y^(t)Y^(t)(a^-Σ^Y^(t))dt D.1

on the simplex Δ. Suppose that Σ^ is positive definite. In this case, (Y^(t))t0 has a unique invariant probability measure ν^. Define

r^:=Δa^y-12yΣ^yν^(dy) D.2

Fix the coefficients of 2.6.

Proposition D.1

For any ε>0, there is a θ2>0 such that if

maxa-a^,D-D^,Γ-Γ^<θ2

then

|r-r^|<ε.
Proof

First, let θ1>0 such that if maxa-a^,D-D^,Γ-Γ^<θ1, then

a^y-12yΣ^y-ay-12yΣy<ε3for allyΔ. D.3

Let γ1,γ2,M3,M4 be defined as in the proof of Lemma A.5. Pick T=T(ε)>0 such that

P~(T,y,·)-νTVγ1exp(-γ2T)<ε3M4for allyΔ. D.4

By standard arguments, there is a θ2(0,θ1) such that if maxa-a^,D-D^,Γ-Γ^<δ2, then

PY~y(T)-Y^y(T)<ε6M3>ε6M4for allyΔ D.5

Let y be a Δ-valued and F0-measurable random variable whose distribution is ν^. Clearly,

Δay-12yΣyν^(dy)=EaY^y(T)-12(Y^y(T))ΣY^y(T). D.6

In view of (D.4),

EaY~y(T)-12(Y~y(T))ΣY~y(T)-rM4supyΔ{P~(t,y,·)-μ}ε3. D.7

It follows from (D.5) that

EaY^y(T)-12(Y^y(T))ΣY^y(T)-aY~y(T)+12(Y~y(T))ΣY~y(T)M3ε6M3PY~y-Y^y<ε6M3+M4PY~y-Y^yε6M3ε3. D.8

In view of (D.2), (D.3), (D.6), (D.7), and (D.8), if

maxa-a^,D-D^,Γ-Γ^<θ2

then |r-r^|<ε, which completes the proof.

Remark D.1

The continuous dependence of r on the coefficients can also be proved by generalizing the arguments from the proof of Evans et al. (2013, Proposition 3). Since Evans et al. (2013, Proposition 3) focuses only on the continuity for a specific parameter rather than all parameters, we provided an alternative proof for the sake of completeness.

D.2: Robust persistence and extinction

Sketch of proof of Theorem 2.8

As usual, we work with

dY^(t)=diag(Y^(t))-Y^(t)Y^(t)Γ^(S^(t)Y^(t))dB(t)+D^(S^(t)Y^(t))Y^(t)dt+diag(Y^(t))-Y^(t)Y^(t)(a^-Σ^(S^(t)Y^(t))Y^(t)-b^(S^(t)Y^(t)))dtdS^(t)=S^(t)a^-b^(S^(t)Y^(t))Y^(t)dt+S^(t)Y^(t)Γ^(S^(t)Y^(t))dB(t), D.9

where S^(t):=iX^i(t),Y^(t):=X^(t)S^(t). In order to have a complete proof for this proposition one can follow the steps from Appendix A. First, since Σ is positive definite then so is Σ^(x):=Γ^(x)Γ^(x) if supxR+n,Γ^(x)-Γ is sufficiently small. As a result, (X^(t))t0 is a nondegenerate diffusion in R+n, and Lemma A.1 holds for (Y^(nT),S^(nT))nN. We also have the following results: there exist positive constants K^i:i=1,,4, which do not depend on θ as long as θ is sufficiently small, such that

ES^y,s(t)e-γbt/2s+K^1,(y,s)Δ×(0,),t0. D.10
E[lnS^y,s(T)]2((lns)2+1)K^2exp{K^3T},(y,s)Δ×(0,),T0, D.11

and

E[lnS^y,s(Tζ^y,s)]-2(lns)2+K^4P(A)(T+1)[lns]-+K^4T2 D.12

for all (y,s)Δ×(0,1),AF where

ζ^y,s:=inf{t0:S^y,s(t)=1}.

On the other hand, standard arguments show that for any ε>0,T>0, there is a θ=θ(ε,T)>0 such that

P(Yy,s(t),Sy,s(t))-(Y^y,s(t),S^y,s(t))ε,0tT>1-ε

given that (y,s)Δ×[0,1]. Combining this fact with Proposition A.2, one can find δ=δ(ε,T)>0 and θ=θ(ε,T)>0 such that

P(Y~y,s(t),0)-(Y^y,s(t),S^y,s(t))ε,0tT>1-ε

given that (y,s)Δ×(0,δ) and (2.20) holds. With this fact, we can use Lemma A.5 with slight modification to show that, for any ε>0, there is a T=T(ε) and δ=δ(ε,T),θ=θ(ε,T) such that

Plns+3rT4lnS^y,s(T)<01-3εfor all(y,s)Δ×(0,δ) D.13

given that (2.20) holds. Having (D.10), (D.11), (D.12), and (D.13), we can use the arguments from Proposition A.3 and Theorem A.2 to finish the proof.

Remark D.2

If r<0,X(t) converges to 0 with probability 1. By virtue of Proposition D.1, if D^,Γ^ are constant matrices and maxa-a^,D-D^,Γ-Γ^ is sufficiently small then X^(t) converges to 0 with an exponential rate almost surely. We conjecture that this result holds for any θ-perturbation of X(t) defined by (2.20). However, when D^:=D^(x),Γ^:=Γ^(x), comparison arguments may be not applicable. Moreover, it is also difficult to analyze the asymptotic behavior of the equation without competition terms, namely

dX^(t)=diag(X^(t))a^+D^(X^(t))X^(t)dt+diag(X^(t))Γ^(X^(t))dB(t). D.14

Footnotes

A. Hening was in part supported by EPSRC Grant EP/K034316/1.

The research of D. Nguyen and G. Yin was supported in part by the National Science Foundation under grant DMS-1710827.

Contributor Information

Alexandru Hening, Email: Alexandru.Hening@tufts.edu, Email: a.hening@imperial.ac.uk.

Dang H. Nguyen, Email: dangnh.maths@gmail.com

George Yin, Email: gyin@math.wayne.edu.

References

  1. Altenberg L. The evolution of dispersal in random environments and the principle of partial control. Ecol Monogr. 2012;82(3):297–333. doi: 10.1890/11-1136.1. [DOI] [Google Scholar]
  2. Assing S, Manthey R. The behavior of solutions of stochastic differential inequalities. Probab Theory Relat Fields. 1995;103(4):493–514. doi: 10.1007/BF01246336. [DOI] [Google Scholar]
  3. Bascompte J, Possingham H, Roughgarden J. Patchy populations in stochastic environments: critical number of patches for persistence. Am Nat. 2002;159(2):128–137. doi: 10.1086/324793. [DOI] [PubMed] [Google Scholar]
  4. Benaïm M, Schreiber SJ. Persistence of structured populations in random environments. Theor Popul Biol. 2009;76(1):19–34. doi: 10.1016/j.tpb.2009.03.007. [DOI] [PubMed] [Google Scholar]
  5. Benaïm M, Hofbauer J, Sandholm WH. Robust permanence and impermanence for stochastic replicator dynamics. J Biol Dyn. 2008;2(2):180–195. doi: 10.1080/17513750801915269. [DOI] [PubMed] [Google Scholar]
  6. Blath J, Etheridge A, Meredith M. Coexistence in locally regulated competing populations and survival of branching annihilating random walk. Ann Appl Probab. 2007;17(5–6):1474–1507. doi: 10.1214/105051607000000267. [DOI] [Google Scholar]
  7. Cantrell RS, Cosner C. The effects of spatial heterogeneity in population dynamics. J Math Biol. 1991;29(4):315–338. doi: 10.1007/BF00167155. [DOI] [Google Scholar]
  8. Cantrell RS, Cosner C, Lou Y. Evolutionary stability of ideal free dispersal strategies in patchy environments. J Math Biol. 2012;65(5):943–965. doi: 10.1007/s00285-011-0486-5. [DOI] [PubMed] [Google Scholar]
  9. Caswell H. Matrix population models. New York: Wiley Online Library; 2001. [Google Scholar]
  10. Chesson P. General theory of competitive coexistence in spatially-varying environments. Theor Popul Biol. 2000;58(3):211–237. doi: 10.1006/tpbi.2000.1486. [DOI] [PubMed] [Google Scholar]
  11. Chueshov I. Monotone random systems theory and applications. Berlin: Springer Science & Business Media; 2002. [Google Scholar]
  12. Cross PC, Lloyd-Smith JO, Johnson PLF, Getz WM. Duelling timescales of host movement and disease recovery determine invasion of disease in structured populations. Ecol Lett. 2005;8(6):587–595. doi: 10.1111/j.1461-0248.2005.00760.x. [DOI] [Google Scholar]
  13. Davies KF, Chesson P, Harrison S, Inouye BD, Melbourne B, Rice KJ. Spatial heterogeneity explains the scale dependence of the native-exotic diversity relationship. Ecology. 2005;86(6):1602–1610. doi: 10.1890/04-1196. [DOI] [Google Scholar]
  14. Dennis B, Patil GP. The gamma distribution and weighted multimodal gamma distributions as models of population abundance. Math Biosci. 1984;68(2):187–212. doi: 10.1016/0025-5564(84)90031-2. [DOI] [Google Scholar]
  15. Dieu NT, Nguyen DH, Du NH, Yin G. Classification of asymptotic behavior in a stochastic sir model. SIAM J Appl Dyn Syst. 2016;15(2):1062–1084. doi: 10.1137/15M1043315. [DOI] [Google Scholar]
  16. Du NH, Nguyen DH, Yin G. Conditions for permanence and ergodicity of certain stochastic predator-prey models. J Appl Probab. 2016;53:187–202. doi: 10.1017/jpr.2015.18. [DOI] [Google Scholar]
  17. Durrett R, Remenik D. Evolution of dispersal distance. J Math Biol. 2012;64(4):657–666. doi: 10.1007/s00285-011-0444-2. [DOI] [PubMed] [Google Scholar]
  18. Evans SN, Ralph PL, Schreiber SJ, Sen A. Stochastic population growth in spatially heterogeneous environments. J Math Biol. 2013;66(3):423–476. doi: 10.1007/s00285-012-0514-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Evans SN, Hening A, Schreiber SJ. Protected polymorphisms and evolutionary stability of patch-selection strategies in stochastic environments. J Math Biol. 2015;71(2):325–359. doi: 10.1007/s00285-014-0824-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Garay BM, Hofbauer J. Robust permanence for ecological differential equations, minimax, and discretizations. SIAM J Math Anal. 2003;34(5):1007–1039. doi: 10.1137/S0036141001392815. [DOI] [Google Scholar]
  21. Geiß C, Manthey R. Comparison theorems for stochastic differential equations in finite and infinite dimensions. Stoch Process Appl. 1994;53(1):23–35. doi: 10.1016/0304-4149(94)90055-8. [DOI] [Google Scholar]
  22. Gonzalez A, Holt RD. The inflationary effects of environmental fluctuations in source-sink systems. Proc Nat Acad Sci. 2002;99(23):14872–14877. doi: 10.1073/pnas.232589299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Hardin DP, Takáč P, Webb GF. Asymptotic properties of a continuous-space discrete-time population model in a random environment. J Math Biol. 1988;26(4):361–374. doi: 10.1007/BF00276367. [DOI] [Google Scholar]
  24. Hardin DP, Takáč P, Webb GF. A comparison of dispersal strategies for survival of spatially heterogeneous populations. SIAM J Appl Math. 1988;48(6):1396–1423. doi: 10.1137/0148086. [DOI] [Google Scholar]
  25. Hardin DP, Takáč P, Webb GF. Dispersion population models discrete in time and continuous in space. J Math Biol. 1990;28(1):1–20. doi: 10.1007/BF00171515. [DOI] [PubMed] [Google Scholar]
  26. Harrison S, Quinn JF. Correlated environments and the persistence of metapopulations. Oikos. 1989;56(3):293–298. doi: 10.2307/3565613. [DOI] [Google Scholar]
  27. Hastings A. Can spatial variation alone lead to selection for dispersal? Theor Popul Biol. 1983;24(3):244–251. doi: 10.1016/0040-5809(83)90027-8. [DOI] [Google Scholar]
  28. Hutson V, Schmitt K. Permanence and the dynamics of biological systems. Math Biosci. 1992;111(1):1–71. doi: 10.1016/0025-5564(92)90078-B. [DOI] [PubMed] [Google Scholar]
  29. Ikeda N, Watanabe S. Stochastic differential equations and diffusion processes. Amsterdam: North-Holland Publishing Co.; 1989. [Google Scholar]
  30. Jansen VAA, Yoshimura J. Populations can persist in an environment consisting of sink habitats only. Proc Nat Acad Sci. 1998;95(7):3696–3698. doi: 10.1073/pnas.95.7.3696. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Jarner SF, Roberts GO. Polynomial convergence rates of Markov chains. Ann Appl Probab. 2002;12(1):224–247. doi: 10.1214/aoap/1015961162. [DOI] [Google Scholar]
  32. Kallenberg O. Foundations of modern probability. Berlin: Springer; 2002. [Google Scholar]
  33. Karlin S. Classifications of selection-migration structures and conditions for a protected polymorphism. Evol Biol. 1982;14(61):204. [Google Scholar]
  34. Kendall BE, Bjørnstad ON, Bascompte J, Keitt TH, Fagan WF. Dispersal, environmental correlation, and spatial synchrony in population dynamics. Am Nat. 2000;155(5):628–636. doi: 10.1086/303350. [DOI] [PubMed] [Google Scholar]
  35. Khasminskii R (2012) Stochastic stability of differential equations, volume 66 of Stochastic modelling and applied probability, 2nd edn. Springer, Heidelberg (With contributions by G. N. Milstein and M. B. Nevelson)
  36. Kirkland S, Li C-K, Schreiber SJ. On the evolution of dispersal in patchy landscapes. SIAM J Appl Math. 2006;66(4):1366–1382. doi: 10.1137/050628933. [DOI] [Google Scholar]
  37. Kliemann W. Recurrence and invariant measures for degenerate diffusions. Ann Probab. 1987;15:690–707. doi: 10.1214/aop/1176992166. [DOI] [Google Scholar]
  38. Law R, Morton RD. Permanence and the assembly of ecological communities. Ecology. 1996;77:762–775. doi: 10.2307/2265500. [DOI] [Google Scholar]
  39. Legendre P. Spatial autocorrelation: trouble or new paradigm? Ecology. 1993;74(6):1659–1673. doi: 10.2307/1939924. [DOI] [Google Scholar]
  40. Liebhold A, Koenig WD, Bjørnstad ON. Spatial synchrony in population dynamics. Annu Rev Ecol Evolut Syst. 2004;35:467–490. doi: 10.1146/annurev.ecolsys.34.011802.132516. [DOI] [Google Scholar]
  41. Mao X. Stochastic differential equations and their applications. Chichester: Horwood Publishing Limited; 1997. [Google Scholar]
  42. Meyn SP, Tweedie RL. Stability of Markovian processes. III. Foster–Lyapunov criteria for continuous-time processes. Adv Appl Probab. 1993;24(3):518–548. [Google Scholar]
  43. Mierczyński J, Shen W. Lyapunov exponents and asymptotic dynamics in random Kolmogorov models. J Evol Equ. 2004;4(3):371–390. doi: 10.1007/s00028-004-0160-0. [DOI] [Google Scholar]
  44. Nummelin E. General irreducible Markov chains and nonnegative operators, volume 83 of Cambridge tracts in mathematics. Cambridge: Cambridge University Press; 1984. [Google Scholar]
  45. Palmqvist E, Lundberg P (1998) Population extinctions in correlated environments. Oikos 83(2):359–367
  46. Patel S, Schreiber SJ (2016) Robust permanence for ecological equations with internal and external feedbacks. arXiv:1612.06554 [DOI] [PMC free article] [PubMed]
  47. Pyšek P, Hulme PE. Spatio-temporal dynamics of plant invasions: linking pattern to process. Ecoscience. 2005;12(3):302–315. doi: 10.2980/i1195-6860-12-3-302.1. [DOI] [Google Scholar]
  48. Rey-Bellet L (2006) Ergodic properties of Markov processes. In: Attal S, Joye A, Pillet CA (eds) Open quantum systems. II, volume 1881 of Lecture notes in mathematics, pp 1–39. Springer, Berlin
  49. Roth G, Schreiber SJ. Persistence in fluctuating environments for interacting structured populations. J Math Biol. 2014;69(5):1267–1317. doi: 10.1007/s00285-013-0739-6. [DOI] [PubMed] [Google Scholar]
  50. Roy M, Holt RD, Barfield M. Temporal autocorrelation can enhance the persistence and abundance of metapopulations comprised of coupled sinks. Am Nat. 2005;166(2):246–261. doi: 10.1086/431286. [DOI] [PubMed] [Google Scholar]
  51. Rudnicki R. Long-time behaviour of a stochastic preypredator model. Stoch Process Appl. 2003;108(1):93–107. doi: 10.1016/S0304-4149(03)00090-5. [DOI] [Google Scholar]
  52. Schmidt KA. Site fidelity in temporally correlated environments enhances population persistence. Ecol Lett. 2004;7(3):176–184. doi: 10.1111/j.1461-0248.2003.00565.x. [DOI] [Google Scholar]
  53. Schreiber SJ. Criteria for Cr robust permanence. J Differ Equ. 2000;162(2):400–426. doi: 10.1006/jdeq.1999.3719. [DOI] [Google Scholar]
  54. Schreiber SJ (2010) Interactive effects of temporal correlations, spatial heterogeneity and dispersal on population persistence. Proc R Soc Lond B Biol Sci. http://rspb.royalsocietypublishing.org/content/early/2010/02/12/rspb.2009.2006.full. Accessed 01 Dec 2016 [DOI] [PMC free article] [PubMed]
  55. Schreiber SJ. The evolution of patch selection in stochastic environments. Am Nat. 2012;180(1):17–34. doi: 10.1086/665655. [DOI] [PubMed] [Google Scholar]
  56. Schreiber SJ, Li C-K. Evolution of unconditional dispersal in periodic environments. J Biol Dyn. 2011;5(2):120–134. doi: 10.1080/17513758.2010.525667. [DOI] [PubMed] [Google Scholar]
  57. Schreiber SJ, Lloyd-Smith JO. Invasion dynamics in spatially heterogeneous environments. Am Nat. 2009;174(4):490–505. doi: 10.1086/605405. [DOI] [PubMed] [Google Scholar]
  58. Schreiber SJ, Ryan ME. Invasion speeds for structured populations in fluctuating environments. Theor Ecol. 2011;4(4):423–434. doi: 10.1007/s12080-010-0098-5. [DOI] [Google Scholar]
  59. Schreiber SJ, Benaïm M, Atchadé KAS. Persistence in fluctuating environments. J Math Biol. 2011;62(5):655–683. doi: 10.1007/s00285-010-0349-5. [DOI] [PubMed] [Google Scholar]

Articles from Journal of Mathematical Biology are provided here courtesy of Springer

RESOURCES