. 2013 Oct 1;260:145–158. doi: 10.1016/j.physd.2012.11.003

Individual based and mean-field modeling of direct aggregation

Martin Burger a, Jan Haškovec b,, Marie-Therese Wolfram c,d
PMCID: PMC4047626  PMID: 24926113


We introduce two models of biological aggregation, based on randomly moving particles with individual stochasticity depending on the perceived average population density in their neighborhood. In the first-order model the location of each individual is subject to a density-dependent random walk, while in the second-order model the density-dependent random walk acts on the velocity variable, together with a density-dependent damping term. The main novelty of our models is that we do not assume any explicit aggregative force acting on the individuals; instead, aggregation is obtained exclusively by reducing the individual stochasticity in response to higher perceived density. We formally derive the corresponding mean-field limits, leading to nonlocal degenerate diffusions. Then, we carry out the mathematical analysis of the first-order model, in particular, we prove the existence of weak solutions and show that it allows for measure-valued steady states. We also perform linear stability analysis and identify conditions for pattern formation. Moreover, we discuss the role of the nonlocality for well-posedness of the first-order model. Finally, we present results of numerical simulations for both the first- and second-order model on the individual-based and continuum levels of description.

Keywords: Direct aggregation, Density dependent random walk, Degenerate parabolic equation, Mean field limit


► We introduce two stochastic individual-based models of biological aggregation. ► The individual particle stochasticity depends on the perceived average population density. ► Both models exhibit formation of aggregates resulting from random fluctuations in the population density. ► We derive the corresponding mean field description and perform its mathematical analysis. ► Extensive numerical results are presented.

1. Introduction

Animal aggregation is the process of finding a higher density of animals at some place compared to the overall mean density. Its formation may be triggered by some environmental heterogeneity that is attractive to animals (the aggregate forms around the environmental template), by physical currents that trap the organisms through turbulent phenomena, or by social interaction between animals  [1,2]. Aggregation may serve diverse functions such as reproduction, formation of local microclimates, anti-predator behavior (see for instance  [3] for a study of reducing the risk of predation to an individual by aggregation in Aphis varians), collective foraging and much more (see, e.g.,  [4] for a relatively recent survey). Aggregation also plays an important role as an evolutionary step towards social organization and collective behaviours  [5]. These aspects explain the continuing interest in understanding not only the function of animal aggregation, but also the underlying mechanisms. The development of quantitative mathematical models is an essential part in this quest. While such models first of all help to link individual behavioral mechanisms to spatio-temporal group patterns, they also often aim at explaining the observed dynamic efficiency of animal groups to adapt to environmental variation, and, moreover, provide a valuable tool to study in more detail the system dynamics and their robustness to variations in individual behavioral parameters or external parameters, see, e.g.,  [6].

Going back to the pioneering work of Skellam  [7], continuum spatio-temporal population dynamics are traditionally modeled by reaction–convection–diffusion PDEs and systems thereof. In these models, diffusion typically describes the avoidance of crowded areas by the individuals, and as such acts as an anti-aggregative force, working against the typically aggregative effect of convection, see the surveys  [8,9]. Our work goes in the opposite direction. We show that biological aggregations can be a consequence of solely random, diffusive motion of individuals, who respond to the local population density observed in their neighborhood by increasing or decreasing the amplitude of their random motion. This kind of behavior was observed in insects, for instance the pre-social German cockroach Blattella germanica. These animals are known to be attracted to dark, warm and humid places  [10]. However, the works by Jeanson et al.  [11,12] have shown that cockroach larvae also aggregate in the absence of any environmental template or heterogeneity. In this case aggregation is the result of social interactions and happens as a self-organized process. The individual based model developed and parameterized in  [12] identified a simple decisive mechanism that can be summarized in the following way: cockroaches do not rest for a long time in places with few conspecifics, and once moving they stop preferentially in places of high cockroach density. A mathematically better tractable version of this decisive mechanism is the model of individuals performing random walks with density dependent coefficients. In the continuum limit, this leads to degenerate diffusion models, which, however, depending on their parameters and data, might have ill-posed regimes. In particular, Turchin  [13] derived a 1D model for a population with density u(x,t) of the form

ut=x[ϕ(u)ux], (1.1)

where ϕ(u)=(μ/2)uk0u2+(2k0/2ω)u3 with the positive coefficients μ,k0 and ω. Turchin used the above equation to describe the aggregative movement of Aphis varians, a herbivore of fireweeds (Epilobium angustifolium). He also discussed the possible ill-posedness of some initial and boundary value problems associated with (1.1). Depending on the actual profile of ϕ, Turchin also classified two different types of aggregation. Essentially the same model has been derived independently in  [14] as a model of cell motility with volume filling and cell-to-cell adhesion. The authors showed that the diffusivity can become negative if the cell adhesion coefficient is sufficiently large and related this to the presence of spatial oscillations and development of plateaus in their numerical solutions of the discrete model. Moreover, they used a combination of stability analysis of the discrete equations and steady-state analysis of the limiting PDE to gain better understanding of the qualitative predictions of the model. Another work studying an equation of the type (1.2) is  [15], where existence and uniqueness of solutions of the initial value problem with homogeneous Neumann boundary conditions in a bounded domain of Rn was shown and some aspects of the aggregating behavior were studied analytically.

In  [16], following  [17], the authors provide an alternative derivation of (1.1) for low population density, using a biased random walk approach. Then they study the existence of traveling wave solutions for a purely negative or zero diffusion equation with a logistic rate of growth g(u),

ut=x[ϕ(u)ux]+g(u), (1.2)

and discuss the well- and ill-posedness of certain boundary conditions associated with some purely negative diffusion equations with logistic-like kinetic part and provide some numerical examples.

One possibility to overcome the difficulty of the possible ill-posedness of (1.1) and (1.2) is to introduce a nonlocality, i.e., to substitute the term Δϕ(u) by ΔJ with


with a nonnegative kernel K(x,y). A particular choice made in  [18] is to define K(x,y) as the Green function of (IλΔ) for a constant λ>0. Eq. (1.2) becomes then


which is equivalent to


This equation was studied in  [19] and can be considered as a model of aggregating population with a migration rate determined by ϕ and total birth and mortality rates characterized by g. In  [19] it was shown that the aggregating mechanism induced by ϕ(u) allows for survival of a species in danger of extinction and performed numerical simulations suggesting that, for a particular version of ϕ(u), the solutions stabilize asymptotically in time to a not necessarily homogeneous stationary solution. Other two works going in this direction are  [20], which we discus later (Section  4.4), and the model of home range formation in wolves due to scent marking  [21],


Here u(x,t) is the population density of wolves, p(x,t) the density of their scent marks and D0,γ,α and μ positive parameters. The increasing function m(p) describes enhanced scent mark rates in the presence of existing scent marks. A nonlocal version of the model with m depending on the averaged version of p is also considered. The authors of  [21] show that the model produces distinct home ranges; in this case the pattern formation results from the positive feedback interaction between the decreased diffusivity of wolves in the presence of high scent mark densities and the increased production of new scent marks in these locations.

In our paper we introduce and study two models where formation of aggregates results from random fluctuations in the population density and is supported merely by reducing the amplitude of the individual random motion in response to higher perceived density. This leads to nonlocal individual-based and PDE models, where the nonlocality stems from calculating the perceived density as a weighted average over a finite sampling radius. This is usual in modeling biological interactions, see for instance  [1,20,22,23] and many more. Our models were inspired by the above mentioned observations of German cockroach  [11,12], but do not aim to be a realistic description of their social behavior. Instead, we consider our work as a proof of concept, where the main characteristic of our models is that we do not assume any deterministic interaction between the individuals that would actively push them to aggregate. This approach was followed for instance in stochastic run-and-tumble models of chemotaxis  [24]. Closely related to our work is the series of papers  [25–27]. However, only the first-order model under specific conditions was studied there. The new aspects contributed to by our work are the generality of the models (first- and second-order), more rigorous derivation of the mean-field limits based on the generalized BBGKY-hierarchy, and rigorous well-posedness and stability analysis of the first-order model.

Our paper is structured as follows: In Section  2 we introduce two stochastic individual based models, where every individual is able to sense the average population density in its neighborhood, and responds in terms of increased or decreased stochasticity of its movement. In particular, the diffusivity is reduced in response to higher perceived density. We present a first-order model, where the location of each individual is subject to a density-dependent random walk, and a second-order model, where the density-dependent random walk acts on the velocity variable, together with a density-dependent damping term. The advantage of the second-order model is that it is possible to introduce a cone of vision, which depends on the direction of each individual’s movement. In Section  3 we formally derive the corresponding mean-field limits, leading to a nonlocal, nonlinear diffusion equation for the first-order model, and a nonlinear Fokker–Planck kinetic equation for the second-order model. Moreover, we show that the diffusive limit of the kinetic equation leads to the first-order diffusion equation derived before. Then, in Section  4, we show the existence of weak and measure-valued solutions of the first order mean-field model and perform linear stability analysis of the uniform steady states, finding regimes that correspond to the sought-for pattern formation (direct aggregation). Finally, in Section  5 we present the results of numerical simulations of our models. We performed Lagrangian simulations of the first- and second-order agent-based models in a periodic 2D domain, Eulerian simulations of the first-order mean-field model in 1D and 2D periodic domains, and, finally, of the second-order mean-field model in a spatially 1D periodic domain with 1D velocity.

2. The stochastic individual based models

We introduce two models for the dynamics of the set of NN individuals:

  • In the first-order model the individuals are defined by their positions xi(t)Rd,i=1,,N,d1. Every individual is able to sense the average density of its close neighbors, given by
    ϑi(t)=1NjiW(xixj), (2.1)
    where W(x)=w(|x|) with w:R+R+ is a bounded, nonnegative and nonincreasing weight, integrable on Rd. A generic example of w is the characteristic function of the interval [0,R], corresponding to the sampling radius R>0. The average density ϑi is then simply the fraction of individuals located within the distance R from the i-th individual. The individual positions are subject to average density-dependent random walk,
    dxi=G(ϑi)dBit,i=1,,N, (2.2)
    where Bit are independent d-dimensional Brownian motions and G:R+R+ is a bounded, nonnegative and nonincreasing function. Let us note that in our forthcoming analysis we allow for degeneracy in G, where G(s)=0 for ss0>0.
  • In the second-order model the individuals are described not only by their locations xi(t)Rd, but also by the velocities vi(t)Rd,i=1,,N. The advantage of this description, in contrast to the first-order model, is that every individual has a well defined direction of movement. Therefore, the weight W in the calculation of the average densities ϑi can also depend on the relative angle with respect to the individual’s direction of movement,
    ϑi(t)=1NjiW(xixj,vi), (2.3)
    with W(x,v)=w(|x|,xv|x||v|). This is important from the modeling point of view, since we can define not only the sampling radius R>0, but also a restricted cone of vision with angle α(0,π]; we set then w(s,z)=χ[0,R](s)χ[cosα,1](z).
    The velocity in our model is subject to a density-dependent random walk and a density-dependent damping term:
    dxi=vidt, (2.4)
    dvi=H(ϑi)vidt+G(ϑi)dBit. (2.5)
    The function G is as before, while H:R+R+ is a nonnegative and nondecreasing function. The damping term H(ϑi)vi is introduced in order to slow down the agents’ motion when they approach a crowded place.

3. The mean-field limits

3.1. The first-order model

We start with the derivation of the mean-field limit of the first-order model (2.2). Unfortunately, the standard framework of BBGKY hierarchies cannot be applied, since due to the structure of the model it is impossible to obtain a hierarchy where the evolution of the k-th marginal is expressed in terms of a finite number of higher-order marginals. Therefore, we have to apply the recently developed technique of introduction of auxiliary variables  [28] and their elimination after the mean-field limit passage. For this, we make the additional assumption WC1(Rd). This actually excludes the generic choice of W(x)=w(|x|) with w a characteristic function of an interval, as mentioned in Section  2. However, from the modeling point of view this is not a concern, since we can work with a smoothed version of the characteristic function instead.

We extend the model by introducing the average densities ϑi given by (2.1) as a new set of independent variables, governed by the system of stochastic differential equations

dϑi=1NjiW(xixj)(G(ϑi)dBitG(ϑj)dBjt). (3.1)

Let us point out that the random walks Bit,Bjt in (3.1) are correlated with those in (2.2). Using the Itô formula, we turn to the equivalent formulation of the stochastic system (2.2), (3.1) in terms of the corresponding Fokker–Planck equation (cf.  [29,30])


where fN=fN(t,x1,ϑ1,,xN,ϑN) is the N-particle distribution function. Defining the k-particle marginals fkN by


we obtain the so-called BBGKY-hierarchy for the system (fkN)k=1N. In particular, the equation for f1=f1(t,x,ϑ) reads

f1Nt=12Δx(G(ϑ)2f1N)+ϑx(G(ϑ)2Rd0W(xy)f2N(x,ϑ,y,σ)dσdy)+122ϑ2(G(ϑ)2RdRd00W(xy)W(xz)f3N(x,ϑ,y,σ,z,τ)dσdτdydz)+12N2ϑ2(Rd0G(ϑ)2|W(xy)|2f2N(x,ϑ,y,σ)dσdy), (3.2)

where for the sake of legibility we dropped the indices at x1 and ϑ1. Now, we pass formally to the limit N, assuming that limNfkN=fk for all k1. Moreover, we admit the usual molecular chaos assumption about vanishing particle correlations as N,

fk(t,x1,ϑ1,,xk,ϑk)=i=1kf1(t,xi,ϑi)for all  k2.

Then, one obtains from (3.2) the one-particle equation

f1t=12Δx(G(ϑ)2f1)+ϑ(G(ϑ)2(Wf1)f1)+122ϑ2(G(ϑ)2|Wf1|2f1), (3.3)

where the operator is defined as


Finally, we reduce (3.3) to obtain the standard mean-field description of the system (2.2) by removing the auxiliary variable ϑ. Indeed, a relatively lengthy, but straight-forward formal calculation (analogous to  [28,31]) shows that (3.3) possesses weak solutions of the form

f1(t,x,ϑ)=ϱ(t,x)δ(ϑWϱ(t,x)),with  Wϱ(t,x)=RdW(xy)ϱ(t,y)dy

and ϱ=ϱ(t,x) satisfies the nonlinear diffusion equation

ϱt=12Δx(G(Wϱ)2ϱ). (3.4)

3.2. The second-order model

With the same procedure as before we derive the formal mean-field limit of the model (2.4)(2.5). We omit the details here and immediately give the resulting kinetic Fokker–Planck equation for the particle distribution function f=f(t,x,v),

ft+vxf=v(H(Wf)vf+12v(G(Wf)2f)), (3.5)

where, with a slight abuse of notation, the convolution operator is defined as


Let us make the following observation: If the weight W does not depend on v, such that Wf does not depend on v as well and Wf(t,x)=Wϱ(t,x), we can write the (non-closed) system for the mass, momentum and energy densities

ϱ(t,x)=Rdf(t,x,v)dv,ϱu(t,x)=Rdf(t,x,v)vdv,ϱE(t,x)=12Rdf(t,x,v)|v|2dv, (3.6)

associated with the solution f of (3.5) as

ϱt+x(ϱu)=0, (3.7)
(ϱu)t+x(Rdf(t,x,v)vvdv)=H(Wϱ)ϱu, (3.8)
(ϱE)t+x(12Rdf(t,x,v)|v|2vdv)=2H(Wϱ)ϱE+d2G(Wϱ)2ϱ. (3.9)

We observe that only mass is conserved, while momentum and energy are not (neither locally nor globally). Indeed, the momentum is dissipated due to the “friction” term, whose strength depends nonlocally on ϱ due to H(Wϱ). The energy is, on one hand, dissipated due to the same term, on the other hand is created due to the diffusive term in (3.5), at the rate d2EG(Wϱ)2. In equilibrium, we have H(Wϱ)ϱu=0, which means that we either have empty regions (ϱ=0) or regions with positive density, but zero velocity. In these populated regions the equilibrium energy is given by


3.3. Diffusive limit of the second-order model (3.5)

We show that the first-order model (3.4) is obtained from (3.5) in the formal diffusive limit, under the assumption that the weight W does not depend on v, which we make throughout this section. Let us recall that then Wf does not depend on v as well and Wf(t,x)=Wϱ(t,x), with ϱ defined by (3.6).

We start by observing that the equilibria of the collision operator of (3.5) are given by the local Maxwellians

M[ϱ](v)=(H(Wϱ)2πG(Wϱ)2)d/2ϱexp(H(Wϱ)G(Wϱ)2|v|2). (3.10)

Therefore, at equilibrium the mean velocity u vanishes if ϱ0, and the “statistical temperature” T, defined by


depends nonlocally on ϱ and is given by

T=T[Wϱ]=G(Wϱ)22H(Wϱ). (3.11)

Consequently, in crowded regions (where Wϱ is high) the temperature is low (“freezing of the aggregates”), while the uninhabited regions are “hot”.

We introduce the diffusive scaling with the small parameter ε>0 to (3.5),


and consider the Hilbert expansion in terms of ε,f=M[ϱ]+εg, where M[ϱ] is given by (3.10). Moreover, we perform the Taylor expansion of H(Wϱ), which we formally write as




and similarly for G(Wϱ)=G0+εG1+O(ε2). Here, ϱM and ϱg denote the velocity averages of the lowest order terms in the expansion, i.e.


Then, collecting terms of order ε, we obtain


which yields, after multiplication by v and integration,

d2x(ϱMT[WϱM])=xRdMvvdv=H0Rdgvdv, (3.12)

with T[WϱM] given by (3.11). Collecting terms of order ε2 and integrating with respect to v, we obtain


and using (3.12), we finally obtain the nonlinear diffusion equation

ϱMtd2x(1H(WϱM)x(T[WϱM]ϱM))=0. (3.13)

Observe that with the choice Hconst., (3.13) reduces to (3.4), possibly up to a linear rescaling of time.

4. Mathematical analysis of the first-order model

In this section we show the existence of weak solutions of the first-order nonlinear diffusion equation (3.4) and some asymptotic properties of these solutions. For simplicity, we consider the full space setting Ω=Rd,d1, but our analysis can be easily adapted to the case of a bounded domain Ω with homogeneous Neumann boundary conditions, or to the case of periodic boundary conditions, as in the numerical examples of Section  5.

To simplify the notation, we set F(z)12G(z)2, thus we rewrite (3.4) as

ϱt=Δ(F(Wϱ)ϱ), (4.1)

subject to the initial condition

ϱ(t=0)=ϱ0. (4.2)

For the rest of this section and without further notice, we make the following reasonable assumptions on F and W:

  • F:R+R+ is a bounded, nonnegative and nonincreasing function.

  • F is continuously differentiable with globally Lipschitz continuous derivative and G=2F is globally Lipschitz continuous.

  • WW1,(Rd)H2(Rd).

Definition 1

We call ϱL(0,T;P(Ω)), where P(Ω) denotes the set of probability measures on Ω, a weak solution of (4.1) subject to the initial condition (4.2) with 0ϱ0L2(Ω)P(Ω), if (F(Wϱ)ϱ)L2(0,T;L2(Ω)) and for every smooth, compactly supported test function φCc([0,T)×Ω) we have

0Ωϱφtdxdt+Ωϱ0φ(t=0)dx=0Ω(F(Wϱ)ϱ)φdxdt. (4.3)

The proof of the existence of weak solutions in the sense of Definition 1 will be performed in two steps. First, we consider an approximating, uniformly parabolic equation, and prove the existence of its solutions. Then, we remove the approximation in a limiting procedure, to obtain global in time distributional solutions.

4.1. Approximation

In order to obtain a uniformly positive diffusion coefficient, we use the approximation Fε(z)F(z)+ε for ε>0, and analyze the approximating equation ϱt=Δ(Fε(Wϱ)ϱ), which we write in the Fokker–Planck form

ϱt=(ϱFε(Wϱ)+Fε(Wϱ)ϱ) (4.4)

subject to the initial condition

ϱ(t=0)=ϱ0. (4.5)

Lemma 1

For every ε>0,T>0 and ϱ0L2(Ω)there exists a nonnegative weak solution

ϱεL2(0,T;H1(Ω))L(0,T;L2(Ω))H1(0,T;H1(Ω)) (4.6)

of   (4.4)   in the sense of   Definition  1   (with Fε in place of F ).


The announced solution is obtained by standard application of the Schauder fixed point theorem, see, e.g.  [32], (4.4) being a uniformly parabolic equation including a drift term with the velocity field Fε(Wu)L((0,T)×Ω). □

Next, we derive uniform in ε a-priori estimates on ϱε, which will allow us to pass the limit ε0 in (4.4).

Lemma 2

There exists a constant C independent of ε>0 small enough, such that the solutions ϱε of   (4.4), constructed in   Lemma  1, subject to the initial datum ϱ0L2(Ω)P(Ω), satisfy





For the sake of better legibility, we will drop the superscript at ϱε in the proof. Since ϱ is constructed as a nonnegative weak solution of the Fokker–Planck equation (4.4) with the initial datum ϱ0L2(Ω)P(Ω), the first estimate follows immediately due to the mass conservation


Using ϱ as a test function for (4.4) (note that due to (4.6), ϱ is indeed an admissible test function) and the identity


we obtain


for s(0,T), where we introduced the shorthand notation FεFε(Wu). The key point is to use the identity


which leads to


Now, since |G(y)|C for 0ysupxΩWϱ(x)WL, we have




and the Gronwall inequality yields a uniform estimate for ϱ in L(0,T;L2(Ω)), which subsequently implies a uniform estimate for Fε(Wϱ)ϱ in L2(0,T;H1(Ω)) and, subsequently, also for ϱt in L20,T;H1(Ω).

Finally, we derive the estimates for Wϱ. Since WW1,(Rd), we immediately obtain


Moreover, we have


and with WH2(Rd),ϱL(0,T;P(Ω)) and the uniform boundedness of Fε in L((0,T)×Ω), we obtain a uniform bound for W(ϱt) in L(0,T;L2(Ω)). □

4.2. Global existence

From the above approximation and a-priori estimates we can easily pass to the limit ε0 and conclude the existence of a weak solution for (4.1):

Theorem 1

Let ϱ0L2(Ω)P(Ω) and T>0 . Then there exists a solution


of   (4.1)(4.2)   in the sense of the weak formulation   (4.3), such that in addition



We use the uniform estimates of Lemma 2 and the Banach–Alaoglu theorem  [33] to extract a subsequence ϱεn such that

ϱεnϱweakly-* in  L(0,T;L2(Ω)),
ϱεntϱtweakly in  L2(0,T;H1(Ω)),
Fεn(Wϱεn)ϱεnuweakly in  L2(0,T;H1(Ω)),

for some uL2(0,T;H1(Ω)). Moreover, we use a variant of the Aubin–Lions Lemma  [34] to conclude the compact embedding of L(0,T;W1,(Ω))W1,(0,T;L2(Ω)) into L(0,T;C(Ω)). Thus, we can extract a further subsequence, again denoted by ϱεn, such that as εn0,

Wϱεnvstrongly in  L(0,T;C(Ω)).

Since further

WϱεnWϱweakly-* in  L(0,T;L2(Ω)),

we conclude v=Wϱ by the uniqueness of the limit. Due to the continuity properties of F and F, we also have

Fεn(Wϱεn)F(Wϱ)strongly in  L(0,T;C(Ω)),
Fεn(Wϱεn)F(Wϱ)strongly in  L(0,T;C(Ω)),
Fεn(Wϱεn)F(Wϱ)strongly in  L(0,T;C(Ω)).


Fεn(Wϱεn)ϱεnF(Wϱ)ϱweakly-* in  L(0,T;L2(Ω)),

and, again, by the uniqueness of the limit we identify u=F(Wϱ)ϱ. Finally, we use the identity


and the above limits to conclude

(Fεn(Wϱεn)ϱεn)(F(Wϱ)ϱ)weakly in  L2(0,T;L2(Ω)).

Hence, we can pass to the limit εn0 in the weak formulation (4.3) and conclude that ϱ is a weak solution of (4.1)(4.2). □

Finally, we want to reduce the regularity of the initial value towards solely probability measures. This is relevant since we shall see below that indeed Dirac δ distributions can be stationary solutions. For this sake we define the very weak (distributional) notion of the solution:

Definition 2

We call ϱL(0,T;P(Ω)) a very weak (distributional) solution of  (4.1) subject to the initial condition (4.2) with ϱ0P(Ω), if for every smooth, compactly supported test function φCc([0,T)×Ω) we have

0Ωφtϱdxdt+Ωφ(t=0)ϱ0dx=0Ω(F(Wϱ)Δφ)ϱdxdt, (4.7)

where we denote by ϱdx the integration with respect to the probability measure ϱ(t,).

Theorem 2

Let ϱ0P(Ω) and T>0 . Then there exists a very weak solution


of   (4.1)(4.2)   in the sense of   (4.7).


Let us consider a sequence (ϱ0n)nNL2(Ω)P(Ω) such that ϱ0nϱ0 in the tight topology of P(Ω), see, e.g.  [35]. Moreover, let us denote by ϱn the corresponding weak solutions of (4.1)(4.2), which are trivially also the very weak solutions in the sense of (4.7).

We proceed as in the proof of Theorem 3.2 of  [36] to show tight equicontinuity in t and tight locally uniform boundedness of the family ϱn. For a test function φW2,(Ω), we have


and due to the boundedness of F and the mass conservation, we immediately obtain


This implies the equicontinuity in W2,(Ω),

|Ωφϱn(t,x)dxΩφϱn(s,x)dx|C(φ)|ts|. (4.8)

Now if φCb(Ω), then for every δ>0 there exists φδW2,(Ω) such that φφδL(Ω)δ. By the above inequality and the mass conservation ϱL1(Ω)=1, we have


implying the tight equicontinuity of ϱn(t,).

With a test function φR(x)1β(|x|2/R2) with a nonincreasing smooth β such that β(r)=1 for 0r1/2 and β(r)=0 for r1, (4.8) gives


where BR denotes the ball in Rd of radius R. This implies the locally uniform tight boundedness of the family ϱn(t,), and, consequently, by the Prokhorov criterion  [35] we have for a subsequence, again denoted by ϱn,

ϱnϱweakly-* in  L(0,T;P(Ω)).

To show that ϱ is a solution to (4.7), we need to prove that F(Wϱn) converges to F(Wϱ) strongly in L1(0,T;C(Ω)). First, let us note that, due to the assumptions on W,Wϱn is uniformly bounded in L(0,T;W1,(Ω)), and the same holds for F(Wϱn). Consequently, the sequence F(Wϱn)ϱn is uniformly bounded in L(0,T;P(Ω)). From (4.1) it immediately follows that ϱnt=Δ(F(Wϱn)ϱn) is uniformly bounded in L(0,T;(Cc2(Ω))), where (Cc2(Ω)) denotes the dual space to Cc2(Ω), the space of twice continuously differentiable functions with compact support in Ω. Consequently, the sequence tF(Wϱn)=F(Wϱn)Wϱnt is uniformly bounded in L(0,T;(Cc2(Ω))), and the generalization of the Aubin–Lions lemma by Simon  [34] implies then the strong convergence of F(Wϱn) to F(Wϱ) in L1(0,T;C(Ω)). □

4.3. Stationary solutions

The numerical simulations provided in Section  5 suggest that for G>0 (for instance, G(s)=es) and a bounded domain with periodic boundary conditions, the stationary solutions of (3.4) consist of one or more localized, but not compactly supported aggregates (clumps), see the bottom right panel of Figs. 5 and 7. However, we were not able to characterize these stationary aggregates analytically. Instead, we provide a few examples of other types of stationary solutions, posed either in the full space setting Ω=Rd or on a torus Ω=Td with periodically extended W. These examples are rather trivial, however, they still provide an interesting insight into the relatively rich structure of the solutions of (3.4).

  • The most trivial type of stationary solution is the constant state ϱc>0. Clearly, this has finite mass only if Ω=Td.

  • If there exists an s0>0 such that G(s)0 for all ss0, then any profile ϱ such that ϱs0(ΩW(x)dx)1 almost everywhere on Ω is a stationary solution to the distributional formulation (4.7). Indeed, such a solution satisfies Wϱs0, so that G(Wϱ)0. However, again, this solution has infinite mass if Ω=Rd.

  • If there exists an s0>0 such that G(s)0 for all ss0, and, moreover, W is continuous on Ω, we construct the atomic measure
    for some NN,xiΩ and ci>0 such that ciW(0)>s0 for all i=1,,N. Then ϱ is a distributional stationary solution to (4.7). Indeed, for any i=1,,N we have
    By the continuity of W and, consequently, Wϱ, we have Wϱ>s0 on some neighborhood of xi. Therefore, G(Wϱ)0 on some open set containing i=1Nxi and, finally, G(Wϱ)ϱ0 everywhere on Ω.

Fig. 5.

Fig. 5

The first-order mean-field model in a periodic 1D setting with random initial condition (upper left panel). Steady state was reached at t11 (lower right panel).

Fig. 7.

Fig. 7

An example of patterns produced by the first-order mean-field model in a periodic 2D setting with the sampling radii R=0.6 (left panel) and R=0.11 (right panel).

4.4. Linear stability analysis

In this section we perform a linear stability analysis of constant density states of the nonlinear diffusion equation (3.4). As discussed previously, we have constant steady state solutions either in the full-space setting Ω=Rd (however, with infinite mass), or on a bounded domain ΩRd with periodic boundary conditions. Without loss of generality, we assume the normalization ΩW(x)dx=1. Let us make the perturbation ansatz ϱ=ϱ0+εϱ˜, where ϱ0>0 is a constant state such that G(ϱ0)>0 and G(ϱ0)<0. Then Wϱ=ϱ0+εWϱ˜ and assuming sufficient smoothness of G, we have


Inserting the ansatz into  (3.4) and collecting terms of order O(ε), we arrive at


Performing the Fourier transform, we obtain

ϱˆt+|ξ|2G(ϱ0)2(G(ϱ0)+2G(ϱ0)ϱ0Wˆ)ϱˆ=0, (4.9)

where we denoted ϱˆ=ϱˆ(t,ξ) the Fourier transform of ϱ˜. Consequently, with the assumption G(ϱ0)>0 and G(ϱ0)<0, those wavenumbers ξ of ϱ˜ are stable for which

ReWˆ(ξ)<G(ϱ0)2G(ϱ0)ϱ00. (4.10)

Since WL1(Rd), we have WˆC0(Rd), and, therefore, all wavenumbers of ϱ˜ larger than a certain threshold will be stable. On the other hand, wavenumbers violating (4.10) will lead to pattern formation, as we show in our numerical examples, Section  5. Moreover, a quick inspection of (4.9) leads to the expectation that the stable high wavenumbers will be smoothed-out on a faster time scale, compared to the slower emergence of patterns due to the unstable lower wavenumbers. Finally, considering the scalings Wr(x)rdW1(x/r) of a fixed kernel W1L1(Rd), we have Wrˆ(ξ)=W1ˆ(rξ) and, therefore, we expect that the wavelength of the patterns (size of aggregates) will scale with r. This can be also clearly observed in our numerical examples.

It is interesting to consider the formal extremal case W=δ0, i.e., Wϱ˜=ϱ˜. Then Wˆ1 and we conclude that the constant state ϱ0 is stable if and only if

(G(ϱ0)2ϱ0)=G(ϱ0)2+2G(ϱ0)G(ϱ0)ϱ0>0. (4.11)

In fact, violation of (4.11) with W=δ0 leads to an ill-posed problem, since then (3.4) looks like a backward heat equation, which can be seen if we expand the derivatives and write it in the Fokker–Planck form as

ϱt=12(G(ϱ)[2G(ϱ)ϱ+G(ϱ)]ϱ). (4.12)

The equation is parabolic (and thus well-posed) only if the diffusivity 2G(ϱ)G(ϱ)ϱ+G2(ϱ) is strictly positive, which is our stability condition (4.11). Therefore, only imposing an initial condition ϱ0 uniformly satisfying (4.11) leads to a well-posed diffusion equation for all t0, since (4.11) is preserved due to the maximum principle. On the other hand, if G>0, the nonlocality WL1(Rd) always stabilizes the equation in the sense of (4.10). Indeed, writing it in the Fokker–Planck form


we see that the second-order term appears with the positive diffusivity G2(Wϱ). The first-order term describes then the transport of ϱ along the generalized gradient Wϱ and is responsible for the aggregative effect.

Clearly, the ill-posedness of (4.12) can be avoided by merely introducing a nonlocality in the first-order transport term, while the diffusivity may stay local (and possibly degenerate). Such a model was derived and studied in  [20], which with our notation is written as


This equation was constructed as a model for biological aggregations in which individuals experience long-range social attraction and short range dispersal. Let us note that here, in contrast to our model, the diffusivity is an increasing function of the density ϱ. In  [20] it was shown that it produces strongly nonlinear states with compact support and steep edges that correspond to localized biological aggregations, or clumps. Similarly as can be observed in our numerical simulations in Section  5, these clumps are approached through a dynamic coarsening process.

Another insight into the stabilizing effect of the nonlocality is provided by the introduction of a formal expansion of W. Taking W as the standard mollifier and Wε(s)εdW(s/ε), such that Wεδ0 as ε0, we expand


Now, due to the symmetry W(z)=W(z) and the normalization RdW(z)dz=1, we have


where the constant β>0 is such that RdW(z)zizjdz=βδij. Inserting this into  (3.4), we obtain


Up to the terms of fourth order in ε, this is a Cahn–Hilliard-type equation which is well-posed for every ε>0 since the term ε2βG(ϱ)G(ϱ)ϱ is strictly negative if G(ϱ)>0 and G(ϱ)<0. However, if ε=0, i.e. W=δ0, this regularizing effect is lost.

5. Numerical examples

In this section we present numerical examples for the first and second-order individual based models (2.2), (2.4)(2.5) in 2D, the first-order mean-field limit (3.4) in 1D and 2D and the second-order mean-field limit (3.5) in 1D with 1D velocity.

5.1. The first-order individual based model (2.2)

We consider a system consisting of N=400 individuals in a 2D domain Ω=(0,1)×(0,1) with periodic boundary conditions. The initial positions are generated randomly and independently for each individual from the uniform distribution on Ω; we took the same initial condition for all the three experiments below. We choose W(x)=w(|x|) with w the characteristic function of the interval [0,R], corresponding to the sampling radius R, and for R we take the values 0.025 (Fig. 1), 0.05 (Fig. 2) and 0.1 (Fig. 3). For G we make the choice G(s)=exp(s/3). The system of stochastic differential equations (2.2) is integrated in time using the Euler–Maruyama scheme with time-step length t=103. We used the linear stability analysis in Section  5 to make the “right” choice of G, such that we could observe pattern formation. Indeed, if G decreases too quickly, the system will “freeze” immediately, before any aggregates could be formed; on the other hand, if G does not decrease fast enough, the system is “overheated” and does not allow aggregates to persist.

Fig. 1.

Fig. 1

The first-order individual based model with N=400 agents and sampling radius R=0.025, subject to a random initial condition.

Fig. 2.

Fig. 2

The first-order individual based model with N=400 agents and sampling radius R=0.05, subject to a random initial condition.

Fig. 3.

Fig. 3

The first-order individual based model with N=400 agents and sampling radius R=0.1, subject to a random initial condition.

In Figs. 1–3 we observe that with a smaller sampling radius R, a larger number of small aggregates is created on a faster time scale. The choice R=0.1 (Fig. 3) leads to creation of one single aggregate. This aggregate is approximately ring-shaped, with higher density of particles around the circumference and lower density in the middle. This can be explained by the fact that the aggregate grows by “capturing” particles from its neighborhood, and once a particle is captured, its mobility is greatly reduced, so that it only slowly makes its way towards the center of the aggregate. Let us also mention that the right-most panels in Figs. 1–3 present quasi-steady states, where the aggregates are in a dynamic equilibrium with a very few free-running particles. However, on a very long time scale, the smaller aggregates typically disintegrate and their particles are caught by the larger ones. These large aggregates are stable and have approximately fixed radii, i.e., they neither disintegrate nor collapse. Such a coarsening behavior is typical of diffusive aggregation systems, as for instance the Cahn–Hilliard equation, or the nonlocal continuum model of  [20].

5.2. The second-order individual based model (2.4)(2.5)

Generally, the behavior of the second-order individual based model is very similar to this of the first-order one, at least if we do not impose a restricted cone of vision (i.e., W in (2.3) does not depend on v). Indeed, with a suitable choice of parameters, one again observes the formation of quasi-stable aggregates, whose number and size depend on the sampling radius. The most striking difference with respect to the first-order model is that the movement of the agents is smoother (their velocities are continuous) and the shape of the aggregates is more complex (we observed the emergence of ellipsoidal aggregates, instead of the almost-circular ones in the first-order model).

The situation becomes slightly more interesting if we consider a restricted cone of vision—we choose 180°, such that W(x,v)=w(|x|,xv|x||v|) with w(s,z)=χ[0,R](s)χ[0,1](z). The sampling radius is chosen as R=0.05. Again, we simulate with N=400 individuals in a 2D domain Ω=(0,1)×(0,1) with periodic boundary conditions, and G(s)=exp(s/3). For H we set the constant H2. The system of stochastic differential equations (2.4)(2.5) is integrated in time using the Euler–Maruyama scheme with time-step length t=103. The initial positions of the agents are generated randomly in the same way as for the first-order model, while their initial velocities are generated independently and randomly from the 2D normalized Gaussian distribution. Three snapshots of the evolution of the system are shown in Fig. 4, the velocities being marked by the linear elements for each individual. The right-most panel in Fig. 4 shows the quasi-steady state with two aggregates formed, in dynamic equilibrium with the “free-running” individuals. It is obvious that, compared to the previous simulations, the aggregates are less densely packed and their shapes are less circular. Also, the portion of the “free-running” particles is much higher, which clearly is an effect of the restricted cone of vision (the captured particles can leave the aggregate more easily since the randomness of their motion is increased when their cone of vision has small or no intersection with the aggregate—i.e., when they are “not seeing” the aggregate).

Fig. 4.

Fig. 4

The second-order individual based model with N=400 agents, sampling radius R=0.05 and 180° cone of vision, subject to a random initial condition.

5.3. The first-order mean-field model (3.4) in 1D

We simulated the first-order mean-field model (3.4) in the 1D periodic domain (0,1), using semi-implicit finite difference discretization for the space variable and first-order forward Euler method for the time variable. The space grid consisted of 200 equidistant points, the time step was 10−4. As before, we chose W(x)=w(|x|) with w the characteristic function of the interval [0,0.1], and G(s)=exp(s/3). We imposed a random initial condition for ϱ, generated such that for every grid point a random number from the uniform distribution in [0,1] has been drawn. Snapshots of the evolution are shown in Fig. 5. We observe that quite a strong smoothing effect takes place on the fast time-scale (first row in Fig. 5), while aggregation takes place on a time-scale approximately one order of magnitude slower (second row); this is explained with the stability analysis in Section  4.4. First, two aggregates of different sizes are created, however, both of them are unstable—the smaller one is smoothed out, while the larger grows further, until the steady state is reached (lower right panel). Observe also the characteristic “fork”-like shape of the profile in the lower mid panel (t=2.65), which is due to the mass arriving from the neighborhood with higher diffusivity than the diffusivity in the middle of the profile; compare also with the ring-shaped aggregate in the right panel of Fig. 2. However, this fork-like structure is eventually also smoothed out, to finally obtain the steady profile. Let us note that the steady aggregate, although well localized, is not compactly supported, i.e., the profile has positive density everywhere on [0,1].

5.4. The first-order mean-field model (3.4) in 2D

We simulated the first-order mean-field model (3.4) in the 2D periodic domain (0,1)2, using the same type of discretization as in the 1D case. The space grid consisted of 100×100 equidistant points, the time step was 10−4. We chose the sampling radius R=0.07, i.e., W(x)=χ[0,0.7](|x|), and G(s)=exp(s/3). Snapshots of the evolution are shown in Fig. 6. Starting again from a random initial condition, we observed a rapid formation of approximately ring-shaped pre-aggregates, which eventually turn into an almost regular pattern of well localized (but not compactly supported) clumps. However, the smaller aggregates may be unstable and diffusively disintegrate, and their mass is absorbed by their neighbors, as shown on the lower right panel of Fig. 6. Finally, in Fig. 7 we present examples of patterns produced with the sampling radii R=0.6 (left panel) and R=0.11 (right panel).

Fig. 6.

Fig. 6

The first-order mean-field model in a periodic 2D setting with random initial condition (upper left panel). After the initial rapid smoothing of the high-frequency components, several ring-shaped structures are created (upper right panel), which eventually turn into an almost regular pattern of well localized aggregates (lower left panel). However, the smaller aggregates may be unstable and diffusively disintegrate, and their mass is absorbed by their neighbors, (lower right panel).

5.5. The second-order mean-field model (3.5)

We conclude this section with simulations of the second-order mean-field model (3.5) in 1D. The spatial domain Ω=[a,b] is discretized using an equidistant mesh xi=a+iΔx, the velocity domain V=[vmin,vmax] at grid points vj=vmin+jΔv. The time step Δt satisfies the CFL condition for vmax, i.e., Δt=Δx|vmax|. The numerical scheme is based on a splitting method: Given an initial datum f(x,v,0)=f0(x,v), we split the system at every time tk=kΔt into the following steps:

  • 1.
    Solve transport equation in x
    ft+vixf=0, (5.1)
    subject to periodic boundary conditions on Ω, for every vi on the time interval t[tk,tk+12Δt], using an upwind scheme with the superbee flux limiter.
  • 2.
    Starting with the solution of the transport equation (5.1), solve
    ft=v(H(Wf)vf+12v(G(Wf)2f)), (5.2)
    with no flux boundary conditions on V, on the time interval [tk,tk+1], using a semi-implicit time discretization.
  • 3.

    Finally solve (5.1) using the solution of (5.2) for another half time step 12Δt.

We set the computational domain to Ω=[0,1], the velocity domain to V=[1,1] and the mesh sizes to Δx=102 and Δv=2×102. We choose similar conditions as in the individual based model, i.e. a limited cone of vision W(x,v)=w(|x|,xv|x||v|) with w(s,z)=χ[0,R](s)χ[0,1](z). The sampling radius is set to R=0.07, G(s)=exp(2s) and H2. The initial datum corresponds to a small perturbation of a uniform distribution with mass one. Snapshots of the evolution of the particle distribution density f(t,x,v) and the mass density ρ(t,x) at different times are depicted in Fig. 8. We observe a fast smoothing of f(t,x,v) in time and a subsequent formation of a stable aggregate.

Fig. 8.

Second order model with limited vision and R=0.07; the red line in the left column indicates the initial datum. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

If we decrease the sampling radius to R=0.03, two separate aggregates form, see Fig. 9. Again, we observe that with a smaller sampling radius the aggregation happens on a faster time scale.

Fig. 9.

Second order model with limited vision and R=0.03; the red line in the left column indicates the initial datum. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)


JH acknowledges the financial support provided by the Austrian Science Foundation (FWF) project Y 432-N15 and the hospitality of the Faculty of Mathematics, University of Münster during his stay, where the work leading to this paper has been initiated. MTW acknowledges financial support from the Austrian Science Foundation (FWF) via the Hertha Firnberg project T456-N23 and the Award KUK-I1-006-43 by the King Abdullah University of Science and Technology (KAUST).

