Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2021 Jun 25;118(26):e2015572118. doi: 10.1073/pnas.2015572118

The origin of cooperation

Nihal Koduri a,1,2, Andrew W Lo b,c,1
PMCID: PMC8255996  PMID: 34172574

Significance

We develop a mathematical theory of how natural selection operates in the presence of interaction between replicating units. Our results show that with interaction, natural selection does more than seek to selfishly maximize fecundity. It also seeks to minimize correlation of fecundity between replicating units. We argue that correlation is a mechanism by which evolution can select for cooperation. This mechanism is distinct from standard biological explanations like kin selection, group selection, and reciprocity, and relies only on natural selection, and without recourse to notions of evolutionary stability.

Keywords: evolution, cooperation, interaction, correlation, model

Abstract

We construct an evolutionary model of a population consisting of two types of interacting individuals that reproduce under random environmental conditions. We show that not only does the evolutionarily dominant behavior maximize the number of offspring of each type, it also minimizes the correlation between the number of offspring of each type, driving it toward −1. We provide several examples that illustrate how correlation can be used to explain the evolution of cooperation.


Cooperation is a key feature of human evolution, and many theories have been developed to explain its existence. These theories typically introduce auxiliary concepts such as inclusive fitness (1, 2), kin selection (3), group selection (410), multilevel selection (11), and reciprocal altruism (12). These auxiliary concepts are needed to reconcile the apparent conflict between natural selection—which is assumed to select for selfish offspring-maximizing behavior—and unselfish cooperation. Indeed, inclusive fitness, kin selection, group selection, multilevel selection, and reciprocal altruism are all mechanisms in which cooperation is selfish.

In this article, we develop an alternate theory for the evolution of cooperation. We construct a simple, but general, evolutionary model of a population consisting of two types of interacting individuals that reproduce under stochastic environmental conditions. Without making any additional assumptions about the nature of the individuals or their interactions, we characterize precisely how natural selection operates in this setting. We show that natural selection does more than select for selfish offspring-maximizing behavior. It also selects for behavior that minimizes the correlation of fecundity between individuals. In particular, we show, both theoretically and computationally, that behaviors that decrease correlation (i.e., driving it toward 1), cause exponential increases in population size. We argue that correlation is an avenue by which evolution can select for cooperation without any auxiliary assumption that cooperation is selfish.

Our framework also contributes to the mathematical biology literature. Previous theories of evolution and cooperation have their origins in evolutionary game theory (1315) and depend on the concept of evolutionary stability, rather than natural selection. They are also tailored to specific interactions. Economists, similarly, have used evolutionary models to explain how humans interact with one another, but also only in specific economic settings (1619). Our results provide a key step toward a general theory of evolution when replicating units interact with one another and affect each other’s reproductive success, assuming nothing more than the principle of natural selection.

Theory

Our starting point is similar to that of Brennan and Lo (20) and Zhang, Brennan, and Lo (21): A population of individuals reproduces asexually and only once in their lifetime, giving rise to offspring that inherit the behaviors of their parent. “Behavior” in this simple context is modeled abstractly as a single parameter p for a Bernoulli trial that yields one of two possible outcomes for the individual’s number of offspring, each of which is a random variable. Deterministic behavior is modeled as p=0 or 1, and randomizing behavior is any value of p(0,1). The evolutionarily dominant behavior is then determined as the p*[0,1] that maximizes population growth.

In contrast to refs. 20 and 21, we model behavior as a general random variable, not just a Bernoulli trial. More importantly, instead of assuming that every individual in the population has the same distribution of offspring, we allow for two types of individuals with different, but correlated, distributions of offspring, which we argue is necessary and sufficient to capture interaction between individuals.

Model

Assumptions.

We assume the existence of two types of individuals, called A and B, that have different distributions of random offspring. In generation t, each individual of type j produces a random number of offspring xj,t, j=A,B. The vector (xA,t,xB,t) has some well-defined, but arbitrary, joint distribution function. The only assumptions we impose on the vectors (xA,t,xB,t) are that, across generations t, they are independently and identically distributed (IID) and that all first and second moments of (logxA,t,logxB,t) exist and are given by:

μAE[logxA,t],μBE[logxB,t],σA2Var(logxA,t),σB2Var(logxB,t),ρCorr(logxA,t,logxB,t). [1]

Because individuals reproduce only once, the total population sizes at generation T of types A and B are given by:

PA,Tt=1TxA,t,  PB,Tt=1TxB,t. [2]

Note that, because the joint distribution of (xA,t,xB,t) is arbitrary, reproduction of type A and reproduction of type B are generally dependent. This dependence is both the simplest and most general way to capture interactions between reproducing individuals. The generality of this model of interaction can be illustrated with the following example. Denote by Ω the set of all possible states of nature for a given generation. In the simplest case, Ω might contain just two outcomes representing the possibility of rain or shine, each having probability, say, 12. The randomness of xA,t and xB,t then represents different reproductive possibilities in these two states of nature. Suppose that when it rains, type A individuals become predators of type B individuals, but in the absence of rain, the two types do not interact. Therefore, in the rain state, xA,t is large and xB,t is small (say, three and one), and in the sunshine state, xA,t is small and xB,t is large (say, one and three). Since the probability of rain is assumed to be 12, the joint distribution of the vector of random offspring is:

(xA,t,xB,t)=(3,1)with probability12(1,3)with probability12.

Given that the states of nature can include not just rain or shine, but outcomes representing many types of environmental constraints, as well as outcomes representing physiological traits, foraging behavior, predator/prey relations, and so on, this framework can capture a broad spectrum of behaviors and interactions between individuals. But, as we show in the next section, despite the complexity of behaviors that this specification can generate, the characterization of evolutionarily dominant behaviors depends only on the five parameters (μA,μB,σA2,σB2, and ρ) of the joint distribution of xA,t and xB,t.

Observe that a change in behaviors (or traits) of individuals of type A and B will induce a change in the random vector (xA,t,xB,t). For example, in our rain/shine example, if type B individuals are able to evade type A individuals in the rain state, then xA,t may be smaller and xB,t may be larger in that state (say, two and two), in which case the joint distribution of (xA,t,xB,t) becomes:

(xA,t,xB,t)=(2,2)with probability12(1,3)with probability12.

Therefore, analyzing a particular behavior/trait amounts to analyzing the properties of a particular random vector (xA,t,xB,t).

Results.

We consider a behavior to be evolutionarily optimal if the total population, PA,T+PB,T, grows exponentially faster under the behavior than under any other behavior. To see why this is the correct notion of evolutionary optimality that follows from natural selection, suppose types A and B follow a behavior V1. Suppose some subset of individuals of these types (by agreement, chance, or some other mechanism) engages in behavior V2, and suppose that the sum of populations under V2 grows exponentially faster than under V1. After a few generations, the individuals following V2 will exponentially outnumber those following V1, and behavior V2 dominates. This is the principle of natural selection.

Note that this notion of evolutionary optimality is not incompatible with competition between types A and B. The evolutionarily optimal behavior may have one type growing faster than another type, in which case one type will dominate the population.

We proceed to characterize the growth rate of PA,T+PB,T in full generality for an arbitrary offspring vector, (xA,t,xB,t). To determine which among feasible behaviors is evolutionarily optimal, one can simply compare the growth rate of PA,T+PB,T under the offspring vectors induced by each of them. The following two theorems show that

PA,T+PB,TeαT+βT , [3]

for some scalars α and β (proofs for all theorems are provided in SI Appendix).

Theorem 1 (Characterization of α). Let the vector (logxA,t,logxB,t) be IID across t with finite first and second moments denoted by Eq. 1. Let PA,T and PB,T be defined as in Eq. 2. Then, as T increases without bound, we have:

log(PA,T+PB,T)Ta.s.max(μA,μB), [4]

wherea.s.denotes almost sure convergence.

Theorem 2 (Characterization of β). Let the vector (logxA,t,logxB,t) be IID across t with finite first and second moments denoted by Eq. 1. Let PA,T and PB,T be defined as in Eq. 2. Assume that μA=μB=μ. Then, as T increases without bound, we have:

log(PA,T+PB,T)TμTdmax(NA,NB), [5]

whereddenotes convergence in distribution, NA and NB are normally distributed random variables with means zero, variances σA2 and σB2, respectively, and correlation ρ.

Theorem 1 confirms that the key parameters in determining the growth rate of the combined population are μA and μB. That is, the traits or behaviors that increase the number of offspring of type A and B individuals separately also increase the growth rate of the total population. To find which traits or behaviors dominate evolutionarily, we can analyze the effects of the traits and behaviors on the fitness of types A and B separately. This is already well known to evolutionary biologists. Theorem 2, though, suggests something additional about evolution. Not only does the number of offspring of each type matter, but the correlation ρ between the reproductive output of these two types also matters. This correlation is determined by the nature of the interaction between the two types.

The main thrust of Theorems 1 and 2 is that evolutionarily optimal behaviors are those that maximize μ and minimize ρ. SI Appendix contains additional propositions that formalize this result, but, intuitively, it is clear from Theorem 1 that increasing μ will increase α, thereby increasing the growth rate of the total population. From Theorem 2, increasing max(NA,NB) increases β, thus also increasing the growth rate. The parameter ρ comes into play because decreasing the correlation between two normal random variables increases their maximum, so decreasing ρ will also increase the exponential growth rate. Fig. 1 provides computational confirmation of this relation.

Fig. 1.

Fig. 1.

Total population growth vs. time. Values of ρ range from 0.9 (lightest) to 0.9 (darkest). The number of offspring for both types follows a lognormal distribution with μ=0.1 and σ=1.0.

An important caveat to Theorems 1 and 2 is that, because of the IID assumption, they only apply to behaviors that are sustainable over time. For example, one cannot use Theorem 1 to conclude that a purely altruistic behavior on the part of type A that decreases μA and increases μB is evolutionarily optimal through its increase of max(μA,μB). This is because such a behavior would lead to type B growing exponentially faster than type A; hence, the altruistic type will be exponentially outnumbered, and the behavior is not sustainable.

In general, a behavior is sustainable if μA=μB so that both types are growing at the same rate, or if μAμB, but the difference is not due to the nature of the interaction, but, rather, due to better utilization of environmental resources. Naturally, in this article, we are more interested in the case when μA=μB and there can be sustained meaningful interaction.

Although Theorems 1 and 2 show that the effect of ρ on the asymptotic growth rate is second order compared to μ, simulation results show that ρ is often more important than μ in its effect on growth rates over finite time intervals. For example, Table 1 shows that after 10 generations, decreasing ρ from 0.5 to 0.5 has approximately the same effect as increasing μ by about 0.09. More concretely, going from (μ=0.05,ρ=0.5) to (μ=0.0,ρ=0.5) increases the growth rate, despite μ being lower. (Additional simulation results are reported in SI Appendix.) In the same way that the function y=x grows faster than y=x for some time before its growth rate becomes sublinear, an offspring vector with better ρ and worse μ can grow exponentially faster than one with worse ρ and better μ for many generations before being overtaken.

Table 1.

Simulated population growth rate

μ/ρ –0.9 –0.8 –0.7 –0.6 –0.5 –0.4 –0.3 –0.2 –0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.00 21% 20% 19% 18% 18% 17% 16% 15% 15% 13% 12% 11% 10% 9% 8% 7% 5% 4% 2%
0.01 22% 21% 20% 20% 18% 18% 17% 16% 15% 14% 13% 12% 11% 10% 9% 8% 7% 5% 3%
0.02 23% 22% 21% 21% 20% 19% 18% 17% 16% 16% 15% 14% 12% 12% 10% 8% 8% 6% 5%
0.03 24% 24% 23% 22% 21% 20% 19% 19% 18% 17% 16% 15% 14% 13% 11% 10% 9% 7% 5%
0.04 26% 25% 24% 23% 22% 22% 20% 20% 19% 17% 16% 16% 14% 13% 12% 11% 10% 8% 7%
0.05 26% 26% 25% 24% 23% 23% 22% 21% 19% 19% 18% 17% 16% 15% 13% 12% 11% 9% 7%
0.06 28% 27% 26% 26% 24% 24% 23% 22% 21% 20% 19% 18% 16% 16% 15% 14% 12% 10% 8%
0.07 29% 29% 27% 27% 26% 25% 24% 23% 22% 21% 20% 19% 18% 16% 16% 15% 13% 11% 9%
0.08 31% 30% 29% 28% 27% 26% 25% 25% 24% 23% 21% 20% 19% 18% 17% 15% 14% 13% 11%
0.09 32% 31% 30% 29% 28% 28% 27% 26% 25% 24% 22% 22% 21% 20% 18% 17% 15% 13% 12%
0.10 34% 32% 31% 30% 30% 29% 28% 27% 26% 25% 24% 23% 22% 21% 19% 18% 16% 15% 13%
0.11 34% 33% 33% 32% 31% 30% 29% 29% 27% 26% 25% 24% 23% 22% 19% 19% 18% 16% 14%
0.12 36% 35% 34% 34% 32% 31% 30% 29% 28% 28% 27% 26% 23% 23% 22% 20% 20% 17% 15%
0.13 37% 36% 36% 35% 34% 33% 32% 31% 30% 29% 28% 27% 25% 25% 23% 21% 20% 19% 16%
0.14 39% 38% 37% 36% 35% 34% 33% 32% 31% 30% 29% 28% 27% 25% 25% 23% 21% 19% 17%
0.15 40% 39% 39% 38% 37% 36% 35% 33% 33% 31% 30% 29% 27% 27% 25% 24% 22% 21% 19%
0.16 41% 41% 40% 39% 38% 37% 36% 35% 34% 33% 31% 31% 30% 27% 27% 25% 23% 22% 20%
0.17 43% 42% 41% 40% 39% 39% 37% 37% 36% 34% 33% 32% 31% 29% 28% 26% 25% 23% 21%
0.18 44% 43% 42% 42% 41% 39% 39% 39% 36% 36% 34% 33% 32% 30% 29% 28% 26% 24% 22%
0.19 46% 45% 44% 43% 42% 41% 39% 39% 38% 37% 35% 35% 33% 32% 31% 28% 27% 25% 23%
0.20 47% 46% 45% 45% 44% 42% 42% 40% 39% 38% 37% 36% 35% 33% 32% 30% 29% 26% 26%

The number of offspring for both types follows a lognormal distribution with the stated μ and ρ, and with σ =1.0. Quantities shown represent population growth rate per generation after 10 generations, computed as eE[log(P10/P0)]/10, where PT=PA,T+PB,T.

There is yet another reason for taking an interest in finite-horizon growth rates besides developing an understanding of short-term behaviors. The next theorem shows that if the proportion of types in the population is rebalanced every T0 generations—which could be expected in the presence of intraspecies interaction—the finite-horizon growth rate for T0 generations becomes the asymptotic growth rate.

Theorem 3. Let T0 be a positive integer. Let the vector (logxA,t,logxB,t) be IID across t with finite first and second moments denoted by Eq. 1. Let P0>0, let PA,T and PB,T be defined as

PA,nT0+iPnT02t=1nT0+ixA,t,i=1,,T0,n=0,,, [6]

and

PB,nT0+iPnT02t=1nT0+ixB,t,i=1,,T0,n=0,,, [7]

and let PT be defined for multiples of T0 as

PnT0=PA,nT0+PB,nT0,n=1,,. [8]

Then, as n increases without bound, we have:

log(PnT0)nT0a.s.ElogPT0/P0T0. [9]

This theorem implies that if the types in Table 1 are rebalanced every 10 generations, then the population under behavior (μ=0.0,ρ=0.5) grows faster asymptotically than under behavior (μ=0.05,ρ=0.5).

Although we did not assume group selection as a primitive, the fact that, even asymptotically, ρ can be more important than μ illustrates when and how group selection can appear to occur. When a particular behavior decreases ρ enough to offset a decrease in μ, it can appear that selection is happening at a level other than that of the individual.

Idiosyncratic Risk.

So far, we have assumed that reproductive risk—as represented by the random variables {xj,t}—is purely systematic, meaning that all type-j individuals have an identical number of random offspring xj,t in a given state of the world. Suppose we assume that reproductive risk is, instead, partly idiosyncratic, so that type-j individuals can have different numbers of offspring in a given state of the world. Namely, we allow the number of offspring of a type-A individual i at time t to be the sum of xA,t and x¯A,i,t, where x¯A,i,t are independent across individuals i in the same generation. In this setting, populations of type A and B at time T can be written

PA,Ti=1PA,T1xA,T+x¯A,i,T, [10]
PB,Ti=1PB,T1xB,T+x¯B,i,T. [11]

The analogue of Theorem 1 for the case with idiosyncratic reproductive risk is:

Theorem 4. Assume the following. For j=A,B

  • 1.

    The random variable xj,t is IID across t;

  • 2.

    For some Cl>0 and Cu>Cl, Clxj,tCu;

  • 3.

    The random variable x¯j,i,t0 is IID across i and t;

  • 4.

    The expectations E[x¯j,i,t] and E[log(E[x¯j,i,t]+xj,t)] exist; and

  • 5.

    E[logxj,t]>0.

Let PA,T and PB,T be defined as in Eqs. 10 and 11. Then, as T increases without bound, we have:

log(PA,T+PB,T)Ta.s.maxE[log(E[x¯A,i,t]+xA,t)],E[log(E[x¯B,i,t]+xB,t)]. [12]

We see that idiosyncratic risk gets averaged out over time. Any idiosyncratic component to the number of offspring of a given type modifies the growth rate identically to adding a constant number of offspring. It is important to keep in mind that, by definition, idiosyncratic risk cannot be correlated, so when we refer to correlation of fecundity throughout the article, we mean correlation due to systematic risk. In SI Appendix, we repeat the simulation in Table 1 for the case with idiosyncratic risk. Adding enough idiosyncratic risk to make the growth rate of (μ=0.0,ρ=0.5) equal to 131% decreases the impact of correlation such that decreasing ρ from 0.5 to 0.5 is equivalent to increasing μ by 0.07 (instead of 0.09, as in Table 1).

Random Matching.

We can also extend the model to use random matching, a more specific mechanism of interaction that allows explicitly for different behaviors to exist in the same environment and for intermixing of individuals with different behaviors. We show that, even in this case, correlation still plays a key role in determining the evolutionarily dominant behavior.

Instead of types A and B, suppose that there are types 1 and 2. Each generation, all individuals in the population are randomly paired with each other. If a type 1 is paired with a type 1, the offspring vector of the pair is (xA,1,t,xB,1,t). Otherwise, the offspring vector of the pair is (xA,2,t,xB,2,t). We see that, as suggested by the notation, the random pairings determine which individuals play the role of type A and type B from the original model. The new types 1 and 2 represent different behaviors. When two individuals of type 1 are paired, the behavior (xA,1,t,xB,1,t) occurs, but otherwise, the behavior (xA,2,t,xB,2,t) occurs. We use the variable q to denote the pair that each individual is in. For example, in generation T, if individual i of type 1 is matched with type 1 as the first component of the pair, then q1,i,T=A1. The population sizes at generation T of types 1 and 2 are defined recursively:

P1,Ti=1P1,T11{q1,i,T1=A1}xA,1,T+1{q1,i,T1=B1}xB,1,T+1{q1,i,T1=A2}xA,2,T+1{q1,i,T1=B2}xB,2,T, [13]
P2,Ti=1P2,T11{q2,i,T1=A2}xA,2,T+1{q2,i,T1=B2}xB,2,T. [14]

The next theorem provides a sufficient condition for type 1 individuals to exponentially outnumber type 2 individuals in the limit. We assume a law of large numbers for random matching. As noted by Duffie and Sun (22), such laws of large numbers are nontrivial to prove, but have been assumed without proof by evolutionary biologists dating back to Hardy (23) and Weinberg (24).

Theorem 5. Assume the following:

  • 1.

    For j=A,B and k=1,2, the random variable xj,k,t is IID across t.

  • 2.

    For j=A,B and k=1,2, for some Cl>0 and Cu>Cl, Clxj,k,tCu.

  • 3.

    For k=1,2, the expectations Elog12xA,k,t+12xB,k,t exist.

  • 4.

    (Law of large numbers for random matching) As T, P1,Ta.s., P2,Ta.s., and

i=1P1,T1{q1,i,T=A2}i=1P1,T1{q1,i,T=A2}+1{q1,i,T=B2}a.s.12.

In words, the proportion of type 1 individuals matched with type 2 in the left side of the pairing converges almost surely to 1/2.

Let P1,T and P2,T be defined as in Eqs. 13 and 14. Then, if

Elog12xA,1,t+12xB,1,t>Elog12xA,2,t+12xB,2,t, [15]

as T increases without bound, we have:

log(P1,T/P2,T)Ta.s.Elog12xA,1,t+12xB,1,tElog12xA,2,t+12xB,2,t. [16]

The condition Eq. 15 has a close connection to correlation. Decreasing the correlation between xA,k,t and xB,k,t decreases the second-order Taylor approximation of Elog12xA,k,t+12xB,k,t. Thus, in general, unless the distributions are highly skewed, if xA,1,t and xB,1,t have the same means, but lower correlation, than xA,2,t and xB,2,t, then 1-types will grow exponentially faster than 2-types. In other words, even with random matching, behaviors that result in lower correlation will dominate the population. It is also significant that in the random matching model, correlation is a first-order factor, impacting the coefficient of T in the growth rate and not only T.

Density Dependence.

In this section, we study the effect of correlation on population growth when population growth is density-dependent. We do this by extending the standard biological model of density-dependent population growth—in which population at any point in time is the solution to a system of ordinary differential equations (ODEs)—to the stochastic case so that the population size is a solution to a system of stochastic differential equations (SDEs).

To see more clearly the analogy we make between ODEs and SDEs, we consider first a model of population growth with no density dependence. Given types A and B, the standard model is given by:

dPA,t=rAPA,tdt, [17]
dPB,t=rBPB,tdt. [18]

We add randomness by introducing (possibly correlated) Wiener processes, WA,t and WB,t, and converting the standard model to the following SDE:

dPA,t=rAPA,tdt+sAPA,tdWA,t, [19]
dPB,t=rBPA,tdt+sBPB,tdWB,t. [20]

Whereas an equation of the type Eq. 17 implies that for small Δ,

PA,t+ΔPA,t=rAPA,tΔ,

an equation of the type Eq. 19 implies that

PA,t+ΔPA,t=rAPA,tΔ+sAPA,T(WA,t+ΔWA,t),

where WA,t+ΔWA,t is normally distributed with mean zero and variance Δ. Eqs. 19 and 20 have closed-form solutions, from which we can derive continuous-time versions of Theorems 1 and 2, namely:

log(PA,T+PB,T)Ta.s.maxrA12sA2,rB12sB2, [21]

and (when rrA=rB and ssA=sB)

log(PA,T+PB,T)r12s2Tdmax(NA,NB), [22]

where NA and NB are normally distributed random variables with means zero, variances s2, and the same correlation as the Wiener processes, WA,t and WB,t.

Adding density dependence, the standard population-growth model becomes:

dPA,t=rAPA,t1PA,tKAPB,tLAdt, [23]
dPB,t=rBPB,t1PB,tKBPA,tLBdt, [24]

and the stochastic counterpart is simply:

dPA,t=rAPA,t1PA,tKAPB,tLAdt+sAPA,tdWA,t, [25]
dPB,t=rBPB,t1PB,tKBPA,tLBdt+sBPB,tdWB,t. [26]

The key issue in this setting is how the impact of ρ changes with K. (We assume L is fixed to K.) Tables 24 show the population growth rate after 10 generations for different values of K and ρ when r=0.5, r=0.6, and r=0.7, respectively. As K increases without bound, Tables 24 approach continuous-time versions of Table 1 with μ=0.0, μ=0.1, and μ=0.2, respectively.

Table 2.

Simulated density-dependent population growth rate (r=0.5)

K/ρ –0.9 –0.7 –0.5 –0.3 –0.1 0.1 0.3 0.5 0.7 0.9
5.0 –5.4% –6.3% –7.3% –8.1% –9.1% –9.8% –11.3% –12.6% –13.8% –15.5%
10.0 –0.0% –1.1% –2.2% –3.0% –4.4% –5.6% –6.6% –8.3% –9.4% –11.3%
20.0 5.0% 3.6% 2.7% 1.4% –0.1% –1.2% –2.6% –3.7% –6.1% –7.9%
40.0 9.2% 8.1% 6.9% 5.4% 3.9% 2.8% 0.9% –0.8% –3.2% –5.6%
80.0 12.4% 11.1% 10.0% 8.5% 7.1% 5.2% 4.1% 1.4% –0.6% –3.3%

The number of offspring for both types follows Eqs. 25 and 26 with the stated K and ρ, and with r=0.5, s=1.0, and L=K. Quantities shown represent population growth rate per generation after 10 generations, computed as eE[log(P10/P0)]/10, where PT=PA,T+PB,T.

Table 4.

Simulated density-dependent population growth rate (r=0.7)

K/ρ –0.9 –0.7 –0.5 –0.3 –0.1 0.1 0.3 0.5 0.7 0.9
5.0 –1.3% –1.9% –2.4% –2.9% –3.7% –4.6% –5.3% –6.0% –7.5% –9.0%
10.0 5.5% 4.8% 3.9% 3.1% 2.4% 1.8% 0.7% –0.5% –2.1% –3.4%
20.0 12.2% 11.4% 10.4% 9.5% 8.5% 7.5% 6.6% 5.0% 3.5% 1.6%
40.0 19.2% 17.9% 17.1% 16.0% 14.5% 13.3% 11.5% 10.8% 8.6% 5.8%
80.0 25.2% 24.0% 22.7% 21.5% 20.1% 18.4% 16.8% 15.3% 13.1% 10.4%

The number of offspring for both types follows Eqs. 25 and 26 with the stated K and ρ, and with r=0.7, s=1.0, and L=K. Quantities shown represent population growth rate per generation after 10 generations, computed as eE[log(P10/P0)]/10, where PT=PA,T+PB,T .

Table 3.

Simulated density-dependent population growth rate (r=0.6)

K/ρ –0.9 –0.7 –0.5 –0.3 –0.1 0.1 0.3 0.5 0.7 0.9
5.0 –3.1% –3.9% –4.5% –5.4% –6.0% –7.0% –8.0% –8.9% –10.4% –11.7%
10.0 3.1% 2.3% 1.4% 0.4% –0.5% –1.7% –2.7% –4.2% –5.4% –7.5%
20.0 9.2% 8.2% 7.0% 5.9% 4.9% 3.6% 2.4% 0.8% –1.0% –2.7%
40.0 15.0% 13.7% 12.4% 11.2% 9.7% 8.1% 6.8% 5.1% 3.1% 0.7%
80.0 19.8% 18.3% 16.6% 15.5% 13.8% 12.4% 10.8% 9.1% 6.9% 4.1%

The number of offspring for both types follows Eqs. 25 and 26 with the stated K and ρ, and with r=0.6, s=1.0, and L=K. Quantities shown represent population growth rate per generation after 10 generations, computed as eE[log(P10/P0)]/10, where PT=PA,T+PB,T .

It is apparent that correlation has a significant impact on the growth rate, even with density dependence extreme enough to cause the population decline over time. In an absolute sense, the impact of correlation appears to decrease with density dependence. But in a relative sense, it appears to increase. For example, to achieve the same impact on the growth rate as increasing r from 0.5 to 0.6, ρ must be decreased from 0.5 to 0.3 when K=80, but only from 0.5 to 0.1 when K=5. Figs. 2 and 3 show that when density dependence increases, correlation matters more for earlier time horizons and less for later time horizons.

Fig. 2.

Fig. 2.

Total density-dependent population growth vs. time (K=10). Values of ρ range from 0.9 (lightest) to 0.9 (darkest). The number of offspring for both types follows Eqs. 25 and 26 with r=0.6, K=10, L=10, and s=1.0.

Fig. 3.

Fig. 3.

Total density-dependent population growth vs. time (K=40). Values of ρ range from 0.9 (lightest) to 0.9 (darkest). The number of offspring for both types follows Eqs. 25 and 26 with r=0.6, K=40, L=40, and s=1.0.

Behavioral Implications

Having shown that evolutionarily optimal behaviors minimize the correlation of fecundity (ρ) along with maximizing fecundity (μ), we now give two examples of behaviors that decrease ρ. The first behavior, which we call “specialization,” causes good outcomes for the two types to occur in different states. The second behavior, “sacrifice,” decreases fecundity for one type in a given state, while increasing fecundity for the other type in the same state. In SI Appendix, we provide a third and more abstract example, which we call “coordination,” where an uncertain action of one type becomes conditionally certain given the action of the other type. Specialization, sacrifice, and coordination can all be considered forms of cooperation.

Specialization.

Let there be two states of the world, ωr and ωs, representing rain and shine. Consider the two behaviors in Fig. 4. Behavior 1 implies no specialization because the good outcomes for types A and B both occur in the rain state. Behavior 2 implies specialization because the good outcomes occur in different states. Since max(μA,μB) is the same under both behaviors, according to Theorems 1 and 2, ρ determines which action is evolutionarily optimal. It is clear that ρ is smaller under Behavior 2; hence, Behavior 2 is evolutionarily optimal. It is worth noting that although Theorems 1 and 2 suggest that a behavior can only be evolutionarily optimal on the basis of ρ if it is not dominated on the basis of μ, the finite-horizon results and Theorem 3 suggest that a behavior can be evolutionarily optimal on the basis of ρ, even if it is dominated on the basis of μ. In other words, there exists some δ sufficiently small such that it could be subtracted from v, w, z, and y in Behavior 2 and still leave Behavior 2 to be evolutionarily optimal.

Fig. 4.

Fig. 4.

Specialization.

Sacrifice.

We continue with our two states, ωr and ωs, representing rain and shine, each with probability 12 of occurring. Consider the two behaviors in Fig. 5. Under Behavior 2, type B individuals sacrifice for type A in state ωr, and type A individuals sacrifice for type B individuals in state ωs. On the other hand, under Behavior 1, both types have the same number of offspring in both states.

Fig. 5.

Fig. 5.

Sacrifice

We claim that sacrifice is the evolutionarily optimal behavior according to Theorems 1 and 2 when γv2vϵv. When γ>v2vϵv, max(μA,μB) under Behavior 2 is larger than under Behavior 1, which means that Behavior 2 is evolutionarily optimal. However, we are particularly interested in the case γ=v2vϵv. In this case, neither type benefits from sacrifice, which is to say that μA is the same under Behavior 1 and Behavior 2, and μB is the same under Behavior 1 and Behavior 2. Yet, simply by inducing negative correlation, Behavior 2 becomes evolutionarily optimal. Moreover, as in the case of specialization, if we expand our definition of evolutionary optimality by considering finite-horizon results and Theorem 3, Behavior 2 can be evolutionarily optimal if γ<v2vϵ, in which case μA and μB would be smaller under Behavior 2 than Behavior 1, and both types would individually be harmed by sacrifice.

Discussion

We propose to resolve the apparent conflict between natural selection and cooperation by showing that natural selection is not necessarily selfish, giving rise to not only behaviors that increase fecundity, but also behaviors that decrease the correlation of fecundity between individuals. We use a simple model of evolutionary dynamics without any strategic interactions, assuming nothing about the knowledge, intelligence, or even sentience of individuals in the population. Our principal result (Theorem 2) shows that in a population with two types of individuals, a decrease in correlation of fecundity between the two types increases the exponential growth rate of the population. This result shows that correlation of fecundity has a second-order effect on the growth rate, as compared to the separate fecundity of each type. However, we also show that correlation has a first-order effect in certain situations, including with periodic rebalancing of type proportions (Theorem 3) or in a random matching setting (Theorem 5). Simulation results also suggest that correlation matters relatively more over finite time horizons and if population growth is density-dependent.

We conclude by discussing some limitations of our work. First, our most general model assumes that offspring vectors are IID over time. As we have mentioned, this means that our model is silent regarding behaviors that cannot be sustained, such as purely altruistic behaviors of one type that result in the other type growing exponentially faster. Second, although our results show that correlation of fecundity can play a significant role in determining growth rates, we have not attempted to quantify how much correlation (or systematic risk in general) exists in real-world settings, and whether this is enough to drive natural selection. Finally, while the framework we have developed implies that the evolution of cooperation can be explained without a separate theory of group selection, the fact that correlation can confer survival benefits shows why it is so tempting to identify groups within a given population. Accordingly, there may still be conceptual benefits from viewing selection as operating at multiple levels, even if not strictly necessary from a purely theoretical perspective. We hope to explore these issues in future research.

Supplementary Material

Supplementary File

Acknowledgments

We thank participants at the Evolutionary Models of Financial Markets seminar for helpful discussion and comments; Jayna Cummings for editorial assistance; and especially two anonymous referees for feedback that significantly improved the article. Research support from the MIT Laboratory for Financial Engineering is gratefully acknowledged.

Footnotes

The authors declare no competing interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2015572118/-/DCSupplemental.

Data Availability

All study data are included in the article and/or supporting information.

References

  • 1.Hamilton W. D., The evolution of altruistic behavior. Am. Nat. 97, 354–356 (1963). [Google Scholar]
  • 2.Hamilton W. D., The general evolution of social behavior. I and II. J. Theor. Biol. 7, 1–52 (1964). [DOI] [PubMed] [Google Scholar]
  • 3.Price G. R., Selection and covariance. Nature 227, 520–521 (1970). [DOI] [PubMed] [Google Scholar]
  • 4.Wynne-Edwards V. C., Intergroup selection in the evolution of social systems. Nature 200, 623–626 (1963). [Google Scholar]
  • 5.Hauert C., Doebeli M., Spatial structure often inhibits the evolution of cooperation in the snowdrift game. Nature 428, 643–646 (2004). [DOI] [PubMed] [Google Scholar]
  • 6.Nowak M., Sasaki A., Taylor C., Fudenberg D., Emergence of cooperation and evolutionary stability in finite populations. Nature 428, 646–650 (2004). [DOI] [PubMed] [Google Scholar]
  • 7.Nowak M. A., Five rules for the evolution of cooperation. Science 314, 1560–1563 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wilson D. S., Wilson E. O., Evolution “for the good of the group”. Am. Sci. 96, 380–389 (2008). [Google Scholar]
  • 9.Nowak M. A., Tarnita C. E., Wilson E. O., The evolution of eusociality. Nature 466, 1057–1062 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Wilson E. O., The Social Conquest of the Earth (Liveright, New York, NY, 2013). [Google Scholar]
  • 11.Wilson D. S., Sober E., Reintroducing group selection to the human behavioral sciences. Behav. Brain Sci. 17, 585–607 (1994). [Google Scholar]
  • 12.Trivers R. L., The evolution of reciprocal altruism. Q. Rev. Biol. 46, 35–57 (1971). [Google Scholar]
  • 13.Maynard Smith J., Evolution and the Theory of Games (Cambridge University Press, Cambridge, UK, 1982). [Google Scholar]
  • 14.Maynard Smith J., Game theory and the evolution of behaviour. Behav. Brain Sci. 7, 95–125 (1984). [Google Scholar]
  • 15.Nowak M., May R. M., Evolutionary games and spatial chaos. Nature 359, 826–829 (1992). [Google Scholar]
  • 16.Samuelson L., Introduction to the evolution of preferences. J. Econ. Theor. 97, 225–230 (2001). [Google Scholar]
  • 17.Robson A. J., The biological basis of economic behavior. J. Econ. Lit. 39, 11–33 (2001a). [Google Scholar]
  • 18.Blume L., Easley D., Evolution and market behavior. J. Econ. Theor. 58, 9–40 (1992). [Google Scholar]
  • 19.Kogan L., Ross S. A., Wang J., Westerfield M. M., The price impact and survival of irrational traders. J. Finance 61, 195–229 (2006). [Google Scholar]
  • 20.Brennan T. J., Lo A. W., The origin of behavior. Quart. J. Finance 01, 55–108 (2011). [Google Scholar]
  • 21.Zhang R., Brennan T. J., Lo A. W., Group selection as behavioral adaptation to systematic risk. PloS One 9, e110848 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Duffie D., Sun Y., Existence of independent random matching. Ann. Appl. Probab. 17, 386–419 (2007). [Google Scholar]
  • 23.Hardy G. H., Mendelian proportions in a mixed population. Science 28, 49–50 (1908). [DOI] [PubMed] [Google Scholar]
  • 24.Weinberg W., Uber den nachweis der Vererbung beim menschen. Jahreshefte Vereins Vaterlandische Naturkunde Wurttemberg 64, 368–382 (1908). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File

Data Availability Statement

All study data are included in the article and/or supporting information.


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES