The evolution of social behaviors and risk preferences in settings with uncertainty

Guocheng Wang; Qi Su; Long Wang; Joshua B Plotkin

doi:10.1073/pnas.2406993121

. 2024 Jul 17;121(30):e2406993121. doi: 10.1073/pnas.2406993121

The evolution of social behaviors and risk preferences in settings with uncertainty

Guocheng Wang ^a,^b, Qi Su ^c,^d,^e,¹, Long Wang ^a,^f,¹, Joshua B Plotkin ^b,^g,¹

PMCID: PMC11287271 PMID: 39018189

Significance

Uncertainty permeates human social lives. Whether arising from imprecise information, ambiguous communication, or intrinsic randomness, we must make decisions about social behavior in the face of uncertain outcomes. Uncertainty and risk preferences are already understood for perfectly rational agents. But human behavior is not rational, and it typically spreads by social learning. A corresponding theory of behavioral evolution in the face of uncertainty has been lacking. Here, we extend the theory of evolutionary games to accommodate uncertainty in social interactions and variation in individual attitudes toward risk. We find that uncertainty and risk preferences can fundamentally alter the predicted behavior in a population, promoting cooperation, for example, when risk-free interactions would otherwise lead to defection.

Keywords: evolutionary game theory, risk preference, uncertainty, cooperation

Abstract

Humans update their social behavior in response to past experiences and changing environments. Behavioral decisions are further complicated by uncertainty in the outcome of social interactions. Faced with uncertainty, some individuals exhibit risk aversion while others seek risk. Attitudes toward risk may depend on socioeconomic status; and individuals may update their risk preferences over time, which will feedback on their social behavior. Here, we study how uncertainty and risk preferences shape the evolution of social behaviors. We extend the game-theoretic framework for behavioral evolution to incorporate uncertainty about payoffs and variation in how individuals respond to this uncertainty. We find that different attitudes toward risk can substantially alter behavior and long-term outcomes, as individuals seek to optimize their rewards from social interactions. In a standard setting without risk, for example, defection always overtakes a well-mixed population engaged in the classic Prisoner’s Dilemma, whereas risk aversion can reverse the direction of evolution, promoting cooperation over defection. When individuals update their risk preferences along with their strategic behaviors, a population can oscillate between periods dominated by risk-averse cooperators and periods of risk-seeking defectors. Our analysis provides a systematic account of how risk preferences modulate, and even coevolve with, behavior in an uncertain social world.

Humans and other animals modulate their social behavior based on past experiences and in response to changing environments. This process of behavioral modulation is often called social learning because it involves individuals observing and imitating the behavior of others who have achieved success in their social interactions (1). Social learning can be modeled using evolutionary game theory (2–5), which extends the classical economic theory of strategic interaction to incorporate the dynamics of learning and imitation among individuals interacting in a population. Evolutionary game theory was first developed to describe behaviors in simplified settings (6, 7), but it has been extended to incorporate a vast range of realistic complications, such as population structure (8–10), iterated interactions (11–14), changing environments (15, 16), and the effects of social reputations and social norms (17–19).

Real-world social interactions are also complicated by intrinsic uncertainty, which has received less attention in the theory of behavioral evolution. Environmental disturbances (20, 21), communication delays, and cultural disparities all introduce a layer of ambiguity and uncertainty associated with the outcome of any social interaction. As a result, individuals often face noisy information about how their behavior will affect outcomes, both for themselves and for their interaction partners. In general, humans are known to modulate their strategic behavior in response to uncertainty (22–26), but how this type of uncertainty, and variation in attitudes toward risk, affect the process of social learning remains largely unexplored in evolutionary game theory.

In the economics literature, von Neumann and Morgenstern developed utility theory to describe how humans make rational decisions when faced with uncertainty (27). The foundational notion of “utility” represents the subjective value an individual assigns to a material outcome or payoff (monetary reward, goods, or even social status) of a strategic interaction. When presented with multiple options that involve uncertainty, a rational individual is assumed to choose the strategy that provides the highest expected utility. In other words, according to classical utility theory, individuals consider the potential gains and losses, as well as the likelihood of each outcome, to arrive at the strategic choice that maximizes their overall expected satisfaction.

Higher payoffs always bring greater utility, but the relationship between payoff and utility need not be linear (27). The concavity or convexity of an individual’s utility function quantifies their attitude toward uncertainty and risk (23, 24, 28, 29). Consider, for example, an individual with the concave utility function $u (π) = \sqrt{π}$ (30), where π represents the payoff or material outcome from a strategic interaction; along with a choice between two scenarios: one that guarantees a payoff of $π = 50$ and the other that provides either payoff $π = 0$ or payoff $π = 100$ with equal likelihood. Both scenarios have the same expected payoff ( $E (π) = 50$ ), but an individual who follows the principles of utility theory will prefer the first scenario because it produces higher expected utility than the second scenario ( $\sqrt{50} > \sqrt{100} / 2$ ). Such an individual is called risk-averse because they prefer the choice with less uncertainty. By contrast a risk-seeking individual with a convex utility function, such as $u (π) = π^{2}$ , will prefer the second (i.e., the risky scenario) over the first.

Here, we develop a game-theoretic model of behavior that incorporates uncertainty in the outcome of social interactions. Unlike standard economic theory that assumes individuals are rational [or boundedly rational (31)] and immediately choose the strategy that maximizes their expected utility, we follow the literature of evolutionary game theory that posits a dynamical process whereby individuals in a population imitate others in an attempt to increase their expected utility. We develop a general theory of behavioral evolution in the context of uncertain outcomes, and we study its consequences for a range of well-known social interactions. We develop and apply this theory when everyone in the population has a fixed risk preference, and also when individuals adapt their risk preferences over time while they simultaneously change their social behavior. We find that uncertainty and associated risk preferences can fundamentally alter long-term behavioral outcomes in populations, producing dynamics that cannot occur in the absence of uncertainty.

Model

We consider a well-mixed population of N individuals engaged in pairwise social interactions (Fig. 1A). Each player chooses one of two possible strategies, which we denote generically as cooperate (C) or defect (D), to interact with every other player in a one-shot game. The payoff structure for a social interaction is given by the following matrix

\begin{array}{l} \begin{matrix} C & D \end{matrix} \\ \begin{matrix} C \\ D \end{matrix} & (\begin{matrix} a & b \\ c & d \end{matrix}) . \end{array}

[1]

Fig. 1. — Strategic evolution with payoff uncertainty and risk preference. (A) Each player i in a large population chooses to cooperate or defect in pairwise interactions with all others and receives an average material payoff denoted $π_{i}$ . Player i derives an associated utility that depends upon his risk preferences, encoded in his utility function $U_{i} (π)$ . (B) After all pairwise interactions, players make noisy observations of each other’s payoff. In the example shown here, player 1 observes player 2’s payoff as a random variable $π_{21}$ whose mean equals player 2’s actual payoff, $π_{2}$ , and whose variance depends on their strategies (ξ denotes a standard Gaussian random variable). (C) A player (e.g., player 1, illustrated here) updates their strategy by imitating another player (e.g., player 2 or player 3) with probability proportional to the expected utility they would achieve from adopting their strategy. (D) We consider a family of utility functions (blue lines) spanning three basic categories: risk-neutral, risk-averse, and risk-seeking. For a risk-neutral utility function, symmetric observation noise around the true payoff produces expected utility equal to the utility of the true payoff. (E) A risk-averse utility function, which is concave, produces lower expected utility when payoffs are uncertain. (F) A risk-seeking utility function, which is convex, produces greater expected utility when payoffs are uncertain. Examples shown in panels (D–F) correspond to player 1 observing the payoff of player 2, and the gray Gaussian represents the probability density of noisy payoff observation $π_{21} = π_{2} + σ_{21} ξ$ .

Mutual cooperation yields both players payoff a, while mutual defection yields payoff d. If one player cooperates while the other defects, the cooperator receives payoff b and the defector payoff c. For example, the classic Prisoner’s Dilemma arises when $c > a > d > b$ , whereas all other standard two-player games arise for different rank orderings of payoffs in the matrix. (For other games, we retain the generic terminology C and D, although the interpretation of “cooperate” or “defect” may differ in those settings.)

Following a round of pairwise social interactions in the population, each player i obtains an average payoff, denoted $π_{i}$ . We let $π_{C}$ denote the average payoff to a cooperator in the population and $π_{D}$ to a defector. When n players choose cooperation and the remaining players defection, we have $π_{C} = (n a + (N - n) b) / N$ and $π_{D} = (n c + (N - n) d) / N$ . According to standard evolutionary game theory, individuals observe each other’s payoffs and tend to imitate those with higher payoffs (2, 7, 32, 33) (Fig. 1B). However, unlike the standard theory, here we assume that there is noise, or uncertainty, when observing another player’s payoff. To model this we assume that player i observes a noisy perturbation of player j’s actual payoff, given by

\begin{matrix} π_{ji} = π_{j} + σ_{ji} ξ, \end{matrix}

[2]

where ξ is a Gaussian random variable with mean 0 and variance 1, and $π_{j}$ equals either $π_{C}$ or $π_{D}$ depending on whether player j is a cooperator or defector. The model parameter $σ_{ji}$ quantifies the amount of uncertainty when player i observes player j’s payoff. The observed payoff $π_{ji}$ is symmetric around the actual payoff $π_{j}$ that j received, and the magnitude of noise $σ_{ji}$ can vary depending on the identity of the observer i and the observed player j. We will focus on strategy-dependent uncertainty, so that the amount of noise $σ_{ji}$ depends on the strategies of the observer i (denoted by $s_{i}$ ) and the observed player j (denoted by $s_{j}$ ). We denote strategy-dependent uncertainty as $σ_{ji} = σ_{s_{j} s_{i}} \in {σ_{CC}, σ_{CD}, σ_{DC}, σ_{DD}}$ .

Each player i has a subjective utility function that maps their payoff to their utility, denoted by $U_{i} (π)$ . Higher payoffs correspond to higher utilities, but the exact relationship between them may vary across individuals. We broadly categorize utility functions into three categories, according to their concavity. In a risk-neutral scenario, any change in the payoff leads to a proportional change in utility ( $U^{″} (π) = 0$ ; see Fig. 1D). And so symmetric deviations from the expected payoff result in symmetric changes in utility. In a risk-averse scenario, increasing the payoff by a fixed amount increases utility, but by a smaller amount than the loss of utility caused by a same-sized decrease in payoff ( $U^{″} (π) < 0$ ; see Fig. 1E). Therefore, symmetric noise around the expected payoff results in lower expected utility. Finally, in a risk-seeking scenario, symmetric noise around the expected payoff produces greater expected utility ( $U^{″} (π) > 0$ ; see Fig. 1F).

We adopt a well-known exponential family of utility functions that can unify these three categories of risk preferences (23, 28, 34, 35):

\begin{matrix} U_{i} (π) = \{\begin{matrix} 1 - \frac{s}{δ_{i}} + \frac{s}{δ_{i}} exp (δ_{i} π) & if δ_{i} \neq 0, \\ 1 + s π & if δ_{i} = 0 . \end{matrix} \end{matrix}

[3]

The parameter s is called the selection intensity, and it quantifies how much payoff affects utility at all. We work in the regime of weak selection ( $s ≪ 1$ ), which is common in evolutionary game theory (8, 36–40). The parameter $δ_{i}$ governs the risk preference of player i. Risk preference is often quantified by the Arrow–Pratt absolute risk aversion measure (ARA) defined by $- U^{″} (π) / U^{'} (π)$ (23, 24, 28), which is given by $- δ_{i}$ in our formulation (Eq. 3). In other words, $δ_{i} > 0$ means that individual i is risk-seeking, $δ_{i} < 0$ means that individual i is risk-averse, and $δ_{i} = 0$ means that individual i is risk-neutral.

Following a round of pairwise social interactions, a random player i is selected to update their strategy. The player i considers the payoff $π_{ji}$ that they observed each opponent j receive that round. Observations are made independently with noise, however, and so different opponents using the same strategy (C or D) were observed receiving different payoffs. Player i treats this noise as uncertainty in what payoff he would achieve by adopting one strategy or another. In particular, player i considers the (empirical) mean and variance in the observed payoffs received by opponents of each strategic type, which allows player i to compute his expected utility $E [U_{i} (π_{ji})]$ for adopting the strategy of an opponent j. Player i then imitates player j’s strategy with a probability proportional to his expected utility:

\begin{matrix} e_{i \to j} = \frac{1}{N} \frac{E [U_{i} (π_{ji})]}{\sum_{k \neq i} E [U_{i} (π_{ki})]} . \end{matrix}

[4]

This process of strategic imitation is identical to the standard “death-birth” model in evolutionary game theory (8, 33, 41), except that each opponent’s payoff is replaced by the expected utility of adopting each opponent’s strategy, from the perspective of player i. This imitation process does not imply that an individual chosen to update will assuredly change their strategy, because they may imitate another individual who shares their current type.

Behavioral Evolution with Fixed Risk Preference

We start by analyzing the evolution of social behaviors when individuals have fixed risk preferences. Each individual i in the population has one of two different risk preferences, denoted $δ_{i} = δ_{1}$ or $δ_{i} = δ_{2}$ (Eq. 3). We denote the fraction of players with risk preference $δ_{1}$ as p and the remainder as $1 - p$ . We initially assume that each player’s risk preference is fixed throughout the process of behavioral evolution (and so p is a constant, whereas strategies can change). For a sufficiently large population, we have proven that the frequency of cooperators x (including cooperators with risk preference $δ_{1}$ or $δ_{2}$ ) evolves according to the following replicator dynamics equation (SI Appendix, section 2A):

\begin{matrix} \dot{x} = s x (1 - x) [π_{C} - π_{D} + \frac{\bar{δ}}{2} (V_{C} - V_{D})], \end{matrix}

[5]

where $\bar{δ} = p δ_{1} + (1 - p) δ_{2}$ quantifies the average risk preference in the population. Here, $V_{C}$ quantifies the variance when a cooperator’s payoff is observed by a random player, which is given by $x σ_{CC}^{2} + (1 - x) σ_{CD}^{2}$ . Likewise, $V_{D}$ represents the variance when a defector’s payoff is observed by a random player, given by $x σ_{DC}^{2} + (1 - x) σ_{DD}^{2}$ . Lower values of $V_{C}$ or $V_{D}$ represent less uncertainty about a cooperator’s or a defector’s payoff, respectively.

When the average risk preference in the population is neutral ( $\bar{δ} = 0$ ), or when the variance in payoff observation does not depend on strategies ( $σ_{CC} = σ_{DC}$ and $σ_{CD} = σ_{DD}$ , resulting in $V_{C} = V_{D}$ ), the evolutionary dynamics follow the standard replicator equation of classical models without payoff uncertainty (2, 7, 32). However, when the average preference in the population is either risk-averse or risk-seeking, and noisy payoff observations depend upon strategies, there is a systematic deviation from the classical (noiseless) theory. Specifically, when cooperators’ payoffs are more transparent than defectors’ payoffs ( $V_{C} - V_{D} < 0$ ) a predominantly risk-averse population ( $\bar{δ} < 0$ ) will promote the evolution of cooperation, whereas a predominantly risk-seeking population ( $\bar{δ} > 0$ ) will promote defectors, and vice versa for $V_{C} - V_{D} > 0$ . In other words, a predominance of risk aversion promotes whichever type has more transparent payoffs, and risk-seeking preferences promote the type with more noisy payoffs. These conclusions also hold in finite populations and in structured populations (SI Appendix, Figs. S1 and S2), and they agree with the simple intuition that risk-averse individuals prefer options with less uncertainty while risk-seeking individuals prefer greater uncertainty.

Note that the relative transparency of two types’ payoffs ( $V_{C} - V_{D}$ ) depends upon the current frequency x of the strategic types in the population. In fact, the combined effects of payoff uncertainty and risk preference on the evolutionary dynamics are equivalent to a transformation of the original payoff matrix (Eq. 1) into an “effective payoff matrix” given by (see Materials and Methods and SI Appendix, section 2A):

\begin{matrix} (\begin{matrix} a & b \\ c & d \end{matrix}) + \frac{\bar{δ}}{2} (\begin{matrix} σ_{CC}^{2} & σ_{CD}^{2} \\ σ_{DC}^{2} & σ_{DD}^{2} \end{matrix}) . \end{matrix}

[6]

For instance, in the classic Prisoner’s Dilemma with noise-free payoff observation, defectors receive higher payoffs regardless of their frequency, and so the population will always converge to the full-defection state. The same result holds when the population is risk-neutral on average ( $\bar{δ} = 0$ , Fig. 2B). However, when payoff observations between players of different types are more noisy than between players of the same type (i.e., $σ_{DC} > σ_{CC}$ and $σ_{CD} > σ_{DD}$ ), then a risk-averse population ( $\bar{δ} < 0$ ) leads to bistable behavioral dynamics (Fig. 2C) so that the population will eventually reach either full cooperation or full defection, depending on its initial composition. This outcome is similar to the dynamics of a classical Stag Hunt game without noise, rather than the Prisoner’s Dilemma. By contrast, when the population is risk-seeking on average ( $\bar{δ} > 0$ ) evolution leads to stable coexistence of both cooperators and defectors (Fig. 2D), similar to the outcome in a classic Snowdrift game without noise. Furthermore, Eq. 6 implies that even for fixed risk preferences in the population, different combinations of uncertainty in payoff observation between types can qualitatively change the outcome of the Prisoner’s Dilemma, producing behavior classically associated with a wide range of social interactions (SI Appendix, Fig. S3). Note that our model of observation uncertainty is qualitatively different from demographic noise (38, 52) arising from an imitation process in a finite population (SI Appendix, Fig. S1).

Fig. 2. — Risk preferences can reverse behavior in the Prisoner’s Dilemma. (A) When payoffs are observed without noise the population evolves to the sole stable equilibrium, pure defection. (B) If the population is risk-neutral, behavioral evolution in the presence of payoff uncertainty is identical to evolution without uncertainty, ending in defection. (C) If the population is risk-averse, it will evolve to either complete cooperation or complete defection, depending upon the initial frequency of cooperators. Payoff uncertainty and risk aversion therefore produce bistability, which is classically associated with the Stag Hunt game instead of the Prisoner’s Dilemma. (D) If the population is risk-seeking, it will evolve to a stable mixture of cooperators and defectors. This outcome of stable coexistence is classically associated with the Snowdrift game, instead of the Prisoner’s Dilemma. The solid arrows on the y-axis in each panel correspond to the analytical prediction in an infinite population, indicating unstable (open circles) and stable (closed circles) equilibria. Lines in each panel represent evolutionary trajectories by Monte Carlo simulations, starting from different initial states, with two representative trajectories highlighted (red and blue thick lines). Colored lines represent trajectories in different basins of attraction. Matrices shown in each panel denote the effective game (Eq. 6) that results from the combination of a Prisoner’s Dilemma with payoff uncertainty and risk preference. In panels (B–D) there is greater uncertainty when an individual observes the payoff of a player using a different strategy, versus a player with the same strategy. Parameter values: $σ_{CC} = σ_{DD} = 0$ , $σ_{CD} = σ_{DC} = \sqrt{20}$ , $s = 0.01$ , $δ_{1} = 0.2$ (C and D), $δ_{2} = - 0.2$ (C and D), $N = 100, 000$ .

Behavioral Evolution with Adaptive Risk Preferences

So far we have assumed that individuals’ risk preferences are fixed. But in reality, risk preferences are known to vary with age, life experiences, and environmental and cultural factors; and empirical studies have shown that individuals’ risk preferences can change over time (42–45). Generally, wealthy individuals who have more resources to explore opportunities tend to be more risk-seeking, while individuals lacking the buffer of standing capital tend to prioritize security and are more risk-averse (23, 34, 35, 46).

We can extend our modeling framework to accommodate individuals who change their risk preference over time, in response to past experiences. In this formulation, the population will contain a dynamically changing mixture of different risk preferences, as well as a dynamically changing composition of behavioral strategies. To model this, after a player is chosen to update his strategy, we randomly select another player, denoted by j, and we allow them to adjust their risk preference with probability u. (When $u < s$ this means that risk preferences change more slowly than behavioral strategies.) If the selected player j elects to update their risk preference, we assume they will adopt a risk-seeking preference $δ_{1} = δ > 0$ with probability $\frac{1}{2} + \frac{π_{j} - η}{2 D}$ , or alternatively they adopt a risk-averse preference $δ_{2} = - δ < 0$ . Here, η denotes a switching threshold and $D = {| π - η |}_{\max}$ represents the maximum possible payoff deviation from η. In other words, if a player’s payoff exceeds some threshold η, they tend to adopt a risk-seeking attitude; otherwise, they tend to be risk-averse.

Our model for how individuals update risk preferences reflects the general finding in economic studies that wealth correlates with risk-seeking behaviors. It has been widely assumed and verified empirically that the absolute risk aversion (i.e., $- δ_{i}$ in our model) decreases as the wealth increases (23, 34, 35, 46), which corresponds to our assumption that individuals with lower payoffs tend to become more risk-averse. In addition, it is reasonable to assume that u is smaller than s, meaning that players consider changing their behavioral strategy more often than they consider updating their overall attitude toward risk.

The coevolution of strategies and risk preferences in the population is nominally a three-dimensional system (there are now four types: cooperators who are risk-averse or risk-seeking and defectors who are risk-averse or risk-seeking). Nonetheless, for weak selection on both strategies and risk preferences ( $s ≪ 1$ and $u ≪ 1$ ), we have proven that the dynamics can be described by two variables (SI Appendix, section 2B): the proportion of cooperators, denoted x, and the overall proportion of risk-seeking players, denoted p, as follows:

\dot{x} = s x (1 - x) [π_{C} - π_{D} + \frac{\bar{δ}}{2} (V_{C} - V_{D})],

[7a]

\dot{p} = u (\frac{1}{2} + \frac{\bar{π} - η}{2 D} - p) .

[7b]

Here, $\bar{π} = x π_{C} + (1 - x) π_{D}$ represents the average payoff in the population.

Persistent Oscillations in the Prisoner’s Dilemma.

The coevolution of behaviors and risk preferences can lead to qualitatively new dynamical phenomena that cannot occur in classical models of behavioral evolution without risk. For example, in the Prisoner’s Dilemma with $a = 1$ , $b = - r$ , $c = 1 + r$ , $b = 0$ ( $r > 1$ ), taking $η = 1 / 2$ , we find the population may enter a stable limit cycle, cycling between risk-averse and risk-seeking preferences while it also cycles between cooperators and defectors (Fig. 3A–C). This outcome is in sharp contrast to a stable equilibrium, which can occur when risk preferences are fixed, and which always occurs when payoffs are noiseless.

Fig. 3. — Coevolution of behavior and risk preference. We consider a scenario in which individuals with higher payoffs tend to adopt more risk-seeking preference, and cooperators’ payoffs are more transparent than defectors’ payoffs ( $V_{C} - V_{D} < 0$ ). (A) Coevolution of strategies and risk preferences can generate oscillatory dynamics. In the Prisoner’s Dilemma illustrated here, full cooperation generates high payoffs, which then increases the frequency of risk-seeking types (green arrow on the *Right*). The resulting risk-seeking population favors the spread of defectors, and so cooperation eventually collapses (blue arrow on the *Top*), leading to a decrease in payoffs. Subsequently, individuals begin to adopt a risk-averse preference (green arrow on the *Left*), which again favors the spread of cooperators (blue arrow on the *Bottom*). These dynamics produce either a stable limit cycle (B and C) or decaying oscillations (D and E), depending upon the relative rate at which individuals update their attitudes to risk versus their behaviors (the persistence of cycles is predicted by the analytic condition in Eq. 8). Panels (B) and (D) show solutions in an infinite population, whereas panels (C) and (E) show sample trajectories of a corresponding stochastic process in a large, finite population. Parameters: $r = 0.5$ , $σ_{CC} = σ_{CD} = 0$ , $σ_{DC} = \sqrt{200}$ , $σ_{DD} = \sqrt{100}$ , $u = 0.0003$ (B and C), $u = 0.0009$ (D and E), $s = 0.01$ , $N = 500, 000$ , $η = 0.5$ , $δ_{1} = - δ_{2} = 0.03$ , $x = 0.8$ , $p = 0.6$ .

A stable limit cycle involving four types is a somewhat unusual outcome for behavioral evolution, compared to classical theories. In particular, we can prove that a stable limit cycle exists provided

\begin{matrix} σ_{CC}^{2} - σ_{DC}^{2} < σ_{CD}^{2} - σ_{DD}^{2} < - \frac{2 r (2 r + 1)}{δ} and u < u^{*}, \end{matrix}

[8]

where $u^{*}$ is a critical rate of risk adaptation (see Materials and Methods and SI Appendix, section 2B) that is monotonically increasing with s ( $u^{*} = O (s)$ ). Here, we provide some intuition for these conditions that produce stable cycles. The condition $σ_{CC}^{2} - σ_{DC}^{2} < σ_{CD}^{2} - σ_{DD}^{2} < - 2 r (2 r + 1) / δ < 0$ implies that cooperator payoffs are more transparent than defector payoffs (i.e., $σ_{CC}^{2} < σ_{DC}^{2}$ , $σ_{CD}^{2} < σ_{DD}^{2}$ , and thus $V_{C} < V_{D}$ ). Under this condition, we have already seen that a predominance of risk aversion favors the evolution of cooperators, whereas a risk-seeking preference hinders cooperators. Starting from a state of full cooperation, then, individuals enjoy high payoffs and thus they tend to become risk-seeking, which is subsequently detrimental to cooperators and enables invasion by defectors. As defectors overtake the population the payoffs for all players will decrease, which, in turn, will stimulate conversion to risk aversion, and the subsequent resurgence of cooperators. This cycle of oscillating risk preferences and social behavior will persist indefinitely, provided the conditions of Eq. 8 are met.

Eq. 8 implies that the limit cycle is more likely, all other things equal, when u is small ( $u ≪ s$ ) and δ is large. This makes intuitive sense. When u is significantly smaller than s the strategy composition (x) evolves much more quickly than risk preferences (p). Consequently, small changes in risk-type frequency p lead to substantial changes in strategy composition x, which will cause an overshoot effect that promotes cycles. Likewise, when δ is large, any change in an individual’s risk preference is dramatic, so that even a small proportion of individuals changing their risk preferences can result in considerable changes in the population’s average risk preference. Both of these scenarios contribute to the overshooting of x relative to p (seen in Fig. 3B), which facilitates persistent cycles. The condition for limit cycles in Eq. 8 also mandates that the transparency of cooperators’ payoff exceeds that of defectors in the full defection state ( $σ_{CD}^{2} - σ_{DD}^{2} < - \frac{2 r (2 r + 1)}{δ} < 0$ ). This condition is also intuitive because it means that individuals in the full defection state will be compelled to cooperate, as illustrated by the lower blue arrow in Fig. 3A. One final stipulation for limit cycles (Eq. 8) is that payoffs to cooperators in the full-cooperation state must be more transparent than in the full-defection state ( $σ_{CC}^{2} - σ_{DC}^{2} < σ_{CD}^{2} - σ_{DD}^{2}$ ), which ensures that the interior fixed point is unstable.

For simplicity, we have presented results for symmetric risk preferences ( $δ_{1} = - δ_{2} = δ$ ), which involve two types with an equal amount of risk aversion or risk seeking. In reality, although higher payoffs may reduce risk aversion (35, 46–48), they may not go so far as to induce people to become risk-seeking ( $δ_{2} < δ_{1} \leq 0$ ). In SI Appendix, section 2B we provide a sufficient condition for limit cycles for a general Prisoner’s Dilemma and arbitrary risk preferences $δ_{1}$ and $δ_{2}$ . The condition shows that stable cycles can still occur when there is variation only in the degree of risk aversion (but no risk-seeking type) (SI Appendix, Fig. S4).

Diverse Dynamical Outcomes.

Aside from limit cycles, adaptive risk preferences can lead to a diverse range of dynamical phenomena in the Prisoner’s Dilemma. These include oscillating dynamics with a spiral sink (Fig. 3 D and E), stable coexistence of cooperators and defectors, and bistability of three different forms. We provide a comprehensive analysis of all possible evolutionary dynamics in SI Appendix, Fig. S5 (also see SI Appendix, section 2B). This broad array of dynamical outcomes all arise from the simple Prisoner’s Dilemma, depending on the rate at which individuals adapt their risk preferences (u), and the relative amount of uncertainty when observing the payoffs of cooperators and defectors.

Oscillatory Behavior in Other Games.

We have seen that Prisoner’s Dilemma can produce a diverse range of behavioral dynamics, when payoffs are noisy and individuals adapt their risk preferences in response to recent payoffs. This is true despite the fact that the classical Prisoner’s Dilemma leads to the simplest possible outcome—dominance of one type—in the absence of noise or uncertainty. Our method of analysis is general, however, and we can use it to study the effects of uncertainty and risk preference for other classic social interactions, such as the Stag Hunt game and the Snowdrift game. In these cases, we again find diverse evolutionary outcomes. In the Stag Hunt game, for example, where a player is incentivized to choose the opposite strategy as his opponent ( $a > c$ and $d > b$ ), the classic replicator dynamics produce either full cooperation and full defection; but with uncertainty and adaptive risk preferences, there are in addition two distinct oscillatory outcomes (Fig. 4A) that sustain both behavioral types in the population. Similar oscillatory behaviors occur in Snowdrift games ( $a < c$ and $d < b$ ), even though there is only one stable equilibrium in the classic setting without payoff uncertainty (Fig. 4 C and D).

Fig. 4. — Coevolution of behavior and risk preferences for other social interactions. The coevolution of strategies and risk preferences produces complex dynamical phenomena, including multiple cycles in different regions of state space, in the Stag Hunt game and Snowdrift game. (A) For the Stag Hunt game, which is a coordination game, we consider a scenario with greater uncertainty in payoff observations between players using different strategies than the same strategy ( $σ_{CC} = σ_{DD} = 0$ and $σ_{DC} = σ_{CD} = 10 \sqrt{2}$ ). Two oscillating dynamical patterns emerge in different regions of state space ( $x < 1 / 2$ versus $x > 1 / 2$ ). (B) Evolutionary trajectories in Monte Carlo simulations. (C) For the Snowdrift game, which is an anticoordination game, we consider a scenario with greater uncertainty in payoff observations between players using the same strategy ( $σ_{CC} = σ_{DD} = 10 \sqrt{2}$ and $σ_{DC} = σ_{CD} = 0$ ). Again, two oscillatory dynamical patterns emerge in different regions of state space. (D) Evolutionary trajectories in Monte Carlo simulations. Matrices indicate payoffs associated with each game, and the amount of uncertainty when a player of one strategic type observes the payoffs of another type. Parameters: $δ_{1} = 0.05$ , $δ_{2} = - 0.05$ (A and B), $δ_{2} = 0$ (C and D), $s = 0.1$ , $u = 0.01$ (A and B), $u = 0.003$ (C and D), $η = 0.5$ .

The emergence of oscillations relies on two factors: how the frequency of strategic types (x) affects the average payoff (and, subsequently, the risk preference of the population) and how risk preference affects the survival of strategic types. There are two routes that can produce oscillations, which may be either persistent or decaying in magnitude (SI Appendix, Fig. S6). One route is exemplified by the Prisoner’s Dilemma: If more cooperators bring higher average payoffs ( $\frac{d \bar{π}}{d x} > 0$ ) and cooperators’ payoffs are more transparent than defectors’ payoffs ( $V_{C} - V_{D} < 0$ ), then more cooperators drive individuals to become risk-seeking. But a risk-seeking population is detrimental to cooperators, which allows defectors to invade and eventually reduces population payoff, so that individuals convert back to risk-aversion and, eventually, cooperators once again. The other route occurs when cooperators yield lower population payoffs ( $\frac{d \bar{π}}{d x} < 0$ ) and defectors’ payoffs are more transparent than cooperators’ ( $V_{C} - V_{D} > 0$ ). In this case, a high frequency of cooperators drives individuals to risk-aversion, which is detrimental to cooperators. The subsequent increases in defectors and average payoff drive individuals to be more risk-seeking, which in turn converts the population back to cooperation (since $V_{C} > V_{D}$ ).

For general two-by-two games, the effects of cooperator frequency on the population average payoff ( $\frac{d \bar{π}}{d x} < 0$ ) and the relative transparency of payoffs for the two strategic types ( $V_{C} - V_{D}$ ) both depend on the current frequency of strategies. And so oscillations can arise in multiple different regions of state space, through different routes. For example in the Stag Hunt game (Fig. 4 A and B), in the region $x < 1 / 2$ , more cooperators produce lower average payoffs and defectors’ payoffs are more transparent, which is consistent with the second route toward oscillation. Conversely, when $x > 1 / 2$ the dynamics agree with the first route toward oscillations. When payoff uncertainty is asymmetric ( $σ_{CC} \neq σ_{DD}$ or $σ_{CD} \neq σ_{DC}$ ), then oscillatory behavior can be more complicated than in Fig. 4, producing phenomena such as limit cycles and even transitions between two different basins of oscillatory dynamics (SI Appendix, Fig. S7).

Discussion

Uncertainty permeates human social lives. Whether arising from imprecise information, ambiguous communication, the need for confidentiality, or intrinsic randomness in the rewards of social interactions, individuals must make decisions about social behavior in the face of uncertain outcomes. Although utility theory provides a classic framework for rational decision-making under uncertainty, the effects of uncertainty have not been systematically studied for populations engaged in social learning by imitation. We have therefore extended the theory of evolutionary games to accommodate the real-life complications of uncertainty and individual attitudes toward risk.

Regardless of their attitudes toward risk, all individuals have the same rank order payoff preferences: They prefer higher payoffs. Nonetheless, we have seen that different attitudes to risk can fundamentally reshape the dynamical process of behavioral evolution by imitation, compared to classical predictions in the absence of uncertainty (2–5). A risk-averse population consistently benefits a behavioral type with more transparent payoffs (less uncertainty), whereas a risk-seeking population promotes the evolution of types with noisier payoffs (greater uncertainty). If the average risk preference of the population ( $\bar{δ}$ ) is nonzero, the dynamics of behavior for a given social interaction involving uncertainty can be mapped to the classical dynamics for a different social interaction without uncertainty. This mapping (Eq. 6) provides a precise understanding of how payoff uncertainty and risk preferences collectively transpire to change the nature of social interactions.

Alternatively, if individuals update their risk preferences over time, so that the population contains a dynamic mixture of risk-seeking and risk-averse types, then qualitatively new phenomena can arise, such as persistent cycles in social behavior that do not arise in the absence of uncertainty.

Our analysis provides a step forward in the large field of evolutionary games: we have developed a framework for studying uncertainty in social learning, analogous to the well-established framework in economics for individual decision-making under uncertainty. At the same time, our results are not merely of theoretical interest, as they have real-life implications. For example, because most people tend to be risk-averse (35, 46–48), cooperators have an incentive to clearly disclose to others the payoffs they receive, so as to attract imitators who adopt their behavior and provide future cooperative partners. In other words, increased transparency about the payoffs from cooperation leads to a more efficient equilibrium (Pareto efficiency).

The advantage to cooperators for disclosing their payoffs in a risk-averse population is analogous to the “Market of Lemons” in economics (26). In the economic setting, goods with poor quality can outperform those with high quality due to noisy information about quality. This situation, known as adverse selection, arises from incomplete market information, and it puts pressure on purveyors of high-quality products to make their information more transparent to buyers. Our results provide a counterpart to this phenomenon in the context of social behaviors.

Even if most people are generally averse to risk, the degree of risk aversion is known to vary across individuals, or even within an individual who experiences different circumstances (notably, wealth) over time (46). If larger payoffs tend to induce more risk tolerance, then oscillating dynamics of behavior of risk attitudes can emerge, and they can even persist indefinitely. These results are reminiscent of the oscillations that arise when the environmental state (and game payoffs) changes in response to the strategic composition in the population (15, 16, 40, 50, 51). In this analogy, a population that is more risk-averse, on average, corresponds to a low-quality environmental state, and a population with less risk aversion corresponds to a high-quality environmental state; cooperation drives the environment toward high quality, and defection drives the environment toward low quality. Although this may be a useful analogy for gaining intuition, the actual mechanism at play in our study is fundamentally different from eco-evolutionary games (15, 16, 40, 50, 51); and uncertainty in payoffs can even lead to multiple different basins of oscillatory dynamics for a given type of social interaction (Fig. 4).

Previous work has explored the effects of stochasticity on frequency-dependent dynamics in evolution (20, 38, 49, 52–54). Our work diverges from that literature in several key respects. First, the source of randomness in prior studies was either demographic stochasticity (38, 49, 52, 54) or environmental stochasticity (20, 53, 54). By contrast, we have described uncertainty in the social interactions themselves—arising from noise when individuals observe each other’s payoffs. This is a fundamentally different source of randomness, which is more properly called uncertainty and which aligns more closely with the concept of incomplete information in game theory. As a result, unlike in prior work, our analysis incorporates individuals’ attitudes to uncertainty, using the established theory of expected utility to describe the level of risk aversion. Behavioral dynamics that arise from payoff uncertainty can be qualitatively different from those arising from demographic or environmental stochasticity.

Our study has several important limitations. We have focused on the simple scenario of a well-mixed, large population, even though population structure (in the absence of uncertainty) is known to have a strong impact on behavioral evolution (8–10, 55–57). We have explored the combined effects of population structure and payoff uncertainty by simulations (SI Appendix, Fig. S2), but this remains a topic for future analytical research. Likewise, we have focused on pairwise interactions (two-player games) alone, leaving yet unexplored the effects of uncertainty and risk preferences on behavioral evolution in multiway interactions, such as public goods games (58, 59), or in iterated games. Finally, although utility theory has been an incisive tool for understanding individual decision-making in the face of uncertainty, Kahneman and Tversky have proposed an alternative theory—prospect theory—that relaxes some assumptions of perfect rationality (25). According to prospect theory, for example, individuals have asymmetric attitudes when facing gains versus losses, known as loss aversion. Analysis of how uncertainty and risk preferences, under the framework of prospect theory, will affect behavioral dynamics in a population undergoing social learning remains an outstanding area for future development in evolutionary game theory.

Materials and Methods

Complete derivations of our mathematical results are detailed in SI Appendix. We briefly outline these derivations below.

Replicator Dynamics.

We briefly summarize the derivation of the replicator equation that describes the evolution of behavior in the presence of payoff uncertainty and risk preferences. Some of the technical details, as indicated, are deferred to SI Appendix.

We consider a population with N individuals consisting of n cooperators and $N - n$ defectors. Among them, M individuals have risk preference $δ_{1}$ and $N - M$ have preference $δ_{2}$ (M is a constant, and we define $p = M / N$ ). There are four types of individuals, $δ_{1}$ -cooperators, $δ_{2}$ -cooperators, $δ_{1}$ -defectors, and $δ_{2}$ -defectors, whose numbers are denoted by $n_{1}$ , $n - n_{1}$ , $M - n_{1}$ , and $N - M - n + n_{1}$ . The state of the population can be described by a 2-tuple $n = (n, n_{1})$ . The master equation of this stochastic system is

\begin{matrix} \begin{matrix} P (n, τ + 1) - P (n, τ) \\ = \sum_{n^{'} \neq n} [P (n^{'}, τ) T (n^{'} \to n) - P (n, τ) T (n \to n^{'})], \end{matrix} \end{matrix}

[9]

where $P (n, τ)$ is the probability of the state being in $n$ at time step τ, and $T (n \to n^{'})$ is the transition rate from state $n$ to $n^{'}$ . In SI Appendix, section 2A, we show that for an arbitrary state $(n, n_{1})$ , only $T ((n, n_{1}) \to (n + 1, n_{1}))$ , $T ((n, n_{1}) \to (n + 1, n_{1} + 1))$ , $T ((n, n_{1}) \to (n - 1, n_{1}))$ , $T ((n, n_{1}) \to (n - 1, n_{1} - 1))$ , and $T ((n, n_{1}) \to (n, n_{1}))$ are nonzero, denoted by $T_{1}$ , $T_{2}$ , $T_{3}$ , $T_{4}$ , and $1 - T_{1} - T_{2} - T_{3} - T_{4}$ . Introducing the notation $x = n / N$ , $q = n_{1} / N$ , and $t = τ / N$ , we obtain the following Fokker–Plank equation

\begin{matrix} \begin{matrix} \frac{d}{d t} & ρ (x, q, t) = - \frac{\partial}{\partial x} [(T_{1} + T_{2} - T_{3} - T_{4}) ρ] \\ - \frac{\partial}{\partial q} [(T_{2} - T_{4}) ρ] + \frac{1}{2 N} \frac{\partial^{2}}{\partial x^{2}} [(T_{1} + T_{2} + T_{3} + T_{4}) ρ] \\ + \frac{1}{2 N} \frac{\partial^{2}}{\partial q^{2}} [(T_{2} + T_{4}) ρ] + \frac{1}{N} \frac{\partial^{2}}{\partial x \partial q} [(T_{2} + T_{4}) ρ], \end{matrix} \end{matrix}

[10]

where $ρ (x, q, t) = N^{2} P (n, τ)$ is the probability density of state $(x, q)$ . The corresponding Langevin equation is

\begin{array}{l} d x & = (T_{1} + T_{2} - T_{3} - T_{4}) d t + \sqrt{(T_{1} + T_{3}) / N} d W_{t}^{(1)} \\ + \sqrt{(T_{2} + T_{4}) / N} d W_{t}^{(2)}, \end{array}

[11a]

d q = (T_{2} - T_{4}) d t + \sqrt{(T_{2} + T_{4}) / N} d W_{t}^{(2)} .

[11b]

Here, $W_{t}^{(1)}$ and $W_{t}^{(2)}$ are independent Wiener processes. For large populations ( $N ≫ 1$ ), the stochastic term vanishes and we obtain an ordinary differential equation. Expanding $T_{1} \sim T_{4}$ in a Taylor series and truncating to first order in s, we obtain the replicator equations

\dot{x} = s x (1 - x) f (x, q),

[12a]

\dot{q} = p x - q + O (s) .

[12b]

For weak selection ( $s ≪ 1$ ), q reaches its equilibrium much more quickly than x. In other words, before the frequency of cooperators x changes at all the frequency of $δ_{1}$ -cooperators q will converge to the slow manifold

\begin{matrix} q = p x + O (s) \approx p x . \end{matrix}

[13]

This equation means that each type of risk preference is evenly distributed among cooperators and defectors. Substituting Eq. 13 into Eq. 12a, we obtain the replicator equation (Eq. 5) used in the main text.

When individuals are allowed to adapt their risk preferences over time, the frequency of the risk-seeking type p is a variable, not a constant. We suppose that in each time step after strategy updating, another individual is chosen and has a chance, with probability u, to update their risk preference. For a cooperator, the individual chooses $δ_{1}$ or $δ_{2}$ with probability $v_{C}^{+} = \frac{1}{2} + \frac{1}{2 D} (π_{C} - η)$ and $v_{C}^{-} = \frac{1}{2} - \frac{1}{2 D} (π_{C} - η)$ . $v_{D}^{+}$ and $v_{D}^{-}$ can be defined similarly for defectors. Let $E [Δ p]$ denote the expected change of p in each time step. Then, after rescaling time ( $t = τ / N$ ), the derivative of p is

\begin{matrix} \dot{p} = & \frac{E [Δ p]}{1 / N} \\ = & u [(x - q) v_{C}^{+} + (1 - x - p + q) v_{D}^{+} - q v_{C}^{-} - (p - q) v_{D}^{-}] \\ = & u [\frac{1}{2} + \frac{\bar{π} - η}{2 D} - p] . \end{matrix}

[14]

Combining Eq. 14 with Eq. 12, the system is described by a three-dimensional ordinary differential equation. Similarly, for weak selection and a slow rate of preference updates ( $s, u ≪ 1$ ), q will quickly equilibrate to $px$ before x and p change. Substituting Eq. 13 into Eqs. 12a and 14, the system simplifies to a two-dimensional ordinary differential equation (see Eq. 7 in the main text).

Evolutionary Dynamics with Fixed Risk Preference.

For a population with fixed risk preferences, the replicator equation is given by Eq. 5. It can be written as

\begin{matrix} \begin{matrix} \dot{x} = s x (1 - x) [b - d + (a - b - c + d) x \\ + \frac{\bar{δ}}{2} (σ_{CD}^{2} - σ_{DD}^{2} + (σ_{CC}^{2} - σ_{CD}^{2} - σ_{DC}^{2} + σ_{DD}^{2}) x)] . \end{matrix} \end{matrix}

[15]

Comparing this equation with the classic replicator equation (i.e., $\dot{x} = x (1 - x)$ $[b - d + (a - b - c + d) x]$ ), we see that the effect of uncertainty and risk preference on the dynamics can be described by a transformation of the game payoff matrix in the following way:

\begin{matrix} [\begin{matrix} a & b \\ c & d \end{matrix}] \Rightarrow [\begin{matrix} a & b \\ c & d \end{matrix}] + \frac{\bar{δ}}{2} [\begin{matrix} σ_{CC}^{2} & σ_{CD}^{2} \\ σ_{DC}^{2} & σ_{DD}^{2} \end{matrix}] . \end{matrix}

[16]

That is, the behavioral dynamics of a population playing the left game in a setting with noisy payoff observations is the same as for a population playing the right game in the (classical) setting without payoff uncertainty.

Analysis of Global Limit Cycles.

We mainly focus on symmetric risk preferences $δ_{1} = - δ_{2} = δ > 0$ and the donation game $a = 1$ , $b = - r$ , $c = 1 + r$ , $d = 0$ . In this case, Eq. 7 has two equilibrium points $e_{1} = (0, p_{1})$ and $e_{2} = (1, p_{2})$ on the boundary, which represent the full-cooperation and full-defection states. Aside from these two outcomes, there are at most two others, interior fixed points $e_{+}$ and $e_{-}$ . According to the Poincare–Bendixson theorem, if all fixed points are unstable, globally stable limit cycles will exist. In SI Appendix, section 2B, we show that the instability of $e_{1}$ requires $σ_{CD}^{2} - σ_{DD}^{2} < - \frac{2 r (2 r + 1)}{δ}$ , and the instability of $e_{2}$ requires $σ_{CC}^{2} - σ_{DC}^{2} < \frac{2 r (2 r + 1)}{δ}$ . Furthermore, if $e_{1}$ and $e_{2}$ are unstable, we show that $e_{+}$ never exists, so we need to consider only the stability of $e_{-}$ . $e_{-}$ is unstable if and only if $σ_{CC}^{2} - σ_{DC}^{2} < σ_{CD}^{2} - σ_{DD}^{2}$ and $u < u^{*}$ . Here, $u^{*}$ is a critical value of order $O (s)$ :

\begin{matrix} u^{*} = s x_{-} (1 - x_{-}) (x_{-} - 1 / 2) (α - β) / (2 r + 1), \end{matrix}

[17]

where

α = σ_{C C}^{2} - σ_{D C}^{2},

[18a]

β = σ_{C D}^{2} - σ_{D D}^{2},

[18b]

x_{-} = \frac{(α - 3 β) - \sqrt{{(α - 3 β)}^{2} + 8 (α - β) (β + 2 r (2 r + 1) / (δ))}}{4 (α - β)} .

[18c]

Combining these conditions we obtain Eq. 8 in the main text.

General Condition for Oscillating Dynamics in Arbitrary Games.

From the description in the main text, we can obtain a necessary condition for oscillating dynamics in the neighborhood of an equilibrium point $e_{*}$ . That is

\begin{matrix} \frac{d \bar{π}}{d x} (V_{C} - V_{D}) |_{e_{*}} < 0 . \end{matrix}

[19]

More precisely, oscillating dynamics in the neighborhood of $e_{*}$ requires that the two eigenvalues of the Jacobian of $e_{*}$ have nonzero imaginary parts. Although condition Eq. 19 is not sufficient to guarantee a nonzero imaginary part, in SI Appendix, section 2C, we show that if Eq. 19 holds, oscillating dynamics can always emerge for sufficiently large amount of risk seeking and risk aversion (i.e., $δ_{1} ≫ 0$ and $δ_{2} ≪ 0$ ).

Supplementary Material

Appendix 01 (PDF)

pnas.2406993121.sapp.pdf^{(4.2MB, pdf)}

Acknowledgments

G.W. acknowledges support from the China Scholarship Council (No. 202306010132). Q.S. acknowledges support from Shanghai Pujiang Program (No. 23PJ1405500). L.W. acknowledges support from the National Natural Science Foundation of China (No. 62036002). J.B.P. acknowledges support from the Simons Foundation Math+X grant to the University of Pennsylvania, and from the John Templeton Foundation (Grant #62281).

Author contributions

G.W., Q.S., L.W., and J.B.P. designed research; G.W. performed research; G.W., Q.S., L.W., and J.B.P. analyzed data; and G.W., Q.S., and J.B.P. wrote the paper with input from L.W.

Competing interests

The authors declare no competing interest.

Footnotes

This article is a PNAS Direct Submission.

Contributor Information

Qi Su, Email: qisu@sjtu.edu.cn.

Long Wang, Email: longwang@pku.edu.cn.

Joshua B. Plotkin, Email: jplotkin@sas.upenn.edu.

Data, Materials, and Software Availability

There are no data underlying this work.

Supporting Information

References

1.Bandura A., Walters R. H., Social Learning and Personality Development (Holt Rinehart and Winston, New York, NY, 1963). [Google Scholar]
2.Taylor P. D., Jonker L. B., Evolutionary stable strategies and game dynamics. Math. Biosci. 40 (1–2), 145–156 (1978). [Google Scholar]
3.Smith J. M., Evolution and the Theory of Games (Cambridge University Press, Cambridge, 1982). [Google Scholar]
4.Weibull J. W., Evolutionary Game Theory (MIT Press, Cambridge, MA, 1997). [Google Scholar]
5.Sandholm W. H., Population Games and Evolutionary Dynamics (MIT Press, Cambridge, MA, 2010). [Google Scholar]
6.Smith J. M., Price G. R., The logic of animal conflict. Nature 246, 15–18 (1973). [Google Scholar]
7.Hofbauer J., Sigmund K., Evolutionary Games and Population Dynamics (Cambridge University Press, Cambridge, MA, 1998). [Google Scholar]
8.Ohtsuki H., Hauert C., Lieberman E., Nowak M. A., A simple rule for the evolution of cooperation on graphs and social networks. Nature 441, 502–505 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
9.van Veelen M., García J., Rand D. G., Nowak M. A., Direct reciprocity in structured populations. Proc. Natl. Acad. Sci. U.S.A. 109, 9929–9934 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
10.McAvoy A., Allen B., Fixation probabilities in evolutionary dynamics under weak selection. J. Math. Biol. 82, 14 (2021). [DOI] [PubMed] [Google Scholar]
11.Axelrod R., Effective choice in the prisoner’s dilemma. J. Conflict Resolut. 24, 3–25 (1980). [Google Scholar]
12.Nowak M. A., Sigmund K., Tit for tat in heterogeneous populations. Nature 355, 250–253 (1992). [Google Scholar]
13.Press W. H., Dyson F. J., Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent. Proc. Natl. Acad. Sci. U.S.A. 109, 10409–10413 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Hilbe C., Chatterjee K., Nowak M. A., Partners and rivals in direct reciprocity. Nat. Hum. Behav. 2, 469–477 (2018). [DOI] [PubMed] [Google Scholar]
15.Weitz J. S., Eksin C., Paarporn K., Brown S. P., Ratcliff W. C., An oscillating tragedy of the commons in replicator dynamics with game-environment feedback. Proc. Natl. Acad. Sci. U.S.A. 113, E7518–E7525 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Tilman A. R., Plotkin J. B., Akçay E., Evolutionary games with environmental feedbacks. Nat. Commun. 11, 915 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Nowak M. A., Sigmund K., Evolution of indirect reciprocity by image scoring. Nature 393, 573–577 (1998). [DOI] [PubMed] [Google Scholar]
18.Ohtsuki H., Iwasa Y., The leading eight: Social norms that can maintain cooperation by indirect reciprocity. J. Theor. Biol. 239, 435–444 (2006). [DOI] [PubMed] [Google Scholar]
19.Sigmund K., The Calculus of Selfishness (Princeton University Press, Princeton, NJ, 2010). [Google Scholar]
20.Assaf M., Mobilia M., Roberts E., Cooperation dilemma in finite populations under fluctuating environments. Phys. Rev. Lett. 111, 238101 (2013). [DOI] [PubMed] [Google Scholar]
21.van den Berg P., Wenseleers T., Uncertainty about social interactions leads to the evolution of social heuristics. Nat. Commun. 9, 2151 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Knight F. H., Risk, Uncertainty and Profit (Houghton Mifflin, Boston, MA, 1921). [Google Scholar]
23.Arrow K. J., Essays in the Theory of Risk-Bearing (Markham Pub. Co, Chicago, IL, 1971). [Google Scholar]
24.Pratt J. W., Risk aversion in the small and in the large. Econometrica 44, 420 (1976). [Google Scholar]
25.Kahneman D., Tversky A., Prospect theory: An analysis of decision under risk. Econometrica 47, 263–292 (1979). [Google Scholar]
26.Akerlof G. A., The market for “lemons’’: Quality uncertainty and the market mechanism. Q. J. Econ. 84, 488–500 (1970). [Google Scholar]
27.von Neumann J., Morgenstern O., Theory of Games and Economic Behavior (60th Anniversary Commemorative Edition) (Princeton University Press, Princeton, NJ, 2007). [Google Scholar]
28.Nicholson W., Snyder C. M., Microeconomic Theory: Basic Principles and Extensions (Cengage Learning, 2012). [Google Scholar]
29.Gerber A., The Nash solution as a von Neumann–Morgenstern utility function on bargaining games. Homo Oeconomicus 37, 87–104 (2020). [Google Scholar]
30.Chiappori P. A., Paiella M., Relative risk aversion is constant: Evidence from panel data. J. Eur. Econ. Assoc. 9, 1021–1052 (2011). [Google Scholar]
31.Selten R., Bounded rationality. J. Inst. Theor. Econ. 146, 649–658 (1990). [Google Scholar]
32.Hofbauer J., Sigmund K., Evolutionary game dynamics. Bull. Am. Math. Soc. 40, 479–519 (2003). [Google Scholar]
33.Nowak M. A., Evolutionary Dynamics: Exploring the Equations of Life (Harvard University Press, Cambridge, MA, 2006). [Google Scholar]
34.Kimball M. S., Standard risk aversion. Econometrica 61, 589 (1993). [Google Scholar]
35.Levy H., Absolute and relative risk aversion: An experimental study. J. Risk Uncertain. 8, 289–307 (1994). [Google Scholar]
36.Wu B., Altrock P. M., Wang L., Traulsen A., Universality of weak selection. Phys. Rev. E 82, 046106 (2010). [DOI] [PubMed] [Google Scholar]
37.Wu B., García J., Hauert C., Traulsen A., Extrapolating weak selection in evolutionary games. PLoS Comput. Biol. 9, e1003381 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Wang G., Su Q., Wang L., Plotkin J. B., Reproductive variance can drive behavioral dynamics. Proc. Natl. Acad. Sci. U.S.A. 120, e2216218120 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Allen B., et al. , Evolutionary dynamics on any population structure. Nature 544, 227–230 (2017). [DOI] [PubMed] [Google Scholar]
40.Su Q., McAvoy A., Wang L., Nowak M. A., Evolutionary dynamics with game transitions. Proc. Natl. Acad. Sci. U.S.A. 116, 25398–25404 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Ohtsuki H., Nowak M. A., The replicator equation on graphs. J. Theor. Biol. 243, 86–97 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Friend I., Blume M. E., The demand for risky assets. Am. Econ. Rev. 65, 900–922 (1975). [Google Scholar]
43.Haushofer J., Fehr E., On the psychology of poverty. Science 344, 862–867 (2014). [DOI] [PubMed] [Google Scholar]
44.Brunnermeier M. K., Nagel S., Do wealth fluctuations generate time-varying risk aversion? Micro-evidence on individuals’ asset allocation. Am. Econ. Rev. 98, 713–736 (2008). [Google Scholar]
45.Bonilla C. A., Vergara M., Risk aversion, downside risk aversion, and the transition to entrepreneurship. Theory Decis. 91, 123–133 (2021). [Google Scholar]
46.Wik M., Kebede T. A., Bergland O., Holden S. T., On the measurement of risk aversion from experimental data. Appl. Econ. 36, 2443–2451 (2004). [Google Scholar]
47.March J. G., Learning to be risk averse. Psychol. Rev. 103, 309–319 (1996). [Google Scholar]
48.Holt C. A., Laury S. K., Risk aversion and incentive effects. Am. Econ. Rev. 92, 1644–1655 (2002). [Google Scholar]
49.Traulsen A., Claussen J. C., Hauert C., Coevolutionary dynamics: From finite to infinite populations. Phys. Rev. Lett. 95, 238701 (2005). [DOI] [PubMed] [Google Scholar]
50.Hilbe C., Šimsa Š., Chatterjee K., Nowak M. A., Evolution of cooperation in stochastic games. Nature 559, 246–249 (2018). [DOI] [PubMed] [Google Scholar]
51.Wang G., Su Q., Wang L., Evolution of state-dependent strategies in stochastic games. J. Theor. Biol. 527, 110818 (2021). [DOI] [PubMed] [Google Scholar]
52.Constable G. W. A., Rogers T., McKane A. J., Tarnita C. E., Demographic noise can reverse the direction of deterministic selection. Proc. Natl. Acad. Sci. U.S.A. 113, E4745–E4754 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Stollmeier F., Nagler J., Unfair and anomalous evolutionary dynamics from fluctuating payoffs. Phys. Rev. Lett. 120, 058101 (2018). [DOI] [PubMed] [Google Scholar]
54.Lehmann L., Perrin N., Rousset F., Population demography and the evolution of helping behaviors. Evolution 60, 1137–1151 (2006). [PubMed] [Google Scholar]
55.Tarnita C. E., Ohtsuki H., Antal T., Fu F., Nowak M. A., Strategy selection in structured populations. J. Theor. Biol. 259, 570–581 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Rand D. G., Arbesman S., Christakis N. A., Dynamic social networks promote cooperation in experiments with humans. Proc. Natl. Acad. Sci. U.S.A. 108, 19193–19198 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Rand D. G., Nowak M. A., Fowler J. H., Christakis N. A., Static network structure can stabilize human cooperation. Proc. Natl. Acad. Sci. U.S.A. 111, 17093–17098 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Hauert C., Holmes M., Doebeli M., Evolutionary games and population dynamics: Maintenance of cooperation in public goods games. Proc. R. Soc. B Biol. Sci. 273, 2565–2571 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Santos F. C., Santos M. D., Pacheco J. M., Social diversity promotes the emergence of cooperation in public goods games. Nature 454, 213–216 (2008). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix 01 (PDF)

pnas.2406993121.sapp.pdf^{(4.2MB, pdf)}

Data Availability Statement

There are no data underlying this work.

[r1] 1.Bandura A., Walters R. H., Social Learning and Personality Development (Holt Rinehart and Winston, New York, NY, 1963). [Google Scholar]

[r2] 2.Taylor P. D., Jonker L. B., Evolutionary stable strategies and game dynamics. Math. Biosci. 40 (1–2), 145–156 (1978). [Google Scholar]

[r3] 3.Smith J. M., Evolution and the Theory of Games (Cambridge University Press, Cambridge, 1982). [Google Scholar]

[r4] 4.Weibull J. W., Evolutionary Game Theory (MIT Press, Cambridge, MA, 1997). [Google Scholar]

[r5] 5.Sandholm W. H., Population Games and Evolutionary Dynamics (MIT Press, Cambridge, MA, 2010). [Google Scholar]

[r6] 6.Smith J. M., Price G. R., The logic of animal conflict. Nature 246, 15–18 (1973). [Google Scholar]

[r7] 7.Hofbauer J., Sigmund K., Evolutionary Games and Population Dynamics (Cambridge University Press, Cambridge, MA, 1998). [Google Scholar]

[r8] 8.Ohtsuki H., Hauert C., Lieberman E., Nowak M. A., A simple rule for the evolution of cooperation on graphs and social networks. Nature 441, 502–505 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r9] 9.van Veelen M., García J., Rand D. G., Nowak M. A., Direct reciprocity in structured populations. Proc. Natl. Acad. Sci. U.S.A. 109, 9929–9934 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r10] 10.McAvoy A., Allen B., Fixation probabilities in evolutionary dynamics under weak selection. J. Math. Biol. 82, 14 (2021). [DOI] [PubMed] [Google Scholar]

[r11] 11.Axelrod R., Effective choice in the prisoner’s dilemma. J. Conflict Resolut. 24, 3–25 (1980). [Google Scholar]

[r12] 12.Nowak M. A., Sigmund K., Tit for tat in heterogeneous populations. Nature 355, 250–253 (1992). [Google Scholar]

[r13] 13.Press W. H., Dyson F. J., Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent. Proc. Natl. Acad. Sci. U.S.A. 109, 10409–10413 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r14] 14.Hilbe C., Chatterjee K., Nowak M. A., Partners and rivals in direct reciprocity. Nat. Hum. Behav. 2, 469–477 (2018). [DOI] [PubMed] [Google Scholar]

[r15] 15.Weitz J. S., Eksin C., Paarporn K., Brown S. P., Ratcliff W. C., An oscillating tragedy of the commons in replicator dynamics with game-environment feedback. Proc. Natl. Acad. Sci. U.S.A. 113, E7518–E7525 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r16] 16.Tilman A. R., Plotkin J. B., Akçay E., Evolutionary games with environmental feedbacks. Nat. Commun. 11, 915 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r17] 17.Nowak M. A., Sigmund K., Evolution of indirect reciprocity by image scoring. Nature 393, 573–577 (1998). [DOI] [PubMed] [Google Scholar]

[r18] 18.Ohtsuki H., Iwasa Y., The leading eight: Social norms that can maintain cooperation by indirect reciprocity. J. Theor. Biol. 239, 435–444 (2006). [DOI] [PubMed] [Google Scholar]

[r19] 19.Sigmund K., The Calculus of Selfishness (Princeton University Press, Princeton, NJ, 2010). [Google Scholar]

[r20] 20.Assaf M., Mobilia M., Roberts E., Cooperation dilemma in finite populations under fluctuating environments. Phys. Rev. Lett. 111, 238101 (2013). [DOI] [PubMed] [Google Scholar]

[r21] 21.van den Berg P., Wenseleers T., Uncertainty about social interactions leads to the evolution of social heuristics. Nat. Commun. 9, 2151 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r22] 22.Knight F. H., Risk, Uncertainty and Profit (Houghton Mifflin, Boston, MA, 1921). [Google Scholar]

[r23] 23.Arrow K. J., Essays in the Theory of Risk-Bearing (Markham Pub. Co, Chicago, IL, 1971). [Google Scholar]

[r24] 24.Pratt J. W., Risk aversion in the small and in the large. Econometrica 44, 420 (1976). [Google Scholar]

[r25] 25.Kahneman D., Tversky A., Prospect theory: An analysis of decision under risk. Econometrica 47, 263–292 (1979). [Google Scholar]

[r26] 26.Akerlof G. A., The market for “lemons’’: Quality uncertainty and the market mechanism. Q. J. Econ. 84, 488–500 (1970). [Google Scholar]

[r27] 27.von Neumann J., Morgenstern O., Theory of Games and Economic Behavior (60th Anniversary Commemorative Edition) (Princeton University Press, Princeton, NJ, 2007). [Google Scholar]

[r28] 28.Nicholson W., Snyder C. M., Microeconomic Theory: Basic Principles and Extensions (Cengage Learning, 2012). [Google Scholar]

[r29] 29.Gerber A., The Nash solution as a von Neumann–Morgenstern utility function on bargaining games. Homo Oeconomicus 37, 87–104 (2020). [Google Scholar]

[r30] 30.Chiappori P. A., Paiella M., Relative risk aversion is constant: Evidence from panel data. J. Eur. Econ. Assoc. 9, 1021–1052 (2011). [Google Scholar]

[r31] 31.Selten R., Bounded rationality. J. Inst. Theor. Econ. 146, 649–658 (1990). [Google Scholar]

[r32] 32.Hofbauer J., Sigmund K., Evolutionary game dynamics. Bull. Am. Math. Soc. 40, 479–519 (2003). [Google Scholar]

[r33] 33.Nowak M. A., Evolutionary Dynamics: Exploring the Equations of Life (Harvard University Press, Cambridge, MA, 2006). [Google Scholar]

[r34] 34.Kimball M. S., Standard risk aversion. Econometrica 61, 589 (1993). [Google Scholar]

[r35] 35.Levy H., Absolute and relative risk aversion: An experimental study. J. Risk Uncertain. 8, 289–307 (1994). [Google Scholar]

[r36] 36.Wu B., Altrock P. M., Wang L., Traulsen A., Universality of weak selection. Phys. Rev. E 82, 046106 (2010). [DOI] [PubMed] [Google Scholar]

[r37] 37.Wu B., García J., Hauert C., Traulsen A., Extrapolating weak selection in evolutionary games. PLoS Comput. Biol. 9, e1003381 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r38] 38.Wang G., Su Q., Wang L., Plotkin J. B., Reproductive variance can drive behavioral dynamics. Proc. Natl. Acad. Sci. U.S.A. 120, e2216218120 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r39] 39.Allen B., et al. , Evolutionary dynamics on any population structure. Nature 544, 227–230 (2017). [DOI] [PubMed] [Google Scholar]

[r40] 40.Su Q., McAvoy A., Wang L., Nowak M. A., Evolutionary dynamics with game transitions. Proc. Natl. Acad. Sci. U.S.A. 116, 25398–25404 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r41] 41.Ohtsuki H., Nowak M. A., The replicator equation on graphs. J. Theor. Biol. 243, 86–97 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r42] 42.Friend I., Blume M. E., The demand for risky assets. Am. Econ. Rev. 65, 900–922 (1975). [Google Scholar]

[r43] 43.Haushofer J., Fehr E., On the psychology of poverty. Science 344, 862–867 (2014). [DOI] [PubMed] [Google Scholar]

[r44] 44.Brunnermeier M. K., Nagel S., Do wealth fluctuations generate time-varying risk aversion? Micro-evidence on individuals’ asset allocation. Am. Econ. Rev. 98, 713–736 (2008). [Google Scholar]

[r45] 45.Bonilla C. A., Vergara M., Risk aversion, downside risk aversion, and the transition to entrepreneurship. Theory Decis. 91, 123–133 (2021). [Google Scholar]

[r46] 46.Wik M., Kebede T. A., Bergland O., Holden S. T., On the measurement of risk aversion from experimental data. Appl. Econ. 36, 2443–2451 (2004). [Google Scholar]

[r47] 47.March J. G., Learning to be risk averse. Psychol. Rev. 103, 309–319 (1996). [Google Scholar]

[r48] 48.Holt C. A., Laury S. K., Risk aversion and incentive effects. Am. Econ. Rev. 92, 1644–1655 (2002). [Google Scholar]

[r49] 49.Traulsen A., Claussen J. C., Hauert C., Coevolutionary dynamics: From finite to infinite populations. Phys. Rev. Lett. 95, 238701 (2005). [DOI] [PubMed] [Google Scholar]

[r50] 50.Hilbe C., Šimsa Š., Chatterjee K., Nowak M. A., Evolution of cooperation in stochastic games. Nature 559, 246–249 (2018). [DOI] [PubMed] [Google Scholar]

[r51] 51.Wang G., Su Q., Wang L., Evolution of state-dependent strategies in stochastic games. J. Theor. Biol. 527, 110818 (2021). [DOI] [PubMed] [Google Scholar]

[r52] 52.Constable G. W. A., Rogers T., McKane A. J., Tarnita C. E., Demographic noise can reverse the direction of deterministic selection. Proc. Natl. Acad. Sci. U.S.A. 113, E4745–E4754 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r53] 53.Stollmeier F., Nagler J., Unfair and anomalous evolutionary dynamics from fluctuating payoffs. Phys. Rev. Lett. 120, 058101 (2018). [DOI] [PubMed] [Google Scholar]

[r54] 54.Lehmann L., Perrin N., Rousset F., Population demography and the evolution of helping behaviors. Evolution 60, 1137–1151 (2006). [PubMed] [Google Scholar]

[r55] 55.Tarnita C. E., Ohtsuki H., Antal T., Fu F., Nowak M. A., Strategy selection in structured populations. J. Theor. Biol. 259, 570–581 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r56] 56.Rand D. G., Arbesman S., Christakis N. A., Dynamic social networks promote cooperation in experiments with humans. Proc. Natl. Acad. Sci. U.S.A. 108, 19193–19198 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r57] 57.Rand D. G., Nowak M. A., Fowler J. H., Christakis N. A., Static network structure can stabilize human cooperation. Proc. Natl. Acad. Sci. U.S.A. 111, 17093–17098 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r58] 58.Hauert C., Holmes M., Doebeli M., Evolutionary games and population dynamics: Maintenance of cooperation in public goods games. Proc. R. Soc. B Biol. Sci. 273, 2565–2571 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r59] 59.Santos F. C., Santos M. D., Pacheco J. M., Social diversity promotes the emergence of cooperation in public goods games. Nature 454, 213–216 (2008). [DOI] [PubMed] [Google Scholar]

PERMALINK

The evolution of social behaviors and risk preferences in settings with uncertainty

Guocheng Wang

Qi Su

Long Wang

Joshua B Plotkin

Significance

Abstract

Model

Fig. 1.

Behavioral Evolution with Fixed Risk Preference

Fig. 2.

Behavioral Evolution with Adaptive Risk Preferences

Persistent Oscillations in the Prisoner’s Dilemma.

Fig. 3.

Diverse Dynamical Outcomes.

Oscillatory Behavior in Other Games.

Fig. 4.

Discussion

Materials and Methods

Replicator Dynamics.

Evolutionary Dynamics with Fixed Risk Preference.

Analysis of Global Limit Cycles.

General Condition for Oscillating Dynamics in Arbitrary Games.

Supplementary Material

Acknowledgments

Author contributions

Competing interests

Footnotes

Contributor Information

Data, Materials, and Software Availability

Supporting Information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases