Friendly-rivalry solution to the iterated n-person public-goods game

Yohsuke Murase; Seung Ki Baek

doi:10.1371/journal.pcbi.1008217

. 2021 Jan 21;17(1):e1008217. doi: 10.1371/journal.pcbi.1008217

Friendly-rivalry solution to the iterated n-person public-goods game

Yohsuke Murase ¹, Seung Ki Baek ^2,^*

Editor: Yamir Moreno³

PMCID: PMC7853487 PMID: 33476337

Abstract

Repeated interaction promotes cooperation among rational individuals under the shadow of future, but it is hard to maintain cooperation when a large number of error-prone individuals are involved. One way to construct a cooperative Nash equilibrium is to find a ‘friendly-rivalry’ strategy, which aims at full cooperation but never allows the co-players to be better off. Recently it has been shown that for the iterated Prisoner’s Dilemma in the presence of error, a friendly rival can be designed with the following five rules: Cooperate if everyone did, accept punishment for your own mistake, punish defection, recover cooperation if you find a chance, and defect in all the other circumstances. In this work, we construct such a friendly-rivalry strategy for the iterated n-person public-goods game by generalizing those five rules. The resulting strategy makes a decision with referring to the previous m = 2n − 1 rounds. A friendly-rivalry strategy for n = 2 inherently has evolutionary robustness in the sense that no mutant strategy has higher fixation probability in this population than that of a neutral mutant. Our evolutionary simulation indeed shows excellent performance of the proposed strategy in a broad range of environmental conditions when n = 2 and 3.

Author summary

How to maintain cooperation among a number of self-interested individuals is a difficult problem, especially if they can sometimes commit error. In this work, we propose a strategy for the iterated n-person public-goods game based on the following five rules: Cooperate if everyone did, accept punishment for your own mistake, punish others’ defection, recover cooperation if you find a chance, and defect in all the other circumstances. These rules are not far from actual human behavior, and the resulting strategy guarantees three advantages: First, if everyone uses it, full cooperation is recovered even if error occurs with small probability. Second, the player of this strategy always never obtains a lower long-term payoff than any of the co-players. Third, if the co-players are unconditional cooperators, it obtains a strictly higher long-term payoff than theirs. Therefore, if everyone uses this strategy, no one has a reason to change it. Furthermore, our simulation shows that this strategy will become highly abundant over long time scales due to its robustness against the invasion of other strategies. In this sense, the repeated social dilemma is solved for an arbitrary number of players.

Introduction

The success of Homo sapiens can be attributed to its ability to organize collective action toward a common goal among a group of genetically unrelated individuals [1], and this ability is becoming more and more important as the world is getting close to each other. Researchers have identified several mechanisms to promote cooperation in terms of evolutionary game theory [2, 3]. For example, the folk theorem holds that repeated interaction can establish cooperation through reciprocal strategies, and this mechanism is called direct reciprocity [4]. Yet, how to resolve a conflict between individual and collective interests is a hard problem, especially when a large number of players are involved and they are prone to error [5–7], because an individual player has very limited control over co-players.

In this respect, the discovery of the zero-determinant (ZD) strategies in the iterated prisoner’s dilemma (PD) has been deemed counter-intuitive because a ZD-strategic player can unilaterally fix the co-player’s long-term payoff or enforce a linear relationship between their long-term payoffs [8]. For instance, one can design an extortionate ZD strategy, with which the player’s long-term payoff will increase by χ ≥ 1 whenever the co-player’s does by one unit payoff. Another counter-intuitive aspect of the ZD strategy is that it is a memory-one strategy referring only to the previous round, so that such a simple strategy can perfectly constrain the co-player’s long-term payoff regardless of the co-player’s strategic complexity. Of course, the excellent performance in a one-to-one match does not necessarily mean evolutionary success: It is difficult for an extortionate strategy to proliferate in a population because, as its fraction increases, two extortionate players are more likely to meet and keep defecting against each other [9–12]. For this reason, especially in a large population, selection tends to favor a generous ZD strategy whose long-term payoff does not exceed the co-player’s [11]. A generous ZD strategy does not aim at winning a match, but it is efficient by forming mutual cooperation when they meet each other.

The important point in this line of thought is that a player’s strategy can unilaterally impose constraints on the co-player’s long-term payoff, so that we can now characterize strategies according to the constraints that they impose. One such meaningful characterization scheme is to ask if a strategy works as a ‘partner’ or as a ‘rival’ [13, 14]: By ‘partner’, we mean that the strategy seeks for mutual cooperation, but that it will make the co-player’s payoff less than its own if the co-player defects from it. It has also been called ‘good’ [15, 16], and the generous ZD strategies can be understood as an intersection between the ZD and partner strategies [11]. On the other hand, a rival strategy always makes its long-term payoff higher than or equal to the co-player’s, so it has been called ‘unbeatable’ [17], ‘competitive’ [13], or ‘defensible’ [18, 19]. A trivial example of a rival strategy is unconditional defection (AllD), and an extortionate ZD strategy also falls into this class. Most of well-known strategies in the iterated PD game are classified either as a partner or as a rival [14]. However, which class is more favored by selection depends on environmental conditions such as the population size and the benefit-to-cost ratio of cooperation: If the population is small and cooperation is costly, it is better off to play a rival strategy than to play a partner strategy, and vice versa [11, 14, 20]. In the iterated PD game, if a single strategy acts as a partner and a rival simultaneously, it has important implications in evolutionary dynamics because it possesses evolutionary robustness regardless of the environmental conditions [21], in the sense that no mutant strategy can invade a population of this strategy with greater fixation probability than that of neutral drift [11, 20, 22]. To indicate the partner-rival duality, such a strategy will be called a ‘friendly rival’ [21]. Tit-for-tat (TFT), a special ZD strategy having χ = 1, is a friendly rival in an error-free environment [14], but a friendly rival generally requires a far more complicated structure in the presence of error. So far, the existence of friendly-rivalry strategies has been reported by a brute-force enumeration method in the iterated PD game [18, 21, 23] and the three-person public-goods (PG) game [19]. However, it is not straightforward to extend these findings to the general n-person PG game. For example, a naive extension of a solution in the iterated PD game fails to solve the three-person PG game because the third player cannot tell if one of the co-players is correcting the other’s error with good intent or just carrying out a malicious attack [19]. To resolve the ambiguity, a strategic decision must be based on more information of the past interactions: In fact, if a player refers to the previous m rounds to choose an action in the n-person PG game, we can show that m must be greater than or equal to n as a necessary condition to be a friendly rival [19]. Unfortunately, the existing brute-force approach then becomes simply unfeasible because the number of possible strategies expands super-exponentially as $2^{2^{m n}}$ . For example, in the three-person game (n = 3), it means that we have to enumerate 2⁵¹² ∼ 10¹⁵⁴ possibilities to find an answer. Although the symmetry among co-players reduces this number down to 2²⁸⁸ ∼ 10⁸⁶, it is still comparable to the estimated number of protons in the universe.

In this work, by using an alternative method to generalize behavioral patterns of a friendly rival for the iterated PD game [21], we construct a friendly-rivalry strategy for the n-person PG game. This approach makes use of the fact that it greatly lessens the computational burden if we only check whether a given strategy qualifies as a friendly rival. The required memory length of our strategy is m = 2n − 1, which satisfies the necessary condition m ≥ n as shown in Fig 1. We will also numerically confirm that it shows excellent performance in evolutionary dynamics. In this way, this work modifies and generalizes our previous finding of n = 2 and m = 3, i.e., a memory-three friendly-rivalry strategy for the iterated PD game [21].

Fig 1 — The dashed blue line depicts a theoretical lower bound m = n for friendly rivalry [19], and the strategy proposed in this work, called CAPRI-n, has m = 2n − 1.

Materials and methods

In this section, we define the game and construct a friendly-rivalry strategy by reasoning. See S1 Table for a summary of mathematical symbols.

Public-goods game

Let us consider the n-person public-goods (PG) game, in which a player may choose either cooperation (c), by contributing a token to a public pool, or defection (d), by refusing it. Let the number of cooperators be denoted as n_c. The n_c tokens in the public pool are multiplied by a factor of ρ, where 1 < ρ < n, and then equally redistributed to the n players. We assume that the tokens are infinitely divisible. A player’s payoff is thus given as

{\begin{matrix} \frac{ρ n_{c}}{n} & when the player chooses c, \\ 1 + \frac{ρ n_{c}}{n} & when the player chooses d . \end{matrix}

(1)

Clearly, it is always better off to choose d regardless of n_c, so full defection is the only Nash equilibrium of this one-shot game. In this study, this game will be repeated indefinitely with no discounting factor to facilitate direct reciprocity. Every player can choose an action between c and d by referring to the previous m rounds. At the same time, a player can make implementation error, e.g., by choosing d while intending c and vice versa, with small probability ϵ ≪ 1.

Axiomatic approach

The long-term payoff of player X is defined as

\begin{matrix} Π_{X} \equiv lim_{T \to \infty} \frac{1}{T} \sum_{t = 0}^{T - 1} π_{X}^{(t)}, \end{matrix}

(2)

where $π_{X}^{(t)}$ is player X’s instantaneous payoff in round t. If ϵ > 0, the Markovian dynamics of strategic interaction for a given strategy profile converges to a unique stationary distribution, from which Π_X can readily be calculated [24, 25]. In terms of the players’ long-term payoffs, we wish to propose the following three criteria that a successful strategy Ω should satisfy [18, 19, 21, 26].

Efficiency: Mutual cooperation must be achieved with probability one as ϵ → 0 if all the players have adopted Ω. In other words, this criterion requires lim_ϵ→0⁺ Π_X = ρ when the strategy profile $P = {Ω, Ω, \dots, Ω}$ .
Defensibility: It must be guaranteed that none of the co-players can obtain higher long-term payoffs against Ω regardless of their strategies and initial states when ϵ = 0. It implies that lim_ϵ→0⁺ (Π_X − Π_Y) ≥ 0, where player X is using strategy Ω and Y denotes any possible co-player of X.
Distinguishability: If X uses Ω and all the co-players are unconditional cooperators (AllC), player X can exploit them to earn a strictly higher long-term payoff than theirs. That is, Π_X > Π_Y when Y is an AllC player.

When a strategy satisfies defensibility and efficiency, the strategy is a friendly rival. A symmetric strategy profile which consists of a friendly-rivalry strategy forms a cooperative Nash equilibrium [18, 19, 21], and the proof is straightforward: Assume that everyone initially uses a friendly-rivalry strategy Ω with earning ρ per round. If one player, say, X, changes his or her strategy alone, X’s payoff will change to Π_X, while each of the co-players earns Π_Ω. Defensibility guarantees that Π_X ≤ Π_Ω, and full cooperation is Pareto-optimal, i.e., (n − 1)Π_Ω + Π_X ≤ nρ. Combining these two inequalities, we see that

\begin{matrix} (n - 1) Π_{X} + Π_{X} \leq (n - 1) Π_{Ω} + Π_{X} \leq n ρ, \end{matrix}

(3)

which means that Π_X ≤ ρ. The player cannot increase his or her payoff by deviating from Ω alone. The third criterion is a requirement to suppress invasion of AllC due to neutral drift in the evolutionary context [27–29]. We call a strategy ‘successful’ if it meets all the above three criteria. Depending on the definition of successfulness, one could choose a different set of axioms for an alternative characterization [30].

Strategy design

Let us construct a deterministic strategy with memory length m = 2n − 1 and show that the proposed strategy indeed meets all of the above three criteria. In the following, we will take an example of three players (n = 3) who are called Alice (A), Bob (B), and Charlie (C), respectively, and choose Alice as a focal player playing this strategy.

Before proceeding, it is convenient to introduce some notations for the sake of brevity. The three players’ history profile over the previous m = 5 rounds can be represented as h_t = (A_t−5 A_t−4 A_t−3 A_t−2 A_t−1;B_t−5 B_t−4 B_t−3 B_t−2 B_t−1;C_t−5 C_t−4 C_t−3 C_t−2 C_t−1), where A_τ, B_τ, and C_τ denote their respective actions at round τ. The last round of full cooperation will be denoted by t*. According to the payoff definition [Eq (1)], we can fully determine Alice’s cumulative payoff over a given period, $\sum_{t} π_{A}^{(t)}$ , just by counting how many times each of the players has defected during the period. This is due to the linearity of operations acting on the number of tokens: The tokens contributed to the public pool are multiplied by a constant factor ρ and equally distributed to all the players, and Alice saves a token every time she defects. For example, if all the players have defected the same number of times, their payoffs must be the same irrespective of the exact history. We thus introduce $Δ_{A}^{τ_{1}, τ_{2}}$ to denote Alice’s number of defections in [τ₁, τ₂]. Likewise, we can define $Δ_{B}^{τ_{1}, τ_{2}}$ for Bob and $Δ_{C}^{τ_{1}, τ_{2}}$ for Charlie. We also define N_d as the maximum difference among the players in numbers of defections over the previous m rounds:

\begin{matrix} N_{d} \equiv max_{i, j \in {A, B, C}} | Δ_{i}^{t - m, t - 1} - Δ_{j}^{t - m, t - 1} | . \end{matrix}

(4)

With these notations, we can now design a successful strategy satisfying all the three criteria simultaneously. To this end, we divide the set of history profiles into three mutually exclusive cases: The first case is that full cooperation occurred in the last round (t* = t − 1). The second case is that it is not in the last round but still in their memory (t − m ≤ t* < t − 1). The third case is that no player remembers the last round of full cooperation (t* < t − m). Let us consider these cases one by one, together with adequate rules for each.

t* = t − 1
- Cooperate: If this is the case, Alice has to choose c under the condition that N_d < n. For example, the inequality is true for (ccccc;cccdc;ccccc), for which N_d = 1. On the other hand, it is not true for (cdddc;ccddc;ccccc) because its N_d is equal to n = 3.
t − m ≤ t* < t − 1
- Accept: Alice has to accept punishment from the co-players by choosing c, under the condition that $Δ_{A}^{t^{*}, t - 1} \geq Δ_{B}^{t^{*}, t - 1}$ and $Δ_{A}^{t^{*}, t - 1} \geq Δ_{C}^{t^{*}, t - 1}$ in addition to N_d < n. For example, c will be prescribed to Alice at (cccdc;ccccd, ccccc), where we have t* = t − 3, $Δ_{A}^{t^{*}, t - 1} = Δ_{B}^{t^{*}, t - 1} = 1$ , $Δ_{C}^{t^{*}, t - 1} = 0$ , and N_d = 1, which satisfies the above inequalities. On the other hand, the condition is not met by (ccddd;ccddd;ccccc) which gives N_d = 3.
- Punish: Alice has to punish the co-players by choosing d, under the condition that $Δ_{A}^{t^{*}, t - 1} < Δ_{B}^{t^{*}, t - 1}$ or $Δ_{A}^{t^{*}, t - 1} < Δ_{C}^{t^{*}, t - 1}$ in addition to N_d < n. For example, d is prescribed at (ccccd;cccdd;ccccc) because N_d = 2 and Alice has defected fewer times than Bob since the last round of full cooperation at t* = t − 3.
t* < t − m
- Recover: Alice has to recover cooperation by choosing c, under the condition that all the players except one cooperated in the last round. For n = 3, it means (ddddd;ddddc;ddddc) and its permutations.
In all the other cases, defect.

A strategy of this sort for the n-person PG game will be called CAPRI-n after the first letters of the five constitutive rules. Note that these five rules may be implemented in a number of different ways [21], and we take this way because it provides the most straightforward way to prove the three criteria. Each of the rules can also be regarded as the player’s internal state consisting of multiple history profiles [26]. For example, Alice can find herself at state R, the abbreviation for ‘Recover’, when her history profile is (ddddd;ddddc;ddddc), at which she must choose c. The connection structure of the above five states is graphically represented in Fig 2, which is helpful for understanding how defensibility and efficiency are realized as shown below.

Fig 2 — The five rules of the strategy can be identified with the player’s internal states [26], each of which is represented as a node in this diagram. An exception is state I, which corresponds to two nodes to clarify the following point: When t* ≥ t − m, the state may have outgoing connections to A and P. When t* < t − m, on the other hand, the only possible next state is R. The player has to choose c at a blue node and d at a red node. We have omitted error-caused transitions for the sake of simplicity.

Defensibility

Let us begin by checking defensibility. Our CAPRI-n player Alice cooperates only at states C, A, and R, so the question is whether she can be forced to visit one of these states repeatedly with giving a strictly higher payoff to one of her co-players. If Alice’s state is C, it means that everyone cooperated at t − 1. If some of her co-players defect from this full cooperation at t, she will retaliate at t + 1 with state P, so she suffers from unilateral defection at most once. Full cooperation is already broken, so it must be only through state A or R if she comes back to C. The former case, i.e. the case when she comes back to C via A, means that Alice has already been compensated for the payoff loss: Otherwise, she would not have state A. In the latter case when C is accessed via R, on the other hand, the only possible history profile is (ddddd;ddddc;ddddc) unless she made a mistake, which means the compensation has been done in the last round. Finally, state A can be accessed from states P and I, at both of which one cannot exploit Alice who chooses d. To sum up, it is impossible to see unilateral cooperation of a CAPRI-n player repeatedly.

Efficiency

The next criterion is efficiency. Provided that CAPRI-n is employed by all the players, only full cooperation or full defection can be a stationary state, and we can verify this statement by checking each possible case:

If t* = t − 1, everyone has to cooperate again as prescribed at state C, so full cooperation will continue.
If t − m ≤ t* < t − 1 and N_d < n, some players must be at state A while the others are at state P. The latter players at P will keep defecting until satisfying $Δ_{A}^{t^{*}, t - 1} = Δ_{B}^{t^{*}, t - 1} = Δ_{C}^{t^{*}, t - 1}$ . If they make it with keeping t* ≥ t − m, all of them should choose c as prescribed at state A, and the resulting mutual cooperation will continue. If they don’t, the situation to everyone reduces to state I, at which they will defect over and over.
The remaining state is R, but it is always transient.

To judge efficiency, we need to consider error-caused transition between these two stationary states, i.e., full cooperation and full defection. The transition from the latter to the former is possible only through state R, which occurs with probability of O(ϵ^{n − 1}). On the other hand, full cooperation can be made robust against every possible type of (n − 1)-bit error if m = 2n − 1. To see the basic idea, let us suppose that memory is just long enough, and it will immediately become clear what we mean by ‘enough’. If an initial state of full cooperation is disturbed by error, everyone goes to either P (punish) or A (accept), and what happens is the following: Those who have defected more will accept punishment from the others, so the players’ numbers of defections tend to be equalized as time goes by. When everyone has finally defected the same number of times, they all arrive at state A, where c is the prescribed action. Then, according to the first rule C, full cooperation will continue. If we look at our example with Alice, Bob, and Charlie (n = 3), we basically mean that our strategy corrects every single-bit or double-bit error when its memory length is given as m = 2n − 1 = 5: To see how this happens, let us consider some possible types of error case by case.

Single-bit error: Imagine that a player, say, Alice, mistakenly defects from full cooperation at t = 1. She will have state A at t = 2, while the others have state P, so their payoffs should be equalized at t = 3 as a result of punishment, by which mutual cooperation is recovered. This scenario can be represented as follows:
$\begin{matrix} \begin{matrix} t = 1 & t = 2 & t = 3 \\ A : & c c c c \underline{d} & \underset{A}{\to} & c c c d c & \underset{A}{\to} & c c d c c \\ B : & c c c c c & \underset{P}{\to} & c c c c d & \underset{A}{\to} & c c c d c \\ C : & c c c c c & \underset{P}{\to} & c c c c d & \underset{A}{\to} & c c c d c, \end{matrix} \end{matrix}$ (5)
where the underline means a mistaken action, and the letter below each arrow means which rule applies there. The important point is that a single-bit error is corrected only in two rounds.
Double-bit error: In this case, we have several possibilities. First, we consider two players’ simultaneous mistakes, which are corrected in a similar way to Eq (5).
$\begin{matrix} \begin{matrix} t = 1 & t = 2 & t = 3 \\ A : & c c c c \underline{d} & \underset{A}{\to} & c c c d c & \underset{A}{\to} & c c d c c \\ B : & c c c c \underline{d} & \underset{A}{\to} & c c c d c & \underset{A}{\to} & c c d c c \\ C : & c c c c c & \underset{P}{\to} & c c c c d & \underset{A}{\to} & c c c d c . \end{matrix} \end{matrix}$ (6)
As another possibility, let us assume that Alice defects by mistake for two successive rounds. It is a simple extension of the recovery pattern in Eq (5):
$\begin{matrix} \begin{matrix} t = 1 & t = 2 & t = 3 & t = 4 \\ A : & c c c c \underline{d} & \underset{A}{\to} & c c c d \underline{d} & \underset{A}{\to} & c c d d c & \underset{A}{\to} & c d d c c \\ B : & c c c c c & \underset{P}{\to} & c c c c d & \underset{P}{\to} & c c c d d & \underset{A}{\to} & c c d d c \\ C : & c c c c c & \underset{P}{\to} & c c c c d & \underset{P}{\to} & c c c d d & \underset{A}{\to} & c c d d c . \end{matrix} \end{matrix}$ (7)
It makes little difference whether error occurs to a single player twice in a row or it does to one after another:
$\begin{matrix} \begin{matrix} t = 1 & t = 2 & t = 3 & t = 4 \\ A : & c c c c \underline{d} & \underset{A}{\to} & c c c d c & \underset{A}{\to} & c c d c c & \underset{A}{\to} & c d c c c \\ B : & c c c c c & \underset{P}{\to} & c c c c \underline{c} & \underset{P}{\to} & c c c c d & \underset{A}{\to} & c c c d c \\ C : & c c c c c & \underset{P}{\to} & c c c c d & \underset{A}{\to} & c c c d c & \underset{A}{\to} & c c d c c . \end{matrix} \end{matrix}$ (8)
The last possibility to consider is when error occurs again at the end of Eq (5):
$\begin{matrix} \begin{matrix} t = 1 & t = 2 & t = 3 & t = 4 & t = 5 \\ A : & c c c c \underline{d} & \underset{A}{\to} & c c c d c & \underset{A}{\to} & c c d c \underline{d} & \underset{A}{\to} & c d c d c & \underset{A}{\to} & d c d c c \\ B : & c c c c c & \underset{P}{\to} & c c c c d & \underset{A}{\to} & c c c d c & \underset{P}{\to} & c c d c d & \underset{A}{\to} & c d c c c \\ C : & c c c c c & \underset{P}{\to} & c c c c d & \underset{A}{\to} & c c c d c & \underset{P}{\to} & c c d c d & \underset{A}{\to} & c d c c c, \end{matrix} \end{matrix}$ (9)
which needs additional two rounds to reach full cooperation at t = 5. Among all types of double-bit error, the last pattern and the like (i.e., error occurs again when cooperation is about to be recovered) are the ones that require the longest memory for full recovery: If the distance between two errors is longer than two rounds, they can be regarded as two single-bit errors, which are separately correctable [Eq (5)]. In general, if we have to correct (n − 1)-bit error that occurs every other round, the memory length m = 2(n − 1) + 1 is required in total, where the last bit has been added to memorize the last round of full cooperation. It is also enough to correct simpler types of error as illustrated above. To sum up, with m = 2n − 1, the transition probability from mutual cooperation to defection can be suppressed down to O(ϵⁿ), whereas the transition in the opposite direction through R has probability of O(ϵ^{n − 1}). Therefore, the players form full cooperation in the limit of ϵ → 0, fulfilling the efficiency criterion.

Distinguishability

The last criterion is distinguishability. If the others are AllC players, our CAPRI-n player will continue unilateral defection when she defected n consecutive times by error, as prescribed by rule I. One can escape from such a state with probability of O(ϵⁿ) due to the condition of N_d < n in rule C, so this stationary state coexists with full cooperation in the limit of ϵ → 0.

Evolutionary simulation

We consider a standard stochastic model proposed in [29], where a well-mixed population of size N evolves over time by an imitation process. A key assumption of this model is that the mutation rate is low so that at most one mutant strategy can exist in the resident population. In other words, the time that it takes to go extinct or occupy the whole population by selection is assumed to be much shorter than the time scale of mutation. Let us assume that a mutant strategy x is introduced to a population of strategy y. The population dynamics is modeled by the frequency-dependent Moran process, in which the fixation probability of the mutant is given in a closed from:

\begin{matrix} ϕ_{x y} = {(\sum_{i = 0}^{N - 1} \prod_{j = 1}^{i} Γ_{j})}^{- 1} \end{matrix}

(10)

with Γ_j ≡ P_j,j−1/P_j,j+1, where P_j,j±1 denotes the probability that the number of mutants increases or decreases from j by one.

For n = 2, the fixation probability is calculated in the following way: Suppose that we have j individuals of the mutant strategy and N − j individuals of the resident strategy. If we randomly choose a mutant and a resident individual, their average payoffs are obtained as

\begin{matrix} {\begin{matrix} s_{x} = \frac{1}{N - 1} [(j - 1) s_{x x} + (N - j) s_{x y}] \\ s_{y} = \frac{1}{N - 1} [(N - j - 1) s_{y y} + j s_{y x}], \end{matrix} \end{matrix}

(11)

respectively, where s_αβ is α’s long-term payoff against β. According to the imitation process, x can change to y with probability f_{x → y} defined as follows:

\begin{matrix} f_{x \to y} = \frac{1}{1 + exp [σ (s_{x} - s_{y})]}, \end{matrix}

(12)

where σ means the strength of selection. Then, we have

\begin{matrix} Γ_{j} = exp [σ (s_{y} - s_{x})], \end{matrix}

(13)

and the fixation probability is calculated as

\begin{matrix} ϕ_{x y}^{- 1} & = & \sum_{i = 0}^{N - 1} \prod_{j = 1}^{i} e^{σ [(N - j - 1) s_{y y} + j s_{y x} - (j - 1) s_{x x} - (N - j) s_{x y}] / (N - 1)} \end{matrix}

(14)

\begin{matrix} = & \sum_{i = 0}^{N - 1} e^{σ i [(- i + 2 N - 3) s_{y y} + (i + 1) s_{y x} - (- i + 2 N - 1) s_{x y} - (i - 1) s_{x x}] / [2 (N - 1)]} . \end{matrix}

(15)

If y is a friendly rival, i.e. if s_yy ≥ s_xx and s_yy ≥ s_xy in addition to s_yx ≥ s_xy, Jensen’s inequality shows that ϕ_xy ≤ 1/N for arbitrary x, indicating that y has evolutionary robustness for any N, ρ, and σ [21].

For n = 3, the fixation probability is calculated in a similar way. We randomly pick up three players from a well-mixed population, and the respective average payoffs of playing x and y can be written by using the binomial coefficients as follows:

\begin{matrix} {\begin{matrix} s_{x} = \frac{1}{(N - 1) (N - 2)} [(\binom{j - 1}{2}) s_{x x x} + (\binom{j - 1}{1}) (\binom{N - j}{1}) s_{x x y} + (\binom{N - j}{2}) s_{x y y}] \\ s_{y} = \frac{1}{(N - 1) (N - 2)} [(\binom{j}{2}) s_{y x x} + (\binom{N - j - 1}{1}) (\binom{j}{1}) s_{y y x} + (\binom{N - j - 1}{2}) s_{y y y}], \end{matrix} \end{matrix}

(16)

where s_αβγ is α’s long-term payoff against β and γ. Plugging these expressions into Eqs (10) and (13), one can calculate the fixation probability ϕ_xy for the three-person case as well. Differently from the two-person case, however, friendly rivalry itself does not necessarily guarantee evolutionary robustness if n ≥ 3: Assume that a friendly-rivalry strategy y cannot distinguish a mutant x, whereas x does distinguish y when x forms the majority of the n-person game. If n = 3, for example, it means that s_yyx = s_xyy = ρ whereas s_yxx = s_xxy = 0. Furthermore, if the mutants are efficient among themselves, i.e., s_xxx = ρ, then its fixation probability will be higher than 1/N. As of now, we find no reason to rule out the possibility of such a mutant.

We can interpret ϕ_xy as transition probability from y to x from the viewpoint of the population. From the stationary distribution of this Markovian dynamics, we can thus calculate abundance of each available strategy in a numerically exact manner [31, 32]. For the sake of simplicity, we use the donation game as a simplified form of the PD game as well as its generalization to n players in the numerical calculation. That is, with the benefit of cooperation b > 1, each player can donate b/(n − 1) to each co-player at the unit cost, which corresponds to ρ = nb/[b + (n − 1)] up to scaling. In the next section, we will present numerical results obtained by using OACIS [33].

Results

Friendly rivalry

To check the validity of our construction, we computationally examined the three criteria by using graph-theoretic calculations used in [19, 21, 34]. For n = 2, we directly confirmed that CAPRI-2 is indeed a successful strategy satisfying all the three criteria. For n = 3, we conducted mapping to an automaton to obtain a simplified yet equivalent graph representation [26] to reduce computational complexity, and our graph-theoretic calculation confirmed that the resulting automaton indeed passed all the criteria. For n = 4, the required amount of calculation to directly check the criteria was beyond our computational resources, so we employed a Monte Carlo method to simulate the game. The Monte Carlo method was also used to double-check the performance of CAPRI-2 and CAPRI-3. See S1 Appendix for more discussion on computational details.

The Monte Carlo calculation was performed as follows: Let us denote a memory-one strategy as (p_cc, p_cd, p_dc, p_dd) where p_μν means the player’s probability to cooperate when the player and the co-player did μ and ν, respectively, in the previous round. The initial μ and ν can be omitted in the strategy description because they are irrelevant to the long-term payoff as long as ϵ > 0. Fig 3 shows the distribution of payoffs when Alice used CAPRI-n whereas each of her co-players’ strategies was composed of four p_μν’s randomly sampled from the unit interval. The co-players’ payoffs never exceeded Alice’s, as required by defensibility.

Fig 3 — Darker shades toward blue indicate higher frequency of occurrence. The multiplication factors for n = 2, 3, and 4 are 1.5, 2, and 3, respectively, and the solid lines indicate the region of feasible payoffs. In each case, the filled circle means the long-term payoffs when CAPRI-n is adopted by all the players, whereas the cross shows those of TFT players as a reference point. In each panel, we have drawn a dotted line along the diagonal as a simple check for defensibility. For n = 3 or 4, the parallelogram surrounding the blue area indicates the set of feasible payoffs when the focal player is AllD, which indicates that the behavior of CAPRI-n is similar to AllD against most of the memory-one players.

We also calculated the probability of full cooperation for n = 2, 3 and 4 when CAPRI-n was adopted by all the players in order to check efficiency. By using linear-algebraic [18, 19] or Monte Carlo calculation, with ϵ = 10⁻⁴, we obtained 0.999, 0.997, 0.978 for n = 2, 3, and 4, respectively, which supports the conclusion that they all satisfy the efficiency criterion.

Evolutionary performance

Before checking the evolutionary performance of our proposed strategy, we conducted simulations without CAPRI-n for comparison. Figs 4A and 5A show the results when the strategies were sampled from deterministic memory-one for n = 2 and 3. When b was low and/or N was small, defensible strategies such as AllD tended to be favored by selection, and the resulting cooperation level was low. On the other hand, when b or N was large, efficient strategies were favored, and they achieved a high level of cooperation. The reason is that cooperative strategies maintained high payoffs by interacting with many other cooperators even if they were exploited by a small number of aggressive mutants.

Fig 4 — The default values were b = 3 and N = 30 unless otherwise specified. The strength of selection and the error probability were set to be σ = 1 and ϵ = 10⁻⁴, respectively. (A) Simulation result with 16 memory-one deterministic strategies, classified into three categories, i.e., efficient, defensible, and the other strategies. (B) Effect of CAPRI-2 when it was added to the available set of strategies.

Fig 5 — The default values were b = 3 and N = 30 unless otherwise specified. The strength of selection and the error probability were set to be σ = 1 and ϵ = 10⁻⁴, respectively. (A) Simulation result with 64 memory-one deterministic strategies, classified into three categories, i.e., efficient, defensible, and the other strategies. (B) Effect of CAPRI-3 when it was added to the available set of strategies.

When CAPRI-n was introduced, it occupied a large amount of the population as shown in Figs 4B and 5B. Whereas each memory-one strategy flourished depending on the environmental parameters b and N, CAPRI-n was found abundant in the entire parameter region. In particular, it is striking that CAPRI-3 overwhelms all the other strategies in the three-person PG game for any moderate sizes of b and N (Fig 5B).

It is nevertheless worth pointing out that CAPRI-2 gave more and more room to efficient strategies in the iterated PD game as b or N increases (Fig 4B), and this is due to neutral drift: Although CAPRI-2 earns a strictly higher long-term payoff than AllC = (1, 1, 1, 1) and Win-Stay-Lose-Shift (WSLS) = (1, 0, 0, 1), it does not with respect to (1, 1, 1, 0), which can, in turn, be invaded by WSLS. For this reason, WSLS can become abundant in the presence of (1, 1, 1, 0) when the environmental conditions are favorable.

We also tested the performance of CAPRI-n against strategies with the same memory length. An obvious problem is the huge number of possible strategies: Provided that m = 2n − 1, the number amounts to $2^{2^{n m}} \sim 10^{19}$ for n = 2, which grows to 10⁹⁸⁶⁴ for n = 3. As an alternative to exhaustive enumeration, we calculated fixation probabilities of mutants that are randomly sampled from the set of deterministic strategies with m = 2n − 1, with taking CAPRI-n as the resident strategy. The numbers of sampled mutants were 10⁹ and 5 × 10⁶ for n = 2 and 3, respectively. As shown in Fig 6, none of them had fixation probability greater than 1/N, and the tendency was more pronounced in the three-person game than in the two-person case. For comparison, we also tested resident strategies drawn randomly from the same strategy set with m = 2n − 1, and a significant fraction of mutants succeeded in fixation with probability higher than 1/N. Although we have no proof for evolutionary robustness for n ≥ 3, the numerical result suggested that it would be extremely unlikely to see CAPRI-n invaded by random mutants even if they had the same memory length.

Fig 6 — When we simulated the two-person game with taking CAPRI-2 as the resident strategy, 10⁹ mutants were sampled. In case of the three-person game in which CAPRI-3 was the resident strategy, the number of sampled mutants was 5 × 10⁶. In either case, no mutant had higher fixation probability than 1/N (the vertical dashed line). On the other hand, when the resident was randomly drawn from the same strategy set, mutants frequently achieved fixation with probability higher than 1/N. For each random sample of the resident strategy, 10² mutants were tested, and this process was repeated 10⁷ and 10⁵ times when n = 2 and 3, respectively. Throughout this calculation, we used N = 10 as the population size, ϵ = 10⁻⁴ as the error probability, σ = 1 as the selection strength, and b = 2 as the benefit of cooperation.

Discussion

In summary, we have constructed a friendly-rivalry strategy for the iterated n-person PG game. It maintains a cooperative Nash equilibrium in the presence of implementation error with probability ϵ ≪ 1, and it shows excellent evolutionary performance regardless of the environmental conditions such as the population size and the strength of selection. In this sense, the n-person social dilemma is solved. The strategy requires memory of the previous m = 2n − 1 rounds and consists of the following five rules: Cooperate if everyone did, accept punishment for your own mistake, punish others’ defection, recover cooperation if you find a chance, and defect in all the other circumstances.

Although we have considered only implementation error, perception error can also be corrected if it occurs with sufficiently low probability: The disagreement between the players’ history profiles due to the perception error will soon be removed at full defection, and the players will escape from mutual defection with probability of O(ϵⁿ). Unless another perception error perturbs this process, the players will eventually arrive at full cooperation, overcoming the perception error.

Another important solution concept to the n-person dilemma can be derived from a different set of criteria: By requiring mutual cooperation, error correction, and retaliation with a time scale of k rounds, one can characterize the all-or-none (AON-k) strategy, which is defined as prescribing c only when everyone cooperated or no one did in each of the previous k rounds [30, 35, 36]. For example, WSLS = (1, 0, 0, 1) is equivalent to AON-1. For each k, one can find a threshold of the multiplication factor above which AON-k constitutes a subgame-perfect equilibrium [30]. AON-k performs well in evolutionary simulation because it prescribes d as the default action, just as CAPRI-n does in state I, unless the players have synchronized their behavior over the previous k rounds. As a result, it earns a strictly higher payoff against a broad range of strategies.

In general, CAPRI-n with m = 2n − 1 can repeatedly exploit the other co-players playing AON-k if k < m − 1, which means that an AON-k population can readily be invaded by CAPRI-n unless k is large enough. Considering the condition for AON-k to be subgame perfect, one could speculate that AON with small k can be abundant in an environment with a high multiplication factor. However, our finding implies that such a simple solution may not be sustained when CAPRI-n is available. This is especially crucial when population size is not large enough because AON-k lacks defensibility. Still, AON-k remains as a strong competitor to CAPRI-n in evolutionary simulation: For example, although WSLS earns a strictly less payoff against CAPRI-2, it circumvented the difficulty of fixation with the aid of a third strategy (1, 1, 1, 0).

We have assumed the small-ϵ limit, but an important question is how the performance of CAPRI-n will be affected if ϵ takes a finite value. One possibility is that it could set a limit on the maximum number of players in regard to the efficiency criterion: The transition probability from full defection to cooperation is given as nϵ^{n − 1} by construction, where the prefactor n originates from the number of possible ways to choose (n − 1) players who will cooperate by mistake. The probability of transition in the opposite direction is of O(ϵⁿ) at most, but it is reasonable to guess that it also has a prefactor as an increasing function of n. If it can be approximated as n^τ ϵⁿ with τ > 1, for example, efficiency criterion should require n ≲ (1/ϵ)^1/(τ−1). To achieve cooperation among a large number of players with finite error probability, therefore, we may have to revise the rules so as to adjust the prefactors.

From a practical point of view, it is worth noting that the five rules of CAPRI-n mostly refer to two factors: One is the players’ last action at t − 1, and the other is the differences in the players’ respective numbers of defections over the previous m rounds. In other words, exact details of the history profile are irrelevant, and this point greatly reduces the cognitive burden to play this strategy. In fact, according to a recent experiment, people assign reputation to their co-players based mainly on their last action and their average numbers of defection [37]. This could explain the reason that such a delicate relationship called friendly rivalry can develop spontaneously and unwittingly among a group of people. How to keep such a relation healthy and productive has so far been acquired as tacit knowledge surrounded by anecdotes and experiences, and CAPRI-n expresses its essential how-tos in a form of explicit knowledge which can be designed, analyzed, and transmitted systematically.

Supporting information

S1 Appendix. Computational check for efficiency, defensibility, and distinguishability.

(PDF)

Click here for additional data file.^{(97.8KB, pdf)}

S1 Table. Summary of mathematical symbols used in this work.

(PDF)

Click here for additional data file.^{(91.3KB, pdf)}

Acknowledgments

Part of the results is obtained by using the Fugaku computer at RIKEN Center for Computational Science (Proposal number ra000002). We appreciate the APCTP for its hospitality during completion of this work.

Data Availability

All the data used in this paper are reproducible from the codes available at https://github.com/yohm/sim_CAPRI_nplayers.

Funding Statement

Y.M. acknowledges support from Japan Society for the Promotion of Science (www.jsps.go.jp) (JSPS KAKENHI; Grant no. 18H03621). S.K.B. acknowledges support by Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Education (www.moe.go.kr) (NRF-2020R1I1A2071670). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1. Nowak MA, Highfield R. Supercooperators. New York: Free Press; 2011. [Google Scholar]
2. Nowak MA. Five rules for the evolution of cooperation. Science. 2006;314(5805):1560–1563. 10.1126/science.1133755 [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Sigmund K. The Calculus of Selfishness. Princeton: Princeton Univ. Press; 2010. [Google Scholar]
4. Fudenberg D, Tirole J. Game Theory. Cambridge, MA: MIT Press; 1991. [Google Scholar]
5. Molander P. The optimal level of generosity in a selfish, uncertain environment. J Conflict Resolut. 1985;29(4):611–618. 10.1177/0022002785029004004 [DOI] [Google Scholar]
6. Boyd R, Richerson PJ. The evolution of reciprocity in sizable groups. J Theor Biol. 1988;132(3):337–356. 10.1016/S0022-5193(88)80219-4 [DOI] [PubMed] [Google Scholar]
7. Gintis H. Behavioral ethics meets natural justice. Politics Philos Econ. 2006;5(1):5–32. 10.1177/1470594X06060617 [DOI] [Google Scholar]
8. Press WH, Dyson FJ. Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent. Proc Natl Acad Sci USA. 2012;109(26):10409–10413. 10.1073/pnas.1206569109 [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Hilbe C, Nowak MA, Sigmund K. Evolution of extortion in Iterated Prisoner’s Dilemma games. Proc Natl Acad Sci USA. 2013;110(17):6913–6918. 10.1073/pnas.1214834110 [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Hilbe C, Nowak MA, Traulsen A. Adaptive dynamics of extortion and compliance. PloS one. 2013;8(11):e77886 10.1371/journal.pone.0077886 [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Stewart AJ, Plotkin JB. From extortion to generosity, evolution in the Iterated Prisoner’s Dilemma. Proc Natl Acad Sci USA. 2013;110(38):15348–15353. 10.1073/pnas.1306246110 [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Adami C, Hintze A. Evolutionary instability of zero-determinant strategies demonstrates that winning is not everything. Nat Commun. 2013;4(1):1–8. 10.1038/ncomms3193 [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Hilbe C, Traulsen A, Sigmund K. Partners or rivals? Strategies for the iterated prisoner’s dilemma. Games Econ Behav. 2015;92:41–52. 10.1016/j.geb.2015.05.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Hilbe C, Chatterjee K, Nowak MA. Partners and rivals in direct reciprocity. Nat Hum Behav. 2018;2(7):469–477. 10.1038/s41562-018-0320-9 [DOI] [PubMed] [Google Scholar]
15. Akin E. What you gotta know to play good in the iterated Prisoner’s Dilemma. Games. 2015;6(3):175–190. 10.3390/g6030175 [DOI] [Google Scholar]
16. Akin E. The Iterated Prisoner’s dilemma: good strategies and their dynamics In: Assani I, editor. Ergodic Theory, Advances in Dynamical Systems. Berlin: de Gruyter; 2016. p. 77–107. [Google Scholar]
17. Duersch P, Oechssler J, Schipper BC. Unbeatable imitation. Games Econ Behav. 2012;76(1):88–96. 10.1016/j.geb.2012.05.002 [DOI] [Google Scholar]
18. Yi SD, Baek SK, Choi JK. Combination with anti-tit-for-tat remedies problems of tit-for-tat. J Theor Biol. 2017;412:1–7. 10.1016/j.jtbi.2016.09.017 [DOI] [PubMed] [Google Scholar]
19. Murase Y, Baek SK. Seven rules to avoid the tragedy of the commons. J Theor Biol. 2018;449:94–102. 10.1016/j.jtbi.2018.04.027 [DOI] [PubMed] [Google Scholar]
20. Stewart AJ, Plotkin JB. Collapse of cooperation in evolving games. Proc Natl Acad Sci USA. 2014;111(49):17558–17563. 10.1073/pnas.1408618111 [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Murase Y, Baek SK. Five rules for friendly rivalry in direct reciprocity. Sci Rep. 2020;10:16904 10.1038/s41598-020-73855-x [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Stewart AJ, Plotkin JB. Small groups and long memories promote cooperation. Sci Rep. 2016;6:26889 10.1038/srep26889 [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Baek SK, Kim BJ. Intelligent Tit-for-Tat in the iterated prisoner’s dilemma game. Phys Rev E. 2008;78(1):011125 10.1103/PhysRevE.78.011125 [DOI] [PubMed] [Google Scholar]
24. Nowak M. Stochastic strategies in the prisoner’s dilemma. Theor Popul Biol. 1990;38(1):93–112. 10.1016/0040-5809(90)90005-G [DOI] [Google Scholar]
25. Nowak MA, Sigmund K, El-Sedy E. Automata, repeated games and noise. J Math Biol. 1995;33(7):703–722. 10.1007/BF00184645 [DOI] [Google Scholar]
26. Murase Y, Baek SK. Automata representation of successful strategies for social dilemmas. Sci Rep. 2020;10:13370 10.1038/s41598-020-70281-x [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Imhof LA, Fudenberg D, Nowak MA. Evolutionary cycles of cooperation and defection. Proc Natl Acad Sci USA. 2005;102(31):10797–10800. 10.1073/pnas.0502589102 [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Imhof LA, Fudenberg D, Nowak MA. Tit-for-tat or win-stay, lose-shift? J Theor Biol. 2007;247(3):574–580. 10.1016/j.jtbi.2007.03.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Imhof LA, Nowak MA. Stochastic evolutionary dynamics of direct reciprocity. Proc R Roc B. 2010;277(1680):463–468. 10.1098/rspb.2009.1171 [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Hilbe C, Martinez-Vaquero LA, Chatterjee K, Nowak MA. Memory-n strategies of direct reciprocity. Proc Natl Acad Sci USA. 2017;114(18):4715–4720. 10.1073/pnas.1621239114 [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Jeong HC, Oh SY, Allen B, Nowak MA. Optional games on cycles and complete graphs. J Theor Biol. 2014;356:98–112. 10.1016/j.jtbi.2014.04.025 [DOI] [PubMed] [Google Scholar]
32. Baek SK, Jeong HC, Hilbe C, Nowak MA. Comparing reactive and memory-one strategies of direct reciprocity. Sci Rep. 2016;6:25676 10.1038/srep25676 [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Murase Y, Uchitane T, Ito N. An open-source job management framework for parameter-space exploration: OACIS. J Phys: Conf Ser. 2017;921:012001. [Google Scholar]
34. Hougardy S. The Floyd–Warshall algorithm on graphs with negative cycles. Inf Process Lett. 2010;110(8-9):279–281. 10.1016/j.ipl.2010.02.001 [DOI] [Google Scholar]
35. Hauert C, Schuster HG. Effects of increasing the number of players and memory size in the iterated Prisoner’s Dilemma: a numerical approach. Proc R Soc B. 1997;264(1381):513–519. 10.1098/rspb.1997.0073 [DOI] [Google Scholar]
36. Lindgren K. Evolutionary Dynamics in Game-Theoretic Models In: Brian Arthur W, Durlauf SN, Lane D, editors. The Economy as an Evolving Complex System II. Upper Saddle River, NJ: Addison-Wesley; 1997. p. 337–368. [Google Scholar]
37. Cuesta JA, Gracia-Lázaro C, Ferrer A, Moreno Y, Sánchez A. Reputation drives cooperative behaviour and network formation in human groups. Sci Rep. 2015;5(1):1–6. 10.1038/srep07843 [DOI] [PMC free article] [PubMed] [Google Scholar]

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008217.r001

Decision Letter 0

Yamir Moreno, Stefano Allesina

3 Oct 2020

Dear Prof. Baek,

Thank you very much for submitting your manuscript "Friendly-rivalry solution to the iterated n-person public-goods game" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

The paper has now been revised by three of our reviewers, who have made a number of criticisms that are sufficiently adverse as to suggest that a major revision is due before a final decision about publication can be made. We therefore ask you to address all comments by the reviewers. However, we ask you to pay particular attention to:

- the technical aspects raised by reviewer 3. This might require to show that the results are robust beyond what is now presented in the MS.

- We would also like that similarities and differences with respect to the paper arXiv:2004.00261 were clearly stated. PCB only published highly original contributions, and we appreciate that you have studied the same strategy for the PD game, which apparently leads to very similar findings and even the discussion presented.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Yamir Moreno

Guest Editor

PLOS Computational Biology

Stefano Allesina

Deputy Editor

PLOS Computational Biology

***********************

- the technical aspects raised by reviewer 3. This might require to show that the results are robust beyond what is now presented in the MS.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: I found the study "Friendly-rivalry solution to the iterated n-person public-goods game" by Murase and Baek very interesting, if not important. I believe that the study should be published subject to addressing the issues listed below.

NON-NEGOTIABLE

• There are many mathematical symbols. A table of symbols is absolutely needed.

MAJOR

L209-225... I really struggled trying to understand the argument about efficiency. If I had more time, maybe I would fully grasp it, but as it is, the available information did not really convince me. I strongly suggest that the authors consider rephrasing the argument for the benefit of their readers.

L265-272... This whole paragraph sounds as if the authors were saying: "We checked our claims and you should trust us!" It is utterly unclear whether the checks were performed on paper or computationally. If the former is the case, then the authors should detail their calculations in an appendix or supporting text. If the latter is the case, then a reference to the precise part of the code should be made available.

Fig. 3... What do gray dots represent? What does blueish color coding stand for?

MINOR

L139... Eq. (1) does not really define a matrix.

L191, 192... I'm not sure if I got 'former' and 'latter' right at this particular spot in the manuscript. Perhaps the authors should rephrase.

L266... CAPRI-2 rather than CAPRI-n.

L366... The authors should cite OACIS in the Methods section.

Reviewer #2: This manuscript presents an interesting study on the social dilemma of cooperation by an approach based on the iterated n-person public-goods Game.

The manuscript is well presented. I found correct mathematics as well as the results obtained both via simulations and formal reasoning. However, I have some concerns about both the justification of the model and the discussion.

The authors extend the concept of Nash equilibrium to long-term payoffs. This extension has been adopted in recent literature, for example, as authors cited, in the context of zero-determinant strategies.

The authors define a set of conditions for a particular collaborative strategy (CAPRI). If I understand the model correctly, the condition for the Nash equilibrium, for the particular case of a symmetric strategic profile where all the agents share the strategy $\\Omega$, is reformulated as:

"It must be guaranteed that none of the co-players can obtain higher long-term payoffs against $\\Omega$ regardless of their strategies and initial states when e = 0."

Furthermore, the required memory length of players adopting such a strategy is m = 2n − 1. I find this concept far from the original Nash Equilibrium, and, at least, would expect a deep discussion by the authors on this (or perhaps to rename it). Nevertheless, the authors study the system for small values of n, which, one the one hand, make the memory requirement realistic while, on the other hand, bring this approach very close to that of pairwise interactions which has been addressed by the authors in Ref. 22.

The memory considered (2n − 1) is very long for large sets of co-players, furthermore when the authors state "not far from human behavior." Does it mean that, in an n-agents interaction, an agent may have a memory of 2n-1 previous interactions of everybody? I think that the authors should contextualize their model and justify it according to its particular requirements on memory length to differentiate it from that of pairwise interactions.

Reviewer #3: See attached PDF

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Marko Jusup

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see http://journals.plos.org/compbiol/s/submission-guidelines#loc-materials-and-methods

Attachment

Submitted filename: Review_PLOS_CAPRI.pdf

Click here for additional data file.^{(94.4KB, pdf)}

PLoS Comput Biol. 2021 Jan 21;17(1):e1008217. doi: 10.1371/journal.pcbi.1008217.r002

Author response to Decision Letter 0

6 Nov 2020

Attachment

Submitted filename: response.pdf

Click here for additional data file.^{(142.5KB, pdf)}

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008217.r003

Decision Letter 1

Yamir Moreno, Stefano Allesina

12 Dec 2020

Dear Prof. Baek,

We are pleased to inform you that your manuscript 'Friendly-rivalry solution to the iterated n-person public-goods game' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology.

Best regards,

Yamir Moreno

Guest Editor

PLOS Computational Biology

Stefano Allesina

Deputy Editor

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: I'm satisfied with the authors' answers to my comments.

Reviewer #2: I appreciate the efforts made by the authors to improve the manuscript as well as the answers to the reviewers' suggestions.

Reviewers' concerns have revealed some weaknesses in the model. In my opinion, in addition to the improvements made in the current version and the necessary corrections, the authors have contextualized these weak points by making the limitations explicit.

The revised version of this paper has improved significantly the original and, in my opinion, it can be accepted for publication.

Reviewer #3: I thank the authors for their careful response to the queries I raised. Although I would be interested to see a further analysis of the role of errors in determining the evolutionary dynamics of the strategies presented in this paper, I agree that the changes made in #11 and #12 are sufficient to address the issues I raised, and I'm therefore happy to recommend accepting the paper

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: Yes: Alexander J. Stewart

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008217.r004

Acceptance letter

Yamir Moreno, Stefano Allesina

18 Jan 2021

PCOMPBIOL-D-20-01362R1

Friendly-rivalry solution to the iterated n-person public-goods game

Dear Dr Baek,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Jutka Oroszlan

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Appendix. Computational check for efficiency, defensibility, and distinguishability.

(PDF)

Click here for additional data file.^{(97.8KB, pdf)}

S1 Table. Summary of mathematical symbols used in this work.

(PDF)

Click here for additional data file.^{(91.3KB, pdf)}

Attachment

Submitted filename: Review_PLOS_CAPRI.pdf

Click here for additional data file.^{(94.4KB, pdf)}

Attachment

Submitted filename: response.pdf

Click here for additional data file.^{(142.5KB, pdf)}

Data Availability Statement

All the data used in this paper are reproducible from the codes available at https://github.com/yohm/sim_CAPRI_nplayers.

[pcbi.1008217.ref001] 1. Nowak MA, Highfield R. Supercooperators. New York: Free Press; 2011. [Google Scholar]

[pcbi.1008217.ref002] 2. Nowak MA. Five rules for the evolution of cooperation. Science. 2006;314(5805):1560–1563. 10.1126/science.1133755 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref003] 3. Sigmund K. The Calculus of Selfishness. Princeton: Princeton Univ. Press; 2010. [Google Scholar]

[pcbi.1008217.ref004] 4. Fudenberg D, Tirole J. Game Theory. Cambridge, MA: MIT Press; 1991. [Google Scholar]

[pcbi.1008217.ref005] 5. Molander P. The optimal level of generosity in a selfish, uncertain environment. J Conflict Resolut. 1985;29(4):611–618. 10.1177/0022002785029004004 [DOI] [Google Scholar]

[pcbi.1008217.ref006] 6. Boyd R, Richerson PJ. The evolution of reciprocity in sizable groups. J Theor Biol. 1988;132(3):337–356. 10.1016/S0022-5193(88)80219-4 [DOI] [PubMed] [Google Scholar]

[pcbi.1008217.ref007] 7. Gintis H. Behavioral ethics meets natural justice. Politics Philos Econ. 2006;5(1):5–32. 10.1177/1470594X06060617 [DOI] [Google Scholar]

[pcbi.1008217.ref008] 8. Press WH, Dyson FJ. Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent. Proc Natl Acad Sci USA. 2012;109(26):10409–10413. 10.1073/pnas.1206569109 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref009] 9. Hilbe C, Nowak MA, Sigmund K. Evolution of extortion in Iterated Prisoner’s Dilemma games. Proc Natl Acad Sci USA. 2013;110(17):6913–6918. 10.1073/pnas.1214834110 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref010] 10. Hilbe C, Nowak MA, Traulsen A. Adaptive dynamics of extortion and compliance. PloS one. 2013;8(11):e77886 10.1371/journal.pone.0077886 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref011] 11. Stewart AJ, Plotkin JB. From extortion to generosity, evolution in the Iterated Prisoner’s Dilemma. Proc Natl Acad Sci USA. 2013;110(38):15348–15353. 10.1073/pnas.1306246110 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref012] 12. Adami C, Hintze A. Evolutionary instability of zero-determinant strategies demonstrates that winning is not everything. Nat Commun. 2013;4(1):1–8. 10.1038/ncomms3193 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref013] 13. Hilbe C, Traulsen A, Sigmund K. Partners or rivals? Strategies for the iterated prisoner’s dilemma. Games Econ Behav. 2015;92:41–52. 10.1016/j.geb.2015.05.005 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref014] 14. Hilbe C, Chatterjee K, Nowak MA. Partners and rivals in direct reciprocity. Nat Hum Behav. 2018;2(7):469–477. 10.1038/s41562-018-0320-9 [DOI] [PubMed] [Google Scholar]

[pcbi.1008217.ref015] 15. Akin E. What you gotta know to play good in the iterated Prisoner’s Dilemma. Games. 2015;6(3):175–190. 10.3390/g6030175 [DOI] [Google Scholar]

[pcbi.1008217.ref016] 16. Akin E. The Iterated Prisoner’s dilemma: good strategies and their dynamics In: Assani I, editor. Ergodic Theory, Advances in Dynamical Systems. Berlin: de Gruyter; 2016. p. 77–107. [Google Scholar]

[pcbi.1008217.ref017] 17. Duersch P, Oechssler J, Schipper BC. Unbeatable imitation. Games Econ Behav. 2012;76(1):88–96. 10.1016/j.geb.2012.05.002 [DOI] [Google Scholar]

[pcbi.1008217.ref018] 18. Yi SD, Baek SK, Choi JK. Combination with anti-tit-for-tat remedies problems of tit-for-tat. J Theor Biol. 2017;412:1–7. 10.1016/j.jtbi.2016.09.017 [DOI] [PubMed] [Google Scholar]

[pcbi.1008217.ref019] 19. Murase Y, Baek SK. Seven rules to avoid the tragedy of the commons. J Theor Biol. 2018;449:94–102. 10.1016/j.jtbi.2018.04.027 [DOI] [PubMed] [Google Scholar]

[pcbi.1008217.ref020] 20. Stewart AJ, Plotkin JB. Collapse of cooperation in evolving games. Proc Natl Acad Sci USA. 2014;111(49):17558–17563. 10.1073/pnas.1408618111 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref021] 21. Murase Y, Baek SK. Five rules for friendly rivalry in direct reciprocity. Sci Rep. 2020;10:16904 10.1038/s41598-020-73855-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref022] 22. Stewart AJ, Plotkin JB. Small groups and long memories promote cooperation. Sci Rep. 2016;6:26889 10.1038/srep26889 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref023] 23. Baek SK, Kim BJ. Intelligent Tit-for-Tat in the iterated prisoner’s dilemma game. Phys Rev E. 2008;78(1):011125 10.1103/PhysRevE.78.011125 [DOI] [PubMed] [Google Scholar]

[pcbi.1008217.ref024] 24. Nowak M. Stochastic strategies in the prisoner’s dilemma. Theor Popul Biol. 1990;38(1):93–112. 10.1016/0040-5809(90)90005-G [DOI] [Google Scholar]

[pcbi.1008217.ref025] 25. Nowak MA, Sigmund K, El-Sedy E. Automata, repeated games and noise. J Math Biol. 1995;33(7):703–722. 10.1007/BF00184645 [DOI] [Google Scholar]

[pcbi.1008217.ref026] 26. Murase Y, Baek SK. Automata representation of successful strategies for social dilemmas. Sci Rep. 2020;10:13370 10.1038/s41598-020-70281-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref027] 27. Imhof LA, Fudenberg D, Nowak MA. Evolutionary cycles of cooperation and defection. Proc Natl Acad Sci USA. 2005;102(31):10797–10800. 10.1073/pnas.0502589102 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref028] 28. Imhof LA, Fudenberg D, Nowak MA. Tit-for-tat or win-stay, lose-shift? J Theor Biol. 2007;247(3):574–580. 10.1016/j.jtbi.2007.03.027 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref029] 29. Imhof LA, Nowak MA. Stochastic evolutionary dynamics of direct reciprocity. Proc R Roc B. 2010;277(1680):463–468. 10.1098/rspb.2009.1171 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref030] 30. Hilbe C, Martinez-Vaquero LA, Chatterjee K, Nowak MA. Memory-n strategies of direct reciprocity. Proc Natl Acad Sci USA. 2017;114(18):4715–4720. 10.1073/pnas.1621239114 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref031] 31. Jeong HC, Oh SY, Allen B, Nowak MA. Optional games on cycles and complete graphs. J Theor Biol. 2014;356:98–112. 10.1016/j.jtbi.2014.04.025 [DOI] [PubMed] [Google Scholar]

[pcbi.1008217.ref032] 32. Baek SK, Jeong HC, Hilbe C, Nowak MA. Comparing reactive and memory-one strategies of direct reciprocity. Sci Rep. 2016;6:25676 10.1038/srep25676 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1008217.ref033] 33. Murase Y, Uchitane T, Ito N. An open-source job management framework for parameter-space exploration: OACIS. J Phys: Conf Ser. 2017;921:012001. [Google Scholar]

[pcbi.1008217.ref034] 34. Hougardy S. The Floyd–Warshall algorithm on graphs with negative cycles. Inf Process Lett. 2010;110(8-9):279–281. 10.1016/j.ipl.2010.02.001 [DOI] [Google Scholar]

[pcbi.1008217.ref035] 35. Hauert C, Schuster HG. Effects of increasing the number of players and memory size in the iterated Prisoner’s Dilemma: a numerical approach. Proc R Soc B. 1997;264(1381):513–519. 10.1098/rspb.1997.0073 [DOI] [Google Scholar]

[pcbi.1008217.ref036] 36. Lindgren K. Evolutionary Dynamics in Game-Theoretic Models In: Brian Arthur W, Durlauf SN, Lane D, editors. The Economy as an Evolving Complex System II. Upper Saddle River, NJ: Addison-Wesley; 1997. p. 337–368. [Google Scholar]

[pcbi.1008217.ref037] 37. Cuesta JA, Gracia-Lázaro C, Ferrer A, Moreno Y, Sánchez A. Reputation drives cooperative behaviour and network formation in human groups. Sci Rep. 2015;5(1):1–6. 10.1038/srep07843 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Friendly-rivalry solution to the iterated n-person public-goods game

Yohsuke Murase

Seung Ki Baek

Roles

Abstract

Author summary

Introduction

Fig 1. Memory length m required for each of currently known friendly-rivalry strategies in the n-person PG game [18, 19, 21].

Materials and methods

Public-goods game

Axiomatic approach

Strategy design

Fig 2. Schematic diagram of the transition between states of CAPRI-n.

Defensibility

Efficiency

Distinguishability

Evolutionary simulation

Results

Friendly rivalry

Fig 3. Distribution of long-term payoffs when a CAPRI-n player meets co-players whose pμν’s are randomly sampled from the unit interval.

Evolutionary performance

Fig 4. Abundance of strategies for n = 2 as the benefit-to-cost ratio b and the population size N vary.

Fig 5. Abundance of strategies for n = 3 as the benefit-to-cost ratio b and the population size N vary.

Fig 6. Normalized distribution of fixation probabilities of mutants, which were randomly sampled from the set of deterministic strategies with the same memory length as CAPRI-n’s.

Discussion

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Yamir Moreno

Stefano Allesina

Roles

Author response to Decision Letter 0

Decision Letter 1

Yamir Moreno

Stefano Allesina

Roles

Acceptance letter

Yamir Moreno

Stefano Allesina

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Fig 3. Distribution of long-term payoffs when a CAPRI-n player meets co-players whose p_μν’s are randomly sampled from the unit interval.