Skip to main content
Philosophical Transactions of the Royal Society B: Biological Sciences logoLink to Philosophical Transactions of the Royal Society B: Biological Sciences
. 2021 Oct 4;376(1838):20200293. doi: 10.1098/rstb.2020.0293

Reputation and punishment sustain cooperation in the optional public goods game

Shirsendu Podder 1,, Simone Righi 2, Francesca Pancotto 3
PMCID: PMC8487744  PMID: 34601913

Abstract

Cooperative behaviour has been extensively studied as a choice between cooperation and defection. However, the possibility to not participate is also frequently available. This type of problem can be studied through the optional public goods game. The introduction of the ‘Loner’ strategy' allows players to withdraw from the game, which leads to a cooperator–defector–loner cycle. While pro-social punishment can help increase cooperation, anti-social punishment—where defectors punish cooperators—causes its downfall in both experimental and theoretical studies. In this paper, we introduce social norms that allow agents to condition their behaviour to the reputation of their peers. We benchmark this with respect both to the standard optional public goods game and to the variant where all types of punishment are allowed. We find that a social norm imposing a more moderate reputational penalty for opting out than for defecting increases cooperation. When, besides reputation, punishment is also possible, the two mechanisms work synergically under all social norms that do not assign to loners a strictly worse reputation than to defectors. Under this latter set-up, the high levels of cooperation are sustained by conditional strategies, which largely reduce the use of pro-social punishment and almost completely eliminate anti-social punishment.

This article is part of the theme issue ‘The language of cooperation: reputation and honest signalling’.

Keywords: reputation, anti-social punishment, optional public goods game

1. Introduction

Explaining the conditions for the evolution of cooperation in groups of unrelated individuals is an important topic for natural and social scientists [16]. Research on the subject has been extensive and has led to the identification of several mechanisms contributing to the success of cooperative behaviour: direct and indirect reciprocity, spatial selection, kin and multi-level selection, and punishment [3], among others. A large number of situations in which cooperation is difficult to achieve involve the pursuit of collective action to address problems beyond the individual dimension [7]. Such problems are frequently studied through variations of the public goods game, where the public goods thrive through the contributions of its participants. While the social optimum is achieved when everyone’s contribution is maximal, individuals always have the temptation to withhold their contributions, free-riding on the efforts of others [8].

The public goods game has been mainly studied as a choice between contributing (or cooperating) and free-riding (or defecting) [9]. However, in many situations, the possibility to not participate (or to exit) a situation is also available owing, for example, to ostracism [10] or to individual choice [11]. An individual adopting this strategy chooses to abstain altogether from interactions, giving up not only its costs but also its potential benefits, instead preferring a lower but guaranteed payoff. For this reason, it is frequently called the ‘Loner’ strategy.

This behaviour has been observed in several social animal species. When prides of lions hunt, individuals that actively pursue prey and contribute to their capture can be characterized as cooperators, members participating in the hunt but remaining immobile (or covering only very short distances) can be characterized as defectors, while the members hunting alone can be considered loners [12]. Similarly, groups of primates frequently display the practice of mutual grooming. In such situations, some individuals both actively receive grooming and groom others (a costly activity in terms of time and effort), while some fail to reciprocate the received benefits. Finally, a third category of individuals tend to groom independently, refusing both to help and to accept help from others [13]. In humans, loners are those individuals that abstain from participation, both from the construction and from the benefits of the public good, for example, refusing to participate in village activities involving conspicuous consumption [14], deciding not to join a business alliance, or quitting a secure job rather than intentionally reducing effort. Finally, at the institutional level, countries can decide to forgo joining geo-political alliances, rather than joining and then not engaging in their activities.

The forces influencing decision making in these situations can be captured in a game theoretical form through the Optional Public Goods Game (OPGG), which introduces the loner strategy into the traditional public goods game. This addition generates significant changes in the game dynamics, undermining the strength of defectors (who cannot exploit loners), while introducing cyclical population shifts. Indeed when most players cooperate, defection is the most effective strategy. When defectors become prevalent, it pays to stay out of the game and to resort to the loner strategy. Finally, loner dominance lays the foundation for the return of cooperation [11,15]. Such rock–paper–scissors dynamics have been confirmed in experiments [16]. Relatedly, the possibility of opting out increases cooperation through enhanced pro-sociality of those who decide to stay in the game [17], through the exit threat [18], and through partner selection [19].

Two mechanisms have been shown to be particularly important in influencing the ability of a group to engage in effective collective action: punishment and reputation. The former can be described as people willing to sustain an individual cost to punish behaviour that they find inappropriate [20]. Experiments [21,22] have found that pro-social punishment (the punishment of a defector by a cooperator)—emergence of which has been linked to the possibility to abstain from participation [23]—can sustain cooperation in iterated games. More recently however, the positive role of punishment has been criticized, as substantial levels of anti-social punishment (the punishment of cooperators by defectors) have been observed in experiments [2426]. Although the second-order free-riding on anti-social punishment can restore the effectiveness of pro-social punishment in the absence of loners [27], allowing this type of punishment significantly reduces cooperation, and re-establishes the rock–paper–scissor type of dynamics among strategies [25] in the OPGG. Importantly, Rand & Nowak [25] show that the strategy of opting out of interactions, combined with a world in which anyone can punish anyone, does not result in any meaningful increase in cooperation. Crucially, it is not the presence of loners that harms cooperation, but the possibility to punish them. Indeed, if loners are shielded from punishment, García & Traulsen [28] show that cooperators that punish pro-socially prevail, even when anti-social punishment is available. Just like anti-social punishment, anti-social pool rewarding (where agents of some type contribute towards rewarding others of their own type) also destabilizes cooperation in fully-mixed populations [29] but not structured populations [30].

Besides punishment, the other key mechanism in sustaining cooperation towards collective action is reputation [3134]. Adopting a loose definition, reputation emerges when an individual’s actions can be directly or indirectly observed by his peers and used to condition their own behaviours when playing with him. In both well-mixed [35,36] and structured populations [37,38], simple reputational systems promote and stabilize cooperation. Reputational systems at different levels of complexity have been studied, ranging from the first-order image scoring norm [39], to the second-order standing criterion [40,41] and to the more complex third-order leading eight social norms [42,43] or higher-order norms [44]. While simpler reputational systems have been observed in animals [45], more elaborate systems are more likely to be the domain of human interactions, owing to the complex relationship between actions, reputations and social structures. How reputation evolves when a given behaviour is observed depends on the social norm characterizing a population, in other terms their ‘notion of goodness’, i.e. from the moral value attributed to a given type of action.

Punishment devices and reputation dynamics based on social norms coexist in human social groups [46] and, therefore, their interaction and implication for the success of collective action need to be studied. In this paper, we extend the theory of the OPGG, studying the individual and joint impact of punishment and reputation on the sustainability of cooperation in a population where groups of agents engage in repeated interactions. In line with evidence that anti-social punishment exists and heavily influences cooperation [25], we allow all types of punishment to occur, while studying its joint impact with a simple (first-order) social norm that prizes cooperation and penalizes the loner and defection choices through bad reputations. By comparing the effect of social norms differing by the relative reputations they assign to defectors and loners, we explore the strategic exchanges between punishment and reputation. Similarly to Panchanathan & Boyd [47], we consider ephemeral groups of agents that interact repeatedly during each time period and are then reshuffled.

We find that while reputation alone mildly increases cooperation, and punishment alone does not, the two mechanisms synergistically interact, leading to high cooperation. Our findings stem from the fact that reputation emerges as a partial substitute for punishment, thus making all types of punishment less necessary. Our results indicate a way through which social groups can sustain cooperation despite the presence of anti-social punishment.

2. The model

A fully-mixed population of N agents play the OPGG in randomly chosen groups of n ≥ 4 as in Hauert [11]. During each interaction, players are given the choice between cooperating C), defecting (D) or abstaining from the game (L). Cooperators incur a cost c to participate and contribute towards the public good, while defectors participate but contribute nothing. Loners withdraw from the game to receive the loner’s payoff σ with 0 < σ < (r − 1)c. In each game, the sum of the contributions is multiplied by a factor r > 1 and is distributed equally between the group’s participants. If there is only a single participant (or equivalently n − 1 loners in the group), the OPGG is cancelled and all players are awarded σ. At this point, payoffs are awarded and additional games are played among the same agents until, with probability 1 − Ω, the interactions terminate and the group is dissolved. Note that, unlike Milinski et al. [31], our set-up can explore a form of indirect reciprocity without the need of adding a different game with respect to the OPGG.

Following each round of the game, players have the opportunity to punish any or all of the other n − 1 members of the group based on their played action. Players pay a cost of γ for each player they choose to punish, while they are subjected to a penalty of β for each punishment they receive.

Once the interaction and punishment rounds end, players change strategy based on the distribution of payoffs in the population. At each time-step an agent i is randomly paired with another agent j outside (within) its group with probability m (respectively, 1 − m). Player i then imitates player j with probability exp(uj)/(exp(ui) + exp(uj)), where ux is the payoff of player x and the exponential payoff function is used to account for negative utilities. Additionally, we allow mutations to occur in each period with probability ε.

Extending the OPGG with the classical reputation mechanism of Brandt et al. [37], we implement a reputational system adopted by the entire population. After a player cooperates, defects, or abstains from the OPGG, he is assigned a reputation according to the population’s social norm. A reputation system requires the definition of a social norm, or reputation dynamics [42], that identifies good and bad behaviour. The simplest possible reputational mechanisms are those that assign reputation on the basis of the mere observation of past actions, such as image scoring [39]. Arguably, most functional human and animal societies associate the highest reputation to cooperative behaviour. For this reason, we focus our attention on the reputation dynamics that prize cooperation above other actions. Like the binary image score, we simplify the reputation values so that players can have only a good (1), intermediate (0) or bad (−1) reputation, always assigning a good reputation to cooperators. The Anti-Defector (AD) norm assigns reputations of −1 to defectors and 0 to loners. Similarly, the Anti-Loner (AL) norm assigns loners a reputation of −1 and defectors a 0. Finally, it is possible to conceptualize norms that do not distinguish between acting as a loner or as a defector. We label these as the Anti-Neither (AN) social norm, which assigns 0 to both actions, and the Anti-Both (AB) social norm, which assigns −1 to both actions. Similar to punishment, reputations update after every OPGG round. We assume that actions are either observed without error or communicated honestly to everyone so that players’ resulting reputations are common knowledge within the population.

To model agents adopting a social norm, we extend the set of behavioural strategies to allow players to condition their actions on the average reputation of the other players within their group. We condition actions on the average reputation firstly because a player’s payoff is dependent on the actions of his group members and secondly because directly observing the actions of others can be difficult in sizeable groups, whereas observing the overall sentiment is arguably easier. Conditioning actions to the average reputation is also a key difference between our set-up and Boyd & Richerson [48], where actions depend on levels of cooperation, and from Takezawa & Price [49], where they depend on the amounts contributed. Indeed, given that there are three types of actions, each receiving a reputation on the basis of a shared social norm, reputational information can only be used as a signal of the cooperativeness of the environment a player is facing, similar to the case of image scoring. Furthermore, in our model, reputation contains elements of both direct and indirect reciprocity as when groups are reshuffled, reputations are carried over. Finally, we significantly extend the strategy set considering strategies whose primary action in a benign environment is not cooperating. The strategy XZCZDZLkmin,Y defines an agent taking action X ∈ {C, D, L} if the average reputation of the other members of the group strictly exceeds kmin ∈ [ − 1, 1), and Y ∈ {C, D, L} otherwise. Furthermore, the subscripts describe the agent’s strategy in terms of its punishment decision taken against cooperators ZC, defectors ZD, and loners ZL, where ZC, ZD, ZL ∈ {P, N} is the decision to punish or not to punish, respectively. In addition to the pure strategies of being an unconditional cooperator, defector or loner, this adds a further eight strategies: C−1,D, C0,D, C−1,L, C0,L, D−1,L, D0,L, L−1,D, L0,D, each of which has eight variants prescribing punishment towards zero, one or more cooperators, defectors and loners. In an effort to minimize the size of the strategy space, and therefore also some of the effects of random drift, we remove counterintuitive strategies, i.e. those that would cooperate only in groups of defectors or loners, while defecting or going loner when surrounded by cooperators. For the same reason, strategies of higher complexity, i.e. those implying multiple behavioural shifts and/or more than two possible actions, are excluded from the present analysis and relegated to further studies.

Combining the payoffs of the game and the subsequent punishments, the utility of a playeri that decides not to opt-out after a single round of the OPGG is

ui=wixi+r(jxj)npγPβP,

where the first two terms represent the initial endowment (wi) and the contribution (xi ∈ {0, c}) of the players. The third term represents the return from the OPGG (where np is the number of agents that decided to participate in the game, i.e. to not opt-out), and the final two terms represent the losses incurred from punishing (γ) and being punished (β). In broad terms, P represents the number of people in the group who exhibited the actions that player i punishes, and P′ represents the number of people within the group that punish player i’s most recent action. For agents that decide to opt-out, the utility reduces to ui = wi + σγPβP′. In summary, if we ignore reputation and punishment, we have three strategies, if we only have punishment, we have 24, if we only have reputation, we have 11, and if we have both reputation and punishment, we have 88 strategies.

3. Results

Investigating the dynamics of the OPGG (figure 1 and electronic supplementary material, figure S11), in the absence of both punishment and any reputational element, we reproduce the rock–paper–scissor dynamics between the three unconditional strategies as in Hauert [11] and Hauert et al. [15]. When aggregated over time, this results in agents cooperating or defecting about 20% of the time each, while operating as loners about half of the time. Introducing the full set of punishing strategies as in Rand & Nowak [25], we successfully reproduce their main result: the proportion of defectors and cooperators further diminishes with respect to the baseline, while the proportion of agents acting as loners increases. Having successfully replicated the key results of the extant literature, we investigate two models benchmarked, respectively, against the traditional OPGG and the OPGG with the full spectrum of punishment. In the first manipulation, we introduce a reputational system to the OPGG, while in the second we add it to the OPGG with punishment. In both cases, we report results averaged over the second half of 100 simulations each comprising 200 000 time-steps.

Figure 1.

Figure 1.

Cooperation is highest when being a defector is strictly less reputable than being a loner, with and without punishment. Punishment and reputation together sustain high levels of cooperation as long as being a loner is not the strictly least reputable action. Without punishment, only the AD social norm sustains cooperation, with all other social norms providing comparable levels of (low) cooperation with the traditional OPGG in the absence of reputation and punishment. All simulations shown use N = 1000, n = 5, r = 3, σ = 1, m = 0.95, ε = 0.1, Ω = 10/11, T = 2 × 105. Those including punishment additionally have γ = 1, β = 2. Results are calculated by averaging across the second half of the simulations, and then across 100 repeated simulations.

(a) . Reputation without punishment

We begin by analysing how the introduction of a simple reputational system influences the dynamics of these populations. The AD norm increases cooperation, so that in this circumstance, about 40% of the OPGG actions are cooperative, roughly a 20% increase from the baseline (figure 1). Interestingly, the increase in cooperators does not come from the defector ranks, but rather from the loner ranks, which decrease the most with respect to the baseline. The increase in cooperation follows from a change in the patterns of strategy adoption caused by the introduction of reputation. Strategy adoption patterns are visualized, for each social norm studied, in figure 2, which also reports the reduced transition matrices (including only the strategies that are—on average—played by more than 10% of the population). Results are computed averaging across time and then across 100 repetitions of each set-up. The full population breakdown as well as the complete transition matrices are reported in the left panels of electronic supplementary material, figures S1–S4. As noted in electronic supplementary material, figures S8 and S9, the average population composition is remarkably stable across simulations, ensuring that the results are generalizable beyond a single simulation.

Figure 2.

Figure 2.

In the absence of punishment, cooperation is sustained only when defection results in a strictly worse reputation than opting out. In this set-up, a conditionally cooperating and a conditionally defecting strategy dominate the population.

In the case of the AD social norm (which generates the highest levels of cooperation), pure strategies are largely abandoned in favour of conditional strategies. Among the eight conditional strategies, only two are adopted by a sizeable part of the population: C0,L and D0,L. These are strategies that share a common goal: to participate in interactions with a population of agents who are sufficiently cooperative (since they both set the reputational threshold for participation in their group to 0). However, while the former strategy aims to engage with and enjoy the collaboration of peers, the latter aims to exploit them. When this is not possible, both strategies behave as a loner for a lower but guaranteed payoff. From the transition table (electronic supplementary material, figure S1, left panel), it emerges that most strategies turn into either C0,L or D0,L, and that these two strategies are much more likely to turn into each other than anything else (leftmost panel of figure 2). This is a consequence of the interaction of these strategies and the level of cooperation in the population. This effect can be clearly discerned studying the temporal evolution of the proportion of agents playing these two strategies and of the associated actions played (electronic supplementary material, figure S14, panel C). While the proportions of D0,L and C0,L in the population move systematically in opposite directions, studying the actual actions played we observe oscillations in the proportion of opt-out and cooperative actions, while defection is played by fewer individuals in a stable manner. In other terms, the C0,L strategy uses the reputational information to conditionally cooperate. When in a group there happen to be too many defectors (most of whom are D0,L given their relative prevalence), the strategy resorts to opting-out, and thus avoiding being exploited. The strategy D0,L acts similarly, thus avoiding being completely evolved out, but it is unable to spread the defection in the population. Importantly, this dynamic relies on the fact that under the AD social norm, agents playing defection are assigned a strictly worse reputation than loners, thus allowing C0,L to discriminate between the two behaviours. In other terms, the cyclic behaviour of the baseline model—described by Rand & Nowak [25]—reemerges, but with the loner strategy being part of two more complex conditional strategies. Importantly, the cycles are—in this set-up—quite noisy owing to the occasional attractiveness (electronic supplementary material, figure S1, left panel) of other strategies, which allows them to be the destination of some transitions and thus to survive albeit in low numbers.

The peculiarity of AD is immediately evident when studying the other social norms. While the anti-defector social norm results in both the greatest population fitness (electronic supplementary material, figure S10) and the greatest increase in cooperation, all other social norms are unable to sustain levels of cooperation higher than the baseline, resulting in populations largely dominated by loners (figure 1) with a cyclical pattern similar to those of Rand & Nowak [25]. The reason for this divergence lies in the reputational damage inflicted by these norms on non-cooperators. Indeed, while reputation helps agents condition their behaviour to opponents’ types, attributing loners a bad reputation does not inflict on them any direct damage. Since they are already prone to not participating in the game, their payoffs remain independent from any reputational effect. However, attributing loners too low a reputation (in AL and AB) hampers the role of loners in providing a backup option against invading defectors. Furthermore, not attributing defectors a strictly lower reputation (in AN) allows them to more easily exploit cooperators by impeding the latter's ability to discriminate between loners and defectors (the full transition matrices are reported in electronic supplementary material, figures S1–S4). In all these cases the effectiveness of C0,L against defection is more limited; hence the strategy has limited evolutionary success (figure 2(bd) and electronic supplementary material, figure S15). For this reason, in the following we will focus our attention mainly on the AD social norm.

(b) . Reputation with punishment

We now analyse the impact of introducing the AD norm to the OPGG with punishment. Owing to the substantial empirical and theoretical evidence for the presence of anti-social punishment in the OPGG, we allow all possible types of punishment to occur. In the presence of the AD norm, the level of cooperation in the population increases significantly, with cooperation becoming the most played strategy, at the expense again of loners, whose role becomes marginal in this set-up. The level of cooperation is higher not only than when solely punishment alone is used, but also than in the case where only reputation is available. Given the success of this cooperation, one could expect punishment and the reputation system to interact in some form, coevolving to produce successful strategies that both punish pro-socially and cooperate conditionally. However, this is largely not the case. Analysing the prevalence of strategies and the transition probabilities between them (figure 3 for between strategy transitions, and figure 4 for transitions between different punishment patterns for the same basic strategy), we note that—for each of the conditional strategies that become dominant in the population—the variants that employ punishment are quite rare. When punishers are born through mutation, they rapidly turn into their non-punishing counterparts.

Figure 3.

Figure 3.

When punishment is available, high levels of cooperation evolve as long as being a loner is not the strictly least reputable action. Strategies that thrive in these situations conditionally cooperate, becoming indistinguishable between one other in such a cooperative environment. Under the AB norm, both defecting and exiting are the least reputable actions, which creates both the most hostile environments and the least reputable OPGG groups. This results in conditional cooperators being cautious by frequently opting out. The AL social norm is not pictured as the only long-term dominant strategy that exists in significant amounts is the loner strategy. See electronic supplementary material, figures S1–S4 for the complete transition matrices.

Figure 4.

Figure 4.

When both punishment and reputation are available, most individuals do not punish, a few punish pro-socially, and almost no-one punishes anti-socially. Nodes represent the punishment variants of each of the five most popular strategies within the reputation and punishment model (AD social norm). P and N represent the decision to punish or not to punish cooperators, defectors and/or loners. For example, NPP represents the absence of punishment toward cooperators (N), its presence against defectors (P) and its presence against loners (P). The values within the nodes are the average proportion of the variant relative to all variants of the strategy. Edges represent the transitions solely between variants, normalized by the total number of transitions between all punishment variants of the strategy. The strategies that do not punish are always the most popular of the variations, those that punish some combination of defectors or loners are common, while those that punish cooperators never survive for long. See electronic supplementary material, figure S5 for more details.

While representing consistently low proportions of the population, the types of punishment that show more resilience (being sometimes able to attract non-punishing counterparts), are those that punish defectors and/or loners, i.e. the pro-social types (see electronic supplementary material, figures S1–S7 for the complete transition matrices and the prevalence of each strategy within each model and social norm). Thus, when both punishment and reputation exist, the latter acts as a substitute for the former. Agents evolve toward strategies that do not punish and away from strategies that do, contributing to the demise of this kind of behaviour, except as a threat against deviators.

Regardless of punishment patterns, the conditional strategies that emerge as prevalent are those whose primary action is to cooperate, which in turn provides the ideal environment for unconditional cooperators to thrive. Indeed, the combination of punishment and reputation ensures that agents that adopt strategies whose primary action is different from cooperation end up with lower average payoffs given that the established cooperative conditional strategies will be less likely to cooperate with individuals whose primary action is not cooperation. The transition matrix of figure 3 shows that the surviving strategies shift between each other without a clear pattern. This is likely due to fluctuations in the group composition when playing the OPGG as these strategies are almost equivalent in largely cooperative environments.

Besides AD, the other social norms, excluding AL, are also able to increase and stabilize cooperation albeit to a lesser extent (figure 1). These social norms produce very similar results to the AD social norm concerning both the emerging conditional strategies (figure 3) and the prevalence of pro-social punishment (electronic supplementary material, figures S12 and S13). The lower the reputation attributed to loners, the lower the frequency of cooperation and population fitness (electronic supplementary material, figure S10). Comparing AD and AB against AN social norms (electronic supplementary material, figure S12) shows that as the reputability of being a defector increases, the level of pro-social punishment increases to compensate, thus minimizing the damage to cooperation and to the fitness of the population. It follows that populations using comparatively more punishment in addition to reputation have lower average fitness.

Finally, for the same reasons as discussed in the reputation only set-up, AL fails to stabilize cooperation altogether, resulting in large fluctuations of cooperation levels, high proportions of the loner strategy in the population, and the lowest payoffs among the social norms both with and without punishment.

(c) . Parameter robustness

Our main results discussed so far are robust to an extensive set of parameter variations. These are reported in the electronic supplementary material, figures S17–S24, where the effects of changes to the group synergy factor r, the loner’s payoff σ, the punishment cost γ and penalty β, probability of repeated interactions Ω, OPGG group size n, the degree of evolutionary group mixing m, and the rate of mutation ε are systematically explored (the parameterizations used for each robustness check are reported in electronic supplementary material, table S2). The superiority of combining the reputation and punishment mechanisms remains consistent with our findings, showing higher levels of cooperation when compared with our baseline model without reputation or punishment, and with the baseline model with only punishment. While the model using solely reputation does tend to increase cooperation, its behaviour is somewhat more dependent on specific parameter ranges of the degree of mixing m (electronic supplementary material, figure S23) and of the stability of groups Ω (electronic supplementary material, figure S21). Cooperation increases when the degree of mixing m increases, likely due to the pool of available strategies for agents to choose to evolve to. When m is small, players evolve based on a randomly selected player within their own OPGG groups. The likelihood for a fruitful and cooperative player to exist within this group is lower than when m is high and players evolve probabilistically based on the payoffs of a randomly selected individual from outside of their group. Instead, considering the likelihood of further interactions Ω in the set-up in which both reputation and punishment are active, as long as Ω0, cooperation has a high and similar proportion within the population. This suggests that a combination of both direct (relying on direct experience within the group to condition action) and indirect reciprocity (relying on third-party information to condition action) is important in obtaining high levels of cooperation.

4. Discussion

The OPGG provides a challenging environment for cooperation. The three-way interaction between cooperators, defectors and loners produces natural cyclical dynamics, where cooperative environments favour defection, which in turn makes opting out more advantageous, which in turn leads to the return of cooperation [11,15]. Similar challenges are faced introducing the possibility of agents punishing one another based on their past actions. In this case, the cyclical dynamics involve the presence both of pro-social (punishment of defectors and loners) and of anti-social punishment (punishment of cooperators), with each strategy punishing its would-be invader [25]. In either case, the average cooperation level resulting from the cyclical behaviour is low.

Introducing into the OPGG the AD social norm (assigning cooperators a good reputation and assigning better reputations to agents that opt-out from the game than to those who participate but withhold their contribution) significantly increases cooperation levels, unlike the other reputational systems. The effectiveness of the AD norm depends on the fact that it provides a way for agents to condition their behaviour to the propensity of their group to cooperate, while reducing the attractiveness of the loner action, which remains as a backup if the environment becomes too hostile. Accordingly, within the ecosystem of strategies generated by the introduction of this norm, over half of the population end up preferring one of two strategies, C0,L or D0,L. Players using either of these strategies behave as loners if the environment becomes too harsh, but turn to cooperation or defection in a more friendly world. The backup option of abstaining from the OPGG on the one hand prevents defectors from over-exploiting cooperators: if they are present in sufficient numbers, their bad reputation induces C0,L to become a loner. On the other hand, the same rule supports a conditional defector who is even evolutionarily stronger than the unconditional counterpart, as it viciously exploits reputation and cooperative behaviour while still receiving the loner’s payoff when the likelihood of exploiting its group decreases. The evolutionary success and average payoffs of these two strategies are similar; hence agents cycle between them, with their equilibrium favouring conditional cooperators, hence the observed increase in cooperation. It should be noted that in this set-up, many other strategies survive—albeit in lower numbers—thus the dominance of the aforementioned strategies is never complete. It is interesting to note that the laxer counterparts of the strategies discussed, namely C−1,L and D−1,L are less successful, as they propose cooperation or defection even in groups largely dominated by defectors.

Cooperation turns out to be much more likely when we include the interaction between costly punishment and a reputational system in the mix, assigning a better reputation to agents that opt-out from the game than to those that participate but withhold their contribution. It should be noted that the emergence of high levels of cooperation in this set-up is surprising as the presence of anti-social punishment has been found to reduce the levels of cooperation ([25] and replicated in figure 1) in the OPGG. Given that the success of the combination of punishment and reputation in sustaining cooperation is much higher than the success of either of the two alone, the two mechanisms act in synergy with each other. The ecosystem of strategies that thrive in this environment is made up of strategies whose main (or sole) action is cooperation, who for the most part do not employ the punishment device at all, with a smaller but material proportion punishing pro-socially. For each conditional strategy, there are significant mutual transitions between the non-punishing and the pro-socially punishing variants, indicating a minor but present role for the latter in preserving cooperation. This cooperative environment relies on the regular influx of mutant strategists to mitigate the second-order free-rider problem, moderating the cyclical dynamics it induces. New punishing agents introduced into the population are progressively transformed into the non-punishing variants. Their role is confirmed by electronic supplementary material, figure S24, where in the presence of both reputation and punishment, cooperation levels decrease for lower mutation rates.

All in all, these results point to the fact that, in the OPGG, reputational information acts as a cheap substitute for costly punishment. Cooperators are rewarded with a positive reputation, which makes future opponents more likely to cooperate with them. This is in line with the experimental results of Ule et al. [50], who found that, in the presence of reputation, subjects prefer rewarding positive behaviour to punishing those who defect. Since the AD reputational mechanism can be considered an indirect method of sanctioning anti-social behaviour, our result is also in line with Balafoutas et al. [51] which experimentally finds that people prefer to punish indirectly rather than directly.

Introducing a reputational system has a differential impact on the different types of punishment. In the absence of reputation, anti-social punishment is able to successfully prevent the invasion of defectors by cooperators—as in Rand & Nowak [25]—but is unable to do the same with the new conditional strategies under a social norm. Contrastingly, pro-social punishment positively reinforces the reputational information, increasing the level of cooperation in the population. This result is in contrast to Gurerk et al. [52], which establishes an experimental superiority of punishment institutions over reputational mechanisms for the public goods game and poses doubts about whether this result can be extended to experiments where the OPGG is played.

It should be noted that all strategies with cooperation as the main action are selected in the ecosystem for this set-up, and in remarkably similar proportions. This follows from the fact that they always act as cooperators in a strongly cooperative environment, thus becoming indistinguishable from each other (thereby avoiding selection).

Our results qualify the findings of the experimental literature on anti-social punishment [24,53,54]. Such contributions work with designs that rigidly separate individual choices about actions and punishments from reputational concerns. This is particularly true for Rand & Nowak [25], which was performed on Amazon Mechanical Turk, where professionals play many games. In recent experiments, run in-person and with samples of the general population where individuals knew they were playing with citizens from the same villages or small towns [55], anti-social punishment was not significant. In these cases, even within an experimental set-up, social norms implicitly matter and remove the scope for anti-social punishment.

Reproducing in silico a simplified version of the combination of social reputation mechanism and punishment systems that characterize complex societies, our contribution shows that these two dimensions interact, shedding light on the surprising success of reputation in a world under the contemporaneous threat of exploitation and of anti-social punishment. Finally, our results contribute to identifying the conditions that allow effective collective action in the presence of the possibility to opt-out of interactions.

Acknowledgements

The authors acknowledge the use of the UCL Myriad High Performance Computing Facility (Myriad@UCL), and associated support services, in the completion of this work.

Data accessibility

Source code used to generate our results is provided as electronic supplementary material [56].

Authors' contributions

S.R., S.P. and F.P. conceived of the presented idea. S.P. and S.R. developed the theory and the experiments. S.P. designed the computational framework and implementations and created the graphics and visualizations. S.R. and S.P. wrote the manuscript. F.P. provided feedback and comments on the manuscript.

Competing interests

We declare we have no competing interests.

Funding

S.R. gratefully acknowledges funding from ‘Decentralized Platform Economics: Optimal Incentives and Structure’, Fondi primo insediamento, Ca’Foscari University of Venice. F.P. gratefully acknowledges funding from ‘Data driven methodologies to study social capital and its role for economic growth’, University of Modena and Reggio Emilia, FAR 2019, CUP E84E19001120005. S.P. gratefully acknowledges doctoral funding from the Engineering and Physical Sciences Research Council.

References

  • 1.Axelrod R, Hamilton W. 1981. The evolution of cooperation. Science 211, 1390-1396. ( 10.1126/science.7466396) [DOI] [PubMed] [Google Scholar]
  • 2.Griffin AS, West SA, Buckling A. 2004. Cooperation and competition in pathogenic bacteria. Nature 430, 1024-1027. ( 10.1038/nature02744) [DOI] [PubMed] [Google Scholar]
  • 3.Nowak MA. 2006. Five rules for the evolution of cooperation. Science 314, 1560-1563. ( 10.1126/science.1133755) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Perc M, Jordan JJ, Rand DG, Wang Z, Boccaletti S, Szolnoki A. 2017. Statistical physics of human cooperation. Phys. Rep. 687, 1-51. ( 10.1016/j.physrep.2017.05.004) [DOI] [Google Scholar]
  • 5.Rainey PB, Rainey K. 2003. Evolution of cooperation and conflict in experimental bacterial populations. Nature 425, 72-74. ( 10.1038/nature01906) [DOI] [PubMed] [Google Scholar]
  • 6.Righi S, Takács K. 2018. Social closure and the evolution of cooperation via indirect reciprocity. Scient. Rep. 8, 11149. ( 10.1038/s41598-018-29290-0) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Smith JM. 1964. Group selection and kin selection. Nature 201, 1145-1147. ( 10.1038/2011145a0) [DOI] [Google Scholar]
  • 8.Hardin G. 1968. The tragedy of the commons. Science 162, 1243-1248. ( 10.1126/science.162.3859.1243) [DOI] [PubMed] [Google Scholar]
  • 9.Ledyard JO. 1995. Public goods: a survey of experimental research. In Handbook of experimental economics (eds Kagel J, Roth A), pp. 111–194. Princeton, NJ: Princeton University Press. [Google Scholar]
  • 10.Maier-Rigaud FP, Martinsson P, Staffiero G. 2010. Ostracism and the provision of a public good: experimental evidence. J. Econ. Behav. Org. 73, 387-395. ( 10.1016/j.jebo.2009.11.001) [DOI] [Google Scholar]
  • 11.Hauert C. 2002. Volunteering as Red Queen mechanism for cooperation in public goods games. Science 296, 1129-1132. ( 10.1126/science.1070582) [DOI] [PubMed] [Google Scholar]
  • 12.Scheel D, Packer C. 1991. Group hunting behaviour of lions: a search for cooperation. Anim. Behav. 41, 697-709. ( 10.1016/s0003-3472(05)80907-8) [DOI] [Google Scholar]
  • 13.Smuts BB, Cheney DL, Seyfarth RM, Wrangham RW. 2008. Primate societies. Chicago, IL: University of Chicago Press. [Google Scholar]
  • 14.Gell A. 1986. Newcomers to the world of goods: consumption among the Muria Gonds. In The social life of things: commodities in cultural perspective (ed. A Appadurai) 110-138. Cambridge, UK: Cambridge University Press. ( 10.1017/cbo9780511819582.006) [DOI] [Google Scholar]
  • 15.Hauert C, De Monte S, Hofbauer J, Sigmund K. 2002. Replicator dynamics for optional public good games. J. Theor. Biol. 218, 187-194. ( 10.1006/jtbi.2002.3067) [DOI] [PubMed] [Google Scholar]
  • 16.Semmann D, Krambeck H-J, Milinski M. 2003. Volunteering leads to rock–paper–scissors dynamics in a public goods game. Nature 425, 390-393. ( 10.1038/nature01986) [DOI] [PubMed] [Google Scholar]
  • 17.Orbell JM, Dawes RM. 1993. Social welfare, cooperators’ advantage, and the option of not playing the game. Am. Sociol. Rev. 58, 787-800. ( 10.2307/2095951) [DOI] [Google Scholar]
  • 18.Nosenzo D, Tufano F. 2017. The effect of voluntary participation on cooperation. J. Econ. Behav. Org. 142, 307-319. ( 10.1016/j.jebo.2017.07.009) [DOI] [Google Scholar]
  • 19.Hauk E. 2003. Multiple prisoner’s dilemma games with(out) an outside option: an experimental study. Theory and Decision 54, 207-229. ( 10.1023/a:1027385819400) [DOI] [Google Scholar]
  • 20.Fehr E, Gächter S. 2002. Altruistic punishment in humans. Nature 415, 137-140. ( 10.1038/415137a) [DOI] [PubMed] [Google Scholar]
  • 21.Fehr E, Gächter S. 2000. Cooperation and punishment in public goods experiments. Am. Econ. Rev. 90, 980-994. ( 10.1257/aer.90.4.980) [DOI] [Google Scholar]
  • 22.Henrich J. 2006. Costly punishment across human societies. Science 312, 1767-1770. ( 10.1126/science.1127333) [DOI] [PubMed] [Google Scholar]
  • 23.Hauert C, Traulsen A, Brandt H, Nowak MA, Sigmund K. 2007. Via freedom to coercion: the emergence of costly punishment. Science 316, 1905-1907. ( 10.1126/science.1141588) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Herrmann B, Thoni C, Gächter S. 2008. Antisocial punishment across societies. Science 319, 1362-1367. ( 10.1126/science.1153808) [DOI] [PubMed] [Google Scholar]
  • 25.Rand DG, Nowak MA. 2011. The evolution of antisocial punishment in optional public goods games. Nat. Commun. 2, 434. ( 10.1038/ncomms1442) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Rand DG, Armao JJ IV, Nakamaru M, Ohtsuki H. 2010. Anti-social punishment can prevent the co-evolution of punishment and cooperation. J. Theor. Biol. 265, 624-632. ( 10.1016/j.jtbi.2010.06.010) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Szolnoki A, Perc M. 2017. Second-order free-riding on antisocial punishment restores the effectiveness of prosocial punishment. Phys. Rev. X 7, 041027. ( 10.1103/physrevx.7.041027) [DOI] [Google Scholar]
  • 28.García J, Traulsen A. 2012. Leaving the loners alone: evolution of cooperation in the presence of antisocial punishment. J. Theor. Biol. 307, 168-173. ( 10.1016/j.jtbi.2012.05.011) [DOI] [PubMed] [Google Scholar]
  • 29.dos Santos M. 2015. The evolution of anti-social rewarding and its countermeasures in public goods games. Proc. R. Soc. B 282, 20141994. ( 10.1098/rspb.2014.1994) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Szolnoki A, Perc M. 2015. Antisocial pool rewarding does not deter public cooperation. Proc. R. Soc. B 282, 20151975. ( 10.1098/rspb.2015.1975) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Milinski M, Semmann D, Krambeck H-J. 2002. Reputation helps solve the ‘tragedy of the commons’. Nature 415, 424-426. ( 10.1038/415424a) [DOI] [PubMed] [Google Scholar]
  • 32.Számadó S, Balliet D, Giardini F, Power EA, Takács K. 2021. The language of cooperation: reputation and honest signalling. Phil. Trans. R. Soc. B 376, 20200286. ( 10.1098/rstb.2020.0286) [DOI] [PMC free article] [PubMed]
  • 33.Barrett HC, Saxe RR. 2021. Are some cultures more mind-minded in their moral judgements than others? Phil. Trans. R. Soc. B 376, 20200288. ( 10.1098/rstb.2020.0288) [DOI] [PMC free article] [PubMed]
  • 34.Samu F, Takács K. 2021. Evaluating mechanisms that could support credible reputations and cooperation: cross-checking and social bonding. Phil. Trans. R. Soc. B 376, 20200302. ( 10.1098/rstb.2020.0302) [DOI] [PMC free article] [PubMed]
  • 35.Hauert C, Haiden N, Sigmund K. 2004. The dynamics of public goods. Discrete Continuous Dyn. Syst. B 4, 575-587. ( 10.3934/dcdsb.2004.4.575) [DOI] [Google Scholar]
  • 36.Sigmund K, Hauert C, Nowak MA. 2001. Reward and punishment. Proc. Natl Acad. Sci. USA 98, 10 757-10 762. ( 10.1073/pnas.161155698) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Brandt H, Hauert C, Sigmund K. 2003. Punishment and reputation in spatial public goods games. Proc. R. Soc. Lond. B 270, 1099-1104. ( 10.1098/rspb.2003.2336) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Podder S, Righi S, Takács K. 2021. Local reputation, local selection, and the leading eight norms. Scient. Rep. 11, 16560. ( 10.1038/s41598-021-95130-3) [DOI] [PMC free article] [PubMed]
  • 39.Nowak MA, Sigmund K. 1998. The dynamics of indirect reciprocity. J. Theor. Biol. 194, 561-574. ( 10.1006/jtbi.1998.0775) [DOI] [PubMed] [Google Scholar]
  • 40.Leimar O, Hammerstein P. 2001. Evolution of cooperation through indirect reciprocity. Proc. R. Soc. Lond. B 268, 745-753. ( 10.1098/rspb.2000.1573) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Panchanathan K, Boyd R. 2003. A tale of two defectors: the importance of standing for evolution of indirect reciprocity. J. Theor. Biol. 224, 115-126. ( 10.1016/S0022-5193(03)00154-1) [DOI] [PubMed] [Google Scholar]
  • 42.Ohtsuki H, Iwasa Y. 2004. How should we define goodness?—Reputation dynamics in indirect reciprocity. J. Theor. Biol. 231, 107-120. ( 10.1016/j.jtbi.2004.06.005) [DOI] [PubMed] [Google Scholar]
  • 43.Ohtsuki H, Hauert C, Lieberman E, Nowak MA. 2006. A simple rule for the evolution of cooperation on graphs and social networks. Nature 441, 502-505. ( 10.1038/nature04605) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.FP Santos, Pacheco JM, Santos FC. 2021. The complexity of human cooperation under indirect reciprocity. Phil. Trans. R. Soc. B 376, 20200291. ( 10.1098/rstb.2020.0291) [DOI] [PMC free article] [PubMed]
  • 45.Bshary R, Grutter AS. 2006. Image scoring and cooperation in a cleaner fish mutualism. Nature 441, 975-978. ( 10.1038/nature04755) [DOI] [PubMed] [Google Scholar]
  • 46.Jordan JJ, Rand DG. 2020. Signaling when no one is watching: a reputation heuristics account of outrage and punishment in one-shot anonymous interactions. J. Pers. Social Psychol. 118, 57-88. ( 10.1037/pspi0000186) [DOI] [PubMed] [Google Scholar]
  • 47.Panchanathan K, Boyd R. 2004. Indirect reciprocity can stabilize cooperation without the second-order free rider problem. Nature 432, 499-502. ( 10.1038/nature02978) [DOI] [PubMed] [Google Scholar]
  • 48.Boyd R, Richerson PJ. 1988. Culture and the evolutionary process. Chicago, IL: University of Chicago Press. [Google Scholar]
  • 49.Takezawa M, Price ME. 2010. Revisiting “The evolution of reciprocity in sizable groups”: continuous reciprocity in the repeated n-person prisoner’s dilemma. J. Theor. Biol. 264, 188-196. ( 10.1016/j.jtbi.2010.01.028) [DOI] [PubMed] [Google Scholar]
  • 50.Ule A, Schram A, Riedl A, Cason T. 2009. Indirect punishment and generosity toward strangers. Science 326, 1701-1704. ( 10.1126/science.1178883) [DOI] [PubMed] [Google Scholar]
  • 51.Balafoutas L, Nikiforakis N, Rockenbach B. 2014. Direct and indirect punishment among strangers in the field. Proc. Natl Acad. Sci. USA 111, 15 924-15 927. ( 10.1073/pnas.1413170111) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Gurerk O, Irlenbusch B, Rockenbach B. 2006. The competitive advantage of sanctioning institutions. Science 312, 108-111. ( 10.1126/science.1123633) [DOI] [PubMed] [Google Scholar]
  • 53.Gächter S, Herrmann B. 2009. Reciprocity, culture and human cooperation: previous insights and a new cross-cultural experiment. Phil. Trans. R. Soc. B 364, 791-806. ( 10.1098/rstb.2008.0275) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Gächter S, Herrmann B. 2011. The limits of self-governance when cooperators get punished: experimental evidence from urban and rural Russia. Eur. Econ. Rev. 55, 193-210. ( 10.1016/j.euroecorev.2010.04.003) [DOI] [Google Scholar]
  • 55.Pancotto F, Righi S, Takács K. 2020. Voluntary play increases cooperation in the presence of punishment: a lab in the field experiment SSRN. ( 10.2139/ssrn.3908319) [DOI]
  • 56.Podder S, Righi S, Pancotto F. 2021. Supplementary material from "Reputation and punishment sustain cooperation in the optional public goods game". The Royal Society. Collection. ( 10.6084/m9.figshare.c.5581115) [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

  1. Podder S, Righi S, Pancotto F. 2021. Supplementary material from "Reputation and punishment sustain cooperation in the optional public goods game". The Royal Society. Collection. ( 10.6084/m9.figshare.c.5581115) [DOI] [PMC free article] [PubMed]

Data Availability Statement

Source code used to generate our results is provided as electronic supplementary material [56].


Articles from Philosophical Transactions of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES