Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Feb 29.
Published in final edited form as: J Theor Biol. 2010 Jun 9;265(4):624–632. doi: 10.1016/j.jtbi.2010.06.010

Anti-social punishment can prevent the co-evolution of punishment and cooperation

David G Rand 1,2,*, Joseph J Armao IV 1, Mayuko Nakamaru 3, Hisashi Ohtsuki 3,4,*
PMCID: PMC3290516  NIHMSID: NIHMS212507  PMID: 20540952

Abstract

The evolution of cooperation is one of the great puzzles in evolutionary biology. Punishment has been suggested as one solution to this problem. Here punishment is generally defined as incurring a cost to inflict harm on a wrong-doer. In the presence of punishers, cooperators can gain higher payoffs than non-cooperators. Therefore cooperation may evolve as long as punishment is prevalent in the population. Theoretical models have revealed that spatial structure can favor the co-evolution of punishment and cooperation, by allowing individuals to only play and compete with those in their immediate neighborhood. However, those models have usually assumed that punishment is always targeted at non-cooperators. In light of recent empirical evidence of punishment targeted at cooperators, we relax this assumption and study the effect of so-called ‘anti-social punishment’. We find that evolution can favor anti-social punishment, and that when anti-social punishment is possible costly punishment no longer promotes cooperation. As there is no reason to assume that cooperators cannot be the target of punishment during evolution, our results demonstrate serious restrictions on the ability of costly punishment to allow the evolution of cooperation in spatially structured populations. Our results also help to make sense of the empirical observation that defectors will sometimes pay to punish cooperators.

Keywords: evolutionary game theory, prisoner’s dilemma, public goods game, structured populations, spite

1 Introduction

Explaining the evolution of cooperation is an issue of central importance to evolutionary biology as well as social sciences. Why does the competitive process of natural selection often lead to altruistic cooperation, in which individuals pay a cost to give a benefit to others? To answer this question, numerous mechanisms for the evolution of cooperation have been proposed (Nowak, 2006; Lehmann & Keller 2006; West et al. 2007), including kin selection (Hamilton, 1964; Frank 1998), direct reciprocity (Trivers 1971) and indirect reciprocity (Alexander 1987; Nowak & Sigmund 2005).

In direct reciprocity, my actions towards you depend on what you have done to me in the past. Axelrod (1984) found that the Tit-for-Tat (TFT) strategy was a winning strategy in his computer tournament. A TFT player cooperates in the first round of the repeated interaction. After the first round, a TFT player takes whatever action his opponent took in the previous round. For example, an ALLD player, who always defects in a repeated interaction, receives cooperation from a TFT player in the first round. However, the ALLD player is ‘punished’ by TFT with mutual defection in subsequent rounds. Therefore, cheating does not pay if the repeated interaction lasts long enough, and the TFT strategy can be evolutionarily stable against the ALLD strategy (Axelrod & Hamilton 1981). Numerous behavioral experiments have demonstrated that direct reciprocity can lead to stable cooperation in repeated games (Wedekind & Milinski 1996; Dal Bo 2005; Dreber et al. 2008; Dal Bo & Frechette 2009; Rand et al. 2009a; Fudenberg et al. 2010).

Another possible way to enforce cooperation is to rely on more explicit punishment (Cluttonbrock & Parker, 1995a,b, Sigmund 2007, Rand et al. 2009b). While TFT ‘punishes’ non-cooperative partners by responding to selfishness with reciprocal selfishness, explicit punishment strategies choose to inflict harm on non-cooperative partners, often at a cost to the punisher. For example, in paper wasps, subordinates that cheat by signaling an inflated status of dominance receive more aggression from other wasps (Tibbetts & Dale 2004). Similarly, monkeys that do not share food are often punished (Hauser 1992). Punishment is also known as ‘policing’ in studies of social insects, where queens or workers sometimes attack other nestmate workers that attempt to produce offspring by themselves (Ratnieks et al. 2006, Wenseleers & Ratnieks 2006). In behavioral experiments with humans, punishment has been shown to stabilize cooperation in multi-player social dilemmas (Yamagishi 1986; Ostrom et al. 1992; Fehr & Gächter 2000, 2002), leading to higher payoffs after an initial learning period (Gächter et al. 2008, Rand et al. 2009a). As a proximate explanation of such pro-social punishment, the desire for egalitarian outcomes (Fehr & Schmidt 1999; Johnson et al. 2009) and/or anger directed at norm violators (Fehr & Gächter 2002) have been recently suggested.

Under the threat of punishment, a cooperative strategy can perform better than a non-cooperative strategy, because the latter suffers the costs of being punished. Therefore cooperation may evolve in populations where punishment is prevalent. When punishment is also costly to punisher, however, we face another puzzle: how does punishment evolve? Cooperators who do not participate in punishment avoid the cost of punishment. Thus, they out-compete those who cooperate and punish. This puzzle is well-known as the ‘second-order free rider problem’ (Oliver 1980; Yamagishi 1986; Boyd & Richerson 1992; Panchanathan & Boyd, 2004).

Such ‘costly punishment’ (Fehr & Gächter 2000, 2002) has been shown to co-evolve with cooperation when, for example, cooperation and punishment are such strongly linked traits that cooperators almost always participate in punishment (Axelrod 1986; Boyd & Richerson 1992; Nakamaru & Iwasa 2005; Lehmann et al. 2007). If cooperators punish defectors, the advantage of selfish individuals is greatly reduced, and cooperation can coevolve with punishment through group selection (Boyd et al. 2003). Cooperation and punishment can co-evolve when those who punish defectors are compensated by being more likely to receive cooperation from others (Gardner & West 2004). Recent models of cultural evolution suggest that weak conformist transmission can stabilize punishment and hence promotes the evolution of cooperation (Henrich & Boyd 2001; Henrich 2004). The option to abstain from the game also favors the co-evolution of cooperation and punishment (Fowler 2005; Hauert et al. 2007; Traulsen et al. 2009). In repeated two-player games, however, it has been shown that costly punishment does not promote cooperation, as traditional tit-for-tat style direct reciprocity is sufficient (Rand et al. 2009b). In contrast to such repeated games, a credible threat of costly punishment could be effective in one-shot games, for example in the situation where resources are so scarce that a long-term reciprocal interaction, and therefore use of the TFT strategy, is unlikely.

Nakamaru & Iwasa (2005, 2006) have shown that costly punishment promotes cooperation in a two-stage game played in a spatially structured population. Nakamaru & Iwasa (2005) studied two strategies: an altruist-punisher, who cooperates with others and punishes non-cooperators, and a selfish-non-punisher, who neither cooperates with others nor punishes non-cooperators. In a subsequent paper (Nakamaru & Iwasa 2006), they included two more strategies in their analysis: an altruist-non-punisher, who cooperates but does not punish, and a selfish-punisher, who does not cooperate but punishes non-cooperators (see also Sigmund et al. 2001).

These theoretical models have studied the effect of allowing players to punish non-cooperators. In addition to such pro-social punishment, however, numerous behavioral experiments have found that a significant fraction of non-cooperators will pay to punish cooperators (Shinada et al. 2004; Denant-Boemont et al. 2007; Dreber et al. 2008; Herrmann et al. 2008; Nikiforakis 2008; Gächter & Herrmann 2009; Wu et al. 2009). For example, a series of cross cultural public goods game experiments found a great deal of cross-cultural variation in the extent to which cooperators versus non-cooperators were targeted with punishment (Herrmann et al. 2008). In the most extreme cases, participants in countries such as Greece and Oman were as likely to punish those who contributed more than them as those who contributed less. The high level of punishment directed at cooperators in this and other experiments indicates that this behavior is not merely the result of errors or lack of comprehension, but is instead a surprising aspect of human behavior requiring explanation. Throughout this paper, we refer to this punishment targeted at cooperators as ‘anti-social punishment’, in order to distinguish it from the (usual) punishment that is targeted at non-cooperators. Anti-social punishment runs counter to the common assumptions about why people choose to punish (Johnson et al. 2009; Fehr & Gächter 2000, 2002). These strategies that pay a cost for cooperators to incur a cost are often excluded from evolutionary models of cooperation and punishment. In addition to the empirical evidence for anti-social punishment, it seems most appropriate for evolutionary models to allow the full set of possible strategies (of a given complexity) and to ask what strategies emerge via natural selection, rather than restricting the strategy set to only include particularly attractive strategies.

We ask how including anti-social punishment affects the evolution of cooperation. Is this ‘dark side’ of punishment in behavioral experiments just irrational, erroneous behavior, or can it in fact be favored by natural selection? Does punishment still promote cooperation when antisocial punishment is possible? In this paper we adopt the framework used by Nakamaru & Iwasa (2005, 2006) and study the consequences of including anti-social punishment.

2 Strategies and payoffs

We consider a two-stage game with cooperation followed by punishment (as in Boyd & Richerson 1992; Sigmund et al. 2001; Nakamaru & Iwasa 2005, 2006; Boyd et al. 2003; Brandt Hauert Sigmund 2003, 2006; Fowler 2005; Hauert et al. 2007; Traulsen et al. 2009). In the first stage, each player has two choices, defection (D) or cooperation (C). Defection means doing nothing, such that all players receive zero payoffs. Cooperation means paying a cost c for another to get a benefit b (b > c > 0). In the second stage, each player can choose to punish each other player or not, conditioned on the other person’s action in the first stage. By punishing, a player pays a cost α for the other person to incur a loss β(α, β > 0). Withholding punishment results in both players receiving zero payoffs in the second stage. We do not consider mixed strategies in the following analysis, and restrict our attention to pure reactive strategies.

Since the action in the second stage is conditioned on the other players’ action in the first stage, there are four possible strategies in the second stage. A non-punisher (N) punishes no one. A pro-social punisher (P) punishes defectors only. An anti-social punisher (A) punishes cooperators only. A spiteful punisher (S) punishes both cooperators and defectors. A combination of the action in the first stage (D,C) and the strategy in the second stage (N, P, A, S) defines one’s strategy in the game. Thus we have eight possible strategies: DN, DP, DA, DS, CN, CP, CA and CS. For example, a CP-strategist cooperates in the first stage and harms defectors in the second stage (therefore deemed as a ‘strong reciprocator’ (Fehr & Fischbacher 2003; Bowles & Gintis 2004)). Table 1 summarizes these eight strategies.

Table 1.

The eight strategies considered in the present paper.

strategy Cooperate? Harm cooperators? Harm defectors?
DN no no no
DP no no yes
DA no yes no
DS no yes yes
CN yes no no
CP yes no yes
CA yes yes no
CS yes yes yes

We can classify these eight strategies into two categories: strategies which do not punish other players using the same strategy (hereafter ‘self-consistent’), and strategies which do punish other players using the same strategy (hereafter ‘self in-consistent’). The strategies DN, DA, CN and CP are self-consistent. Conversely, the strategies DP, DS, CA and CS are self-inconsistent.

The 8 × 8 payoff matrix of the pairwise game is given in Eq.(1). As can be seen, the strategies DN and DA are Nash equilibria for all payoff values, and CP is a Nash equilibrium when β > c.

DNDPDADSCNCPCACSDNDPDADSCNCPCACS0β0βbbβbbβααβααβbbβbbβ0β0βbαbαβbαbαβααβααβbαbαβbαbαβcccβcβc+bc+bc+bβc+bβcαcαcαβcαβc+bc+bc+bβc+bβcccβcβc+bαc+bαc+bαβc+bαβcαcαcαβcαβc+bαc+bαc+bαβc+bαβ (1)

For reference, consider the sub-model where punishment cannot be targeted at cooperators. In this case, the strategies A and S are not allowed in the second stage, so the payoff matrix (1) is reduced to

DNDPCNCPDNDPCNCP0βbbβααβbbβccc+bc+bcαcαc+bc+b (2)

In the following analysis, we will concentrate on the ”score-dependent viability model” (Nakamaru et al. 1997; Irwin & Taylor 2001; Nakamaru & Iwasa 2005, 2006; Sekiguchi & Nakamaru 2009) (hereafter ”viability model”). We chose to study the viability model as it has previously been shown to be very effective in promoting the co-evolution of punishment and cooperation in structured populations (Nakamaru & Iwasa 2005, 2006), and therefore presents the greatest challenge to anti-social punishment. We will compare evolutionary dynamics in unstructured populations and lattice structured populations. We will investigate the effect on the evolution of cooperation of introducing anti-social punishment into the game. We do so by expanding the strategy set to include all combinatorially possible pure strategies (shown in Eq. (1)), as opposed to making any structural changes to the game (for example creating the opportunity to retaliate as in Janssen & Bushman 2008) or increasing the complexity of the possible strategies.

3 z-mixed population model

First we study viability game dynamics in populations with no spatial structure. This viability updating rule can be interpreted as either genetic evolution or social learning. In the context of genetic evolution, an organism’s probability of death is affected by its payoff, while the probability of reproduction is random. In the context of social learning, a person’s probability of abandoning her current strategy is determined by her payoff, after which she randomly picks a new strategy from the population to imitate (a type of conformism).

Groups of size z + 1 are randomly selected from the population, and each player interacts with her z other group members to obtain a game payoff, f. In each generation a random player is given a chance to update her strategy (i.e. has some chance of death). With probability

d(f)=γexp[θf], (3)

she abandons her current strategy (i.e. dies) and randomly adopts the strategy of one of the z players she just interacted with (i.e. one of the z players is randomly chosen to reproduce). This model was called ”complete mixing model” by Nakamaru et al. (1997) and Nakamaru & Iwasa (2005, 2006). However, as we will see below, this model is different from a usual ”well-mixed population” model often assumed in evolutionary game theory. To avoid potential confusion, we call the above model ”z-mixed population model”, where z represents the number of players one interacts with. Unlike the traditional well-mixed population model, agents in the z-mixed population model interact with, and compete with, only a subset of the population. The z → ∞ limit recovers the traditional well-mixed population dynamics.

Here γ,θ > 0 are constants which influence the intensity of selection. The intuitive consequence of Eq.(3) is that players with lower payoffs are more likely to change their strategies.

3.1 When anti-social punishment is not allowed

First we consider the four-strategy game, Eq.(2). It has previously been shown that in a z-mixed population, the strategies CP and CN are selected when β is large, regardless of initial frequencies of strategies (Nakamaru & Iwasa 2006). This result is surprising given that DN is always a Nash equilibrium strategy in the game, Eq.(2). The key lies in the assumptions of the viability model. In the viability model, (i) one interacts with only a subset of the population (finite z), and (ii) those whom one mimics are the same as those whom one interacts with. Such local population regulation significantly affects stability conditions of strategies, and is a common feature of many models of evolution in structured populations (Nowak & May 1992; Wilson et al. 1992; Taylor 1992; Killingback & Doebeli 1996; Szabo & Toke 1998; Hauert & Doebeli 2004; Ohtsuki et al. 2006; Santos et al. 2006). In particular, a Nash equilibrium strategy may no longer be a stable strategy in the viability model, and non-Nash equilibrium strategies may be stable.

To see how viability updating changes the stability condition, imagine the game between two arbitrary strategies, X and Y:

XYXYa1a2a3a4 (4)

The condition for strategy X to be a strict Nash equilibrium is a1 > a3, namely, X performs better than Y against X. However, in the viability model it can be shown that strategy X is stable against invasion by strategy Y when

(z1)a1+a2>za3 (5)

Here z is the number of others that one interacts with. See Appendix A for its derivation. Interestingly, a2, which is the payoff of X playing against Y, appears in the stability condition. The intuitive explanation for this is that a Y-player affects the payoffs of those X players she then mimics. As z increases, the effect of a2 diminishes, but never disappears as long as z is finite.

Using the stability condition, Eq.(5), we obtain for general z that CP is robust against invasion by DP and DN when

β>c+b+αz. (6)

Note that CP is always neutral to CN, because there are no defectors to punish in the population of only CP and CN.

We can further extend previous analysis by introducing errors. We assume that there is a (sufficiently) small chance ε of error in execution of intended actions (Molander 1985; May 1987; Nowak & Sigmund 1989, 1992, 1993; Fudenberg & Maskin 1990; Fudenberg & Tirole 1991; Lindgren 1991; Lindgren & Nordahl 1994; Boerlijst et al. 1997; Wahl & Nowak 1999; Brandt & Sigmund 2006). More precisely, one fails to perform an intended action in the first stage and/or the second stage independently with probability ε. In the first stage, with probability ε one mistakenly chooses D though her intended action is C, and vice versa. In the second stage, with probability ε one fails to harm others though she intended to, and vice versa.

Such errors resolve the neutrality between CN and CP. In the presence of small errors, the payoff matrix between CN and CP becomes

CNCPCNCPbc+(b+cαβ)εbc+(b+cα2β)ε+2βε2bc+(b+c2αβ)ε+2αε2bc+(b+c2α2β)ε+(2α+2β)ε2 (7)

Since ε is sufficiently small, it has an effect only when payoffs in Eq.(2) are equal (i.e. when two strategies are neutral). Neglecting ε2 terms and using Eq.(5), we obtain that CP is stable against CN when

β>zα. (8)

Combining (6) and (8) leads to the ESS condition of strategy CP, as

β>max{c+b+αz,zα}. (9)

Similarly, the ESS condition for the other three strategies can be derived. Figure 1A shows the parameter regions where each strategy is an ESS. The results are equivalent to those found in the absence of error (Nakamaru & Iwasa 2006), except that CN is never an ESS. In particular, we see that the cooperative strategy CP is the only ESS when β is sufficiently large.

Figure 1.

Figure 1

Evolutionary stability under the z-mixed population model. The strategies that are ESS are indicated in each region of the (b, β) parameter space. We set c = 1 and α = 1. We consider small errors. (A) When anti-social punishment in not allowed, CP is ESS in regions 3, 4 and 5. (B) When anti-social punishment in allowed, CP is never ESS. Either DN or DS is the unique ESS.

3.2 When anti-social punishment is allowed

Here we allow the option of anti-social punishment, making the punishment strategies A (anti-social punisher) and S (spiteful strategy) possible in the second stage of the game. The game is thus described by the full 8 × 8 matrix given by Eq.(1).

In the absence of errors (ε = 0), we find three pairs of strategies that are neutral in addition to CN-CP. DN and DA are neutral because there are no cooperators to punish. DP and DS are neutral because they always punish themselves and each other. CA and CS are neutral because they also always punish themselves and each other. With the introduction of errors, the payoff matrices for these pairs change to

DNDADNDA(bcαβ)ε(bcα2β)ε+2βε2(bc2αβ)ε+2αε2(bc2α2β)ε+(2α+2β)ε2, (10)
CACSCACSαβ+(bc+2α+2β)ε+(2α2β)ε2αβ+(bc+2α+β)ε2αε2αβ+(bc+α+2β)ε2βε2αβ+(bc+α+β)ε (11)
CACSCACSbcαβ+(b+c+2α+2β)ε+(2α2β)ε2bcαβ+(b+c+2α+β)ε2αε2bcαβ+(b+c+α+2β)ε2βε2bcαβ+(b+c+α+β)ε (12)

Taking into account Eqs.(7, 10, 11, 12), the evolutionary stability of each strategy in the game Eq.(1) under the z-mixed population model is derived.

  1. DN is ESS if β < , and is invaded by DP, DA, and DS if β > .

  2. DP is invaded by DN and DA if β < , and invaded by DS if β > . Thus DP is never ESS.

  3. DA is invaded by DN if β < , and invaded by DP and DS if β > . Thus DA is never ESS.

  4. DS is invaded by DN, DP and DA if β < , and is ESS if β > .

  5. CN is always invaded by DN and DP. Thus CN is never ESS.

  6. CP is invaded by CN if β < , and invaded by CA and CS if β > . Thus CP is never ESS.

  7. CA is always invaded by DN and DP. Thus CA is never ESS.

  8. CS is always invaded by DA and DS. Thus CS is never ESS.

Figure 1B summarizes the result. If β < , DN is the unique ESS. If β > , DS is the unique ESS. It is noteworthy that CP is never an ESS under the full strategy set. Anti-social behavior and spite destroy cooperation. As we saw above, if β > , CP is not invaded by CN, DN, or DP and therefore it can be ESS in the absence of anti-social punishment. When anti-social punishment is available, however, CA and CS can invade CP if β > . The reason for this invasion is simple. Imagine a CA (or CS)-player in a population of CP. The CA player punishes z CP players, so he pays the total cost of , whereas each CP player who meets the CA player incurs the loss β. If this loss is greater than the total cost the CA pays, CP is more likely to change to CA than vice versa, and therefore CA propagates in the population of CP. CA players maximize their relative fitness in a population of CP players by avoiding the punishment they would receive for defecting and by harming others to enhance the probability that they are mimicked. However, invasion by CA or CS players does not suggest maintenance of cooperation by anti-social punishment, as they are easily invaded by various defecting strategies. Thus evolution always leads to DS if β < .

To summarize, when punishment is sufficiently effective (i.e. β > ), it is adaptive to punish others and increase one’s relative payoff no matter what strategy the others adopt in the first stage (see Appendix B). Hence the optimal target of punishment should be not only defectors but also cooperators, and the condition β > is simply the condition under which unconditional harming is adaptive in the z-mixed population model. When it was only possible to punish defectors, this power of punishment allowed CP to invade DN. Once anti-social punishment is available, however, cooperators no longer have an advantage, and spiteful defection, DS, fares best. The β > condition reveals that the success of spiteful punishers relies on agents interacting and competing with a limited number of others. In the z → ∞ limit of a well-mixed infinite population, DN is the unique ESS for all payoff values.

4 Lattice model

We now consider a structured population in which players are arranged on a square lattice. Each player interacts with the four players in her von Neumann neighborhood and obtains a game-payoff. Thus this lattice model is the structured counterpart to the z-mixed population model with z = 4, and is equivalent to a series of overlapping linear public goods games (e.g. Santos et al. 2008), where each player’s cooperative actions affect her four neighbors. Other features of the model are the same as the z-mixed population model as described in the previous section. Agent based simulations are used to explore the evolutionary dynamics in this structured population.

4.1 When anti-social punishment is not allowed

Examining the restricted four-strategy game on a lattice, we find similar results to the z-mixed population model. As Figure 2 shows, the defecting strategy, DN, is favored when β is small, and the cooperative strategies, CN and CP, are favored when β is large. Thus our inclusion of ε errors does not change results on the lattice compared to what has been found previously (Nakamaru & Iwasa 2006). In the absence of anti-social punishment, there is a large parameter region in which cooperative strategies dominate on the lattice, regardless of initial conditions.

Figure 2.

Figure 2

Evolutionary outcomes of the restricted strategy set without anti-social punishment on a regular square lattice. When β is small, DN wins regardless of initial frequencies of strategies; when β is large, CP wins regardless of initial frequencies of strategies. Starting from the specified initial frequency, 50 agent based simulations are run and the winning strategy is recorded. The blue portion of each circle indicates the fraction of runs where CP wins or CP and CN coexist; the red portion, the fraction of runs where DN wins; the white portion, the fraction of runs in which there was no convergence after 125,000,000 generations. We explore the (b, β) parameter space, setting c = 1 and α = 1. We consider small errors, ε = 0.01, and a 50 × 50 lattice for a total population size N = 2500. We use viability updating with parameter values γ = 0.1 and θ = 0.1. (A) Initial frequency of strategies: DN=0.79, DP=0.07, CN=0.07, CP=0.07. (B) Initial frequency of strategies: DN=0.07, DP=0.07, CN=0.07, CP=0.79.

4.2 When anti-social punishment is allowed

We now explore the dynamics on the lattice when strategies which punish cooperators are included, Eq.(1). Simulation results are shown in Figure 3. Results are dependent on initial strategy frequencies. When DN or CP is most abundant initially, we see a similar pattern: the majority of the time DN wins when β is small, DA wins when β is intermediate, and CP wins when β is large. When DS is most abundant initially, DN again wins when β is small, DA wins when β is large, and further simulations find that CP wins when β > 10. When DA is most abundant initially, however, the outcome is very different. Regardless of b or β, DA wins in the majority of cases. DN wins occasionally. Cooperation never invades a resident population of anti-social defectors.

Figure 3.

Figure 3

Evolutionary outcomes of the full strategy set on a regular lattice. Cooperation cannot invade a population of antisocial defectors, DA. Starting from the specified initial condition, 50 agent based simulations are run and the winning strategy recorded. The blue portion of each circle indicates the fraction of runs where CP wins or CP, CN, CA and/or CS coexist; the red portion, the fraction of runs where DN wins; the yellow portion the fraction of runs where DA wins; the white portion, the fraction of runs in which there was no convergence after 125,000,000 generations. DP and DS never win. We consider small errors, ε = 0.01, and a 50 × 50 lattice for a total population size N = 2500. We use viability updating with parameter values γ = 0.1 and θ = 0.1. We explore the (b, β) parameter space, setting c = 1 and α = 1. Additional simulations find qualitatively similar results for α = 0.5 and α = 1.5. Initial population density (A) DN=0.79, all others 0.03 (B) CP=0.79, all others 0.03 (C) DS=0.79, all others 0.03 (D) DA=0.79, all others 0.03.

These results are qualitatively different from what we saw in the z-mixed population. First, DS never wins on the lattice, whereas DS was the sole ESS in the z-mixed population model when β was large. A possible explanation for this difference involves spatial correlations. Since offspring disperse locally on the lattice, DS players are very likely to interact with other DS players (i.e. their ”relatives”). Because DS is self-inconsistent in the sense that one DS player harms other DS-players, the target of the DS players’ spiteful punishment tends to be other DS players. Therefore DS, as well as the other self-inconsistent strategies DP, CA and CS, cannot propagate on the lattice (see also Appendix B). A pairwise invasion analysis based on computer simulations (Table 2) shows that across β values, each self-inconsistent strategy is invaded by the first-round equivalent self-consistent strategies (i.e. DP and DS invaded by DN and DA, CA and CS invaded by CN and CP).

Table 2.

Pairwise invasion analysis in a lattice structured population. In each cell we compare two strategies, resident and mutant. The resident’s initial frequency is 0.97, and the invader’s initial frequency is 0.03. The fraction of simulations in which the invader takes over the whole population is shown. We set b = 5, c = 1 and α = 1, and conduct 50 simulations, each lasting 125,000,000 generations. We consider small errors, ε = 0.01, and a 50 × 50 lattice for a total population size N = 2500. We use viability updating with parameter values γ = 0.1 and θ = 0.1.

β = 1 Resident
DN DP DA DS CN CP CA CS
Invader DN 1 0.24 1 1 1 1 1
DP 0 0 0.35 0 0 1 1
DA 0 1 1 1 1 1 1
DS 0 0 0 0 0 1 1
CN 0 0 0 0 0 1 1
CP 0 0 0 0 0 1 1
CA 0 0 0 0 0 0 0
CS 0 0 0 0 0 0 0
β = 5 Resident
DN DP DA DS CN CP CA CS
Invader DN 1 0.20 1 1 0 1 1
DP 0 0 0.05 0 0 0 0
DA 0 1 1 1 0.17 1 1
DS 0 0 0 0 0 1 1
CN 0 1 0 0 0 1 1
CP 0.95 1 0 1 0 1 1
CA 0 0 0 0 0 0 0.30
CS 0 0 0 0 0 0 0
β = 10 Resident
DN DP DA DS CN CP CA CS
Invader DN 1 0.21 1 1 0 1 1
DP 0 0 0 0 0 0 0
DA 0 1 1 1 0 1 1
DS 0 0.03 0 0 0 1 0
CN 0 1 0 0 0 1 1
CP 1 1 0 1 0 1 1
CA 0 0 0 0 0 0 0.08
CS 0 1 0 0 0 0 0

The second major difference we observe is that on the lattice, anti-social defectors (DA) win in a large region of the parameter space, whereas DA is never an ESS in the z-mixed population. This is because DP and DS, which are potential invaders of DA in the z-mixed population model, are self-inconsistent and thus do not perform well on the lattice as discussed above. In contrast, DA is self-consistent and harms only cooperators, which protects DA from invasion by CP. The pairwise invasion analysis in Table 2 suggests that strategies other than DN cannot invade DA on the lattice. DN can occasionally invade DA, but only due to a small advantage from the ε tiny error probability (see Eq.(10)). Hence DN and DA are almost neutral. In the z-mixed population model DA can invade DN when β is large. In contrast DA never invades DN on the lattice. Again this is because in the presence of errors, DA engages in self-inconsistent punishment of erroneous cooperation.

Overall, the lattice simulation results using the full strategy set are quite different from what occurs in the restricted strategy set where anti-social punishment is not possible. When anti-social punishment is excluded, there is a wide parameter region where cooperation wins regardless of initial conditions (Figure 2). However, the introduction of anti-social punishment eliminates this region, hinders cooperation, and anti-social defectors prevail (Figure 3).

5 Discussion

Here we have analyzed the effect of anti-social punishment on the evolution of cooperation. We have studied score-dependent viability dynamics (Nakamaru & Iwasa 2005, 2006). We include all eight possible pure reactive strategies as well as tiny behavioral errors, and compare the z-mixed population model and the spatially structured lattice model.

Results in the z-mixed population model clearly demonstrate that the inclusion of anti-social punishment destroys the evolutionary success of cooperation. We found that cooperation is never ESS under the full strategy set. Spiteful defectors who always harm others (DS) are the only ESS if the punishment technology is sufficiently efficient (sufficiently large effect to cost ratio). Our lattice model analysis confirms the difficulties anti-social punishment poses for cooperation. On the lattice, cooperators can never invade a population of anti-social defectors who punish cooperators (DA). This anti-social defection strategy is highly successful across the wide parameter ranges we tested. These results show that anti-social punishment can be favored by natural selection, and highlight the dangers associated with allowing the option for costly punishment. Not only is it possible for cooperators to punish defectors, but the opposite can also occur. Together with the previous results of Nakamaru & Iwasa (2006), our results suggest that punishment only successfully promotes cooperation in structured populations if it is only possible to harm defectors but not cooperators. Our results suggest that there is an evolutionary imperative for defectors to seek out ways to punish cooperators, and thus models which exclude this phenomenon may give skewed results favoring evolution of cooperation.

At first it seems counter-intuitive that successful self-interested players would pay a cost to harm others in the second round of a two-shot game. The explanation lies in the nature of evolutionary competition in settings where you learn from the same players you interact with. Here it is not one’s absolute payoff that matters, but rather how much you have relative to others (Page et al. 2000). Although punishers (P), anti-social punishers (A), and spiteful punishers (S) in our model seem to perform spite at a level of absolute payoffs because both punisher and punishee incur immediate costs, the actual effect of these behaviors is to enhance the punisher’s probability of replacing the victims of punishment: inflicting harm which reduces the others’ payoff by more than it reduces yours improves your relative payoff. Therefore, spiteful strategies which harm others can be at an advantage in a spatially structured setting, as long as they preferentially harm agents with strategies different than their own (Hamilton 1970; Wilson 1975; Nakamaru et al. 1997; Foster et al. 2000; Nakamaru & Iwasa 2005; Lehmann et al. 2006, El Mouden & Gardner 2008). In this sense, punishment in our model does not represent genuine spite but can be classified as a selfish behavior (West & Gardner 2010). Conversely, evolution has been shown to disfavor costly punishment in repeated games played in well mixed populations (Rand et al. 2009b). The effect of costly punishment in repeated games with local interaction and competition, however, is an interesting question for future study, as is the role of anti-social punishment in games with continuous (as opposed to binary) traits (Nakamaru & Dieckmann, 2009). Furthermore, while our analysis has demonstrated that the population structure considered in this paper can favor anti-social punishment, a more explicitly structured population, such as a population subdivided into many small groups, may not, as the balance between within-group selection and between-group selection could change (Wilson 1975, Okasha 2006, Traulsen & Nowak 2006, Wilson & Wilson 2007). Similarly, incorporating explicit inter-group conflicts into our model could increase the importance of group-level selection, and therefore could change the relative importance of anti-social and pro-social punishment (Bowles 2009; Lehmann & Feldman 2008). These issues deserves further study.

We have framed the cooperative dilemma faced by players in our model as a series of pairwise interactions. However, because players are unconditional cooperators or defectors, the game we study is exactly equivalent to a z + 1-player public goods game followed by punishment (Boyd & Richerson 1992; Fehr & Gachter 2000, 2002; Sigmund et al. 2001; Nakamaru & Iwasa 2005, 2006; Boyd et al. 2003; Brandt Hauert Sigmund 2003, 2006; Fowler 2005; Hauert et al. 2007; Traulsen et al. 2009). In the z-well mixed population, a cooperator pays a cost to benefit each other member of her group, which is drawn randomly each round; this corresponds to a public goods game played under the ‘stranger matching’ protocol (Fehr & Gachter 2000, 2002). In the lattice population structure, a cooperator pays a cost to benefit her z = 4 neighbors; this corresponds to a series of overlapping public goods games with fixed group compositions. Thus the results we observe regarding anti-social punishment are not unique to pairwise interactions, but rather apply to collective action problems confronted by groups of individuals.

In our modeling framework, agents update their strategies through an evolutionary process. As opposed to prospectively calculating the strategy that would maximize one’s payoff, our agents are motivated to change strategy when their payoffs are low, and then imitate the behavior of others they observe. Thus strategies with higher payoffs tend to become more common in the population. If instead our agents picked strategies based on rational self-interest using Eq.(1) with ε errors, they would never cooperate either with or without anti-social punishment as DN is the unique Nash equilibrium. Thus in both extremes of strict imitation and strict rationality, cooperation is not stable using the full strategy set. Intermediate cases in which agents use some combination of imitation and prospective reasoning are an interesting subject for further study.

The analysis presented here raises important questions about the ability of costly punishment to promote the evolution of cooperation. Spiteful behavior including anti-social punishment is well documented aspects of human behavior (Saijo & Nakamura 1995; Denant-Boemont et al. 2007; Dreber et al. 2008; Herrmann et al. 2008; Nikiforakis 2008; Gächter & Herrmann 2009; Wu et al. 2009), and should not be ignored. Here we show that in populations with local interaction and competition, anti-social and spiteful behavior can be favored by evolution as it actually enhances actor’s relative reproductive success, and costly punishment no longer promotes the evolution of cooperation when anti-social punishment is allowed. Nonetheless, we observe cooperation throughout the natural world and human society. While many other models for the co-evolution of punishment and cooperation have been proposed, our results suggest that the effects of anti-social punishment need to be explored in these other contexts as well. If including anti-social punishment in other models gives results similar to what we have shown here, then mechanisms other than punishment could be primarily responsible for the evolution of cooperation, and punishment must have evolved secondarily for other reasons such as asserting dominance (Clutton-Brock & Parker 1995a).

Acknowledgments

DR and JA acknowledge support from the John Templeton Foundation, the NSF/NIH joint program in mathematical biology (NIH grant R01GM078986), the Bill and Melinda Gates Foundation (Grand Challenges grant 37874), and J. Epstein. MN acknowledges Grants-in-Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology of Japan (No.21770016, No. 20310086, No.19046006, No.19310097, No.21247006) and support from the Inamori Foundation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Appendix A Derivation of eqation (5) in the main text

Consider the game, (4). We study the stability condition of strategy X against Y. For that purpose let ε (≪1) be the frequency of strategy Y in the population. The rest 1 − ε is the frequency of X. A Y-strategist changes to strategy X at rate

εx=0z(zx)(1ε)xεzxd(xa3+(zx)a4)xz=εd(za3)+O(ε2), (A1)

where d(·) is given by Eq.(3). In the sum above, x represents the number of X-players with whom a Y-player interacts. With probability (zx)(1ε)xεzx the Y-player interacts with x many X-players and gains the payoff, xa3 + (zx)a4. When he dies, he changes to strategy X with probability x/z. On the other hand, an X-strategist changes to strategy Y at rate

(1ε)y=0z(zy)(1ε)zyεyd[(zy)a1+ya2]yz=εd((z1)a1+a2)+O(ε2). (A2)

There y represents the number of Y-players with whom an X-player interacts. With probability (zy)(1ε)zyεy the X-player interacts with y many Y-players and gains the payoff, (zy)a1 + ya2. When he dies, he changes to strategy Y with probability y/z.

Comparing Eqs.(A1, A2), we obtain the stability condition of X against Y as d(za3) > d((z − 1)a1 + a2), or equivalently,

(z1)a1+a2>za3. (A3)

Appendix B Conditions for DS strategy to win

An intuitive explanation of why the DS strategy wins in the z-mixed population model when β is relatively large is twofold: (i) there are no assortment effects in the z-mixed population model, such that one does not meets others with the same strategy more often than expected by global frequencies, and (ii) one learns from the same individuals one interacts with in the z-mixed population model, such that one’s payoff relative to one’s interaction partners matters.

To see that the property (i) is a necessary condition for DS to win, we compare the z-mixed population model (Section 3.1.2) with the lattice model (Section 3.2.2). Property (i) is present in the former but is absent in the latter, while property (ii) is present in both models. As we saw in the main text, DS wins in the z-mixed model but not in the lattice model, suggesting that property (i) is crucial for the propagation of DS strategy.

To see that (ii) is also a necessary condition for DS to win, we compare the z-mixed population model to a variant with ”global replacement”. In this variant, an updating player mimics the strategy of a player randomly chosen from the entire population, as opposed to a randomly selected interaction partner. Property (ii) is present in the z-mixed population model but is absent in the global replacement variant, whereas property (i) is present in both models. To compare the two models, we use agent based simulations with z = 4, population size N = 2500, b = 5, c = 1, α = 1, β = 10 and viability updating parameter values γ = 1 and θ = 1. We examine the outcome of 25 simulations lasting 125,000,000 generations, starting from an initial frequency of DS=0.79, all other strategies 0.03. Our calculations in the main text showed that DS is ESS in the z-mixed model when β > . Consistent with this, we find that DS wins in the majority of cases in the z-mixed population agent based simulations. Conversely, we find that DN wins 80% of the time using the global replacement model. This result suggests that property (ii) is also crucial for the propagation of the DS strategy.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Alexander RD. The Biology of Moral Systems. New York: Aldine de Gruyter; 1987. [Google Scholar]
  2. Axelrod R. The Evolution of Cooperation. New York: Basic Books; 1984. [Google Scholar]
  3. Axelrod R, Hamilton WD. The evolution of cooperation. Science. 1981;211:1390–1396. doi: 10.1126/science.7466396. [DOI] [PubMed] [Google Scholar]
  4. Axelrod R. An evolutionary approach to norms. Amer Polit Sci Rev. 1986;80:1095–1111. [Google Scholar]
  5. Boerlijst MC, Nowak MA, Sigmund K. The logic of contrition. J Theor Biol. 1997;185:281–293. doi: 10.1006/jtbi.1996.0326. [DOI] [PubMed] [Google Scholar]
  6. Bowles S. Did Warfare Among Ancestral Hunter-Gatherers Affect the Evolution of Human Social Behaviors? Science. 2009;324:1293–1298. doi: 10.1126/science.1168112. [DOI] [PubMed] [Google Scholar]
  7. Bowles S, Gintis H. The evolution of strong reciprocity: cooperation in heterogeneous populations. J Theor Biol. 2004;65:17–28. doi: 10.1016/j.tpb.2003.07.001. [DOI] [PubMed] [Google Scholar]
  8. Boyd R, Gintis H, Bowles S, Richerson PJ. The evolution of altruistic punishment. Proc Natl Acad Sci USA. 2003;100:3531–3535. doi: 10.1073/pnas.0630443100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Boyd R, Richerson PJ. Punishment allows the evolution of cooperation (or anything else) in sizable groups. Ethol Sociobiol. 1992;13:171–195. [Google Scholar]
  10. Brandt H, Hauert C, Sigmund K. Punishment and reputation in spatial public goods games. Proc R Soc Lond B. 2003;270:1099–1104. doi: 10.1098/rspb.2003.2336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Brandt H, Hauert C, Sigmund K. Punishing and abstaining for public goods. Proc Natl Acad Sci USA. 2006;103:495–497. doi: 10.1073/pnas.0507229103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Brandt H, Sigmund K. The good, the bad and the discriminator - Errors in direct and indirect reciprocity. J Thoer Biol. 2006;239:183–194. doi: 10.1016/j.jtbi.2005.08.045. [DOI] [PubMed] [Google Scholar]
  13. Clutton-Brock TH, Parker GA. Punishment in animal societies. Nature. 1995a;373:209–216. doi: 10.1038/373209a0. [DOI] [PubMed] [Google Scholar]
  14. Clutton-Brock TH, Parker GA. Sexual coercion in animal societies. Anim Behav. 1995b;49:1345–1365. [Google Scholar]
  15. Dal BP. Cooperation Under the Shadow of the Future: Experimental Evidence from Infinitely Repeated Games. American Economic Review. 2005;95:1591–1604. [Google Scholar]
  16. Dal BP, Frechette G. The Evolution of Cooperation in Infinitely Repeated Games: Experimental Evidence. Forthcoming in the American Economic Review 2009 [Google Scholar]
  17. Denant-Boemont L, Masclet D, Noussair C. Punishment, counterpunishment and sanction enforcement in a social dilemma experiment. Economic Theory. 2007;33:1432–1479. [Google Scholar]
  18. Dreber A, Rand DG, Fudenberg D, Nowak MA. Winners don’t punish. Nature. 2008;452:348–351. doi: 10.1038/nature06723. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. El Mouden C, Gardner A. Nice natives and mean migrants: the evolution of dispersal-dependent social behaviour in viscous populations. J Evol Biol. 2008;21:1480–1491. doi: 10.1111/j.1420-9101.2008.01614.x. [DOI] [PubMed] [Google Scholar]
  20. Fehr E, Gächter S. Cooperation and Punishment in Public Goods Experiments. American Economic Review. 2000;90:980–994. [Google Scholar]
  21. Fehr E, Gächter S. Altruistic punishment in humans. Nature. 2002;415:137–140. doi: 10.1038/415137a. [DOI] [PubMed] [Google Scholar]
  22. Fehr E, Fischbacher U. The nature of human altruism. Nature. 2003;425:785–791. doi: 10.1038/nature02043. [DOI] [PubMed] [Google Scholar]
  23. Fehr E, Schmidt KM. A theory of fairness, competition, and cooperation. The Quarterly Journal of Economics. 1999;114:817–868. [Google Scholar]
  24. Foster KR, Wenseleers T, Ratnieks FLW. Spite in social insects. Trends Ecol Evol. 2000;15:469–470. [Google Scholar]
  25. Fowler JH. Altruistic punishment and the origin of cooperation. Proc Natl Acad Sci USA. 2005;102:7047–7049. doi: 10.1073/pnas.0500938102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Frank SA. Foundations of social evolution. Princeton: Princeton University Press; 1998. [Google Scholar]
  27. Fudenberg D, Maskin E. Evolution and cooperation in noisy repeated games. Am Econ Rev. 1990;80:274–279. [Google Scholar]
  28. Fudenberg D, Tirole J. Game theory. Cambridge, MA: MIT Press; 1991. [Google Scholar]
  29. Fudenberg D, Rand DG, Dreber A. Slow to Anger and Fast to Forgive: Cooperation in an Uncertain World. 2010 Available at SSRN: http://ssrn.com/abstract=1616396.
  30. Gächter S, Renner E, Sefton M. The Long-Run Benefits of Punishment. Science. 2008;322:1510. doi: 10.1126/science.1164744. [DOI] [PubMed] [Google Scholar]
  31. Gächter S, Herrmann B. Reciprocity, culture and human cooperation: previous insights and a new cross-cultural experiment. Phil Trans R Soc B. 2009;364:791–806. doi: 10.1098/rstb.2008.0275. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Gardner A, West SA. Cooperation and punishment, especially in humans. Am Nat. 2004;164:753–764. doi: 10.1086/425623. [DOI] [PubMed] [Google Scholar]
  33. Hamilton WD. The genetical evolution of social behaviour. J Theor Biol. 1964;7:1–52. doi: 10.1016/0022-5193(64)90038-4. [DOI] [PubMed] [Google Scholar]
  34. Hamilton WD. Selfish and spiteful behaviour in an evolutionary model. Nature. 1970;228:1218–1220. doi: 10.1038/2281218a0. [DOI] [PubMed] [Google Scholar]
  35. Hauert C, Doebeli M. Spatial structure often inhibits the evolution of cooperation in the snowdrift game. Nature. 2004;428:643–646. doi: 10.1038/nature02360. [DOI] [PubMed] [Google Scholar]
  36. Hauert C, Traulsen A, Brandt H, Nowak MA, Sigmund K. Via freedom to coercion: The emergence of costly punishment. Science. 2007;316:1905–1907. doi: 10.1126/science.1141588. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Hauser MD. Cost of deception: Cheaters are punished in rhesus monkeys (Macaca mulatta) Proc Natl Acad Sci USA. 1992;89:12137–12139. doi: 10.1073/pnas.89.24.12137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Henrich J. Cultural group selection, coevolutionary processes and large-scale cooperation. J Econ Behav Organ. 2004;53:3–35. [Google Scholar]
  39. Henrich J, Boyd R. Why people punish defectors: Weak conformist transmission can stabilize costly enforcement of norms in cooperative dilemmas. J Theor Biol. 2001;208:79–89. doi: 10.1006/jtbi.2000.2202. [DOI] [PubMed] [Google Scholar]
  40. Herrmann B, Thoni C, Gächter S. Antisocial punishment across societies. Science. 2008;319:1362–1367. doi: 10.1126/science.1153808. [DOI] [PubMed] [Google Scholar]
  41. Irwin AJ, Taylor PD. Evolution of altruism in stepping-stone populations with overlapping generations. Theor Pop Biol. 2001;60:315–325. doi: 10.1006/tpbi.2001.1533. [DOI] [PubMed] [Google Scholar]
  42. Janssen MA, Bushman C. Evolution of cooperation and altruistic punishment when retaliation is possible. J Theor Biol. 2008;254:541–545. doi: 10.1016/j.jtbi.2008.06.017. [DOI] [PubMed] [Google Scholar]
  43. Johnson T, Dawes CT, Fowler JH, McElreath R, Smirnov O. The role of egalitarian motives in altruistic punishment. Economics Letters. 2009;102:192–194. [Google Scholar]
  44. Kiers ET, Rousseau RA, West SA, Denison RF. Host sanctions and the legume-rhizobium mutualism. Nature. 2003;425:78–81. doi: 10.1038/nature01931. [DOI] [PubMed] [Google Scholar]
  45. Killingback T, Doebeli M. Spatial Evolutionary Game Theory: Hawks and Doves Revisited. Proc R Soc B. 1996;263:1135–1144. [Google Scholar]
  46. Lehmann L, Feldman M. War and the evolution of belligerence and bravery. Proceedings of the Royal Society B. 2008;275:2877–2885. doi: 10.1098/rspb.2008.0842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Lehmann L, Keller L. The evolution of cooperation and altruism - a general framework and a classification of models. J Evol Biol. 2006;19:1365–1376. doi: 10.1111/j.1420-9101.2006.01119.x. [DOI] [PubMed] [Google Scholar]
  48. Lehmann L, Bargum K, Reuter M. An evolutionary analysis of the relationship between spite and altruism. J Evol Biol. 2006;19:1507–1516. doi: 10.1111/j.1420-9101.2006.01128.x. [DOI] [PubMed] [Google Scholar]
  49. Lehmann L, Rousset F, Roze D, Keller L. Strong reciprocity or strong ferocity? A population genetic view of the evolution of altruistic punishment. Am Nat. 2007;170:21–36. doi: 10.1086/518568. [DOI] [PubMed] [Google Scholar]
  50. Lindgren K. Evolutionary phenomena in simple dynamics. In: Langton C, et al., editors. Artificial Life II. Redwood City, CA: Addison-Wesley; 1991. pp. 295–312. [Google Scholar]
  51. Lindgren K, Nordahl MG. Evolutionary dynamics of spatial games. Physica D. 1994;75:292–309. [Google Scholar]
  52. May RM. More evolution of cooperation. Nature. 1987;327:15–17. [Google Scholar]
  53. Molander P. The optimal level of generosity in a selfish uncertain environment. J Conflict Resolut. 1985;29:611–618. [Google Scholar]
  54. Nakamaru M, Dieckmann U. Runaway selection for cooperation and strict-and-severe punishment. J Theor Biol. 2009;257:1–8. doi: 10.1016/j.jtbi.2008.09.004. [DOI] [PubMed] [Google Scholar]
  55. Nakamaru M, Matsuda H, Iwasa Y. The evolution of cooperation in a lattice structured population. J Theor Biol. 1997;184:65–81. doi: 10.1006/jtbi.1996.0243. [DOI] [PubMed] [Google Scholar]
  56. Nakamaru M, Iwasa Y. The evolution of altruism by costly punishment in lattice-structured populations: score-dependent viability versus score-dependent fertility. Evol Ecol Res. 2005;7:853–870. [Google Scholar]
  57. Nakamaru M, Iwasa Y. The coevolution of altruism and punishment: role of the selfish punisher. J Theor Biol. 2006;240:475–488. doi: 10.1016/j.jtbi.2005.10.011. [DOI] [PubMed] [Google Scholar]
  58. Nikiforakis N. Punishment and Counter-punishment in Public Goods Games: Can we still govern ourselves? Journal of Public Economics. 2008;92:91–112. [Google Scholar]
  59. Nowak MA. Five rules for the evolution of cooperation. Science. 2006;314:1560–1563. doi: 10.1126/science.1133755. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Nowak MA, May RM. Evolutionary games and spatial chaos. Nature. 1992;359:826–829. [Google Scholar]
  61. Nowak MA, Sigmund K. Game-dynamical aspects of the prisoner’s dilemma. Appl Math Comp. 1989;30:191–213. [Google Scholar]
  62. Nowak MA, Sigmund K. Tit For Tat in heterogeneous populations. Nature. 1992;355:250–253. [Google Scholar]
  63. Nowak MA, Sigmund K. A strategy of win-stay, lose-shift that outperforms tit for tat in prisoner’s dilemma. Nature. 1993;364:56–58. doi: 10.1038/364056a0. [DOI] [PubMed] [Google Scholar]
  64. Nowak MA, Sigmund K. Evolution of indirect reciprocity. Nature. 2005;437:1291–1298. doi: 10.1038/nature04131. [DOI] [PubMed] [Google Scholar]
  65. Ohtsuki H, Hauert C, Lieberman E, Nowak MA. A simple rule for the evolution of cooperation on graphs and social networks. Nature. 2006;441:502–505. doi: 10.1038/nature04605. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Okasha S. Evolution of the levels of selection. Oxford: Oxford University Press; 2006. [Google Scholar]
  67. Oliver P. Rewards and punishment as selective incentives for collective action: theoretical investigations. Am J Sociol. 1980;85:1356–1375. [Google Scholar]
  68. Ostrom E, Walker J, Gardner R. Covenants With and Without a Sword: Self-Governance is Possible. Am Pol Sci Rev. 1992;86:404–417. [Google Scholar]
  69. Page KM, Nowak MA, Sigmund K. The spatial ultimatum game. Proc R Soc B. 2000;267:2177–2182. doi: 10.1098/rspb.2000.1266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Panchanathan K, Boyd R. Indirect reciprocity can stabilize cooperation without the second-order free rider problem. Nature. 2004;432:499–502. doi: 10.1038/nature02978. [DOI] [PubMed] [Google Scholar]
  71. Rand DG, Dreber A, Ellingsen T, Fudenberg D, Nowak MA. Positive Interactions Promote Public Cooperation. Science. 2009a;325:1272–1275. doi: 10.1126/science.1177418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Rand DG, Ohtsuki H, Nowak MA. Direct reciprocity with costly punishment: Generous tit-for-tat prevails. J Theor Biol. 2009b;256:45–57. doi: 10.1016/j.jtbi.2008.09.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Ratnieks FLW, Foster KR, Wenseleers T. Conflict resolution in insect societies. Annu Rev Entomol. 2006;51:581–608. doi: 10.1146/annurev.ento.51.110104.151003. [DOI] [PubMed] [Google Scholar]
  74. Santos FC, Pacheco JM, Lenaerts T. Evolutionary dynamics of social dilemmas in structured heterogeneous populations. Proc Natl Acad Sci USA. 2006;103:3490–3494. doi: 10.1073/pnas.0508201103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Santos FC, Santos MD, Pacheco JM. Social diversity promotes the emergence of cooperation in public goods games. Nature. 2008;454:213–216. doi: 10.1038/nature06940. [DOI] [PubMed] [Google Scholar]
  76. Sekiguchi T, Nakamaru M. Effect of the presence of empty sites on the evolution of cooperation by costly punishment in spatial games. J Theor Biol. 2009;256:297–304. doi: 10.1016/j.jtbi.2008.09.025. [DOI] [PubMed] [Google Scholar]
  77. Shinada M, Yamagishi T, Ohmura Y. False friends are worse than bitter enemies: ”Altruistic” punishment of in-group members. Evolution and Human Behavior. 2004;25:379–393. [Google Scholar]
  78. Sigmund K. Punish or perish? Retaliation and collaboration among humans. Trends Ecol Evol. 2007;22:593–600. doi: 10.1016/j.tree.2007.06.012. [DOI] [PubMed] [Google Scholar]
  79. Sigmund K, Hauert C, Nowak MA. Reward and punishment. Proc Natl Acad Sci USA. 2001;98:10757–10762. doi: 10.1073/pnas.161155698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Szabo G, Toke C. Evolutionary prisoner’s dilemma game on a square lattice. Phys Rev E. 1998;58:69–73. [Google Scholar]
  81. Taylor PD. Altruism in viscous populations? an inclusive fitness model. Evol Ecol. 1992;6:352–356. [Google Scholar]
  82. Tibbetts EA, Dale J. A socially enforced signal of quality in a paper wasp. Nature. 2004;432:218–222. doi: 10.1038/nature02949. [DOI] [PubMed] [Google Scholar]
  83. Traulsen A, Hauert C, Brandt De Silva H, Nowak MA, Sigmund K. Exploration dynamics in evolutionary games. Proc Natl Acad Sci USA. 2009;106:709–712. doi: 10.1073/pnas.0808450106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Traulsen A, Nowak MA. Evolution of cooperation by multilevel selection. Proc Natl Acad Sci USA. 2006;103:10952–10955. doi: 10.1073/pnas.0602530103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Trivers RL. The evolution of reciprocal altruism. Q Rev Biol. 1971;46:35–57. [Google Scholar]
  86. Wahl LM, Nowak MA. The continuous prisoner’s dilemma: II. Linear reactive strategies with noise. J Theor Biol. 1999;200:323–338. doi: 10.1006/jtbi.1999.0997. [DOI] [PubMed] [Google Scholar]
  87. Wedekind C, Milinski M. Human cooperation in the simultaneous and the alternating Prisoner’s Dilemma: Pavlov verus Generous Tit-for-Tat. Proc Natl Acad Sci USA. 1996;93:2686–2689. doi: 10.1073/pnas.93.7.2686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Wenseleers T, Ratnieks FLW. Enforced altruism in insect societies. Nature. 2006;444:50. doi: 10.1038/444050a. [DOI] [PubMed] [Google Scholar]
  89. West SA, Griffin AS, Gardner A. Evolutionary explanations for cooperation. Curr Biol. 2007;17:R661–672. doi: 10.1016/j.cub.2007.06.004. [DOI] [PubMed] [Google Scholar]
  90. West SA, Gardner A. Altruism, spite, and greenbeards. Science. 2010;327:1341–1344. doi: 10.1126/science.1178332. [DOI] [PubMed] [Google Scholar]
  91. Wu J, Zhang B, Zhou Z, He Q, Zheng X, Cressman R, Tao Y. Costly punishment does not always increase cooperation. Proc Natl Acad Sci USA. 2009 doi: 10.1073/pnas.0905918106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Wilson DS. A theory of group selection. Proc Natl Acad Sci USA. 1975;72:143–146. doi: 10.1073/pnas.72.1.143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Wilson DS, Pollock GB, Dugatkin LA. Can altruism evolve in purely viscous populations? Evol Ecol. 1992;6:331–341. [Google Scholar]
  94. Wilson DS, Wilson EO. Rethinking the Theoretical Foundation of Sociobiology. Quarterly Review of Biology. 2007;82:327–348. doi: 10.1086/522809. [DOI] [PubMed] [Google Scholar]
  95. Wilson EO. Sociobiology: the new synthesis. Harvard Press; Cambridge, Mass: 1975. p. 697. [Google Scholar]
  96. Yamagishi T. The provision of a sanctioning system as a public good. J Pers Soc Psychol. 1986;51:110–116. [Google Scholar]

RESOURCES