Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2009 Sep 28;106(41):17448–17451. doi: 10.1073/pnas.0905918106

Costly punishment does not always increase cooperation

Jia-Jia Wu a, Bo-Yu Zhang a,b, Zhen-Xing Zhou b, Qiao-Qiao He a, Xiu-Deng Zheng a, Ross Cressman c,1, Yi Tao a,1
PMCID: PMC2765097  PMID: 19805085

Abstract

In a pairwise interaction, an individual who uses costly punishment must pay a cost in order that the opponent incurs a cost. It has been argued that individuals will behave more cooperatively if they know that their opponent has the option of using costly punishment. We examined this hypothesis by conducting two repeated two-player Prisoner's Dilemma experiments, that differed in their payoffs associated to cooperation, with university students from Beijing as participants. In these experiments, the level of cooperation either stayed the same or actually decreased when compared with the control experiments in which costly punishment was not an option. We argue that this result is likely due to differences in cultural attitudes to cooperation and punishment based on similar experiments with university students from Boston that found cooperation did increase with costly punishment.

Keywords: antisocial punishment, cultural effects, experimental outcome, Prisoner's Dilemma repeated game, reputation


Costly (altruistic) punishment means paying a cost for another individual to incur a cost (1) and has been advanced as a key mechanism to explain cooperative behavior in human societies (1, 2). A great deal of research, both through experimental studies and theoretical models (see also refs. 313), has examined whether the option of costly punishment promotes cooperative behavior in one-shot multiplayer public goods games where non cooperative free riders gain at no cost the same group benefits from the “good” as those individuals who contributed to it. For treatments that include costly punishment, players may punish other group members after they are informed what contribution each person made. Empirically, this option has consistently increased the level of cooperation as measured by average individual contribution, especially in typical experiments designed so subjects either do not interact more than once or do not know that they have. Of particular relevance for us is the additional experimental finding (1, 14) that different societal norms alter the prevalence of costly punishment and its effectiveness in promoting cooperative behavior.

To a lesser extent, costly punishment has also been examined in repeated games and in other games where players can base their behavior at later stages of an interaction on earlier actions of their opponents. For instance, players are now able to retaliate against an individual who previously punished them and reputation effects play a role (6). Experimental evidence based on public goods games with counterpunishment (12, 13) again show costly punishment promotes cooperative behavior. Instead, our experiments are based on the repeated two-player Prisoner's Dilemma (PD) game with and without the option of costly punishment. In this regard, Dreber et al. (15) recently performed repeated PD experiments where, in each round, participants chose between cooperation, defection and possibly costly punishment. The subjects participating in their experiments were college/university students in the Boston area. Along with the result that people who gain the highest payoff in this repeated game tend not to use costly punishment (i.e., winners don't punish), they found that the prevalence of cooperation increases with the option of costly punishment.

To determine whether societal norms play a role in these results, we replicated their experiments with the same experimental design (see Methods) at the Computer Lab in the School of Life Sciences, Beijing Normal University with university students from Beijing as subjects. That is, following Dreber et al. (15), we performed two control experiments (C1 and C2) and two treatments (T1 and T2). In the control experiments, two subjects chosen at random play a standard repeated PD game without knowing how many rounds there will be in this two-player game but knowing that, after one round is complete, there is a 0.75 probability of another round. There are only two options in every round: cooperate and defect. Cooperation (C) means paying 1 unit for the other person to receive 2 units (in C1 and T1) or 3 units (in C2 and T2). Defection (D) means gaining 1 unit at a cost of 1 unit for the other person. In the treatments, people have three options in every round: cooperate, defect, or punish. Punishment (P) means paying 1 unit for the other person to lose 4 units. Subjects were told the payoff consequences of each strategy choice but not the name we associated with it to avoid any predisposition on their part to certain labels.

That is, the payoff matrices for the experiments C1 and T1 are given by

graphic file with name zpq04109-9693-m01.jpg

respectively, and the payoff matrices for the experiments C2 and T2 are given by

graphic file with name zpq04109-9693-m02.jpg

respectively. These payoff matrices are identical to those in Dreber et al. (15).

Results

Our first experimental results, depicted in Fig. 1A, show that costly punishment does not increase cooperation. Specifically, in the two control experiments, C1 and C2, the ratio of cooperation to all decisions is 24.62% in C1, and 23.95% in C2. In T2, the frequency of cooperation is 26.45%, which is slightly higher than in C2 but there is no significant difference (P = 0.2124, Fisher's two-tailed exact test). Furthermore, in T1, the frequency of cooperation, 18.25%, is substantially smaller than in C1 (P = 0.0544). That is, costly punishment decreases the frequency of cooperation in T1. In summary, comparing each control experiment with its treatment, we find that the option of costly punishment either makes no significant difference or actually decreases the amount of cooperation in our experiments.

Fig. 1.

Fig. 1.

Average frequency of cooperation in each session. A is from our experiments, and B is from Dreber et al. (15) (see their supplementary figure 2A). Error bars represent the standard error from the mean.

These results are unexpected because previous experiments on the repeated PD game [see Fig. 1B, which is reproduced with permission from Dreber et al. (15)] and on public goods games consistently show that costly punishment increases the frequency of cooperation. In particular, our result for T1 is completely opposite to that of previous experiments. Our study seems to be the first experiment that exhibits this latter outcome, although theoretical models that analyze evolutionary dynamics under direct reciprocity (16) also suggest costly punishment may not promote cooperation.

Although Dreber et al. (15) find empirically that there is significantly more cooperation in each treatment than in its corresponding control experiment, there is no significant difference in the average payoff (Fig. 2B) because any extra payoff due to more subjects mutually cooperating is offset by lower payoffs for both the punishers and those punished. That is, in their experiments, the option of costly punishment is neutral in regards to advantage to the group. Because cooperation does not increase in our treatment, we obtain the expected result that average payoff decreases (Fig. 2A). This decrease is more significant in C1 vs. T1 (Mann–Whitney test: P = 5.0324 × 10−6 and z = −4.5634) than in C2 vs. T2 (P = 0.2092 and z = 1.2558). Thus, our experiments argue against group selection in cooperative games as a mechanism to explain the appearance of costly punishment in human societies (7). This reinforces the conclusion (15) that the use of costly punishment evolved in human societies for other reasons than to promote cooperation.

Fig. 2.

Fig. 2.

Average payoff per round in each session. A is from our experiments, and B is from Dreber et al. (15) (see their supplementary figure 2B). Also shown is the payoff per round for mutual cooperation (ALLC) in each session. Clearly, punishment does not provide any advantage for the group, and all control and treatment payoffs are lower than the ALLC payoff.

The most likely explanation for the different results in the two studies lies in cultural effects in China compared with the U.S. (see also the discussion at the end of Methods where we rule out that these differences are based on the level of monetary rewards). In particular, Chinese and Americans have different cultural attitudes toward reputation and authority. In China, a person's reputation is enhanced by establishing and maintaining a network of two-person dyadic relationships, known as “guanxi,” with more people who themselves have good reputations (17). Reputation through guanxi networks is clearly related to indirect reciprocity effects [i.e., when my behavior toward you depends on how you behave with others (11)]. Along with this emphasis on collectivism compared with individualism, Chinese society is hierarchical with deference given to authority figures (18). Because neither indirect reciprocity nor a predetermined authority figure are possible in the experimental setup of our repeated two-player PD game, these cultural characteristics do not provide a reason for our participants in Beijing to increase the level of cooperation in the presence of costly punishment. However, direct reciprocity (i.e., when my behavior toward you depends on how you have behaved to me) is relevant in our repeated games (16) and forms the basis of a player's reputation (in the sense of being able to predict how opponents will react to each other's actions) during the course of several rounds between the same two players. This type of dyadic reputation [which is more important in Western society (17)], can be used to give theoretical predictions close to the outcome of the experiments in Boston (16). Heuristically, the levels of cooperation increase and defection decrease when individuals who have a reputation of defecting run the risk of being punished.

It also appears that societal differences are the main factor for varying attitudes to punishment in experiments based on the multiplayer public goods game (1). The variation found by Herrmann et al. (1) is most clear for antisocial punishment behavior (i.e., punishing the high contributors to the public good because these are suspected of being the primary punishers of the low contributors), which was rare for participants from democratic societies with advanced market economies and common in more traditional societies (19). Interestingly, in these experiments conducted in 16 comparable participant pools of undergraduate university students, the results on antisocial punishment for the two pools in the U.S. and China (Boston and Chengdu, respectively) are quite similar [figure 1 in Hermann et al. (1)]. One explanation of this result is that indirect reciprocity occurs in these multiplayer games and so reputation (in either of the senses discussed in the previous paragraph) becomes a factor in people's behavior. That is, antisocial punishment is rare in the U.S. and China due to reputation even though the basis of this reputation differs in the two societies.

Our experimental results provide further evidence, based on the two-player repeated PD game, that participant's attitude to punishment depends on their cultural background. Specifically, Fig. 3A plots the histogram of rounds in which costly punishment is first used. Of particular interest is that the frequency of first P use during round 1 in our experiments is 24.47% in T1, and 29.82% in T2. These rates are much higher than in Dreber et al. (15), where this frequency is <5% in both T1 and T2 (Panel B). Participants who use costly punishment in round 1 are indiscriminately punishing before they know anything about their opponent's behavior. Furthermore, the combined use of P and D in the first round of our treatments is much higher than those of Dreber et al. (15) (>69% in both T1 and T2 compared with <28% in Boston). The outcome of experiments in the U.S. is consistent with participants using direct reciprocity to foster higher payoffs through mutual cooperation by initially exhibiting cooperation. They are willing to take this risk due to a belief it will be compensated for later through a better reputation. As argued above, guanxi reputation effects are not relevant for Chinese participants in the repeated PD game. Punishment in the first round could then be an attempt to establish oneself as a dominant authority figure who is willing to punish in later rounds if dissatisfied with how the interaction proceeds. Such a strategy (summarized by the Chinese phrase “xia ma wei,” which means to deal someone a head-on blow at first encounter) may serve as a strong admonition (perhaps even intimidation) to an opponent. It is shown in the SI that there is no trend in the level of costly punishment or of defection used by our subjects in the first round over the course of either treatment session. Thus, the high frequency of first P use in round 1 as shown in Fig. 3A is a stable phenomenon that does not decrease as our subjects interact with more opponents.

Fig. 3.

Fig. 3.

Histogram of rounds in which costly punishment (P) is first used by a subject in T1 and in T2. Panel A is from our experiments where costly punishment was used 23 times in the first round of the 94 interactions where it was used at least once in T1 and 17 times in the first round of the 57 interactions where it was used at least once in T2. Panel B is from Dreber et al. (15) (see their supplementary figure 6).

These empirical discrepancies, on the initial use of costly punishment and its ongoing effect on the overall prevalence of cooperation and average payoff for the repeated PD game, between experiments with subjects in the U.S. and in China using the identical experimental design raise many questions on how culture impacts strategy choice. To examine this further, we compared other results from our experiments to those reported elsewhere. These are given in the SI where we show that our experiments match other aspects of the effects of costly punishment found by Dreber et al. (15). In particular, we find that winners don't punish and that average payoff also decreases with the increased use of punishment after defection (i.e., winners were not merely lucky by being paired with people against whom punishment was unnecessary). We also find that defection increases in later rounds of this repeated game as typically occurs in such experiments (15, 20).

Methods

A total of 94 subjects (58 women, 36 men, mean age 22.8 years old) from Beijing Normal University, Beijing Jiaotong University, Beijing University of Posts and Telecommunication, and Institute of Zoology, Chinese Academy of Sciences participated voluntarily in a modified repeated Prisoner's Dilemma game at the computer lab in the School of Life Sciences, Beijing Normal University. The lab consists of 38 computers. We developed a computer program for the anonymous interactions of participants, which is similar to the software z Tree used by Dreber et al. (15) in their experiments. Subjects did not know how many rounds there would be in one pairwise interaction, only that the probability of another round was 0.75. In each round of an interaction, the two participants chose simultaneously between all available options and, after the round, were shown the other person's choice and both payoff scores. At the end of each interaction, participants were told both their total scores and then randomly rematched for another interaction. The experiments (four sessions) were conducted in October 2008, with an average of 24 participants playing an average of 22 interactions, for an average of 79 total rounds per subject.

Each session began by reading the written instructions to the subjects on how the repeated game is played. These instructions in Chinese are the translation of those used in Dreber et al. (15). Each participant answered two test questions to verify he/she understood the instructions and played a practice interaction with another subject. In each session, the subjects were paid a show-up fee of Ren Min Bi (the Chinese unit of currency) (RMB)15 and given an initial payoff of 50 units to compensate for possible negative scores during the session. The additional income earned by each subject in a session was determined by his/her final score summed over all interactions multiplied by RMB0.2.

The average payment per subject per session was RMB28.3 and the average session length was 1 h. This compares with an average payment of U.S.$26 per session (that also lasted ≈1 h) in the experiments of Dreber et al. (15). Because the exchange rate at the time of our experiments was ≈1 RMBto U.S.$0.16, another possible explanation for the experimental differences between Beijing and Boston is that our subjects in China did not consider the consequences of their strategy choice as carefully as those in the U.S. and so were more willing to lose money to punish opponents for nonmonetary reasons. However, because the average weekly cost (including the cost of schooling, rent and food) to an undergraduate student at Beijing Normal University is between 250 and 300 RMB, our payment is ≈10% of this weekly cost, which is more than the comparable percentage for subjects in Boston. Thus, based on student living standards in Beijing and Boston, our payments provide at least as much a monetary incentive to our subjects.

Supplementary Material

Supporting Information

Acknowledgments.

The authors thank Martin Nowak and David Rand for making the relevant figures of Dreber et al. (15) available for us to reproduce and for comments on our experiments; Erin Cressman and two anonymous referees for suggestions to improve earlier versions of this article; and Bai-Hua Wang, Shi-Chang Wang, Ting Ji, and Ling-Ling Deng for their contributions to our experiments.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/cgi/content/full/0905918106/DCSupplemental.

References

  • 1.Herrmann B, Thoni C, Gachter S. Antisocial punishment across societies. Science. 2008;319:1362–1366. doi: 10.1126/science.1153808. [DOI] [PubMed] [Google Scholar]
  • 2.Fehr E, Gachter S. Altruistic punishment in humans. Nature. 2002;415:137–140. doi: 10.1038/415137a. [DOI] [PubMed] [Google Scholar]
  • 3.Yamagishi T. The provision of a sanctioning systems as a public good. J Pers Soc Psychol. 1986;51:110–116. [Google Scholar]
  • 4.Boyd R, Richerson PJ. Punishment allows the evolution of cooperation (or anything else) in sizable groups. Ethol Sociobiol. 1992;13:171–195. [Google Scholar]
  • 5.Clutton-Brock TH, Parker GA. Punishment in animal societies. Nature. 1995;373:209–216. doi: 10.1038/373209a0. [DOI] [PubMed] [Google Scholar]
  • 6.Sigmund K, Hauert C, Nowak MA. Reward and punishment. Proc Natl Acad Sci USA. 2001;98:10757–10762. doi: 10.1073/pnas.161155698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Boyd R, Gintis H, Bowles S, Richerson PJ. The evolution of altruistic punishment. Proc Natl Acad Sci USA. 2003;100:3531–3535. doi: 10.1073/pnas.0630443100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Fowler JH. Altruistic punishment and the origin of cooperation. Proc Natl Acad Sci USA. 2005;102:7047–7049. doi: 10.1073/pnas.0500938102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Bochet O, Page T, Putterman L. Communication and punishment in voluntary contribution experiments. J Econ Behav Org. 2006;60:11–26. [Google Scholar]
  • 10.Gurek O, Irlenbusch B, Rockenbach B. The competitive advantage of sanctioning institutions. Science. 2006;312:108–111. doi: 10.1126/science.1123633. [DOI] [PubMed] [Google Scholar]
  • 11.Rockenbach B, Milinski M. The efficient interaction of indirect reciprocity and costly punishment. Nature. 2006;444:718–723. doi: 10.1038/nature05229. [DOI] [PubMed] [Google Scholar]
  • 12.Nikiforakis N. Punishment and counter-punishment in public good games: Can we really govern ourselves? J Public Econ. 2008;92:91–112. [Google Scholar]
  • 13.Denant-Boemont L, Masclet D, Noussair CN. Punishment, counterpunishment and sanction enforcement in a social dilemma experiment. Econ Theory. 2007;33:145–167. [Google Scholar]
  • 14.Henrich J, et al. Costly punishment across human societies. Science. 2006;312:1767–1770. doi: 10.1126/science.1127333. [DOI] [PubMed] [Google Scholar]
  • 15.Dreber A, Rand DG, Fudenberg D, Nowak MA. Winners don't punish. Nature. 2008;452:348–351. doi: 10.1038/nature06723. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Rand DG, Ohtsuki H, Nowak MA. Direct reciprocity with costly punishment: Generous tit-for-tat prevails. Theoret Pop Biol. 2009;256:45–57. doi: 10.1016/j.jtbi.2008.09.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Standifird SS. Using guanxi to establish corporate reputation in China. Corporate Reputat Rev. 2006;9:171–178. [Google Scholar]
  • 18.Dien DS. Chinese authority-directed orientation and Japaness peer-group orientation: Questioning the notion of collectivism. Rev Gen Psychol. 1999;3:372–385. [Google Scholar]
  • 19.Gintis H. Punishment and cooperation. Science. 2008;319:1345–1346. doi: 10.1126/science.1155333. [DOI] [PubMed] [Google Scholar]
  • 20.Selten R, Stoecker End behavior in sequences of finite prisoner's dilemma supergames. J Econ Behav Org. 1986;7:47–70. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES