Skip to main content
PLOS One logoLink to PLOS One
. 2013 Nov 1;8(11):e77886. doi: 10.1371/journal.pone.0077886

Adaptive Dynamics of Extortion and Compliance

Christian Hilbe 1,*, Martin A Nowak 2, Arne Traulsen 1
Editor: Matjaz Perc3
PMCID: PMC3815207  PMID: 24223739

Abstract

Direct reciprocity is a mechanism for the evolution of cooperation. For the iterated prisoner’s dilemma, a new class of strategies has recently been described, the so-called zero-determinant strategies. Using such a strategy, a player can unilaterally enforce a linear relationship between his own payoff and the co-player’s payoff. In particular the player may act in such a way that it becomes optimal for the co-player to cooperate unconditionally. In this way, a player can manipulate and extort his co-player, thereby ensuring that the own payoff never falls below the co-player’s payoff. However, using a compliant strategy instead, a player can also ensure that his own payoff never exceeds the co-player’s payoff. Here, we use adaptive dynamics to study when evolution leads to extortion and when it leads to compliance. We find a remarkable cyclic dynamics: in sufficiently large populations, extortioners play a transient role, helping the population to move from selfish strategies to compliance. Compliant strategies, however, can be subverted by altruists, which in turn give rise to selfish strategies. Whether cooperative strategies are favored in the long run critically depends on the size of the population; we show that cooperation is most abundant in large populations, in which case average payoffs approach the social optimum. Our results are not restricted to the case of the prisoners dilemma, but can be extended to other social dilemmas, such as the snowdrift game. Iterated social dilemmas in large populations do not lead to the evolution of strategies that aim to dominate their co-player. Instead, generosity succeeds.

Introduction

Repeated games are among the best-studied objects in game theory, and the iterated prisoner’s dilemma has stimulated research on the evolution of cooperation for more than five decades [1][5]. The prisoner’s dilemma describes a social dilemma between two players, each having the choice whether to cooperate or to defect. When both cooperate, they each receive a mutual reward Inline graphic, which exceeds their payoff for mutual defection, Inline graphic. But if one player cooperates and the other defects, then the defector gets the highest payoff Inline graphic, whereas the cooperator ends up with the lowest payoff Inline graphic. Thus, if the game is played only once (or for a known finite number of rounds), then mutual defection is the only equilibrium. However, when players cannot anticipate how often the game will be played, cooperative solutions become feasible [3], [5], [6].

Researchers from diverse disciplines have used the iterated prisoner’s dilemma to discuss the potential of direct reciprocity for the evolution of cooperation [7][19]. However, recently Press and Dyson [20] discovered that the infinitely repeated prisoner’s dilemma also contains strategies that allow the manipulation and extortion of opponents [21][25]. To show this, they first proved that there are simple strategies, which only depend on the outcome of the previous round, such that each side can enforce a linear relationship between the payoffs of the two players. More precisely, suppose player 1 applies a memory-one strategy Inline graphic, where Inline graphic is the probability to cooperate after yielding a payoff Inline graphic in the previous round (additionally, such a strategy needs to specify a move for the first round. However, for infinitely iterated games, the first round can often be neglected). Moreover, assume that there are three constants Inline graphic such that Inline graphic can be written as

graphic file with name pone.0077886.e010.jpg (1)

Press and Dyson [20] showed that when player 1 applies such a strategy against an opponent with arbitrary strategy Inline graphic, then the player’s payoff Inline graphic and the opponent’s payoff Inline graphic fulfill the linear relation

graphic file with name pone.0077886.e014.jpg (2)

Since their proof required certain determinants to vanish, Press and Dyson called such strategies Inline graphic zero-determinant strategies. At first sight, zero-determinant strategies might seem as a mere mathematical curiosity [26]. However, their existence has several surprising consequences. Press and Dyson [20] discovered that certain zero-determinant strategies can guarantee that a player always yields at least the opponent’s payoff. They showed that by setting Inline graphic, a zero-determinant strategist can enforce the relation

graphic file with name pone.0077886.e017.jpg (3)

where Inline graphic is called the extortion factor [20], [23]. Such extortioner strategies Inline graphic guarantee that the player’s own surplus (over the maximin value Inline graphic) exceed’s the co-player’s surplus by a fixed percentage. In particular, when the the typical payoff relations Inline graphic hold, the payoff of an extortioner is never below the payoff of its co-player, suggesting that extortioners would dominate any evolutionary opponent [20].

On the other hand, Stewart and Plotkin [21], [25] considered a generous counterpart to extortioners. Starting from Inline graphic, they investigated zero-determinant strategists that enforce the relation

graphic file with name pone.0077886.e023.jpg (4)

where again Inline graphic. With such a generous strategy, a player can ensure that her payoff is never above the opponent’s payoff. In [23] such players are called compliers. Although compliant strategies seem to be too generous to succeed in competitive environments, Stewart and Plotkin [21] showed that compliers do surprisingly well in round robin tournaments, in which the compliant strategy was outperforming all other strategies (including the most prominent strategies All D, Tit for Tat, Win-Stay Lose-Shift, and an extortioner strategy). Moreover, as shown in [25], a large fraction of compliant strategies is “evolutionary robust”, meaning that no mutant with another strategy can have a selective advantage over a resident population of compliers.

Zero-determinant strategies thus have remarkable conceptual properties, but comparably little is known which of these strategies would evolve in a natural setup. It has recently been argued that extortioners are evolutionarily unstable [22]: since extortioners demand an extortionate share from any surplus, two interacting extortioners would end up with a surplus of zero. Moreover, numerical simulations indicate that zero-determinant strategies in general are disfavored by selection in sufficiently large populations [23]. However, this does not preclude certain zero-determinant strategies, such as compliers, to play an important role, as recently demonstrated by [21], [25]. To identify such important strategies, researchers have focused on particular limiting cases of zero-determinant strategies, such as extortioners, equalizers, and compliers. Moreover, to investigate the dynamics of these strategies, previous studies either had to resort to individual-based simulations, or they needed to restrict attention to a finite subset of representative strategies [22], [23], [25].

Instead, it is the aim of this study to provide an analytical framework that allows to study the evolutionary dynamics of all zero-determinant strategies. Constructing an analytical model for the evolutionary dynamics of the iterated prisoner’s dilemma is not straightforward. Already for simple memory-one strategies, a calculation of the resulting payoffs may become prohibitively laborious (for an example see [22]). To derive an analytical model of the dynamics, we will thus focus on an appropriate super-set of zero-determinant strategies: the set of all memory-one strategies that enforce a linear relation of the form (2), as in [25]. We show that if all players apply such strategies then the payoffs and the resulting adaptive dynamics take a remarkably simple form. In particular, we find that populations either move to the edge of compliers, or they move towards a neighborhood of unconditional defectors Inline graphic. In this process, extortioners play an important role, as they can neutrally invade unconditional defectors, thereby promoting the emergence of compliance. On the other hand, altruistic strategies (such as unconditional cooperators) have the opposite effect: they can subvert a population of compliers, giving rise to the evolution of selfish strategies. Which of these strategies gets the upper hand in the long run, critically depends on the population size. While small populations favor the emergence of selfish strategies, compliance succeeds as populations become sufficiently large.

Results

In the following, let us focus on the set of all memory-one strategies that enforce a linear relation between the payoffs of the two players. As players cannot set their own score [20], it is reasonable to consider only those strategies fulfilling Eq. (2) for which Inline graphic (formally this means that we exclude the strategy Inline graphic from the set of zero-determinant strategies, which is fully dependent on the initial condition). In the appendix we show that this subset of strategies is then identical to the set

graphic file with name pone.0077886.e028.jpg (5)

Instead of the three parameters Inline graphic, Inline graphic and Inline graphic, this specification only requires two free parameters, Inline graphic and Inline graphic. Both parameters allow an intuitive interpretation (see Figure 1). The parameter Inline graphic gives the correlation between both players’ payoffs. A factor Inline graphic means that a player enforces a positive linear relation between the payoffs, whereas for Inline graphic, the payoffs obey a negative linear relation. The parameter Inline graphic, on the other hand, can be considered as the payoff that a player would get against himself (see Figure 1). We thus call the parameter Inline graphic the baseline payoff, and we refer to Inline graphic as the slope of an Inline graphic–strategy (in fact, the slope Inline graphic is just the inverse of the extortion factor Inline graphic).

Figure 1. Illustration of zero-determinant strategies for an iterated prisoner’s dilemma with Inline graphic, Inline graphic, Inline graphic and Inline graphic.

Figure 1

All graphs show the possible payoffs of the focal player (on the horizontal axis) and the resulting payoff for the opponent (on the vertical axis) as colored areas or lines. The colored points represent the payoff pairs for Inline graphic randomly chosen opponents. (a) In general, as for example when the focal player applies the win-stay lose-shift strategy Inline graphic, the possible payoff pairs form a convex polygon. (b) However, if the focal player applies a compliant strategy, the set of all possible payoff pairs degenerates to a line with positive slope Inline graphic, which intersects the diagonal at Inline graphic. (c) An extortioner enforces payoff relations that are on a line with positive slope Inline graphic, intersecting the diagonal at Inline graphic. (d) The strategy Inline graphic enforces a linear relation between the payoffs of the two players although Inline graphic is not a zero-determinant strategy for the given parameters, as described in the Methods section.

We consider an iterated prisoner’s dilemma and make the common assumption that the payoffs of the one-shot game fulfill the relation Inline graphic, and Inline graphic, such that mutual cooperation is the best outcome and mutual defection is the worst outcome. As payoffs then need to be in the interval Inline graphic, and because memory-one strategies need to consist of four probabilities, there are restrictions on the linear relations that a player can enforce. In the Methods section, we show that a pair (Inline graphic) is enforceable if

graphic file with name pone.0077886.e059.jpg (6)

For example, the set of extortioners corresponds to the set of pairs (Inline graphic) with Inline graphic and Inline graphic. The set of compliers is given by those memory-one strategies for which Inline graphic and Inline graphic. In the following, we study the evolution of zero-determinant strategies by considering the dynamics on the (Inline graphic)-plane. That is, we assume that each player determines an enforceable pair Inline graphic and then picks a Inline graphic from the corresponding class of Inline graphic-strategies. Depending on the player’s performance in the game, the enforceable pair Inline graphic may then be adopted by others, a process that we will describe with adaptive dynamics and individual-based simulations.

Adaptive Dynamics in Infinite Populations

In order to derive the adaptive dynamics on the Inline graphic–plane, we first have to calculate the payoffs for each player. While the payoff function for general memory-one strategies is highly non-trivial, these calculations become straightforward for Inline graphic-strategies. Suppose a player wants to enforce the linear relation (Inline graphic) by choosing an appropriate Inline graphic-strategy Inline graphic, whereas the co-player enforces the pair (Inline graphic) by choosing Inline graphic. Then the payoffs are implicitly given by

graphic file with name pone.0077886.e077.jpg (7)

From this, we recover the result that a player can set the co-player’s score to a fixed value [20], [27]: by choosing Inline graphic, player 2 can guarantee that the first player’s payoff is Inline graphic (i.e., the set of so-called equalizers corresponds to all enforceable pairs (Inline graphic) with Inline graphic).

Excluding the two non-generic cases that both players enforce the most extreme payoff relations (Inline graphic or Inline graphic), this system of two linear equations has a unique solution for the payoffs

graphic file with name pone.0077886.e084.jpg (8)

It follows that if both players have the same baseline payoff, Inline graphic, then their payoff will be Inline graphic, irrespective of their choice of the slopes Inline graphic and Inline graphic. In particular, the payoff of a homogeneous Inline graphic-population is Inline graphic. As a consequence, if we consider homogeneous populations, and if we assume that the populations move towards the direction where mutants have the highest invasion fitness, then the resulting adaptive dynamics [28][30] is given by

graphic file with name pone.0077886.e091.jpg (9)

The first equation implies that the slope Inline graphic remains constant under adaptive dynamics. Nevertheless, the initial value of Inline graphic determines the eventual fate of the population: if individuals enforce a positive correlation between payoffs (Inline graphic), then the baseline payoff Inline graphic increases over time. Eventually, such a population will thus yield the maximum payoff Inline graphic, i.e. the population converges to the edge of compliers, see Fig. 2. On the other hand, for Inline graphic the population payoffs Inline graphic decrease over time, and the dynamics leads to strategies in the neighborhood of Inline graphic. Interestingly, although extortioners always outcompete their direct opponent, the edge of extortioners is unstable, as illustrated in Fig. 2. Along this edge, mutants with higher baseline payoff Inline graphic can invade. By giving in the extortioners’ claim, they are able to yield a payoff that exceeds the payoff Inline graphic that extortioners get against themselves. However, this argument rests on the assumption of an infinite population, such that the probability for an extortioner to interact with a rare, but profitable mutant is zero. In the following section, we therefore extend our analysis to finite populations.

Figure 2. Adaptive dynamics in the (Inline graphic-plane.

Figure 2

The grey-shaded state space represents the set of all enforceable linear relations that fulfill the inequalities (6). The corners of this state space consist of the payoff relations (Inline graphic) that correspond to the five strategies Always Cooperate (Inline graphic), Tit-for-Tat (Inline graphic, which starts with cooperation, and then repeats the opponent’s previous move), Suspicious Tit-for-Tat (Inline graphic, which starts with defection and then repeats the opponent’s previous move), Always Defect (Inline graphic), and an Anti-Tit-for-Tat strategy (Inline graphic, which always plays the opposite of the opponent’s previous move). Three special subsets of this state space are of particular interest: (i) Extortioners are strategies for which Inline graphic and Inline graphic. (ii) Equalizers are strategies with Inline graphic (iii) Compliers correspond to the edge Inline graphic and Inline graphic. The grey line between Inline graphic and Inline graphic corresponds to the set of linear relationships that can be enforced with unconditional strategies (in particular it follows that all unconditional strategies enforce linear relationships with a negative slope, see Methods section). The adaptive dynamics for this system is surprisingly simple: orbits are parallel to the Inline graphic-axis; for Inline graphic, they converge towards the edge of compliers, whereas for Inline graphic, they converge towards the left boundary of the state space. Parameters: Inline graphic, Inline graphic, Inline graphic, Inline graphic.

Adaptive Dynamics in Finite Populations

Extortioners play a more prominent role in finite populations [23], where pairwise payoff advantages have a stronger effect (see also [14], [31]). This is most intuitive when the population only consists of two individuals; since extortioners outperform their direct co-player by definition, extortion is expected to spread. These observations suggest that a given extortionate strategy can be stable as long as the population size is below some critical threshold. To calculate this threshold analytically, let us consider a homogeneous population of size Inline graphic that enforces the pair Inline graphic. From time to time, a player may mutate to a different enforceable pair (Inline graphic). If mutation (or exploration) events are sufficiently rare, the strategy of the mutant goes extinct, or fixates, before the next mutation occurs [32], [33]. In this case, the fixation probability Inline graphic is the decisive quantity for the evolutionary dynamics. It can be shown that such a process can be described with a modified form of the adaptive dynamics equation; instead of asserting that homogenous populations move towards the direction where mutants have the highest invasion fitness, it is assumed that the population moves towards the direction where mutants have the highest fixation probability. In Imhof and Nowak [34] it is shown that this direction can be found by calculating the adaptive dynamics for a slightly perturbed payoff matrix (called the effective payoff matrix, or modified payoff matrix, see [35], [36]),

graphic file with name pone.0077886.e127.jpg (10)

The first correction term, Inline graphic means that individuals cannot play against themselves, whereas the second correction term Inline graphic corresponds to the competition effect in finite populations. In our case, the adaptive dynamics for finite populations becomes

graphic file with name pone.0077886.e130.jpg (11)

Remarkably, the slope Inline graphic remains invariant for all population sizes. However, the dynamics for the baseline payoff Inline graphic changes for small Inline graphic: in the extreme case of Inline graphic, all trajectories in the interior of the state space lead to the lowest possible population payoff. For Inline graphic, a bistable situation emerges: if the value of Inline graphic in the initial population exceeds Inline graphic, then the population moves towards the edge of compliers (with Inline graphic), whereas for smaller values of Inline graphic populations move towards a non-cooperative equilibrium (with Inline graphic). Therefore, larger populations promote the evolution of cooperative behaviors, and in the limit of infinitely large populations, Inline graphic, we recover the original adaptive dynamics (9). The dynamical equations (11) also imply that a given extortionate strategy can only be stable if Inline graphic, or equivalently if the strategy’s extortion factor Inline graphic fulfills Inline graphic. Thus, to be stable in a finite population, extortioners need to be sufficiently demanding (Inline graphic), whereas compliers must not be too generous (Inline graphic).

In order to confirm these predictions, we have simulated the dynamics in finite populations for a pairwise comparison process, where the probability to switch to the role model’s strategy is given by a Fermi function [37], [38]. We assume that mutations follow Gaussian distributions around Inline graphic and Inline graphic and focus on the distribution of strategies and on the distribution of payoffs. For Inline graphic we find that the population clusters around the edge of low population payoffs (see Fig. 3a), and the density function for the payoffs has a single peak at Inline graphic. Increasing the population size has a two-fold effect (Fig. 3b and 3c). First, compliant strategies with Inline graphic become stable, such that the density function of the population payoffs has a second peak at Inline graphic. Second, increasing the population size reduces the stochastic noise; as a consequence almost all the mass is concentrated around the two peaks Inline graphic and Inline graphic. As predicted by adaptive dynamics, and in line with previous results [23], larger populations exhibit larger payoffs. For example, payoffs for a population size Inline graphic exceed the payoffs for Inline graphic by more than a factor of six.

Figure 3. Stochastic dynamics for different population sizes.

Figure 3

We consider a homogeneous population of size Inline graphic. Once a mutation occurs, the mutant strategy either takes over the whole population (with probability Inline graphic), or goes extinct before the next mutation arises. This leads to a sequence of residents in the state space, which is shown in the upper three graphs (the dashed line corresponds to the threshold Inline graphic). The lower three graphs give the distribution of the resulting payoffs in the population. (a) In the extreme case of Inline graphic, most players enforce a strategy with baseline payoff Inline graphic. In particular, extortion strategies can persist. (b) As population size increases, a bistable situation emerges: the population clusters along the edges with Inline graphic and Inline graphic. (c) For large population sizes, this implies that the edge of compliers is (neutrally) stable, whereas the edge of extortioners is unstable. As a consequence, mean payoffs increase with population size. The figure shows simulation runs for Inline graphic residents for a prisoner’s dilemma with Inline graphic, Inline graphic, Inline graphic, Inline graphic. New mutant strategies are randomly drawn from a Gaussian distribution around the parent strategy (Inline graphic). The invasion probability Inline graphic of a mutant is calculated as Inline graphic, where Inline graphic and Inline graphic are the respective payoffs of mutants and residents, and where Inline graphic is the strength of selection.

Although extortioners seem to apply a fully selfish strategy, they are important as they can act as a catalyst for cooperation, by helping the population to escape from states with low payoffs [23]. Our adaptive dynamics formalism allows us to give an intuitive explanation for this effect: under a local mutation scheme, a population of Inline graphic players can only be invaded by neutral drift, by moving along the vertical line of strategies with Inline graphic. For cooperative strategies to have a selective advantage, the new resident population needs to have a positive slope Inline graphic (i.e., only when the new resident applies an extortionate strategy, cooperation can evolve). In order to confirm this catalytic effect of extortionate strategies, we have removed a Inline graphic-neighborhood around the edge of extortioners from the set of enforceable pairs (see Fig. 4a; in [34] this method is called a knock-out experiment). That is, only those mutants are permitted that are sufficiently different from extortioners. The result is surprising: although extortioners are defined as strategies with the lowest payoff against themselves, their exclusion reduces the average payoff of the population for all population sizes Inline graphic (Fig. 4b). This effect is especially pronounced in larger populations; for Inline graphic, Fig. 4b indicates that it is almost impossible to reach a cooperative regime without extortioners.

Figure 4. Extortioners facilitate cooperation.

Figure 4

In order to study the impact of extortioners on the evolutionary dynamics, we have excluded all mutant strategies that are Inline graphic-close to the set of extortioners. (a) For the simulations we have used Inline graphic, represented by the white area in the upper left corner of the panel. (b) As a result, we find for all population sizes Inline graphic that the removal of extortioners decreases the average payoff. This decrease is particularly dramatic in large populations, Inline graphic. Parameters are the same as in Fig. 3.

So far, we have assumed that a mutant’s strategy is close to the parent’s strategy (which allowed us to use derivatives to approximate the dynamics), and that mutations are rare (which allowed us to focus on games between a resident and one mutant strategy). Let us now weaken these assumptions and numerically explore the impact of non-local mutations, and of different mutation rates, respectively. In Fig. 5, we distinguish four simulations, according to whether the mutation rate is high or low (Inline graphic vs. Inline graphic), and whether mutations occur on a local or on a global level (mutant strategies are drawn from a normal distribution around the parent’s strategy, vs. mutant strategies are uniformly distributed over the set of enforceable pairs). These simulations indicate that all treatments follow the same pattern: average payoffs are close to the minimum Inline graphic in small populations, and they increase with population size. However, there is a clear difference between treatments with local mutations and treatments with non-local mutations. If mutations are local, populations can be trapped in regions with a low payoff for a considerable time, although distant mutant strategies would offer an immediate escape. For example, we have seen that any strategy of the form Inline graphic forms a stable fixed point of the adaptive dynamics. However, once we allow mutants to adopt any strategy of the state space, mutants with Inline graphic close to one and Inline graphic can easily invade (in fact, in Stewart and Plotkin [25] it is shown that in sufficiently large populations, compliant strategies with Inline graphic can replace any noncooperative zero-determinant strategy). Overall, non-local mutations thus lead to a shift of the invariant distribution towards more cooperative strategies.

Figure 5. Average payoffs for the four different mutation treatments.

Figure 5

In rare-mutation treatments, the mutation rate is set to Inline graphic, whereas in frequent-mutation treatments the mutation rate is Inline graphic. Local mutations are randomly drawn from a Gaussian distribution around the parent strategy, non-local mutations are randomly drawn from the entire state space. The rare local mutations correspond to the previous simulations in Figs. 3 and 4. All other parameters are the same as before.

Discussion

The set of zero-determinant strategies exhibits a fascinating variety of possible behaviors, ranging from extortioners to compliant strategies, and from selfish strategies to altruists. To evaluate the evolutionary relevance of these different possible behaviors, previous studies focused on particular subsets. Adami and Hintze [22] demonstrated that neither extortioners nor equalizers are evolutionarily stable, and Hilbe et. al. [23] confirmed numerically that these two subsets are only favored by selection if the population is sufficiently small. In contrast, as shown by Stewart and Plotkin [25], large population sizes favor the emergence of compliant strategies, which are evolutionary robust (they can only be invaded by neutral drift), and which in turn are quite successful in invading other strategies. However, this focus on specific subsets of zero-determinant strategies comes at the risk of neglecting other important subsets. Thus, here we have systematically explored the space of all zero-determinant strategies.

To this end, we have derived the adaptive dynamics for all strategies that enforce a linear relation between the payoffs of the two players. This set of strategies includes all zero-determinant strategies [20] and all unconditional strategies such as Inline graphic or Inline graphic (see Methods section), but not all memory-one strategies (for example, it does not contain the win-stay lose-shift rule depicted in Figure 1a). The focus on this strategy space allows us to describe the evolutionary dynamics with an analytically tractable model. The resulting dynamics in large populations is bistable and the state space contains two neutrally stable sets. When the initial population enforces a positive relation between payoffs (Inline graphic), the population is most likely to end up at the edge of compliers. This subset of strategies shares the following three properties: (i) compliers enforce a linear relation between the payoffs of the two players, (ii) a population of compliers yields the maximum possible payoff Inline graphic, and (iii) compliers play a best response to themselves (no strategy can yield a payoff higher than Inline graphic when playing against a complier, see also [24] for a characterization of such strategies). However, compliers have one shortcoming: they can be neutrally invaded by altruistic strategies (strategies that accept a decrease of their own payoff to increase the opponent’s payoff, such as Inline graphic with Inline graphic). Such altruistic strategies give rise to selfish behaviors, leading the population to a neighborhood of Inline graphic. To escape from that neighborhood, extortioners play an important role [23]: they can invade Inline graphic by neutral drift and promote the emergence of compliant strategies. Thus, the route from cooperation to defection goes via altruism, whereas the route from defection to cooperation goes via extortion.

It is natural to ask which of these dynamical results on the space of all zero-determinant strategies are robust when we consider evolution in more general strategy spaces, such as memory-one strategies, or strategies encoded by a finite automaton (see, for example, [5]). Further simulations suggest that our results hold more generally: for Fig. 6 we consider the adaptive dynamics on the space of all memory-one strategies (similar simulations are also presented in [23], [25]). The numerical results confirm our analytical predictions based on the adaptive dynamics framework: extortioners are strongest in small populations, whereas compliers succeed in large populations. Note, however, that zero-determinant strategists in general are disfavored by selection as the population size increases. In fact, as our analysis suggests, a large proportion of zero-determinant strategies only play a transient role in the evolutionary dynamics. For most of the time, the population applies a strategy that is close to one of the boundaries Inline graphic and Inline graphic, whereas interior states are hardly visited. The dynamics is centered around the edge of selfish strategies and extortioners, and around the edge of compliers and altruists, whereas the evolutionary importance of other zero-determinant strategies seems negligible.

Figure 6. Statistics for the stochastic dynamics on the space of all memory-one strategies.

Figure 6

Instead of taking the enforceable pairs (Inline graphic) as the evolving traits, we consider the adaptive dynamics on the space of memory-one strategies Inline graphic, see also [23], [25]. (a) To assess the impact of zero-determinant strategies, extortioners, and compliers, we record how often the evolving population is in a Inline graphic-neighborhood of these strategy sets, and compare this to their expected abundance in a neutral process. A given strategy set is thus favored by selection if its relative abundance exceeds one. Our simulations indicate that in small populations extortioners are favored by selection, whereas in large populations compliers are favored. (b) As a consequence, average payoffs increase with population size. Simulations are run for a sequence of Inline graphic mutants. We assume that mutant strategies are uniformly distributed over the space of memory-one strategies, and use the parameters Inline graphic and Inline graphic. The other parameters of the evolutionary process are the same as in the previous figures.

Our results on the adaptive dynamics of zero-determinant strategies resemble the results for the evolution of reactive strategies (i.e., memory-one strategies with Inline graphic and Inline graphic, [5], [28], [34]). In both models, there are two regimes. There is a cooperation rewarding zone where populations evolve towards an edge of fully cooperative strategies (the edge of compliers, or the edge between tit-for-tat and generous tit-for-tat, respectively). Outside of this cooperation rewarding zone, populations move towards lower population payoffs (ending up at a neighborhood of Inline graphic). These similarities are not a mere coincidence. Instead, for games with equal gains from switching (when Inline graphic), every reactive strategy is a zero-determinant strategy [23] and thus reactive strategies form a subset of Inline graphic. Conversely, we show in the Methods section that any enforceable payoff relation (Inline graphic) can be enforced by a reactive strategy in this case. Thus, for games with equal gains from switching, the space Inline graphic is essentially equivalent to the space of reactive strategies.

Throughout this manuscript, we have focused on the dynamics of an iterated prisoner’s dilemma. However, only a few of our results actually depend on the characteristic order of payoffs, Inline graphic. In fact, the only result specific to the prisoner’s dilemma concerns the characterization of enforceable (Inline graphic) pairs in Eq. (6). For games that are different from the prisoner’s dilemma, the geometry of the state space may thus be different, but the dynamics on the respective state space remains unchanged. In Figure 7, we illustrate this observation by considering the dynamics of an iterated snowdrift game (which is defined by the payoff relations Inline graphic, Inline graphic, Inline graphic, Inline graphic with Inline graphic such that Inline graphic, see [39], [40]). For snowdrift games we observe that only a subset of extortionate strategies is feasible [41]: extortionate strategies with Inline graphic need to fulfill the requirement Inline graphic (i.e. the maximum extortion factor is Inline graphic). Moreover, only strategies that yield a baseline payoff higher than Inline graphic can enforce a payoff relation with negative slope, Inline graphic. As a consequence, any sufficiently large initial population that yields a payoff less than Inline graphic against itself can be replaced by more cooperative mutant strategies with higher baseline payoffs. As in the prisoner’s dilemma, this dynamics leads to the edge of compliers, which can only be left by neutral invasion of altruists.

Figure 7. Zero-determinant strategies for the iterated snowdrift game (with Inline graphic, Inline graphic, and Inline graphic, Inline graphic, Inline graphic, Inline graphic).

Figure 7

(a) The grey shaded area gives the space of feasible payoff pairs in the snowdrift game. The three colored lines give three examples of possible payoff combinations if the focal player uses a strategy that enforces a linear relation between payoffs. Unlike in the iterated prisoner’s dilemma, the slope of Inline graphic is positive, Inline graphic. (b) The grey-shaded area depicts the space of possible combinations of baseline payoff Inline graphic and slopes Inline graphic that are enforceable in the snowdrift game. A comparison with Fig. 2 shows that the state space differs considerably from the state space of a prisoner’s dilemma game. However, the qualitative dynamics within the state space remains unchanged.

Similar results may be feasible for social dilemmas with a continuous action space, as for example considered in [42][46]. However, transferring our findings to the continuous case is not straightforward. First, the existing literature on zero-determinant strategies exclusively deals with games where the players can only choose among two actions (either to cooperate or to defect), and it is not obvious how the corresponding proofs can be generalized to iterated games with continuous action spaces. Moreover, even if continuous games admit zero-determinant strategies, one may wonder which linear relations (Inline graphic) these strategies can enforce. Is there an upper bound on the extortion factor? Which payoffs can be enforced by an equalizer strategy? The answers to these questions are likely to depend on specific details of the benefit and cost function, representing an interesting topic for future research.

Our results confirm that extortionate behaviors can only prevail in small populations. In large populations, the evolutionary steady state is increasingly biased in favor of cooperative strategies. This may come as a surprise, as it has been shown that intermediate population sizes are optimal for the fixation of rare cooperative mutants in a population of defectors [14]. However, compliant strategies do not need to invade defectors directly. Instead, in sufficiently large populations extortioners always provide an escape path to leave non-cooperative populations. More importantly, once compliant strategies are common, they are evolutionary robust [25], with the neutral invasion of overly altruistic strategies as their only weak spot. Overall, compliance succeeds.

Methods

The Geometry of the State Space

Let us first show that the set of all strategies that fulfill condition (2) coincides with the set Inline graphic, as defined by (5). If we multiply the condition

graphic file with name pone.0077886.e244.jpg (12)

with some Inline graphic, then we can relate (2) and (5) by the following transform of coordinates

graphic file with name pone.0077886.e246.jpg (13)

It then follows from (1) that a zero-determinant strategy Inline graphic enforces the pair (Inline graphic) if and only if there is a Inline graphic such that Inline graphic has the form

graphic file with name pone.0077886.e251.jpg (14)

Since all entries Inline graphic need to be in the interval Inline graphic, there are restrictions on the pairs (Inline graphic) that can be enforced by zero-determinant strategies. For the parameters of the prisoner’s dilemma, it follows by Inline graphic and Inline graphic that baseline payoffs Inline graphic need to fulfill the condition Inline graphic. Again because Inline graphic and Inline graphic, we may then conclude that Inline graphic. As a consequence, the requirement Inline graphic yields Inline graphic and Inline graphic. Then Inline graphic leads to the restriction Inline graphic, whereas Inline graphic implies Inline graphic. In summary, we conclude that for all pairs (Inline graphic) that fulfill

graphic file with name pone.0077886.e270.jpg (15)

there is a corresponding zero-determinant strategy Inline graphic of the form (14) such that Inline graphic (we only have to choose a Inline graphic that is sufficiently small). Conversely, the linear relations Inline graphic that can be enforced by zero-determinant strategies are in fact all possible linear relations that can be enforced in an iterated prisoner’s dilemma with Inline graphic. To see this, we note that for any memory-one strategy Inline graphic we have:

  1. The payoff pair Inline graphic is on the line between Inline graphic and Inline graphic, whereas

  2. the payoff pair Inline graphic is on the line between Inline graphic and Inline graphic.

Thus, any linear payoff relation (Inline graphic) enforced by some Inline graphic connects the line segment between Inline graphic and Inline graphic with the line segment between Inline graphic and Inline graphic (see also Figs. 1b–1d). A straightforward computation verifies that any such linear payoff relation (Inline graphic) needs to meet the conditions (15).

The set Inline graphic is a proper super set of the zero-determinant strategies. For example, the strategy Inline graphic is not a zero-determinant strategy in the general prisoner’s dilemma (it is only a zero-determinant strategy in games with equal gains from switching, i.e. when Inline graphic). However, Inline graphic holds true in all prisoner’s dilemma games. In fact, every unconditional strategy Inline graphic is an element of Inline graphic, with parameters

graphic file with name pone.0077886.e296.jpg (16)

In particular, it follows that unconditional strategies can only enforce linear payoff relations with negative slopes Inline graphic. As previously suggested, these values of Inline graphic and Inline graphic satisfy the inequalities (15) for all Inline graphic; any linear relation (Inline graphic,Inline graphic) that can be enforced by an unconditional strategy can also be enforced by a zero-determinant strategy.

Given a triplet (Inline graphic), the corresponding zero-determinant strategy Inline graphic is uniquely determined by (1). However, for a given pair (Inline graphic) there will generally be many zero-determinant strategies Inline graphic that enforce the corresponding linear relationship in (5) - one for every Inline graphic in (14). We call two strategies Inline graphic equivalent, and write Inline graphic, if they give rise to the same pair (Inline graphic). To study the evolutionary dynamics of Inline graphic-strategies, we consider the dynamics on the space of equivalence classes Inline graphic. That is, we assume that each player determines a pair (Inline graphic) and then picks a Inline graphic from the corresponding class of Inline graphic-strategies. The dynamics is well-defined in the sense that the adaptive dynamics does not depend on the choice of the class representative Inline graphic.

Inline graphic-strategies Versus Reactive Strategies

When payoffs fulfill equal gains from switching, Inline graphic, we can choose Inline graphic such that the zero-determinant strategies according to Eqs. (14) are given by

graphic file with name pone.0077886.e320.jpg (17)

In particular, Inline graphic and Inline graphic, i.e. all resulting zero-determinant strategies are reactive strategies. For such reactive strategies it follows that for Inline graphic the conditions Inline graphic and Inline graphic are equivalent to the conditions

graphic file with name pone.0077886.e326.jpg (18)

respectively. From this, we conclude that for games with equal gains from switching, all payoff relations Inline graphic that can be enforced by zero-determinant strategies (given by the conditions (15)) can already be enforced by reactive strategies.

Acknowledgments

We thank Karl Sigmund for inspiring comments and suggestions.

Funding Statement

There were no external funders.

References

  • 1.Rapoport A, Chammah AM (1965) Prisoner’s Dilemma. University of Michigan Press, Ann Arbor.
  • 2. Trivers RL (1971) The evolution of reciprocal altruism. The Quarterly Review of Biology 46: 35–57. [Google Scholar]
  • 3.Axelrod R (1984) The Evolution of Cooperation. New York, NY: Basic Books.
  • 4.Nowak MA (2006) Evolutionary Dynamics. Harvard University Press, Cambridge.
  • 5.Sigmund K (2010) The calculus of selfishness. Princeton Univ. Press.
  • 6. Friedman J (1971) A non-cooperative equilibrium for supergames. Review of Economic Studies 38: 1–12. [Google Scholar]
  • 7. Molander P (1985) The optimal level of generosity in a selfish, uncertain environment. Journal of Conflict Resolution 29: 611–618. [Google Scholar]
  • 8. Fudenberg D, Maskin E (1986) The folk theorem in repeated games with discounting or with incomplete information. Econometrica 54: 533–554. [Google Scholar]
  • 9. Milinski M (1987) Tit For Tat in sticklebacks and the evolution of cooperation. Nature 325: 433–435. [DOI] [PubMed] [Google Scholar]
  • 10. Boyd R, Lorberbaum J (1987) No pure strategy is evolutionary stable in the iterated prisoner’s dilemma game. Nature 327: 58–59. [Google Scholar]
  • 11. Nowak MA, Sigmund K (1993) A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner’s Dilemma game. Nature 364: 56–58. [DOI] [PubMed] [Google Scholar]
  • 12. Frean MR (1994) The prisoner’s dilemma without synchrony. Proceedings of the Royal Society B 257: 75–79. [DOI] [PubMed] [Google Scholar]
  • 13. Hauert C, Schuster HG (1997) Effects of increasing the number of players and memory size in the iterated prisoner’s dilemma: a numerical approach. Proceedings of the Royal Society B 264: 513–519. [Google Scholar]
  • 14. Nowak MA, Sasaki A, Taylor C, Fudenberg D (2004) Emergence of cooperation and evolutionary stability in finite populations. Nature 428: 646–650. [DOI] [PubMed] [Google Scholar]
  • 15. Imhof LA, Fudenberg D, Nowak MA (2005) Evolutionary cycles of cooperation and defection. Proceedings of the National Academy of Sciences USA 102: 10797–10800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Perc M, Szolnoki A, Szabó G (2008) Restricted connections among distinguished players support cooperation. Physical Review E 78: 066101. [DOI] [PubMed] [Google Scholar]
  • 17. Grujic J, Fosco C, Araujo L, Cuesta J, Sanchez A (2010) Social experiments in the mesoscale: Humans playing a spatial prisoner’s dilemma. PLoS One 5: e13749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Perc M, Wang Z (2010) Heterogeneous aspirations promote cooperation in the prisoner’s dilemma game. PLoS One 5: e15117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. van Veelen M, García J, Rand DG, Nowak MA (2012) Direct reciprocity in structured populations. Proceedings of the National Academy of Sciences USA 109: 9929–9934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Press WH, Dyson FD (2012) Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent. Proceedings of the National Academy of Sciences USA 109: 10409–10413. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Stewart AJ, Plotkin JB (2012) Extortion and cooperation in the prisoner’s dilemma. Proceedings of the National Academy of Sciences USA 109: 10134–10135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Adami C, Hintze A (2013) Evolutionary instability of zero-determinant strategies demonstrates that winning is not everything. Nature Communications 4: 2193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Hilbe C, Nowak MA, Sigmund K (2013) The evolution of extortion in iterated prisoner’s dilemma games. Proceedings of the National Academy of Sciences USA 110: 6913–6918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Akin E (2013) Stable cooperative solutions for the iterated prisoner’s dilemma. arXiv : 1211.0969v2.
  • 25.Stewart AJ, Plotkin JB (2013) From extortion to generosity, evolution in the iterated prisoner’s dilemma. Proceedings of the National Academy of Sciences USA (in press). [DOI] [PMC free article] [PubMed]
  • 26.Ball P (2012) Physicists suggest selfishness can pay. Nature, doi:10.1038/nature.2012.11254.
  • 27. Boerlijst MC, Nowak MA, Sigmund K (1997) Equal pay for all prisoners. American Mathematical Monthly 104: 303–307. [Google Scholar]
  • 28. Nowak MA, Sigmund K (1990) The evolution of stochastic strategies in the prisoner’s dilemma. Acta Applicandae Mathematicae 20: 247–265. [Google Scholar]
  • 29.Metz JAJ, Geritz SAH, Meszena G, Jacobs FJA, van Heerwaarden JS (1996) Adaptive dynamics: a geometrical study of the consequences of nearly faithful replication. In: van Strien SJ, Ver18 duyn Lunel SM, editors, Stochastic and Spatial Structures of Dynamical Systems, Amsterdam: North Holland. 183–231.
  • 30. Geritz SAH, Kisdi E, Meszéna G, Metz JAJ (1998) Evolutionarily singular strategies and the adaptive growth and branching of the evolutionary tree. Evolutionary Ecology Research 12: 35–57. [Google Scholar]
  • 31. Hilbe C, Traulsen A (2012) Emergence of responsible sanctions without second order free riders, antisocial punishment or spite. Nature Scientific Reports 2: 458. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Antal T, Scheuring I (2006) Fixation of strategies for an evolutionary game in finite populations. Bulletin of Mathematical Biology 68: 1923–1944. [DOI] [PubMed] [Google Scholar]
  • 33. Wu B, Gokhale CS, Wang L, Traulsen A (2012) How small are small mutation rates? Journal of Mathematical Biology 64: 803–827. [DOI] [PubMed] [Google Scholar]
  • 34. Imhof LA, Nowak MA (2010) Stochastic evolutionary dynamics of direct reciprocity. Proceedings of the Royal Society B 277: 463–468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Lessard S (2011) Effective game matrix and inclusive payoff in group-structured populations. Dynamic Games and Applications 1: 301–318. [Google Scholar]
  • 36. Hilbe C (2011) Local replicator dynamics: A simple link between deterministic and stochastic models of evolutionary game theory. Bulletin of Mathematical Biology 73: 2068–2087. [DOI] [PubMed] [Google Scholar]
  • 37. Blume LE (1993) The statistical mechanics of strategic interaction. Games and Economic Behavior 5: 387–424. [Google Scholar]
  • 38. Traulsen A, Nowak MA, Pacheco JM (2006) Stochastic dynamics of invasion and fixation. Physical Review E 74: 011909. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Sugden R (1986) The Economics of Rights, Co-operation and Welfare. Oxford and New York: Blackwell.
  • 40. Doebeli M, Hauert C (2005) Models of cooperation based on the prisoner’s dilemma and the snowdrift game. Ecology Letters 8: 748–766. [Google Scholar]
  • 41.Roemheld L (2013) Evolutionary extortion and mischief - zero determinant strategies in 2×2 games. arXiv 1308.2576.
  • 42. Roberts G, Sherratt TN (1998) Development of cooperative relationships through increasing in vestment. Nature 394: 175–179. [DOI] [PubMed] [Google Scholar]
  • 43. Killingback T, Doebeli M (1999) ‘Raise the stakes’ evolves into a defector. Nature 400: 518. [DOI] [PubMed] [Google Scholar]
  • 44. Wahl LM, Nowak MA (1999) The continuous prisoner’s dilemma: I. Linear reactive strategies. Journal of Theoretical Biology 200: 307–321. [DOI] [PubMed] [Google Scholar]
  • 45. Wahl LM, Nowak MA (1999) The continuous prisoner’s dilemma: II. Linear reactive strategies with noise. Journal of Theoretical Biology 200: 323–338. [DOI] [PubMed] [Google Scholar]
  • 46. Killingback T, Doebeli M (2002) The continuous Prisoner’s Dilemma and the evolution of cooperation through reciprocal altruism with variable investment. The American Naturalist 160: 421–438. [DOI] [PubMed] [Google Scholar]

Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES