Abstract
Because punishment is scarce, costly, and painful, optimal enforcement strategies will minimize the amount of actual punishment required to effectuate deterrence. If potential offenders are sufficiently deterrable, increasing the conditional probability of punishment (given violation) can reduce the amount of punishment actually inflicted, by “tipping” a situation from its high-violation equilibrium to its low-violation equilibrium. Compared to random or “equal opportunity” enforcement, dynamically concentrated sanctions can reduce the punishment level necessary to tip the system, especially if preceded by warnings. Game theory and some simple and robust Monte Carlo simulations demonstrate these results, which, in addition to their potential for reducing crime and incarceration, may have implications for both management and regulation.
Keywords: crime, enforcement, game theory, positive feedback, tipping
Humans cooperate; the extent of cooperation among humans marks them out from all other species (1, 2). But cooperation is vulnerable to exploitation through aggression, deception, opportunistic defection from agreements, and free-riding. Where cooperative strategies are suboptimal for individuals, mutually beneficial arrangements will fail to arise, or will degenerate (3–5).
Under experimental conditions, some individuals will voluntarily incur costs to themselves to punish noncooperative behavior; doing so can facilitate cooperation by reducing the potential gains from exploitation (1, 4, 6). Thus punishment is a basic element of human social interaction.
As punishment is always costly, both to the punisher and (obviously) to those punished, a well-designed enforcement system should combine high efficacy in discouraging exploitative behavior with low actual infliction of sanctions. At first blush, it might seem that these 2 objectives are in fundamental tension; that more compliance requires more punishment. But 3 law-enforcement examples seem to support the contrary proposition.
First, when the New York City Police Department implemented a “zero tolerance” policy toward “squeegeeing”—penny-ante extortion involving wiping the windshields of cars stuck in traffic and then “requesting” payment from the drivers—by arresting every squeegee-man observed plying his trade, the number of actual arrests for squeegeeing went down, not up (7). The same happened when the New York City Transit Police cracked down on “turnstile-jumping.”
Second, police in High Point, NC, who had been sporadically arresting drug dealers in a long-established crack market for 2 decades and finding that every arrestee was quickly replaced, changed strategies. They identified all of the currently active dealers in that market and developed felony cases against them, but arrested and prosecuted only 3 dealers who had been involved in violence. They then held a meeting with the remaining dealers and announced that anyone who continued to sell drugs would face certain prison time. Soon after, the market disappeared, and the number of crack-dealing arrests in the area immediately fell to near 0—and stayed there (8).
Third, a judge in Honolulu, HI, selected a group of probationers with such high rates of noncompliance (primarily missed or “dirty” drug tests) that they faced possible probation revocation and incarceration. Instead of sending them to prison at once, he warned them that they would be subject to increased drug-testing frequency and that any violation would lead to an immediate and certain, although short (measured in days, not months) jail sentence. The absolute number (not just the rate) of detected violations decreased; the majority of those who received warnings never needed to be sanctioned at all (9).†
These observations help illustrate the potential interactions among 3 variables: The rate at which a rule is broken, the probability that any given incident of rule-breaking leads to punishment, and the total quantity of punishment actually administered. They can be explained by a common-sense observation known to everyone who has successfully raised a child or trained a pet; holding sanction intensity constant, the more credible the commitment to punish rule-breaking is, the more likely it is to succeed in discouraging the targeted behavior, and therefore the less likely it is that the threatened punishment will actually take place (10). This recalls the chess maxim that “the threat is stronger than its execution.”‡
Those enforcing rules on multiple subjects face an additional complexity due to positive feedbacks in violation rates. When several persons are subject to some rule and sanctions capacity is constrained, the subjects face interdependent choices: The higher the prevalence of violation, the less the risk of sanction for any given violator. This idea is referred to as “enforcement swamping” (11) or the “overload theory” (12) and is well known in the deterrence (11–17) and urban riot (18) literatures. The result can be a 2-equilibrium “tipping” situation, in which both high and low violation rates are self-sustaining, and temporary interventions can therefore have lasting consequences if they push the system across the tipping point (19, 20).§
The application to criminal justice policy is transparent, but the same thinking can be usefully applied to any interaction between a group of individuals who might violate some rule and an authority with limited capacity to punish such violations: A teacher facing a classroom, a manager dealing with many subordinates, a regulatory agency attempting to control the behavior of many firms, a tax collection agency trying to minimize cheating. Despite the ubiquity of situations that feature such interdependency, the problem of finding enforcement strategies to minimize violations and sanctions in a dynamic context remains largely unstudied.¶
We begin with a simple 1-player compliance game to show that increasing the severity or probability of punishment can lead to less actual punishment. We then show that the same holds for 2 potential offenders, and moreover that assigning 1 of 2 potential offenders priority for punishment, and making that assignment common knowledge, can reduce the punishment capacity needed to secure complete compliance with no actual punishment use. These results generalize to n potential offenders, where n need not even be finite; under standard the rational-actor, common-knowledge assumptions of game theory, a single threatened sanction can deter countably many potential violators, without ever needing to be actually imposed, as long as the sanction is greater than the cost of compliance and the order of priority for punishment is fixed and known to all.
From there, we use a simulation model to show the inverse relationship between punishment capacity and punishment utilization and the superiority of prioritized over equal-probability sanctioning both hold for repeated games where behavior is stochastic and where individualistic Bayesian-updating rational agents evaluate the probability of being sanctioned based on personal experience. These games display tipping behavior; relaxing the punishment-capacity constraint can reduce the amount of punishment actually inflicted, as the system “tips” from a high-violation to a low-violation equilibrium, so that a temporary increment to enforcement capacity may have lasting benefits, and prioritization (“dynamic concentration”) can be made to do the work of that temporary increment while reducing the volume of actual punishment required to “tip” the system to its high-compliance equilibrium.
Results
A Compliance Game with 1 Potential Offender.
Assume an economically rational actor subject to some rule. If he breaks the rule, he pays a penalty P. If he complies, he pays compliance cost C. For instance, a firm subject to an environmental regulation might have to decide between obeying it and paying some cost in the form of higher production costs, or violating it and paying a regulatory penalty.
A rational subject will never violate if the penalty for breaking the rule is above the cost of compliance or gain from violation. Therefore as the penalty imposed in case of a violation crosses the threshold from C - ε to C + ε, the violation rate goes from unity (a violation every time) to 0, and the total penalty imposed falls from C - ε to 0, because the rule is not violated, and therefore no penalty is incurred. Thus increasing the severity of punishment can lead to less punishment being actually used.
If instead of being certain the penalty is applied stochastically, and the potential violator is risk-neutral, then the rule will be broken just in case the expected value of punishment—the probability of punishment conditional on violation p times the penalty P—is less than C. That is, violation occurs if and only if pP > C. The critical value of the probability of punishment conditional on compliance is therefore p* = C/P. Below p*, the violation rate will be unity. Above that probability, the violation rate will be 0; increasing the probability is equivalent to increasing the penalty.
As the probability of punishment p grows from 0 to C/P, the subject's behavior doesn't change, and the expected punishment per turn grows from 0 to p*P. But as soon as p is high enough so that p*P > C, our imagined perfectly rational subject will stop violating, and the amount of penalty collected will fall back to 0. Thus increasing the probability of punishment, like increasing its severity, can economize on actual punishment utilization.
It is worth noting that under the conditions as hypothesized, an empirical investigation of the relationship between severity or probability of punishment and violation rates will show no benefit of enforcement as long as the range of empirical experience remains below the critical values.
A 1-Shot Sequential Compliance Game with n Potential Offenders.
When there are multiple potential offenders and sanctions capacity is constrained, the potential offenders may find themselves in a situation of interdependent choice, where the optimal play for each depends on the play of the others.
Once again, assume that each subject will violate the rule just in case the cost of compliance is greater than the expected sanction (24). Assume also an enforcement agency with perfect information about violations (see SI Text for a formal exposition of the enforcement agency's problem). The game has 7 rules:
The game is played once and not repeated.
The game is played by n subjects, A1, A2, … Ai, … An, each strategic, self-interested, and risk-neutral, and known to be so by the other players. (The enforcement agency is not modeled as a player.)
Each subject either complies with the rule, incurring a compliance cost (or opportunity cost of forgone chance to violate) C, or violates the rule. Each violator either pays a penalty, P, where C < P < nC, or escapes without penalty, depending on the strategy adopted by the enforcer. P and C are identical across players and that fact is common knowledge.
Moves are sequential. A1 moves before A2, A2 moves before A3, and so on. Each player is aware of the rules and of the decisions made by the previous players.
There is only 1 sanction available, and it must be used when there is a violation.
The sanction is not assigned until all subjects have made their choice.
Players cannot communicate directly and cannot make side-payments. (If this were not true, then 2 players could agree to both violate, with those who escape punishment compensating the one who incurs punishment, and that agreement would be robust to any enforcement strategy.)
Proposition 1.
In an n-player game when C < P < nC and the sanction is randomly assigned, if any player violates, “all violate” is a Nash equilibrium.
Consider first the sequential 2-player game. If only 1 player violates, he is punished with certainty; if both players violate, each has a 0.5 chance of being punished. Fig. 1 displays the sequential game and the expected payoffs. If A1 has complied, then A2 pays P if A2 violates. Because P > C, A2 chooses to pay C to comply. That vindicates A1's decision to comply, because with A2 complying, A1 would have faced P (> C) for violation. Thus comply-comply is a Nash equilibrium.
Fig. 1.
Sequential game with a coin-flip to assign 1 sanction if there are 2 violators. Players pay C to comply and P if they are sanctioned.
But if A1 violates, the expected cost of violation by A2 is P/2, by assumption less than C. Because A2 is risk-neutral, A2 violates. This again vindicates A1's choice, because he now faces an expected cost of P/2 rather than P for his violation. Thus, violate-violate is also a Nash equilibrium.
Therefore A1, in reasoning strategically about his move, faces in effect a choice between complying at cost C and not complying at cost P/2 < C. Thus he rationally chooses not to comply, and A2 follows suit. Strategically rational play by both players thus leads to both to violate.‖ The result trivially generalizes to n players under the assumption of perfect rationality as common knowledge; as long as P/n < C, all violate.††
If we amend Rule 4 to make this a simultaneous game instead of a sequential game, the participants are in a “stag hunt” (25). If both violate, both are better off than if they both comply. But if only 1 violates, he is worse off than he would have been had he complied. Thus neither has a dominant strategy, and the outcome is indeterminate.
Proposition 2.
Increasing the capacity to punish can decrease the amount of punishment actually inflicted.
Proposition 1 shows that with 2 players moving sequentially and 1 randomly assigned punishment available, punishment will always be inflicted because both players will violate the rule. But if the punishment constraint is publicly relaxed so that both players know that both will be punished if both violate, then compliance is the dominant strategy for each player as long as P > C. If both comply, neither is punished. Therefore relaxing the punishment-capacity constraint from 1 to 2 (in an n-player game, to any value x such that P(x/n)>C) reduces the number of punishments actually inflicted from 1 to 0.
Proposition 3.
Even with a punishment-capacity constraint, establishing a priority order for punishment among potential violators makes universal compliance the only Nash equilibrium as long as the sanction cost is greater than the cost of compliance.
The “equal-opportunity” enforcement strategy under which each violator has the same probability of punishment is not the optimal strategy for an enforcement agent interested in reducing the violation rate and in economizing the use of actual punishment. Departing from that rule by announcing a priority order for punishment can substitute for relaxing the punishment-capacity constraint.
In the 2-player case, let the enforcer assign priority for punishment to A1. Now A1 complies, because otherwise he is certain to be punished. Therefore A2 also complies. Instead of 1 violation and 1 punishment, the result is 0 violations and 0 punishments.
The same holds if A2 has priority. Now A2 faces certain punishment if he violates, no matter what A1 does. Therefore he will always comply. A1, knowing that, knows that he faces certain punishment if he violates; therefore he will also comply. Threatening either player with certain punishment in case of violation therefore deters both.
That result holds if moves are simultaneous rather than sequential. The directly threatened player will always comply, and therefore the other player can never hope to violate with impunity because the single sanction will always remain available for him.
This result generalizes directly to the n-player case, whether the moves are sequential or simultaneous. Let the enforcer assign to each of the n players a unique and immutable priority number, the numbers ranging from 1 to n. (Because the players are identical, we can assume without loss of generality that each of the Ai has priority i). The players then move, thus partitioning themselves into 2 subsets: Compliers and Violators. Compliers pay C. The member of the Violator subset with the highest priority (lowest priority number), Vmin, pays the sanction P > C. All other players in the Violator subset pay 0.
Vmin has made a mistake. He pays P, when he would have paid only C < P by complying. Therefore, Vmin would prefer to change strategies and move to the Complier subset. But that leaves a new Vmin, similarly discontented. As long as there is at least 1 player in the Violator subset, 1 player will have an incentive to change strategies. Thus, only universal compliance is a Nash equilibrium.
To put the argument formally: It is the condition of a Nash equilibrium that the Violator subset have no lowest-numbered member, but every non-empty subset of the natural numbers has a least element, therefore in equilibrium the Violator subset must be empty. This is equivalent to a proof by mathematical induction: the proposition “The player with priority number n would regret violating” is true if n = 1 (because the first person on the punishment-priority list can never get away with a violation) and always true of n + 1 if it is true of n (because if player n would regret violating, he will not violate, in which case player n + 1 would regret violating). The proposition is thus true for all of the natural numbers, so every player would regret violating; therefore all will comply.
As long as players are rational and their rationality is common knowledge, all potential violators can thus be deterred with a single threat, which never needs to be made good. This is the strategy of the proverbial Texas Ranger with just 1 bullet in his gun who prevents an angry mob from rushing the jail when he says—and, crucially, is believed when he says—that he will shoot the first person who steps forward. No member of the mob wants to be first. But if no one is first, then no one ever steps forward, and no one is shot.
The claim that it is possible to deter not merely any finite number but any countable number of potential offenders with a single threat, and therefore never to have to deliver on the threat, seems implausible if made about the empirical world. While it would be true for a hypothetical group of perfectly rational actors if the rationality of all were common knowledge among them, the “Texas Ranger” strategy is not robust to mistakes by members of the target group. For example, it will fail if a member of the mob is too drunk to understand the warning, too angry to heed it, accidentally stumbles forward, or is pushed from behind by another member of the mob who thinks strategically. In an imperfect world, the larger the number of participants the higher the probability that 1 of them will violate in error, thus making it rational for all those below him on the priority ordering list to violate as well.
That raises the question of how rational participants would respond to a world of uncertainty about one another's behavior, and how an enforcement agency could use any given level of punishment capacity to maximum advantage. The resulting problem in interdependent decision-making under uncertainty is best explored by simulation.
Simulation Results.
Now consider a multi-round compliance/enforcement game with n potential offenders, each of whom, acting simultaneously, violates or complies on each round. Each violator is or is not punished by an enforcer with a limited supply of sanctions, and updates his subjective probability of punishment (conditional on violation) accordingly, starting with a universal prior probability and using Bayesian updating. (They do not, in this model, know about or learn from the experience of other subjects, although the model could be adapted to allow such vicarious learning.) When the number of potential offenders exceeds the number of available sanctions, the enforcement agency must decide how to allocate the sanctions.
Consider 2 sanctioning strategies: Random sanctioning and dynamic concentration. With random sanctioning, the enforcer randomly assigns up to S sanctions among V violators on every trial. If V ≤ S, the probability of punishment given violation is unity. If V > S, each violator is punished with probability S/V, which falls as the number of violators rises. (If S is greater than V in a trial, the excess sanctions go unused and cannot be saved for future trials.) With dynamic concentration, each subject is assigned an immutable priority number, ranging from 1 to n. If V > S, the S violators with the highest priority (lowest priority numbers) are punished with certainty; others are not punished. Thus dynamic concentration is the stochastic analog of the priority-order system in the deterministic case.
For 2 potential offenders and a single sanction available each round, there are 2 equilibria, 1 with both violating and 1 with both complying (see Figs. S1 and S2). Which equilibrium is reached depends on the offenders' initial beliefs about the probability of punishment conditional on violation; if both offenders start out with sufficiently low subjective probabilities of punishment, they will always reach the violate-violate equilibrium; with sufficiently high priors, they will always reach the comply-comply equilibrium.
From the high-violation equilibrium, adding a second sanction forces the system to the low-violation equilibrium, which remains stable if the second sanction is then removed.
The enforcer can “tip” the system from the high-violation to the low-violation equilibrium without adding a second sanction by giving 1 of the offenders priority in sanctioning (i.e., using the strategy of dynamic concentration). The number of rounds before reaching the low-violation equilibrium, and the number of sanctions assigned, falls if some exogenous shock (which in practice might be a warning) causes the offenders to raise their prior probability estimates or to update their beliefs more quickly.
As in the deterministic analysis, the simulation results for 2 potential offenders generalizes to n potential offenders: There are 2 equilibria; which is reached depends on the initial subjective probabilities; increasing sanctions capacity can reduce the level of sanctions actually imposed; a temporary increment to sanctions capacity can “tip” the system to its low-violation equilibrium; and dynamic concentration will outperform equal-probability sanctioning. The same outcomes result if we relax the assumption that each violator has a constant cost of compliance C.
Consider a situation where there are 100 potential offenders and the enforcer has more than 1 sanction available. Begin by assuming that everyone has a prior probability of sanction given violation equal to 5% (α = 1 and β = 19).
Fig. 2A displays simulation results that compare the number of violations committed under random sanctioning (solid line) and dynamic concentration (dashed). The y-axis measures the total number of violations committed in a simulation with 500 trials and the x-axis represents S. For both strategies, a simulation is performed at 100 different levels of S and the total number of sanctions is recorded for each increment of S. The mean of the total number of sanctions is based on 20 runs of the simulation (for a total of 4,000 simulations).
Fig. 2.
Dynamic concentration reduces violations and the critical value of the sanctions constraint. [Prior probability of sanction given violation: 5% (α = 1 and β = 19)]. The y-axes measure the total number of violations (A) and sanctions delivered (B) in a simulation with 500 trials and the x-axis represents the enforcement agency's sanction capacity, S. The dashed line represents dynamic concentration and the solid line represents random sanctioning. The mean number of violations and sanctions is based on 20 simulations for each increment of S.
Increasing sanctions capacity eventually creates a low-offending equilibrium in both cases, but the 2 strategies follow different pathways to this equilibrium. The area under the violation curve is considerably smaller for dynamic concentration than for random sanctioning.
Fig. 2B is based on the same simulations used to generate Fig. 2A, but now we plot the number of sanctions delivered as a function of sanctions capacity. For both strategies there is a linear increase in total sanctions delivered as sanctions capacity increases up to the tipping point value of S. Once that critical value is reached, the system moves to a high-expected-sanction, low-violation equilibrium with lower use of actual sanctions. Dynamic concentration reduces the sanctions capacity required to get to the low-offending/high-expected sanction equilibrium.
Now assume that all offenders have a prior probability of sanction given violation equal to 50%, rather than the 5% assumed in the previous simulation (Fig. S3). That change decreases the time it takes to get to the tipping point for both random sanctioning and dynamic concentration, and also increases the advantage of dynamic concentration.†† Conversely, when the prior probability is 1% (Fig. S4), the curves for random sanctioning and dynamic concentration are closer to each other. While dynamic concentration still leads to fewer violations than the random sanctioning in the 1% scenario, the differences are not quite as dramatic as they are in the other scenarios. These sensitivity analyses suggest that dynamic concentration can lead to more spectacular results when the prior probability of sanction given violation is increased at baseline, for example with a warning, consistent with actual practice in both the High Point intervention (8) and the Hawaii probation project (7).
Thus, although in a stochastic world it is not possible to make threats do all of the work, and therefore not possible to control infinite numbers of subjects with a single threatened penalty, distributing sanctions according to a fixed priority order greatly improves the terms of the tradeoff between reducing violation rates and economizing on sanctions.
Discussion
When punishment capacity is constrained and offenders' behavior responds to changes in the probability of punishment, a dual-equilibrium “tipping” situation can result. In that case, temporary increases in punishment capacity can lead to lasting changes in violation rates. A strategy of dynamically concentrating sanctions on a subset of violators can reduce violation rates and the total amount of punishment actually delivered. When the capacity to punish is constrained, dynamic concentration can be more effective and less costly than randomly assigning sanctions to offenders.
These findings help us explain the spectacular results of some criminal-justice interventions using focused deterrence (6–8, 26), which seen in this light are not as surprising as experience with equi-probability enforcement made them appear. Dynamically concentrating sanctions is an attractive alternative to the current system of more or less random sanctioning.
Finally, these results may also point the way to continuing the past decade's decrease in crime rates while reversing the 20-year-long prison-building boom, which has given the United States, with 2.2 million people behind bars at any 1 time, the dubious distinction of having the world's highest rate of incarceration per capita (27, 28).§§
The applicability of these models in any given situation is an empirical question, not one that can be answered by game trees or simulation modeling. But the strategy of dynamic concentration is potentially applicable whenever those subject to a rule are somewhat deterrable and where sanctions capacity is scarce compared to the number of detected violations. Advances in monitoring technology such as GPS tracking and remote and continuous alcohol testing increase the range of circumstances in which monitoring is easy, making punishment capacity the binding constraint, and putting a premium on strategies such as dynamic concentration that maintain deterrence while economizing on actual punishment.
These monitoring technologies are becoming popular in criminal justice settings, and they can generate a tremendous amount of information (29). In situations when probation officers in a particular office are overwhelmed by the violations and cannot apprehend and sanction all of the offenders, dynamic concentration would be a better strategy than randomly targeting violators. In practice, priority for sanctions with dynamic concentration would be established by group rather than by individual. A subset of potential violators—perhaps those probationers who committed the most violations in the past—is selected for “zero-tolerance” enforcement and is sanctioned for every violation. The size of that initial group is limited by the capacity of the system to impose sanctions. Once the violation rate within that group falls to the point where the available sanctions capacity (in the form of court time for hearings and jail cells for punishment) is no longer being fully absorbed, the range of people subject to “zero-tolerance” enforcement can be expanded without adding sanctions capacity.
Situations in which the constrained resource detection capacity rather than sanctions capacity require different strategies, because in that circumstance success in reducing the violation rate does not generate resource savings to the same extent; thus maintaining low violation rates will in general be more costly than in the case considered here. Drunk-driving roadblocks provide an example; a roadblock quickly reduces violation rates, but the fixed cost of the roadblock remains even as violation rates fall, and is thus hard to sustain over time. The problem of optimal monitoring and punishment strategy when both monitoring and punishment are costly is a target for future research.
The model presented here could also be usefully extended in at least 6 different directions. First, the assumption that each offender learns only from his own experience could be relaxed by allowing learning from the experience of others (to model general, as opposed to specific, deterrence). As a further extension, the model might allow each offender to learn more readily from the experience of some of his peers than from others; the optimal sanctions policy is likely to vary in complex ways with the structure of that sociogram (see SI Text for thoughts about collusion). Intuition suggests that more-central individuals are more worth concentrating on early than others, but that might not always be the case. Modeling the internal dynamics of these subgroups would also allow one to introduce collective punishments. A further complication would come from relaxing the assumption that the participants cannot collude. If they can coordinate perfectly, they can defeat the strategy of dynamic concentration; if their capacity to coordinate is limited, then the enforcement agency will need to craft strategies to make successful collusion harder.
Second, the enforcement agency could be allowed to communicate warnings to offenders, as in the High Point and Honolulu examples (7, 8). Because the game between each offender and the enforcer is not zero-sum, credible communication is possible, reaching a low-violation equilibrium by sparing potential offenders the cost of “finding out the hard way” and sparing enforcers the cost of administering sanctions. The problem of finding an optimal warning strategy will not always have an obvious solution; too aggressive a warning strategy risks making threats that cannot be delivered on, thereby weakening the credibility of all such threats for the future.
Third, the enforcement agency could be allowed to vary the severity of punishment instead of its probability. Our intuition is that dynamic concentration would greatly reduce the severity level required to “tip” from the high-violation equilibrium to the low-violation equilibrium.
Fourth, the assumption of perfect rationality could be relaxed, making some or all offenders subject to temporal myopia (30), prospect-theoretic behavior (31), or optimistic bias (32, 33). In general, these will increase the importance of certainty and swiftness of punishment compared to severity (34), but the problem of finding an optimal strategy in a population with mixed decision styles, and especially in a population whose decision styles have to be deduced from behavior, remains to be addressed.
Fifth, the closed nature of the process, with no subjects entering and none leaving, could be relaxed, and removing a subject from the process (e.g., by imprisonment in the law enforcement context, expulsion in the classroom context, or firing in the managerial context) could be added to the enforcement agency's repertoire.
Sixth, another extension could allow for heterogeneity of offenses as well as offenders, in terms both of severity and of deterrability.
As noted, the potential usefulness of dynamic concentration extends well beyond criminal-justice settings. The principles apply to anyone trying to enforce rules where monitoring is relatively easy but the stock of sanctions is limited. Managers, teachers, and parents all face analogous problems. So do tax collectors and those charged with enforcing environmental, workplace-safety, and product-safety regulations. Because the model assumes that those subject to the rule act strategically but not in coordinated fashion, it may have only limited applicability to armed conflict or organized insurgency, as opposed to crowd control (20). Sometimes the relevant concentration will be on individuals, but it might just as well be on organizations, geographic regions, or violation types.
In every case the principle is the same: Better to actually control someone or somewhere or something than to fail in an attempt to control everyone and everywhere and everything. That casts doubt on the optimality of “zero-tolerance” policies unless the range of what will not be tolerated and the size of the population to be controlled is calibrated to the available sanctions capacity and clearly announced in advance.
The applicability of this principle is limited to circumstances in which it is the capacity to punish rather than the capacity to detect violations that is the binding constraint; to open drug dealing, for example, rather than burglary. But the alligators-and-swamp problem of law enforcement in a high-violation context may be more tractable than it looked at first glance.
Materials and Methods
For every trial of a simulation, let each subject face a cost of compliance C with a mean equal to one-half P (changing the value of C within the range P > C > P/n would change the numerical results but leave the qualitative results intact). This cost of compliance will vary by offender and vary over time. Because the subjects are rational, the decision to comply or violate depends not only on C and P, but also on that subject's estimate of the probability of punishment given violation, which changes with that player's experience.
The enforcer has sanctions capacity S and can choose which violators to sanction, up to S violators per trial. As in the simple compliance game, the enforcer has perfect information about which subjects violate and chooses a sanctioning strategy that determines how this information is used.
Assume that all offenders begin with the same prior probability of sanction given violation and generate a posterior probability using this formula: (α + number of sanctions)/(α + β + number of violations), where α and β are both fixed positive constants determined exogenously (35). A trial is any unit of time; in modeling probation enforcement, a trial might be, for example, the period between drug tests. See SI Textfor more information about the simulation.
Supplementary Material
Acknowledgments.
We thank H. Bernhardt; J. Caulkins; P. Cook; R. Frank; A. Hawken; D. Hsia; T. Miles; B. O'Flaherty; M. O'Hare; G. Ridgeway; T. Schelling; 2 anonymous reviewers; participants at the 2006 Association for Public Policy Analysis and Management Meeting; participants at seminars and workshops at the University of Maryland School of Public Policy and Department of Criminology, George Mason University Law School, Yale University Department of Political Science, John Jay College, and Harvard Law School for useful comments and suggestions; and A. Morral for early and significant contributions. This work was supported by the National Institute of Justice and the Smith-Richardson Foundation. Sabbatical leave was provided to M.K. by the University of California Los Angeles School of Public Affairs, in conjunction with the Thomas C. Schelling Visiting Professorship at the University of Maryland.
Footnotes
The authors declare no conflict of interest.
This article contains supporting information online at www.pnas.org/cgi/content/full/0905513106/DCSupplemental.
A randomized controlled trial of this program is underway, and the preliminary results are promising (9).
Variously attributed to Nimzowitsch, Tartakower, and Eisenbach.
There are alternative explanations and differential crime rates in seemingly similar communities. See SI Text for a review of these explanations.
Previously published theoretical models suggest that focused enforcement with announcement can outperform a strategy of random enforcement (21). Using computational modeling, this paper builds on these efforts by (i) focusing on repeated game settings, (ii) relaxing the assumption that potential offenders have perfect information about the probability of being punished, (iii) allowing those subject to a rule to update their subjective probability of being punished in Bayesian fashion, and (iv) showing that dynamic concentration as a punishment-allocation strategy can reduce both offense levels and punishment levels, compared to equal-probability punishment. The same game has also been approached from the other side, focusing on the problem of political dissenters who need to coordinate the time of their dissent to minimize the punishment risk faced by each 1 of them (22). van Baal's book on computer simulations of criminal deterrence (23) does not address our main concern about how to choose who to sanction when the number of offenders exceeds the number of available sanctions.
If P < C, then violation is a dominant strategy for both players; if P > 2C, then compliance is stochastically dominant for both players, because in that case P/2, the expected value of punishment, is greater than the cost of compliance even if both players were to violate. So only the intermediate case where C is between P and 2P (more generally, between P and nP) is strategically complex.
However, if there is any player who attributes to each other player a nonzero probability of defection, then that player's subjective estimate of the probability that someone else will defect, and therefore that player's incentive to defect, tends toward unity as n grows.
In Fig. 2, the tipping point for random sanctioning is ≈20% higher than it is for dynamic concentration; the comparable number for Fig. S3 is ≈33%.
That adding sanctions capacity can, mathematically, reduce actual sanctions use does not imply that any actual increase in sanctions capacity, e.g., building more prisons, will lead to less punishment. That depends on circumstances and on how the additional capacity is used; using more prison cells to impose longer terms does not increase certainty and therefore will increase, rather than decrease, total punishment actually inflicted.
References
- 1.Fehr E, Fischbacher U. The nature of human altruism. Nature. 2003;425:785–791. doi: 10.1038/nature02043. [DOI] [PubMed] [Google Scholar]
- 2.Henrich J, et al. In: Genetic and Cultural Evolution of Cooperation. Hammerstein P, editor. Cambridge, MA: MIT Press; 2003. pp. 125–152. [Google Scholar]
- 3.Hobbes T. Leviathan. London, UK: Andrew Crooke; 1651. [Google Scholar]
- 4.Fehr E, Gächter S. Cooperation and punishment in public goods experiments. Am Econ Rev. 2000;90:980–994. [Google Scholar]
- 5.Hume D. Treatise on Human Nature. Green & Co., London, UK: Longmans; 1898. [Google Scholar]
- 6.Gächter S, Renner E, Sefton M. The long-run benefits of punishment. Science. 2008;322:1510. doi: 10.1126/science.1164744. [DOI] [PubMed] [Google Scholar]
- 7.Kleiman M. When Brute Force Fails: Strategic Thinking for Crime Control. Princeton, NJ: Princeton University Press; 2009. [Google Scholar]
- 8.Kennedy D. Deterrence and Crime Prevention. London, UK, and New York, NY: Routledge; 2008. [Google Scholar]
- 9.Kleiman M, Hawken A. Fixing the Parole System. Issues Sci Tech. 2008 Summer. [Google Scholar]
- 10.Schelling T. The Strategy of Conflict. Cambridge, MA: Harvard University Press; 1960. [Google Scholar]
- 11.Kleiman M. Enforcement swamping: A positive-feedback mechanism in rates of illicit activity. Math Comp Mod. 1993;17:65–75. [Google Scholar]
- 12.Rasmussen E. Stigma and self-fulfilling expectations of criminality. J Law Econ. 1996;39:519–543. [Google Scholar]
- 13.Ehrlich I. Participation in illegitimate activities: A theoretical and empirical investigation. J Pol Econ. 1973;81:521–565. [Google Scholar]
- 14.Lui F. A dynamic model of corruption deterrence. J Pub Econ. 1986;31:215–236. [Google Scholar]
- 15.Andvig J, Moene K. How corruption may corrupt. J Econ Behav Org. 1990;13:63–76. [Google Scholar]
- 16.Sah R. Social osmosis and patterns of crime. J Pol Econ. 1991;99:1272–1295. [Google Scholar]
- 17.Schrag J, Scotchmer S. The self-reinforcing nature of crime. Intl Rev Law Econ. 1997;17:325–335. [Google Scholar]
- 18.Banfield E. The Unheavenly City: The Nature and Future of Our Urban Crisis. Brown and Company, Boston, MA: Little; 1970. [Google Scholar]
- 19.Grodzins M. Metropolitan segregation. Sci Am. 1957;4:33–41. [Google Scholar]
- 20.Schelling T. Micromotives and Marcobehavior. New York, NY: Norton; 1978. [Google Scholar]
- 21.Lando H, Shavell S. The advantage of focusing law enforcement effort. Intl J Law Econ. 2004;24:209–218. [Google Scholar]
- 22.DeNardo J. Power in Numbers. Princeton, NJ: Princeton University Press; 1985. [Google Scholar]
- 23.van Baal P. Computer Simulations of Criminal Deterrence. The Hague, The Netherlands: Boom Juridische uitgevers; 2004. [Google Scholar]
- 24.Becker G. Crime and punishment: An economic approach. J Pol Econ. 1968;76:169–217. [Google Scholar]
- 25.Skyrms B. The Stag Hunt and Evolution of Social Structure. Cambridge, MA: Cambridge University Press; 2004. [Google Scholar]
- 26.Piehl A, Kennedy D, Braga A. Problem solving and youth violence: An evaluation of the Boston gun project. Am Law Econ Rev. 2000;2:58–106. [Google Scholar]
- 27.Harrison P, Beck A. “Prison and jail inmates at midyear 2005”. Washington, DC: NCJ 213133, Bureau of Justice Statistics; 2006. [Google Scholar]
- 28.International Centre for Prison Studies “World prison brief online”. London, UK: King's College; [accessed on June 18, 2007]. [Google Scholar]
- 29.Kilmer B. The Future of DIRECT surveillance: Drug and alcohol use information from remote and continuous testing. J Drug Policy Anal. 2008;1:1–10. [Google Scholar]
- 30.Strotz R. Myopia and inconsistency in dynamic utility maximization. Rev Econ Stud. 1956;23:165–180. [Google Scholar]
- 31.Kahneman D, Tversky A. Prospect theory: An analysis of decision under risk. Econometrica. 1979;47:263–291. [Google Scholar]
- 32.Tversky A, Kahneman D. Judgment under uncertainty: Heuristics and biases. Science. 1974;185:1124–1131. doi: 10.1126/science.185.4157.1124. [DOI] [PubMed] [Google Scholar]
- 33.Weinstein N. Unrealistic optimism about future life events. J Pers Soc Psychol. 1980;39:806–820. [Google Scholar]
- 34.Beccaria C. Essay on Crimes and Punishments. Indianapolis, IN: Bobbs-Merrill; 1764/1963. [Google Scholar]
- 35.Gelman A, Carlin J, Stern H, Rubin D. Bayesian Data Analysis. Boca Raton, FL: Chapman and Hall; 1995. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


