Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Mar 12.
Published in final edited form as: Cell. 2015 Feb 26;160(6):1233–1245. doi: 10.1016/j.cell.2015.01.045

Neuronal Prediction of Opponent’s Behavior during Cooperative Social Interchange in Primates

Keren Haroush 1,2,*, Ziv M Williams 1,2,*
PMCID: PMC4364450  NIHMSID: NIHMS668054  PMID: 25728667

SUMMARY

A cornerstone of successful social interchange is the ability to anticipate each other’s intentions or actions. While generating these internal predictions is essential for constructive social behavior, their single neuronal basis and causal underpinnings are unknown. Here, we discover specific neurons in the primate dorsal anterior cingulate that selectively predict an opponent’s yet unknown decision to invest in their common good or defect and distinct neurons that encode the monkey’s own current decision based on prior outcomes. Mixed population predictions of the other was remarkably near optimal compared to behavioral decoders. Moreover, disrupting cingulate activity selectively biased mutually beneficial interactions between the monkeys but, surprisingly, had no influence on their decisions when no net-positive outcome was possible. These findings identify a group of other-predictive neurons in the primate anterior cingulate essential for enacting cooperative interactions and may pave a way toward the targeted treatment of social behavioral disorders.

INTRODUCTION

Social interactions are unique from other behaviors in that they inherently require individuals to anticipate each other’s unknown intentions and actions. Accordingly, individuals need to consider not only how their decisions affect their own personal outcomes but also how they may affect the outcomes of other individuals in a group and how these individuals may consequently respond. Such interactions, therefore, are not simply governed by the learned sensorimotor contingencies between action and outcome but are rather based on the ability to predict the unknown intentions or “state of mind” of others.

Whether and what neurons encode another’s unknown actions and what role these signals play during joint decisions, made independently by two interacting individuals, remain unknown. Prior studies have demonstrated that frontal canonical cells, termed mirror neurons, encode another’s known, observable actions, as well as actions performed by the individual himself (di Pellegrino et al., 1992; Rizzolatti and Sinigaglia, 2010). More recently, neurons have been similarly found to encode another’s observed receipt of reward (Azzi et al., 2012; Chang et al., 2013; Hosokawa and Watanabe, 2012), as well as monitoring of other’s errors (Yoshida et al., 2012, see Discussion). These findings have therefore provided a critical understanding of how another’s known and observable actions may be represented at the neuronal level. However, they are distinct from those that may represent another’s imminent decisions or intentions, which are fundamentally unobservable and unknown. While cells that predict another’s unobservable intended actions have been widely hypothesized, and are a cornerstone of many theories on animal social behavior (Frith and Frith, 1999; Gallese and Goldman, 1998; Rilling et al., 2004; Sanfey et al., 2006; Vogeley et al., 2001), their existence has never been demonstrated.

A second unresolved question is how putative neural signals related to self and other’s decisions may affect achieving mutual goals. Mutually beneficial interactions are ubiquitous among social animals (Bshary et al., 2008; Clutton-Brock, 2009; de Waal, 2000; Stephens et al., 2002; Warneken and Tomasello, 2006) and are cardinal to our understanding of socially-guided decisions. While competitive interactions, which allow an individual to profit at the expense of the other, have been previously investigated (Donahue et al., 2013; Hosokawa and Watanabe, 2012; Lee et al., 2005; Seo et al., 2014), the single-neuronal basis of mutually beneficial interactions, favorable to both individuals, have not been explored.

Finally, whereas certain areas may harbor signals that encode elements of social decision-making (Abe and Lee, 2011; Apps et al., 2012; Apps and Ramnani, 2014; Azzi et al., 2012; Behrens et al., 2008; Carter et al., 2012; Chang et al., 2013; Delgado et al., 2005; Donahue et al., 2013; Hampton et al., 2008; Lee et al., 2005; Rilling et al., 2002; Rudebeck et al., 2006; Sanfey et al., 2003; Tomlin et al., 2006; Yoshida et al., 2012), it has not yet been determined what causal contribution neurons in these areas may play in modulating mutual decisions.

A formal framework for studying mutually beneficial joint decisions is by the iterated prisoner’s-dilemma (iPD) game (Clutton-Brock, 2009; Rilling et al., 2002; Stephens et al., 2002). This task incorporates two crucial properties: one is that the outcome is contingent upon the mutual concurrent decisions of both individuals, and therefore no one decision guarantees an individual’s outcome, and the other is that both decisions can be either concordant or discordant (Camerer, 2003). Therefore, the key to succeeding in the game relies on one’s ability to anticipate the other’s concurrent, yet unknown intentions. Moreover, this dissociation of self and other decisions, concordant and discordant interactions, and the dissociation between one’s decision and reward, allows one to identify neuronal signals within the population that specifically encode another’s yet unknown decisions and importantly dissociate them from those that reflect one’s own planned decision and expected reward.

Here, we used a joint-decision paradigm to study mutual decisions in primates and provide evidence of neurons that predict another agent’s intentions and modes of cooperation. We specifically focused on the dorsal region of the anterior cingulate cortex (dACC) because of its broad connectivity with frontal and temporal-parietal areas known to be involved in interactive behavior (Behrens et al., 2009; Paus, 2001) as well as its role in encoding social interest in other individuals based on functional imaging (Behrens et al., 2008) and ablative studies (Rudebeck et al., 2006). We find that many dACC neurons encoded the monkey’s own decision to cooperate. Furthermore, a substantial and largely distinct group of neurons encoded the opponent monkey’s decisions when they were yet unknown. These other-predictive neurons were uniquely sensitive to social context compared to other population cells and encoded no information about the monkey’s own decisions or expected reward. At the population-level, dACC neurons reliably predicted the other’s decisions with accuracy that remarkably approached those of behavioral decoders when based on prior selections. Finally, transient disruption of dACC activity directly and specifically inhibited mutually beneficial interactions based on prior decisions, but did not affect other decisions based on receipt of reward.

These findings together provide direct examination of how individual neurons represent another’s unknown intentions or covert “state of mind,” demonstrate the distinct encoding of other decisions from self-decisions and reward, ascertain the distinct roles that self- and other-encoding cells play in enacting joint decisions between simultaneously interacting animals, and demonstrate a causal link between cingulate activity and the specific enactment of mutually beneficial decisions.

RESULTS

Increased Cooperation following Mutual Cooperation

Four pairs of adult male Rhesus monkeys (Macaca mulatta) performed an iPD game whereby each animal chose on each trial between two response options over multiple successive trials (Figure 1A). The choice terms, cooperation and defection, were derived from iPD literature (Camerer, 2003). These were defined operationally by the payoff matrix illustrated in Figure 1B and are not referred to here in an anthropomorphic way. If both animals selected cooperation, both received the highest mutual reward whereas if one of the animals defected, that animal received the highest individual reward. The lynchpin of this game, however, was that if neither monkey cooperated, they would both receive a lower reward than if they both chose to cooperate. Accordingly, each individual decision could result in either high or low reward depending on the other’s choice, and reward could not be predicted solely from any individual decision. Moreover, since the monkeys performed multiple trials, the decision of an individual to cooperate or defect on one trial may influence the other’s subsequent decisions and, therefore, affect the future potential for mutual benefit. Here, we used this setup to differentiate between potential neuronal signals that encoded self-decisions, other-decisions, and expected reward as both monkeys jointly, simultaneously made their own choices.

Figure 1. Task Design.

Figure 1

(A) Experimental set-up. The monkeys sat side-by-side, facing a screen. On each trial, they covertly chose, in succession, to cooperate (orange hexagon) or defect (blue triangle). Following delay, both choices were revealed on screen and reward was delivered.

(B) Payoff matrix. Reward outcome for all possible choice combinations. Cooperation and defection were defined operationally by whether mutual benefit or loss is incurred.

(C) Trial timeline. The order in which the monkeys made their selections was randomized on each trial.

The monkeys sat side-by-side, facing a screen that displayed different targets representing the choice to cooperate or defect (note, that facial expression observations or eye contact were not possible here by design). Neither monkey saw the other monkey’s selection until after they made their own selection plus an additional blank screen delay. Then both selections were revealed on-screen followed by reward (Figure 1C). To further rule out implicit signals such as auditory cues that may contribute to predictions of the other’s decisions, we randomly alternated the order in which monkeys made their selections (see below).

Behaviorally, we find that the monkeys were more likely to select defection over cooperation. The monkeys performed 1,346 trials over seven sessions; they chose defection in 65.3% of trials and cooperation in 34.7% of trials (chi-square = 123.7, df = 1, p < 10−29). They selected cooperation simultaneously on 17.1% of trials, significantly higher than chance level (chi-square = 44.07, df = 1, p < 10−11) and both defected on 37.6% of trials, significantly less than chance level (chi-square = 22.27, df = 1, p < 10−6). Similar to prior observations in humans (Kuhlman and Marshello, 1975; Rapoport and Chammah, 1965), the monkeys were less likely to cooperate if the other previously defected (26% ±6%; 2 × 2-chi-square = 56.89, df = 1, p < 10−13) (Figure 2A), indicating their understanding of the task by taking into account the other’s past action when selecting their own. Moreover, the monkeys were most likely to cooperate if both monkeys cooperated on the preceding trial (62.1% ± 7.0%; chi-square = 76.7, df = 1, p < 10−18) (Figure 2B), despite the fact that individual reward is maximized if a monkey defects when his opponent continues cooperating (note these choices did not reflect a simple tit-for-tat response; see Supplemental Information and Figure S1). In other words, the monkeys reciprocated mutual cooperation for continued mutual benefit. Finally, we examined the behavioral strategy followed by the monkeys by analyzing specific choice sequences and found that they were significantly different than chance (Figures 2C and 2D; see Supplemental Information).

Figure 2. Mutually Beneficial Interactions Increase Cooperation.

Figure 2

(A) Conditional probability of a monkey cooperating given that it cooperated or defected on the preceding trial (left) and conditional probability of a monkey cooperating given the opponent cooperated or defected on the preceding trial (right). Error bars represent SEM.

(B) Probability of selecting cooperation following both monkey’s prior mutual selections. Red bar denotes overall cooperation probability. Mutually beneficial interactions led to an increase in subsequent cooperation (this was not evident when playing a computer opponent or in separate rooms, see text).

(C) Probability of following tit-for-tat (TFT) strategy. Histogram shows probability for 5,000 control Monte Carlo realizations of surrogate behavioral data. Red dashed line indicates experimental data value.

(D) Probability of following win-stay-lose-switch (WSLS) strategy. Red dashed line indicates experimental data value. Inset denotes observed data values of both strategies (blue bars), error bars represent SEM, white bars denote mean of surrogate control values.

See also Figure S1.

Behavioral Controls

To determine whether the monkeys’ choices were affected by social context, i.e., their interaction with another monkey, we repeated the task in the exact same set-up, only now replacing a monkey with a computer opponent (Chang et al., 2013; Hosokawa and Watanabe, 2012). The computer’s choices were determined by the statistics of monkeys’ choices on the previous sessions, described above (see Supplemental Information). We find that the monkeys were less likely to cooperate overall (19.1% ± 3.9% versus 34.7%; chi-square = 161.73, df = 1, p < 10−36). Moreover, they were less likely to reciprocate cooperation following mutual cooperation (14.5% ±3.0% versus 62.1%; chi-square = 73.25, df = 1, p < 10−17) when playing a computer opponent, therefore leading to less mutually beneficial interactions.

To eliminate the possibility that the reduced cooperation resulted from differences in choice selection between the computer model and the behaving monkey, we performed an additional set of social control experiments. Here, the monkeys were placed in two separate rooms so that they could not see the other player or hear each other’s licking sounds. In addition, the monkeys’ juicers were placed outside the experiment room to eliminate any cues from juicer clicks. Under these conditions, the monkeys performed the same task as before with each other. The monkeys performed a total of 2,344 trials in five experimental sessions. By and large, we find the behavior of the monkeys in this control to be similar to the behavior found in the computer opponent control. Namely, the overall probability of the monkeys to cooperate under these conditions significantly dropped to 14.2%, compared with 34.7% when playing together (chi-square = 432.08, df = 1, p < 10−95). Furthermore, we did not observe the increased cooperation following mutual cooperation that was a signature of the monkeys’ behavior when playing each other in the same room. Namely, the probability of cooperating following a mutual cooperation trial dropped to 17.4% compared with 62% when playing in the same room (chi-square = 38.76, df = 1, p < 10−9). This value closely matches the computer control value of 14.5% (not significant [n.s.] difference). Therefore, the effect of social context on the behavior of the monkeys is corroborated by these two independent control experiments (i.e., computer control and other room control).

As noted above, the monkeys demonstrated their understanding of the task by taking into account past joint decisions when selecting their own. However, to further confirm that the monkeys understood the relationship between their choices and payoff, the monkeys performed an additional control version of the task in which they were presented with the same choices as before, but could now see the other’s selection before responding (see Supplemental Information). We find that, on trials in which the other monkey first defected, the monkey maximized reward by subsequently selecting defection on 90.7% ± 2.2% of trials (i.e., within the same trial when no mutual beneficial outcome was possible). This held true even if the other monkey cooperated on the preceding trial (95.0% ± 3.0%). In other words, the monkeys did not reciprocate a prior offer of cooperation if they knew their opponent defected on the present trial. This did not reflect a simple reward maximization behavior (see Supplemental Information).

Single Neuronal Encoding of Another Individual’s Unknown Decisions

We recorded 363 neurons in the dACC in two of the four monkeys during task performance. Of these, 185 neurons significantly responded to the task (stepwise linear regression of neuronal firing rate with both monkeys’ current and past decisions as predictor variables, corrected for comparisons across pre- and post-selection periods) (Figures 3A–3D and S2; Table S1; Experimental Procedures; Supplemental Information). In total, 24.3% of neurons encoded the monkey’s own choices on the current trial; 15.7% responded differentially to choosing cooperation versus defection during the pre-selection period (immediately before the monkey’s selection) while 11.4% responded differentially during the post-selection period (immediately after the monkey’s selection; p < 0.05) (Figure 3A). There was a 2.33-fold ± 0.26-fold change in absolute activity between cooperation and defection when considered across all such neurons (p < 0.05). While the sign of the modulation of neural activity was similar in most neurons when the monkeys chose to defect, responses were more variable across neurons when the monkeys chose cooperation. Approximately half of these neurons (54.7%) had an increase of activity whereas the other half presented a decrease in activity (Figure 3C, left panel). In other words, many dACC neurons encoded the monkey’s decision to cooperate or defect.

Figure 3. Distinct dACC Neurons Encode Self and Other’s Decisions.

Figure 3

Peristimulus histograms as mean firing activity ± SEM and raster plots for individual neurons. Cooperation trials are denoted in red and defection in blue. Time zero denotes monkey’s own selection.

(A) Left: an example of a neuron that encoded the monkey’s own current decision to cooperate or defect. Right: the same neuron did not encode the opponent’s yet unknown decision. Gray bar indicates the time when both decisions were revealed to the monkeys (on half of trials; see text).

(B) Example of a neuron that encoded the opponent monkey’s yet unknown decision to cooperate or defect (right), but did not encode the monkey’s own current decision (left).

(C) Population responses based on the monkey’s own current decisions for neurons that had a significantly higher activity during self-cooperation versus self-defection (top left) and significantly lower activity during self-cooperation versus self-defection (bottom left); and population responses for neurons that had significantly higher activity during other-cooperation versus other-defection (top right) and significantly lower activity during other-cooperation versus other-defection (bottom right).

(D) Functional partitioning within the population between neurons encoding the monkey’s own current decisions and the opponent’s yet unknown decisions. Log-log-scale scatter plots of individual neurons p values obtained from the regression analysis during pre- (left) and post-selection (right) periods (only significant neurons are shown). Dashed lines denote significance thresholds. Gray points denote neurons that significantly encoded both the monkey’s own decisions and the opponent’s decisions.

(E) Neurons with significant modulation based on choice probability (CP) analysis. Top row: pre-decision time period, bottom row: post-decision time period. Columns from left to right correspond to different behavioral variables (SC, self-current; SP, self-past; OC, other-current; OP, other-past). Red bars indicate significant neurons as obtained by bootstrap estimate.

See also Figures S2, S3, S4, S5, and S6 and Tables S1, S2A, S2B, and S3.

The key for succeeding in this game was the ability to anticipate the other monkey’s concurrent decisions. Analyzing neural activity during the time when monkeys were still unaware of the other’s concurrent selection, we found that the activity of many neurons was modulated by the other monkey’s yet unknown upcoming choice. A total of 32.4% of neurons demonstrated significant differences in activity when the other monkey concurrently selected cooperation versus defection. Most of these (27.6%) encoded the opponent’s unknown choice during the post-selection period (but prior to being informed of the other’s response) and 7% during pre-selection period (p < 0.05) (Figure 3B). There was a 1.81-fold ± 0.07-fold change in absolute activity between other’s cooperation and defection when considered across all such neurons (p < 0.05) (Figure 3C, right panel; note that the total number of neurons encoding current decisions was larger when considering past responses; see Supplemental Information and further below).

Neurons encoding the opponent monkey’s choices and neurons encoding the monkey’s own choices demonstrated little overlap with each other (Figure 3D). Only 4.3% of neurons responded to both the monkey’s own decisions as well as the opponent’s planned decisions. This was significantly lower than chance level, i.e., that expected by a product of the individual probabilities of encoding self and other (expected: 7.9%, chi-square = 4.97, df = 1, p < 0.026). This suggests that self and other related computations were carried out by largely distinct neuronal populations (Figures S3 and S4; Supplemental Information).

To further delineate and confirm the response characteristics of these neurons, we applied three additional approaches to re-analyze the data. First, we performed a choice probability (CP) index analysis examining the trial-by-trial encoding of single neuronal responses. CP index analysis results closely matched the stepwise regression results (35.7% of task responsive neurons had a significant CP index for encoding the other’s choice post-selection, and 21.6% had a significant CP index for encoding self-decision pre-selection) (Figures 3E and S5A; Supplemental Information). Second, we performed an Akaike Information Criterion (AIC) analysis, which penalizes models containing multiple terms, to complement the term selection process in the stepwise linear regression (Figure S5B–S5E; Tables S2A and S2B). Finally, we performed an unsupervised population analysis in the form of a mixture of linear regression models to test in a more unbiased fashion the behavioral factors to which neurons responded at the population level (Figure S6A–S6F). These analyses confirmed the existence of self and other encoding neurons and the prominence of other-predictive neurons in the dACC and further demonstrate that our findings based on the neuronal data were reproducible across statistical methods (see Supplemental Information).

Neurons Predicting the Other’s Unknown Decision Are Sensitive to Social Context

To test the direct effect of social context on neural encoding, we recorded a total of 164 additional neurons from the dACC during the social control experiment in which the monkeys played together but in separate rooms. Of these, 84 neurons were found task-responsive using the same stepwise regression analysis as above (p < 0.025 for any main or interaction effect, either during the pre or post selection period; see Table S3). We found that only 14.3% of task responsive cells predicted the other’s choice, significantly less than the 27.6% observed in the main task (chi-square = 7.42, df = 1, p < 0.006; post-decision). In contrast, a significantly larger fraction of task-responsive neurons encoded the monkey’s own decision in the separate room control (21.4% during the pre-selection period and 26.2% during the post-selection period, compared to 15.7% and 11.4% respectively in the main task; pre-selection: chi-square = 2.083, df = 1, p = 0.149; post-selection: chi-square = 18.193, df = 1, p < 0.00002). One possible explanation for the higher number of neurons encoding the monkey’s own decisions is that there were more trials recorded per session during the separate room control. However, if this was the only factor, we would also expect to have a concurrent increase in the number of other-predictive neurons, which was not the case. Moreover, the increase in neurons encoding self-decisions indicates that the drop in other-predictive neurons was not simply due to a difference in the raw number of overall cooperation/defection trials. Therefore, this considerable reduction in the fraction of other-predictive neurons indicates that other-predictive neurons are significantly and selectively sensitive to social context.

Neurons Encoding the Other’s Unknown Decisions Do Not Encode Expected Reward

While certain cingulate cells are known to encode received and expected reward (Seo and Lee, 2007; Sheth et al., 2012; Williams and Eskandar, 2006), cells encoding self or other decisions were largely distinct from those that encoded expected reward. An important feature of the iPD game is that it enables one to dissociate neuronal signals encoding self and other decisions from those related to expected reward. Specifically, the monkey’s own choice alone cannot guarantee a high or low reward. Therefore, predicting one’s own reward inherently requires an accurate prediction of the opponent’s yet unknown selection. Nonetheless, to demonstrate more directly that the activity of cells predicting other-decisions is not explained by encoding of expected reward, we provide four lines of evidence based on examining the neuronal responses across multiple behavioral outcome contingencies.

First, we directly examined the encoding of expected reward during the decision period. We found that none of the other-predictive neurons was significantly modulated by self-reward across all four reward contingencies determined by the payoff matrix (see Supplemental Information for statistical tests). Second, we examined the differences in firing rate modulation between encoding of other decision and encoding of self-reward across the recorded population. We found that the firing rate modulation of other-predictive neurons was strong and significantly different from the general population when considering differences in the other’s choice to cooperate or defect (Figure 4A), but not when aligning trials according to differences in the monkey’s own expected reward, i.e., comparing trials in which the monkey cooperated or defected when the other choose to defect (Figure 4B) and when the other chose to cooperate (Figure 4C). Note that while we did find neurons in the dACC that showed strong modulation to self and other reward (as previously reported by Azzi et al., 2012; Chang et al., 2013; Hosokawa and Watanabe, 2012), these were distinct from the other-predictive neurons (Figure S7A; Supplemental Information). Third, we examined the reward feedback period itself, as it may have been possible that other-predictive neurons only encode reward weakly during the decision period when outcome is uncertain, but are more strongly modulated by reward when it is certain or known. However, we found that this was not the case (Figure 4D). In fact, compared to other cingulate cells, which overall demonstrated an enhanced modulation to expected reward during feedback, other-predictive neurons demonstrated a slight, non-significant reduction in modulation (Figure 4E). Finally, to test whether other-predictive neurons could be simply sensitive to raw difference in amount of reward irrespective of choice, we repeat the comparison between feedback time modulation and decision time modulation, but for the contingency that yielded the maximal difference in reward, and find no difference in modulation of the other-predictive neurons (Figure 4F).

Figure 4. Other-Predictive Neurons Do Not Encode the Monkey’s Own Expected Reward as Shown across Multiple Reward Contingencies.

Figure 4

(A) Histogram of normalized difference in firing rate between trials in which the other monkey defected versus cooperated. Red bars indicate other-predictive neurons. Blue bars indicate the full population. The distributions were statistically different.

(B) Histogram of normalized difference in firing rate between trials in which the monkey chose defection versus cooperation, conditioned on the other choosing defection. Red bars indicate other-predictive neurons. Blue bars indicate the full population. No significant difference was found between distributions.

(C) Histogram of normalized difference in firing rate between trials in which the monkey chose defection versus cooperation, conditioned on the other choosing cooperation. Red bars indicate other-predictive neurons. Blue bars indicate full population. No significant difference was found between distributions.

(D) Scatter plot of firing rate difference between trials in which the other defected versus cooperated, for firing rate during decision time (xaxis) and feedback time (y axis) in other-predictive neurons. There is no increase in differential activity when reward is known. Crosses represent mean ± SEM.

(E) Scatter plot of firing rate difference between trials in which other defected versus cooperated, for firing rate during decision time (x axis) and feedback time (y axis) in the full population. Here, there was a significant increase in differential activity when reward is known.

(F) Scatter plot of firing rate difference between trials in which the monkey chose defection versus cooperation, conditioned on other’s defection, for firing rate during decision time (x axis) and feedback time (y axis) in other-predictive neurons. Here, there is no increase in differential activity when reward is known.

See also Figure S7.

In summary, we demonstrate that the response properties of other-predictive neurons were not explained by simple encoding of the monkey’s own expected reward (see Supplemental Information). These results are further bolstered by the finding above that other-predictive neurons encoded no significant information about self-decisions and that they were highly sensitive to social context compared to other population cells.

dACC Populations Accurately Predict the Other’s Decisions on a Trial-by-Trial Basis

Activity in the dACC was significantly predictive of self and other’s choices on a trial-by-trial basis when considered across the entire population (Figures 5A and 5B). We constructed a linear decoder to predict the monkeys selections based on population activity (see Supplemental Information). Evaluating model performance on validation trials not used for model training, we find that cingulate populations predicted up to 66.1% ± 0.9% of the recorded monkey’s own current choices (multivariance analysis of variance [MANOVA], p < 10−4), with predictions being most pronounced in the pre-selection period (Figure 5C). Surprisingly, population activity correctly predicted the other monkey’s yet unknown choices on up to 79.4 ± 1.1% of trials (MANOVA, p < 10−5), with predictions being most pronounced in the post-selection period (Figure 5D). Prediction of other’s unknown choices was significantly more accurate than prediction of monkey’s own current choices (paired t test, p < 10−5).

Figure 5. Trial-by-Trial Population Prediction of the Other’s Yet Unknown Decision.

Figure 5

(A and B) Principal component (PC) analysis over a sample session. Plotted in first three PC space, each circle represents the activity of all neurons recorded simultaneously on a single cooperation (red) or defection (blue) trial (see Supplemental Information).

(A) Self-current pre-decision activity.

(B) Other’s-concurrent (yet unknown) post-decision activity.

(C and D) Linear decoding model. Each bar represents projection of the activity of all simultaneously recorded neurons during a single trial on first discriminant component (color code above). Positive values predict cooperation and negative defection. Insets (top right) plot distribution of projection values for cooperation (red) and defection (blue).

(C) Self-current pre-selection projection.

(D) Other’s-concurrent projection during post-selection.

(E) Peristimulus histograms as mean firing activity ± SEM (top) and raster plots of a neuron encoding the monkey’s own current decision during the pre-selection period and modulated by the other’s past decision. Trials separated according to monkey’s own current decision to defect (left) or cooperate (right) and opponent’s decision on a preceding trial to cooperate (red) or defect (blue; see text). Time zero denotes monkey’s own selection. Gray bar indicates feedback period.

To more directly examine the role that the cells selected as other-predictive neurons by the regression analysis play in population decoding of the other’s yet unknown decision, we next ran the decoder using only this subset of the neuronal population. We find that the accuracy of predicting the other monkey’s decision was not affected and remained up to 78.1% ± 0.8% (MANOVA, p < 10−9) correct, despite the fact that the decoder had access to far less cells. However, the accuracy of decoding the monkey’s own decisions drastically dropped and was only up to 54.7% ±0.9% (MANOVA, p = 0.37, n.s.). These specific effects found in restricting the analysis to this subset of neurons further support the above ascribed role of other-predictive neurons, as well as the functional distinction between these cells and those that encode the monkey’s own selections.

Finally, we considered whether implicit cues between the two monkeys could explain these predictions. Note that an important aspect of the task design was that the monkeys made their selections in random temporal order before their responses were revealed. Accordingly, we tested the population predictions when considering only trials in which the monkey played first, i.e., when the other monkey hadn’t yet made his selection. We found that predictions of other’s unknown choices maintained high accuracy (up to 70.7% ± 0.8%) and similar accuracies were found when considering only trials in which the monkey played second (68.5% ± 7.2%), ruling out the possibility that prediction is an artifact of an implicit signal disclosing the other monkey’s choice. Note lower accuracy was expected due to using half the number of trials.

Behavioral Trial-by-Trial Decoders

To search for a possible basis for neural prediction of the other’s concurrent selections, we examined predictions based on both monkeys’ prior behavioral history. Using a locally-optimal classification model considering the monkeys’ selections four trials back, we estimated on validation trial data the accuracy of predicting the opponent monkey’s unknown concurrent choices. We find that model prediction accuracy was up to 79.8%, similar to neuronal decoding (similar accuracies were found for predicting self-selections, see Supplemental Information). To further explore the behavioral basis of the neuronal predictions of other’s decisions, we tested trial-by-trial correlation between the behavioral and population-activity predictors, revealing significant correlations based on both monkeys’ past selections (r = 0.31, p < 0.0003). These correlations of other’s predictions were not evident when behavioral predictions were based on only a single monkey’s past decisions or reward (see Supplemental Information). This suggests that population predictions were based on the prior choices of the two monkeys rather than any individual’s past response or reward.

Neurons Keeping Track of Past Interactions

Consistent with the above findings, we find that many neurons within the population kept a dynamic record of the monkeys’ prior selections. Figure 5E illustrates such a neuron; when the monkey chose to currently defect (left panel), responses did not differ when, on the preceding trial, the opponent chose to defect versus cooperate. In contrast, when the monkey himself cooperated (right panel), neuronal activity was significantly inhibited on trials in which the opponent previously defected (i.e., the monkey cooperated despite the opponent previously defecting) compared to those in which the opponent cooperated (i.e., reciprocating opponent’s preceding cooperation). In addition we found neurons that differentially encoded the joint outcomes on preceding trials (see Figure S7B and Supplemental Information for further details).

Cingulate Disruption Selectively Inhibits Mutually Beneficial Interactions

Given the above physiological findings, we next investigated whether disruption of the dACC may influence the monkeys’ mutual choices. A series of electrical pulses was delivered to the dACC on half of 3,026 randomly selected trials in blocks (1,000 ms triggered at image presentation; 100 mA, 200-ms biphasic pulse durations with cathodal phase leading; see Supplemental Information).

Stimulation had a significant and selective effect on the monkeys’ decisions. Here, we defined the “decision-ratio” as the number of trials in which the monkey selected cooperation over defection (i.e., a ratio of 1 indicates equal selection of cooperation versus defection). When no stimulation was given, the decision-ratio was 0.53 (corresponding to 34.7% cooperation, as also found in the main task). When stimulation was administered, the decision-ratio dropped to 0.43, i.e., monkeys were less likely to cooperate when stimulated (t(6) = 3.18, p < 0.01) (Figure 6A). This effect was highly dependent on the opponent monkey’s preceding selection. When the opponent previously cooperated and no stimulation was given, the decision-ratio was 0.74, meaning that monkeys were more likely to choose cooperation if the opponent previously chose cooperation. However, during stimulation, following opponent’s cooperation, the decision-ratio significantly dropped to 0.43 (t(6) = −5.57, p < 0.0007) (Figure 6B). In contrast, following opponent’s defection when no stimulation was given, the decision ratio was 0.48 and, when stimulation was given, it was 0.43 (t(6) = −1.12, p = 0.15). In other words, stimulation had no effect on the monkey’s current decision if the opponent previously defected, but when the opponent previously cooperated, stimulation reduced the decision-ratio to a level equal to the opponent previously choosing defection. Moreover, stimulation had no effect on risk behavior since the rate of cooperation when the other monkey defected on the preceding trial was not affected by stimulation (even though the risk of cooperation under such a condition is higher; i.e., the probability of the opponent to defect following defection is twice higher than following cooperation).

Figure 6. dACC Stimulation Selectively Inhibits Mutually Beneficial Interactions.

Figure 6

White bars represent stimulation trials.

(A) Proportion in which the monkeys chose cooperation over defection ± SEM (decision-ratio of 1 indicates equal proportion of selecting either).

(B) Decision-ratio given the opponent’s past decisions to cooperate (left), or defect (right).

Finally, to further confirm that stimulation did not simply affect decisions based on past reward, we employed a zero-sum game task in which monkey’s reward was contingent on the other’s response, but individual profit was always at the expense of the other and no mutual positive outcome was possible (i.e., playing under Pareto optimality conditions) (Nash, 1950). We found no effect of stimulation on monkeys’ choices during the zero-sum game, based either on the monkeys’ preceding selection or preceding receipt of reward (Figure 7; Supplemental Information). Taken together, we conclude that stimulation in the dACC abolished specifically the incorporation of recent positive interactions, rather than any past interaction, into the monkey’s own current decision, resulting in less mutually beneficial interactions.

Figure 7. Stimulation Has No Effect when No Mutually Beneficial Interactions Were Possible.

Figure 7

(A) Zero-sum game payoff matrix.

(B–D) Bars represent the decision-ratio on stimulated (white) and non-stimulated (colored) trials during the zero-sum game (see Supplemental Information). Error bars represent SEM. (B) Overall decision-ratio. (C) Decision-ratio was not affected by opponent’s selection of choice A (left)or choice B(right) on the preceding trial. (D) Decision-ratio was not affected by the monkey’s own past reward. Left bars: the monkey previously received a high reward. Right bars: the monkey previously received a low reward.

DISCUSSION

Identifying neurons that reflect another individual’s covert intentions or “state of mind” has been a long sought goal in neuroscience and a central proposed tenet of social decision making (Frith and Frith, 1999; Rilling et al., 2004; Sanfey et al., 2006; Vogeley et al., 2001). Here, we discover neurons that selectively encode another individual’s yet unknown decisions during joint interactions. We confirmed that no explicit cues were relayed between the two monkeys during the task by using alternating trials in half of which the monkey from which we obtained recordings played first. We also demonstrated reliable population predictions of the other’s decisions even on trials in which the other monkey had not yet made his selection. Remarkably, other-predictive cells during joint interactions constituted over a third of the cingulate task-responsive population and were more prevalent than cells encoding the monkey’s own present selections. Notably, other predictive neurons were highly sensitive to social context and were not modulated by self-decisions or expected reward. Consistently, population predictions of the opponent’s selections were more accurate than those reflecting the monkey’s own selections and, in fact, predicted the other monkey’s decisions with accuracies that were near optimal compared to behavioral decoders that considered both monkey’s past behaviors. Taken together, these findings provide understanding of the population partitioning by which individual neurons in the primate cingulate cortex encode information about other social agents.

Game theory provides a framework for dissecting specific aspects of joint decision making, namely the contributions of self and other choices to shared outcome. Signals related to another’s yet unobservable actions, in particular, are a distinct feature of mutual interactions in that one participant’s concurrent decision affects the other’s outcome and therefore inherently requires each participant to anticipate the other’s intentions or state of mind.

These predictive signals are fundamentally distinct from previously reported neurons which reflect another animal’s known and observable actions. These include canonical mirror neurons that reflect one’s observed behavior and do not distinguish between self and other (di Pellegrino et al., 1992; Rizzolatti and Sinigaglia, 2010), neurons that encode another’s observed receipt of reward (Azzi et al., 2012; Chang et al., 2013), and neurons that monitor other’s observable errors (Yoshida et al., 2012). Importantly, the prediction neurons reported here are distinct from the findings of the latter study, in which neurons monitored the other’s errors while the monkeys explicitly observed each other’s selections on the same shared task (with each monkey alternating between actor and observer every other trial) (Yoshida et al., 2012). Moreover, encoding of the other’s error occurred within the monkeys’ movement time window (<200 ms before other’s response) and in a setup which allowed them to directly observe each other’s movement-preparatory cues. Here, decisions were made jointly, the other’s decisions were inherently unobservable and unknown, and their neural encoding could be found many seconds before the other monkey made a selection.

A central feature of non-competitive games such as iPD is that no particular decision guarantees a high or low reward and different outcomes can be experienced either mutually or individually. This dissociation enabled us to examine the computations that contributed to self and other predictions and differentiate them from those that contribute to the encoding of reward outcome. More importantly, it allowed us to examine what particular computations were associated with interactions that were mutually beneficial compared with those that were not. For instance, the monkeys were almost twice as likely to cooperate if they both cooperated on the preceding trial, indicating an intention to reciprocate mutual cooperation. Here, we find that neurons that encoded a monkey’s decisions largely did not encode his past or future receipt of reward even though, in combination, these neural signals could be used to predict the monkey’s shared outcome. Many neurons, however, were also highly modulated by the two monkey’s prior selections. For example, certain neurons differentially encoded the monkey’s present decision to cooperate, based on the other monkey’s preceding selection of cooperation or defection. Similarly, at the population level, neuronal predictions strongly correlated with predictions made by the behavioral decoder when considering both monkeys’ past selections, indicating that neural predictions were based on the past interaction of both individuals.

Consistent with these physiological findings, we observed that disruption of the dACC by stimulation reduced the monkey’s likelihood of cooperation, an effect which was most evident when the opponent cooperated on the preceding trial. Stimulation therefore affected reciprocation of the other’s cooperation, but did not affect the animal’s ability to incorporate any past decision or outcome since no effect was observed when the opponent defected on the previous trial, or when testing the monkey’s decisions in a zero-sum game. This is consistent with previous studies employing a computer opponent in zero-sum games that showed that the dACC does not differentially encode the monkey’s decisions during such interactions (Donahue et al., 2013; Seo and Lee, 2007). Therefore, during joint interactions, the dACC specifically mediated mutually beneficial decisions based on the recent history of the interaction.

The monkeys were clearly affected by the social context of their interaction, as they significantly changed their behavior when playing either against a computer opponent or in separate rooms, consistent with prior reports (Carter et al., 2012; Chang et al., 2013; Hosokawa and Watanabe, 2012). Moreover, other-predictive neurons were selectively influenced by social context compared other population cells, suggesting that these cells encoded information that was specific to other social agents rather than any information about the environment which affected outcome. The monkeys also selected the appropriate responses when their opponent’s decisions were known, suggesting that they understood the consequent payoff. While the joint nature of the task precludes the possibility of identifying “involuntary errors” by the individual animals, we find that the monkeys made incorrect selections on <10% of sequential control trials making such rare occurrences highly unlikely to qualitatively affect the study’s results. This conclusion is also supported by the finding that the population prediction of the opponent’s decisions was robust to substantial deletion of trials. However, as with any animal or human study that investigates interactive behavior, what internal thought process truly motivates these different behaviors can only be speculated upon. On this point, we note that cooperation is based on the observable action of two interacting individuals, rather than its hidden motivation, and is defined explicitly as the selection of actions capable of leading to joint benefit but which can also lead to loss if the action is not mutual.

Taken together, the present findings support the proposed role of the dACC in encoding a dynamic model of the environment (Adolphs, 2009; Karlsson et al., 2012; Sheth et al., 2012) but considerably expand it into the inclusion of mutual interactions which require an explicit representation of another’s yet unknown behavior. The two distinct groups of neurons found in the dACC, encoding the self versus predicting the other’s decisions, may therefore be uniquely suitable to allow the soon-available actual decision of the opponent and known decision of the acting monkey to update the internal model of their joint decisions in a way analogous to delta-learning (Pouget and Snyder, 2000) or an actor-critic (Parush et al., 2011; Williams and Eskandar, 2006; Witten, 1977) framework. Given the broad anatomical connectivity of the dACC to areas that encode aspects of socially-guided interactions, including the temporal-parietal junction, superior temporal sulcus, amygdala and orbitofrontal cortex, the dACC is likely to be part of a wider network of areas, sometimes referred to as the “social brain.” The observed role of the dACCin predicting another’s intentions contributes to our under-standing of this proposed network. For instance, disruption of its activity markedly degraded cooperative behavior, suggesting that dACC activity may be necessary for constructive interaction between individuals and social learning. Such deficits are particularly prominent in individuals with autism-spectrum disorders or antisocial behavior in which anticipating another’s intentions or state of mind and incorporating them into one’s actions are severely affected (Frith and Frith, 1999; Lombardo and Baron-Cohen, 2011). Our neuronal findings in combination with the behavioral effects observed with stimulation may therefore pave the way toward targeted treatment in the dACC for these or similar disorders in which dysfunctional social behavior is a predominant feature.

EXPERIMENTAL PROCEDURES

Task Design

Four adult male Rhesus monkeys (Macaca Mulatta) across four paired combinations were trained to play an iterated prisoner’s dilemma (iPD) game. On successive trials, two images (an orange hexagon and a blue triangle) were randomly displayed on the left and right of the screen (Figure 1A). Each monkey selected one of the two images using a joystick and was not shown the other monkey’s concurrent selection. The outcome of each monkey’s selection depended on both of their concurrent choices, according to the payoff matrix shown in Figure 1B. Based on these payoffs, the orange hexagon was operationally defined as “cooperation” since mutual cooperation led to the highest mutual reward (Camerer, 2003). The blue triangle was operationally defined as “defection” since unilateral defection led to the highest individual reward. However, if both monkeys defected, they each received less reward than if they both cooperated. Note, importantly, that the terms cooperation and defection are used here solely to indicate the potential for mutual benefit or loss dependent on the opponent’s selection. Mutual cooperation and mutual defection indicates that both monkeys made the same choice. See Supplemental Information for trial structure details.

Neuronal Recording and Stimulation

Single-Unit Isolation and Recordings

All procedures were performed under approval by the Massachusetts General Hospital institutional review board and were conducted in accordance with Institutional Animal Care and Use Committee (IACUC) guidelines. Prior to recordings, floating micro-electrode arrays (MicroProbes for Life Sciences) were surgically implanted in each monkey. The electrodes were implanted in the dACC through a wide craniotomy under stereotactic guidance (David Kopf Instruments). The location of the arrays was confirmed by direct visual inspection of the sulcal and gyral anatomy with the electrode tips located 8 mm from the cortical surface. Each array had 36 microelectrodes spaced horizontally 400 mm apart. Electrode leads were secured to the skull and attached to connectors with the aid of titanium miniscrews and dental acrylic.

Recordings began 2 weeks following surgical recovery. A Plexon multichannel acquisition processor was used to amplify and band-pass filter the neuronal signals (150 Hz–8 kHz; 1 pole low-cut and 3 pole high-cut with 1,0003 gain; Plexon). Shielded cabling carried the signals from the electrode array to a set of six 16-channel amplifiers. Neural signals were then digitized at 40 kHz and processed to extract action potentials by the Plexon workstation. Classification of the waveforms was performed using template matching and principal component analysis based on waveform parameters. Only single-, well-isolated units with identifiable waveform shapes and adequate refractory periods were used. When an individual electrode recorded more than one neuron, a high degree of isolation was required in order to include each as a single-unit (p < 0.01, multivariate ANOVA across the 1+ two principal components). We did not include multi-unit activity.

Electrical Stimulation Protocol

During stimulation trials, the monkeys performed the iPD and zero-sum games in separate sessions. Each session was composed of randomly selected 30–40 stimulated trials followed by another 30–40 trials in which no stimulation was delivered. Stimulation was administered as a brief series of alternating rectangular positive to negative voltage pulses. Stimulation parameters were 100 µA and 200 Hz biphasic pulses, with cathodal phase leading. Average impedance at the time of the stimulation experiments was 100–500 kΩ. Here, all 32 electrode contacts were simultaneously stimulated per array. Stimulation was given for 1,000 ms and included the baseline and image presentation periods. Stimulation ended prior to presentation of the go cue and prior to the monkey’s selection.

Statistical Analysis

A stepwise linear regression was conducted in order to determine how the different task parameters modulated the neuronal activity. In this analysis, parameters are incrementally added to the model, starting with the parameter that explains the most variance and continuing on to the parameters that most explain the remaining variance, terminating when parameters no longer significantly explain the residual variance. The model included the four main effect parameters, as described below (self-current, other-current, self-past and other-past) as well as their pairwise interactions (see Equation 1),

r(t)=a+i=14βiMainMiMain+i=16βiInterMiInter (Equation 1)

where r(t) is current trial firing rate, MMain = {s(t),s(t − 1),o(t),o(t − 1)}arethe four main effects and MInter = {s(t)s(t − 1),s(t)o(t),s(t)o(t − 1),s(t − 1)o(t), s(t − 1)o(t − 1),o(t)o(t − 1)} are the six second order interaction terms; s(t) is current self selection, o(t) is current other selection, and (t − 1) indicates preceding trial.

For brevity, “self” refers here to the selections of the monkey in which neural recordings were performed and “other” refers to the selections of the opponent (i.e., selecting to cooperate or defect). In addition, “current” refers to the two monkeys’ current selection (i.e., the trial from which neuronal activity was being evaluated) and “past” refers to the two monkeys’ selections on the previous trial. The depended variable is the averaged neuronal firing in the 500 ms period before response selection (i.e., choosing cooperation versus defection) and during the 500 ms period after selection, referred to as “pre-selection” and “post-selection,” respectively. Note that we chose to use a stepwise linear regression this analysis since the task parameters and samples were neither balanced nor independent (see further details in Supplemental Information). Multiple complimentary analyses, including a four-way analysis of variance, AIC analysis, and mixture of regressions analysis, yielded qualitatively similar results.

Supplementary Material

1
2
3
4
5
6
8

Highlights.

  • Cingulate neurons predict another agent’s unknown decisions during social interaction

  • Other-predictive neurons are sensitive to social context, but not to expected reward

  • Distinct cingulate neurons encode the individual’s own decisions to cooperate or defect

  • Disrupting cingulate activity selectively inhibits mutually beneficial interactions

ACKNOWLEDGMENTS

This project was funded by NIH 5R01-HD059852, the Presidential Early Career Award for Scientists and Engineers, and the Whitehall Foundation. Data are available as Supplemental Information. We thank John Assad, Wael Asaad, Shaul Druckmann, Shaul Hochstein, Daeyeol Lee, Israel Nelken, and Sameer Sheth for insightful discussions, Caitlin Commins, Christine Emmanuel, Rebecca Gwaltney, Morgan Jamiel, and Kaitlin Sodon for technical assistance, and Katie Ris-Vicari for the graphical abstract.

Footnotes

SUPPLEMENTAL INFORMATION

Supplemental Information includes Extended Experimental Procedures, seven figures, and three tables and can be found with this article online at http://dx.doi.org/10.1016/j.cell.2015.01.045.

REFERENCES

  1. Abe H, Lee D. Distributed coding of actual and hypothetical outcomes in the orbital and dorsolateral prefrontal cortex. Neuron. 2011;70:731–741. doi: 10.1016/j.neuron.2011.03.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Adolphs R. The social brain: neural basis of social knowledge. Annu.Rev. Psychol. 2009;60:693–716. doi: 10.1146/annurev.psych.60.110707.163514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Apps MAJ, Ramnani N. The anterior cingulate gyrus signals the net value of others’ rewards. J. Neurosci. 2014;34:6190–6200. doi: 10.1523/JNEUROSCI.2701-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Apps MAJ, Balsters JH, Ramnani N. The anterior cingulate cortex: monitoring the outcomes of others’ decisions. Soc. Neurosci. 2012;7:424–435. doi: 10.1080/17470919.2011.638799. [DOI] [PubMed] [Google Scholar]
  5. Azzi JCB, Sirigu A, Duhamel JR. Modulation of value representation by social context in the primate orbitofrontal cortex. Proc. Natl. Acad. Sci. USA. 2012;109:2126–2131. doi: 10.1073/pnas.1111715109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Barraclough DJ, Conroy ML, Lee D. Prefrontal cortex and decision making in a mixed-strategy game. Nat. Neurosci. 2004;7:404–410. doi: 10.1038/nn1209. [DOI] [PubMed] [Google Scholar]
  7. Behrens TEJ, Hunt LT, Woolrich MW, Rushworth MFS. Associative learning of social value. Nature. 2008;456:245–249. doi: 10.1038/nature07538. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Behrens TEJ, Hunt LT, Rushworth MFS. The computation of social behavior. Science. 2009;324:1160–1164. doi: 10.1126/science.1169694. [DOI] [PubMed] [Google Scholar]
  9. Britten KH, Newsome WT, Shadlen MN, Celebrini S, Movshon JA. A relationship between behavioral choice and the visual responses of neurons in macaque MT. Vis. Neurosci. 1996;13:87–100. doi: 10.1017/s095252380000715x. [DOI] [PubMed] [Google Scholar]
  10. Bshary R, Grutter AS, Willener AST, Leimar O. Pairs of cooperating cleaner fish provide better service quality than singletons. Nature. 2008;455:964–966. doi: 10.1038/nature07184. [DOI] [PubMed] [Google Scholar]
  11. Camerer C. Behavioral Game Theory: Experiments in Strategic Interaction. Princeton University Press; 2003. [Google Scholar]
  12. Carter RM, Bowling DL, Reeck C, Huettel SA. A distinct role of the temporal-parietal junction in predicting socially guided decisions. Science. 2012;337:109–111. doi: 10.1126/science.1219681. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Chang SWC, Gariépy JF, Platt ML. Neuronal reference frames for social decisions in primate frontal cortex. Nat. Neurosci. 2013;16:243–250. doi: 10.1038/nn.3287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Clutton-Brock T. Cooperation between non-kin in animal societies. Nature. 2009;462:51–57. doi: 10.1038/nature08366. [DOI] [PubMed] [Google Scholar]
  15. de Waal FBM. Primates—a natural heritage of conflict resolution. Science. 2000;289:586–590. doi: 10.1126/science.289.5479.586. [DOI] [PubMed] [Google Scholar]
  16. Delgado MR, Frank RH, Phelps EA. Perceptions of moral character modulate the neural systems of reward during the trust game. Nat. Neurosci. 2005;8:1611–1618. doi: 10.1038/nn1575. [DOI] [PubMed] [Google Scholar]
  17. di Pellegrino G, Fadiga L, Fogassi L, Gallese V, Rizzolatti G. Understanding motor events: a neurophysiological study. Exp. Brain Res. 1992;91:176–180. doi: 10.1007/BF00230027. [DOI] [PubMed] [Google Scholar]
  18. Donahue CH, Seo H, Lee D. Cortical signals for rewarded actions and strategic exploration. Neuron. 2013;80:223–234. doi: 10.1016/j.neuron.2013.07.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Frith CD, Frith U. Interacting minds—a biological basis. Science. 1999;286:1692–1695. doi: 10.1126/science.286.5445.1692. [DOI] [PubMed] [Google Scholar]
  20. Gallese V, Goldman A. Mirror neurons and the simulation theory of mind-reading. Trends Cogn. Sci. 1998;2:493–501. doi: 10.1016/s1364-6613(98)01262-5. [DOI] [PubMed] [Google Scholar]
  21. Hampton AN, Bossaerts P, O’Doherty JP. Neural correlates of mentalizing-related computations during strategic interactions in humans. Proc. Natl. Acad. Sci. USA. 2008;105:6741–6746. doi: 10.1073/pnas.0711099105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Hosokawa T, Watanabe M. Prefrontal neurons represent winning and losing during competitive video shooting games between monkeys. J. Neurosci. 2012;32:7662–7671. doi: 10.1523/JNEUROSCI.6479-11.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Karlsson MP, Tervo DGR, Karpova AY. Network resets in medial prefrontal cortex mark the onset of behavioral uncertainty. Science. 2012;338:135–139. doi: 10.1126/science.1226518. [DOI] [PubMed] [Google Scholar]
  24. Kuhlman DM, Marshello AFJ. Individual differences in game motivation as moderators of preprogrammed strategy effects in prisoner’s dilemma. J. Pers. Soc. Psychol. 1975;32:922–931. doi: 10.1037//0022-3514.32.5.922. [DOI] [PubMed] [Google Scholar]
  25. Lee D. Game theory and neural basis of social decision making. Nat. Neurosci. 2008;11:404–409. doi: 10.1038/nn2065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Lee D, McGreevy BP, Barraclough DJ. Learning and decision making in monkeys during a rock-paper-scissors game. Brain Res. Cogn. Brain Res. 2005;25:416–430. doi: 10.1016/j.cogbrainres.2005.07.003. [DOI] [PubMed] [Google Scholar]
  27. Lombardo MV, Baron-Cohen S. The role of the self in mindblindness in autism. Conscious. Cogn. 2011;20:130–140. doi: 10.1016/j.concog.2010.09.006. [DOI] [PubMed] [Google Scholar]
  28. Nash JF. Equilibrium points in N-person games. Proc. Natl. Acad. Sci. USA. 1950;36:48–49. doi: 10.1073/pnas.36.1.48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Parush N, Tishby N, Bergman H. Dopaminergic balance between reward maximization and policy complexity. Front. Syst. Neurosci. 2011;5:22. doi: 10.3389/fnsys.2011.00022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Paus T. Primate anterior cingulate cortex: where motor control, drive and cognition interface. Nat. Rev. Neurosci. 2001;2:417–424. doi: 10.1038/35077500. [DOI] [PubMed] [Google Scholar]
  31. Pouget A, Snyder LH. Computational approaches to sensorimotor transformations. Nat. Neurosci. 2000;3(Suppl):1192–1198. doi: 10.1038/81469. [DOI] [PubMed] [Google Scholar]
  32. Rapoport A, Chammah AM. Prisoner’s Dilemma; A Study in Conflict and Cooperation. Ann Arbor: University of Michigan Press; 1965. [Google Scholar]
  33. Rilling J, Gutman D, Zeh T, Pagnoni G, Berns G, Kilts C. A neural basis for social cooperation. Neuron. 2002;35:395–405. doi: 10.1016/s0896-6273(02)00755-9. [DOI] [PubMed] [Google Scholar]
  34. Rilling JK, Sanfey AG, Aronson JA, Nystrom LE, Cohen JD. The neural correlates of theory of mind within interpersonal interactions. Neuroimage. 2004;22:1694–1703. doi: 10.1016/j.neuroimage.2004.04.015. [DOI] [PubMed] [Google Scholar]
  35. Rizzolatti G, Sinigaglia C. The functional role of the parieto-frontal mirror circuit: interpretations and misinterpretations. Nat. Rev. Neurosci. 2010;11:264–274. doi: 10.1038/nrn2805. [DOI] [PubMed] [Google Scholar]
  36. Rudebeck PH, Buckley MJ, Walton ME, Rushworth MFS. A role for the macaque anterior cingulate gyrus in social valuation. Science. 2006;313:1310–1312. doi: 10.1126/science.1128197. [DOI] [PubMed] [Google Scholar]
  37. Sanfey AG, Rilling JK, Aronson JA, Nystrom LE, Cohen JD. The neural basis of economic decision-making in the Ultimatum Game. Science. 2003;300:1755–1758. doi: 10.1126/science.1082976. [DOI] [PubMed] [Google Scholar]
  38. Sanfey AG, Loewenstein G, McClure SM, Cohen JD. Neuroeconomics: cross-currents in research on decision-making. Trends Cogn. Sci. 2006;10:108–116. doi: 10.1016/j.tics.2006.01.009. [DOI] [PubMed] [Google Scholar]
  39. Seo H, Lee D. Temporal filtering of reward signals in the dorsal anterior cingulate cortex during a mixed-strategy game. J. Neurosci. 2007;27:8366–8377. doi: 10.1523/JNEUROSCI.2369-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Seo H, Cai X, Donahue CH, Lee D. Neural correlates of strategic reasoning during competitive games. Science. 2014;346:340–343. doi: 10.1126/science.1256254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Shadlen MN, Britten KH, Newsome WT, Movshon JA. A computational analysis of the relationship between neuronal and behavioral responses to visual motion. J. Neurosci. 1996;16:1486–1510. doi: 10.1523/JNEUROSCI.16-04-01486.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Sheth SA, Mian MK, Patel SR, Asaad WF, Williams ZM, Dougherty DD, Bush G, Eskandar EN. Human dorsal anterior cingulate cortex neurons mediate ongoing behavioural adaptation. Nature. 2012;488:218–221. doi: 10.1038/nature11239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Stephens DW, McLinn CM, Stevens JR. Discounting and reciprocity in an iterated prisoner’s dilemma. Science. 2002;298:2216–2218. doi: 10.1126/science.1078498. [DOI] [PubMed] [Google Scholar]
  44. Tomlin D, Kayali MA, King-Casas B, Anen C, Camerer CF, Quartz SR, Montague PR. Agent-specific responses in the cingulate cortex during economic exchanges. Science. 2006;312:1047–1050. doi: 10.1126/science.1125596. [DOI] [PubMed] [Google Scholar]
  45. Vickery TJ, Chun MM, Lee D. Ubiquity and specificity of reinforcement signals throughout the human brain. Neuron. 2011;72:166–177. doi: 10.1016/j.neuron.2011.08.011. [DOI] [PubMed] [Google Scholar]
  46. Vogeley K, Bussfeld P, Newen A, Herrmann S, Happé F, Falkai P, Maier W, Shah NJ, Fink GR, Zilles K. Mind reading: neural mechanisms of theory of mind and self-perspective. Neuroimage. 2001;14:170–181. doi: 10.1006/nimg.2001.0789. [DOI] [PubMed] [Google Scholar]
  47. Warneken F, Tomasello M. Altruistic helping in human infants and young chimpanzees. Science. 2006;311:1301–1303. doi: 10.1126/science.1121448. [DOI] [PubMed] [Google Scholar]
  48. Williams ZM, Eskandar EN. Selective enhancement of associative learning by microstimulation of the anterior caudate. Nat. Neurosci. 2006;9:562–568. doi: 10.1038/nn1662. [DOI] [PubMed] [Google Scholar]
  49. Witten IH. Adaptive optimal controller for discrete-time Markov environments. Inf. Control. 1977;34:286–295. [Google Scholar]
  50. Yoshida K, Saito N, Iriki A, Isoda M. Social error monitoring in macaque frontal cortex. Nat. Neurosci. 2012;15:1307–1312. doi: 10.1038/nn.3180. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2
3
4
5
6
8

RESOURCES