Skip to main content
UKPMC Funders Author Manuscripts logoLink to UKPMC Funders Author Manuscripts
. Author manuscript; available in PMC: 2026 Apr 5.
Published in final edited form as: Neuron. 2022 Jun 14;110(17):2743–2770. doi: 10.1016/j.neuron.2022.05.022

Medial and orbital frontal cortex in decision making and flexible behavior

Miriam C Klein-Flügge 1,2,*, Alessandro Bongioanni 1,2, Matthew FS Rushworth 1,2
PMCID: PMC7618973  EMSID: EMS212946  PMID: 35705077

Abstract

The medial frontal cortex and adjacent orbitofrontal cortex have been the focus of investigations of decision making, behavioral flexibility, and social behavior. We review studies conducted in humans, macaques, and rodents and argue that several regions with different functional roles can be identified in dorsal anterior cingulate cortex, perigenual anterior cingulate cortex, anterior medial frontal cortex, ventromedial prefrontal cortex, and medial and lateral parts of orbitofrontal cortex. There is increasing evidence that the manner in which these areas represent the value of the environment and specific choices is different to subcortical brain regions and more complex than previously thought. While activity in some regions reflects distributions of rewards and opportunities across the environment, in other cases, activity reflects the structural relationships between features of the environment that animals can use to infer what decision to take even if they have not encountered identical opportunities in the past.

Introduction

It is well-established that frontal cortex guides decision making and flexible behavior. This conviction is based on more than half a century of investigations into how animals and people adapt their behavior so that it is appropriate to changing circumstances. For example, the most recent investigations of orbitofrontal cortical (OFC) employing the very latest techniques (Banerjee et al., 2020) assess aspects of behavior – the ability to switch and change which choice is made – using a behavior reversal task with elements that would have been familiar to researchers investigating OFC more than fifty years earlier (Mishkin et al., 1969). Similarly, the idea that OFC and brain areas on the medial surface of the frontal cortex such as anterior cingulate cortex (ACC) and ventromedial prefrontal cortex (vmPFC) guide decision making is bolstered by a series of investigations that show that their activity reflects the value of choices, the process of decision making, and the value of the course of action pursued (Cai and Padoa-Schioppa, 2014; Kable and Glimcher, 2009; Rudebeck and Murray, 2014; Rushworth et al., 2011; Soltani and Koechlin, 2022; Wallis, 2012).

It is, however, clear that frontal cortex is far from the only brain region concerned with behavioral flexibility and reward-guided decision making. There are other very different types of brain system that control decision making and behavioral flexibility, for example in the striatum, dopaminergic midbrain, and serotonergic brainstem, and identifying the special, additional contribution made by areas such as OFC and ACC is not always straightforward. The ACC and parts of OFC are found in many mammals, but they are especially extensive in primates. On the other hand, animals lacking OFC and ACC still exhibit changing patterns of decision making and behavioral flexibility. For example, larval zebrafish do not possess frontal cortical areas and yet they switch between exploiting opportunities for predation and exploration for new sources of food. In humans, the balancing of exploitation against exploration has been associated with frontal cortex (Badre et al., 2012; Daw et al., 2006; Trudel et al., 2021; Zajkowski et al., 2017). In zebrafish, however, this pattern of reward-guided decision making depends on the dorsal raphe nucleus (DRN) (Marques et al., 2020). In rodents and even in primates, DRN activity also tracks aspects of the reward environment and indicates when behavioral change might be necessary (Grossman et al., 2022; Hayashi et al., 2015; Khalighinejad et al., 2022; Wittmann et al., 2020). Neurons in other subcortical nuclei, for example midbrain dopaminergic neurons, also have activity that reflects the value of potential choices and the process of decision making (Wang et al., 2021; Yun et al., 2020). If subcortical systems carry such signals, can we specify how frontal cortical contributions differ? In order to do this, might we need to distinguish between the types of behavioral flexibility observed in zebrafish and humans?

Not only is it important to distinguish frontal cortical contributions from subcortical ones, but some recent results suggest a need to rethink the precise nature of the frontal cortical contributions. For example, when lesions are made to primate OFC in such a way that adjacent white matter is spared then behavior reversal is actually uncompromised (Rudebeck et al., 2013). Not only might frontal cortex not have the role that we thought it had in behavioral flexibility but, in addition, other scientists have argued that OFC and other frontal areas lack the representations of value that have, for the last two decades, been thought to guide decision making (Hayden and Niv, 2021). We may then need to conceive the specific contributions of frontal cortex in a more differentiated way.

In this review we summarize recent evidence regarding the nature of the representations found across a set of frontal cortical regions (figures 1, 2). First, we discuss how dorsal ACC represents the distribution of opportunities across the environment, computes their value at multiple time scales, decides whether to engage with a present opportunity or continue exploring, and influences information-seeking behavior. Second, we turn to anterior medial frontal cortex, describing its role in representing the structure of the environment, even when this is not immediately relevant for reward-guided decisions. We also discuss the relation between rodent and primate OFC. Next, we review evidence showing that vmPFC translates values into choices. We consider work on perigenual ACC showing a role in integrating costs and benefits to drive behavior and conclude by discussing how dorsomedial prefrontal cortex (dmPFC) organizes relationships in social contexts. Throughout the review, we also consider whether and, if so, how such representations might differ from and complement those present in subcortical brain systems previously associated with reward-guided choices and behavioral change. We compare the frontal cortical areas described with components of the subcortical circuit comprising dopaminergic and serotonergic nuclei under the control of a pathway running through the striatum, pallidum, and habenula that controls reward-guided behavior in many vertebrates (figure 2A). In mammals this circuit is, itself, partly under the control of frontal cortex as well as being influenced by dopamine and serotonin.

Figure 1. Medial and orbital frontal cortex in rodents, macaques, and humans.

Figure 1

Five functional regions, dorsal anterior cingulate (dACC), perigenual anterior cingulate (pgACC), dorsomedial prefrontal cortex (dmPFC), anterior medial frontal cortex (amFC), and ventromedial prefrontal cortex/medial orbitofrontal cortex (vmPFC/mOFC) are shown in relation to cytoarchitectonic maps of rat (left), macaque (center), and human (right) medial and orbital frontal cortex (based on Wise, 2008). All five regions are identifiable in humans and macaques. The color scheme indicates that the orbital region of rodents has some functional features shared with primate dmPFC, amFC, and vmPFC/mOFC, but that it does not correspond in a simple way to any of them. Thus, while the extent to which regions are homologous across humans and other primates, such as macaques, is relatively clear, correspondences between primates and rodents are more contentious and unlikely to be one-to-one in nature. This is important to bear in mind when evaluating evidence from different species about the regions’ functional contributions.

Figure 2. Multiple systems for representing the value of the environment in the vertebrate brain.

Figure 2

(A) A subcortical circuit for identifying rewarding environments uses an estimate of the value of an animal’s environment – how good is it currently – and detects prediction errors – occasions when the environment turns out to be better or worse than previously estimated. This circuit is present in rodents and primates but also many other chordate animals (Freudenmacher et al., 2020; Hong and Hikosaka, 2008; Matsumoto and Hikosaka, 2007; Stephenson-Jones et al., 2013, 2016). Some of its component elements are identifiable even in cyclostomes – such as the lamprey – which diverged from other chordates 550 million years ago. This includes inhibitory GABA-ergic mediated control (rostromedial tegmental nucleus, RMTg) of dopaminergic and serotonergic regions (DA/5HT) by the lateral habenula (LHb), which is in turn innervated by a habenula-projecting pallidal region (GPh) and the striatum (str). A second pathway runs via the dorsal pallidum/globus pallidus (GP) to brainstem motor areas such as the subthalamic nucleus (STN) (adapted from Stephenson-Jones et al., 2013). Neuromodulatory systems such as the dopaminergic (DA) midbrain nuclei – the ventral tegmental area (VTA) and substantia nigra pars compacta (SNc) – and serotonergic (5HT) dorsal raphe nucleus (DRN) allow mammals and birds to follow value gradients (blue-to-red gradients indicate low-to-high reward) so that they find and stay in the best locations in an environment. (B) Cingulate areas such as dACC (area 24 in primates and Cg1 and Cg2 in rodents) are identifiable in most mammals including monotremes and marsupials (Ashwell et al., 2008; Mayner, 1989; Suárez et al., 2018) suggesting an origin over 200 million years ago. In addition to the current value of the environment and prediction errors, dACC enables mammals to represent the distribution of opportunities across the environment and over time across multiple scales. In this example, an animal might learn about food availability over an intermediate time scale – time within a day – and a long time scale – time over weeks, allowing it to make predictions into the future. (C) Rodents and primates construct representations of the relationships between elements and features of the environment. For example, they might learn that if reward is available at one location (here indicated by one type of tree), it may be present or absent from another and vice versa. In the four examples, illustrated by a pair of trees, a monkey has learned two types of relationships between rewards in two locations. In the first two cases, a positive correlation implies that when reward is found in location A (far left) another reward is likely to be available in the location B hidden from view, but when it is absent in A (second from left) it will also be absent in B. In the third and fourth cases (second from right and far right), the monkey has learned a negative correlation between reward in the two locations. This is the type of situation that is being investigated in reversal tasks. Such cognitive maps of environmental contingencies depend on multiple brain systems such as medial temporal lobe areas and they are found in rodents. However, they may be an especially prominent feature of granular prefrontal areas in primates.

Anterior cingulate cortex and opportunities in the environment

An animal can manage well in many cases if it can represent the value of its current environment (whether and to what degree it is rewarding), whether the actions it takes will make its experience of the environment better or worse (positive and negative prediction errors), and its uncertainty about the estimates it is making about its environment. Such representations can guide an animal with limited environmental knowledge to find and stay in good foraging areas and to retreat from those that are not so good. For example, it is possible that a fish with such representations might manage well even if it were living in murky water that precludes remote sensing of the distribution of opportunities in the environment.

Subcortical systems provide important information in such environments (figure 2A). For example, in addition to the DRN-centered serotonergic system described above (Marques et al., 2020), dopaminergic neurons in monkeys encode prediction errors and report whether the environment is getting better or worse (Hart et al., 2014; Schultz, 2013; Schultz et al., 1997). As each opportunity is encountered, changes in dopaminergic neuron activity reflect whether or not the monkey will pursue it or wait for a better opportunity (Yun et al., 2020). Subcortical circuits that allow an animal to follow value gradients and find and stay in the best possible locations in their environment, or avoid dangerous ones, are present not only in rodents and monkeys, but also many chordate animals such as the lamprey (figure 2A) and other fish (Agetsuma et al., 2010; Amo et al., 2014).

It has been pointed out, however, that not all environments are the same. For example, the terrestrial environment is a very different one which animals with distance receptors, such as mammals, are able to survey from afar (Hunt et al., 2021; MacIver et al., 2017; Mugan and MacIver, 2020). If an animal can survey its environment, then, in addition to representing its average reward value and reward prediction errors, it becomes possible to represent the distribution of opportunities across the environment and the changes and sequences of behavior by which the opportunities might be pursued (figure 2B). Converging experimental evidence shows that dorsal ACC (dACC) activity (1) encodes the distribution of opportunities across time as well as space (figure 3A), (2) assesses the value of disengaging from the present course of action (figure 3B), and (3) regulates switching between periods of exploiting such knowledge and seeking more information (figure 3C).

Figure 3. Exploring and navigating the distribution of opportunities in the environment via dACC.

Figure 3

(A) (left) Activity in dACC reflects a person’s experience of success on a simple task over multiple time scales simultaneously. Activity in the lighter, yellow regions in dACC is dominated by the most recent experience while activity in the orange regions reflects experience over a more extended time scale (adapted from Meder et al., 2017). (center) Neurons in ACC have the longest intrinsic time scale, compared to other regions (e.g., middle temporal=MT, lateral intraparietal=LIP or lateral or orbital prefrontal=LPFC/OFC (adapted from Murray et al., 2014)). (right) It is possible to work out whether one is on an upward or downward reward trajectory and thus make predictions into the future by comparing reward rates experienced over the short term or longer term. Activity in human dACC reflects this comparison: recent reward experience is encoded with a positive sign (shown in yellow) and reward experienced over a longer time scale is encoded with a negative sign (shown in blue) (adapted from Wittmann et al., 2016a). (B) (left) Activity in dACC reflects the value of alternative courses of action in macaques. The better alternative and the worse alternative are associated with significantly positive and no activity (illustrated in black and grey, respectively) as opposed to the option currently pursued which is associated with a negative activity change (shown in yellow (adapted from Fouragnan et al., 2019). (center) Activity in dACC reflects prospective value – the value that a course of action, such as leaving a job and looking for better employment – might lead to in the future taking into account not just the mean value of opportunities but also their distribution and the time horizon available to explore them (adapted from Kolling et al., 2018). (right) Activity in dACC also reflects the sequence of actions needed to acquire a goal (adapted from Holroyd et al., 2018). (C) (left) In rats, two anatomical projections from dACC mediate exploring and evaluating behavioral change on the one hand and commitment to taking a behavior (adapted from Tervo et al., 2021). (center) Monkeys were taught to alternate between task performance (working) and exploratory behavior (checking) in a paradigm that captures something of how a person, for example, a scientist might oscillate between working on a manuscript and checking their email. Activity recorded in dACC between the end of one trial (time 0) and a lever press could be decoded to predict whether monkeys would work or check: color code indicates percentage correct linear decoding (blue indicates threshold for significance at 70%, yellow indicates 90% correct decoding) (adapted from Stoll et al., 2016). (right) Exploration of a potential choice is often guided by uncertainty about its consequences. Neurons in macaque dACC are most active when cues are associated with uncertain outcomes (50% chance of outcome) as opposed to certain outcomes (0 or 100% outcomes; adapted from Monosov, 2017).

dACC and the distribution of experiences over time

One way in which resources in an environment might be unevenly distributed is across time. For example, one foraging location, such as a particular fruit tree, may have held a high value over the last couple of days since fruits have ripened, but a low average value over the last few months, when there were no fruits at all. On the other hand, another location populated by edible insects may have lower value right now, but higher long-term value, because the presence of insects is more regular across seasons. Both neuroimaging studies in humans and single neuron recording studies in macaques demonstrate that dACC simultaneously holds multiple representations of value with different time constants (Bernacchia et al., 2011; Cavanagh et al., 2016; Meder et al., 2017; Murray et al., 2014; Seo and Lee, 2009; Spitmaan et al., 2020; Wittmann et al., 2016a). For example, Meder and colleagues (2017) used fMRI to examine neural activity while people decided whether to repeat a choice or switch to an alternative. They reported that variation in dACC activity was related to variation in choice value. Importantly, however, dACC voxels carried estimates of choice value that were constructed over different time scales (figure 3A, left). For example, activity in one ACC voxel might reflect whether a choice had been successful and delivered reward on average over the course of many previous trials. However, another voxel’s activity might reflect whether the choice had been successful on just the most recent trials. This means that ACC constructs multiple estimates of choice value over different time scales.

Several features of such representations are worth noting (figure 3A). First, although present in dACC, they are not ubiquitous. For example, neuroimaging suggests they are not prominent in OFC, vmPFC, or most of lateral prefrontal cortex although they are found in anterior lateral prefrontal cortex and anterior insula (Fischer et al., 2017; Meder et al., 2017; Wittmann et al., 2016a). Second, there is a degree of topographic organization; choice values that depend on shorter timescales are, on average, anterior to choice value representations reflecting longer time scales (Meder et al., 2017). Moreover, interactions between dACC and other brain regions with the same properties are organized in a temporarily structured manner: activity in dACC voxels operating on short and long time scales is, respectively, correlated with activity in voxels operating on the same time scales in other areas such as inferior parietal lobule. Third, there is, nevertheless, flexibility in how choice values are represented; all dACC voxels in fMRI studies (Meder et al., 2017) and neurons in single neuron studies (Spitmaan et al., 2020) tend to encode choice value over shorter time scales when the environment is changing quickly so that only the recent past is a good guide to the future. Conversely, the opposite happens in more stable environments when the long-term average might be more informative. This is consistent with the idea that these timescale representations might endow dACC with a form of metaplasticity that allows an animal to integrate feedback according to the currently relevant environmental rate of change (Farashahi et al., 2017). Such patterns of activity that occur in dACC during decision making may also underlie the way in which post-decision, reward-related dACC activity changes depending on whether weight is to be given to just recent or also longer time scales (Behrens et al., 2007).

As well as reflecting choice value over different time scales, dACC neuron activity, even at rest, shows patterns of autocorrelation over long time scales, meaning that activity fluctuations are slower compared to the rest of the brain (figure 3A, center). Therefore ACC neurons have both long autocorrelation time constants and they compute value with longer time constants compared to other frontal areas (Cavanagh et al., 2016; Murray et al., 2014; Soltani et al., 2021; Spitmaan et al., 2020). Moreover the two features appear to be linked: long autocorrelation time constants exist in neurons with long reward history time constants (Spitmaan et al., 2020). When lesions are made in ACC, macaques can only adjust their behavior in response to the most recent outcome but the influence of the longer history of reward and choice is lost (Kennerley et al., 2006).

If an animal can represent choice value over multiple time scales, it can represent its position in the environment with respect to the distribution of opportunities within it in ways not possible for an animal that represents one instantaneous rate of reward or reward prediction errors at one time scale (compare Figure 2A and B). For example, by comparing longer-term and shorter-term value representations an animal can estimate its reward trajectory, i.e., whether the environment is getting better or worse and, if it is, how quickly it is changing. For example, an insectivorous predator may estimate that the number of prey insects on a particular tree is higher today than the average of the last week, but at the same time notice that this is much lower than the annual average; this indicates a short-term peak, within a long-term decline. Such information can be used to guide decisions about whether to persist in the current environment or to switch to an alternative, as we discuss in the next section (Wittmann et al., 2016a), and it may also enable meta-learning (Schweighofer and Doya, 2003). Time-scale based value information is not just important for animals, value information occurring at different time scales may also be an important determinant of mood in humans (Eldar et al., 2016; Keren et al., 2021).

Careful analysis of dACC activity patterns (figure 3A, right) suggests subjects do indeed use the information contained in dACC representations to guide decisions about whether to keep foraging in one environment or to explore an alternative environment (Wittmann et al., 2016a). Moreover, individual variation in dACC activations reflecting short time scales predicts individual variation in the influence that value estimates constructed over short time scale will have on behavior. Similarly, individual variation in longer time scale neural representation strength predicts individual variation in the impact of longer term reward history on behavior (Wittmann et al., 2016a).

The representation of the environment borne by dACC is different to that found beyond cortex, for example in brainstem regions such as DRN. DRN activity reflects very broad aspects of the environment such as whether it is good or bad and what its average value might be (Cohen et al., 2015; Hayashi et al., 2015; Khalighinejad et al., 2022; Wittmann et al., 2020). A dACC-possessing animal can represent the distribution of opportunities across its environment and not just its mean value and prediction errors about that mean. This is important if an animal is to identify and pursue a distant reward goal that lies beyond what might otherwise be a barrier – a region associated with minimal rewards or even a cost. Lesions in the Cg1/Cg2 regions in the rat (areas with some similarities to primate dACC) disrupt the ability of rats to climb over a barrier to reach a more valuable outcome (Rudebeck et al., 2006; Walton et al., 2002, 2003).

Nevertheless, recent analyses of neurophysiological recordings from one subcortical region, the dopaminergic midbrain, demonstrate some distributional coding of values (Dabney et al., 2020). In neuroimaging studies, such representations have not yet been identified in the dopaminergic midbrain. This might mean that they are not as prominent as those in cortical areas such as dACC or it may simply reflect the limits of spatial resolution in neuroimaging studies. It is also possible that representations at multiple timescales exist in both subcortex and dACC, but that their location reflects other aspects of task complexity. Directly comparing distributional coding and the degree of reward encoding over different time scales in cortical regions such as dACC and dopaminergic midbrain and determining whether they are similar or different and the degree to which they are mutually interdependent or independent will be an important challenge.

dACC and behavioral change

How might such representations guide adaptive and flexible decision making? We often think of decisions as being between one well-defined option and another. For example, a monkey choosing between an apple or an orange. Animals can and do make binary decisions but this scenario may not be representative of situations that foraging animals regularly encounter (Pearson et al., 2014). It is a lucky monkey that finds itself near and equidistant to fruiting orange and apple trees, but this is the situation that many of our laboratory decision making tasks simulate. Instead, foraging animals often encounter one opportunity at a time and the question is whether to engage with that opportunity or whether doing so represents an opportunity cost in terms of what else they are forgoing (Charnov, 1976; Freidin and Kacelnik, 2011; Stephens and Krebs, 1986). In other words, the foraging macaque may encounter an apple tree and when it does, it needs to decide how much time and effort it should devote to foraging in the tree as opposed to continuing to explore for other opportunities. In a similar way, when one is already engaging with an option, the question is whether and when to leave it. For example, a human “foraging” in the job market might consider the value of alternative jobs to the one in which they are currently employed. Apparent idiosyncrasies in binary decision making behavior can be explained if animals are evaluating options against the context in which they occur as would be expected of a sequential forager evaluating each opportunity encountered against the opportunity cost it entails in the current environment (Freidin and Kacelnik, 2011).

Activity across several frontal lobe areas reflects aspects of such decision variables but it is notable that ACC activity prominently reflects the value of switching and the range of alternatives (Blanchard and Hayden, 2014; Fouragnan et al., 2019; Hayden et al., 2011; Kolling et al., 2012, 2018; Lopez-Persem et al., 2016; Mehta et al., 2019). It has been suggested that such activity might simply reflect the difficulty or response conflict when deciding which choice to take during a decision but it is now clear from careful examination of data both from early (Kolling et al., 2012) and more recent studies that such signals are strongly decorrelated from difficulty and cannot be explained by difficulty; instead it is the value of the potential alternatives that the environment furnishes that is represented in dACC in both macaques and humans (Fouragnan et al., 2019; Kolling et al., 2016a, 2016b, 2018; Vassena et al., 2020; Wittmann et al., 2016a). Closely related activity patterns have been reported in interconnected regions of posterior cingulate cortex (Barack and Platt, 2021; Barack et al., 2017).

For example, Fouragnan and colleagues used fMRI to examine activity across the brains of monkeys that tracked values of three possible choices. The values of the choices gradually changed over the testing session so that a choice that was good at one time was not so good at another. In addition, on any given trial only two out of three options were available for the monkey to choose between. Both task manipulations made it advantageous for the animals to switch choices from trial to trial and this is indeed what they did. Different types of representations of the values of alternatives choices were found in dACC and hippocampus. While hippocampus activity reflected the value of currently unavailable options, dACC activity reflected the best alternative to the current choice, regardless of whether that option was available, but rejected, on the current trial or if it was not presented at all but might reappear in a future trial (figure 3B, left). Such a pattern of activity is not predicted by accounts of dACC emphasizing only the difficulty of making a decision. This is because alternative options that cannot be chosen during the current decision, but which might be chosen at some point in the future, should not change the difficulty of the current decision. dACC activity not only reflected the value of alternative choices, but it also reflected how likely animals were to switch to them if they were offered on a future occasion. dACC is a crucial node in the network for changing behavior and switching between choices; when its activity was disrupted by transcranial ultrasound stimulation (TUS), there was a reduction in adaptive patterns of switching behavior; switching was no longer more determined by the value of the best alternative choice but instead, maladaptively, it was more influenced by the value of the other, less good, alternative (Fouragnan et al., 2019).

When people are in a similar situation and decide whether to engage with the most immediate opportunity or to explore alternatives then again ACC activity reflects not just one, but many aspects of the environment of alternative possibilities that contribute to improve decision-making (Kolling et al., 2018). If we reflect on how a person might be constrained to move through and explore their environment, it becomes clear that not only the best alternative opportunity should be represented, but other environmental features should be represented as well (figure 3B, center). Imagine a scientist deciding whether to continue in their current job or to leave to invest time seeking better prospects elsewhere. The scientist should obviously consider the value of their current position and they should compare it with the job that they hope to find if they leave. However, in addition to the value of the coveted “dream job”, they also need to consider how likely they are to obtain it, or any other job, given their time horizon – the time period they have in which to explore the job market. In other words, the average value of alternative options in the job market, as well as their variance in value and ease of access need to be considered. If the applicant has plenty of time to explore alternatives, then they might plan to apply and re-apply for the best jobs until they end up with one of them. However, if limited resources mean they only have limited time in which to land a new job, then they must be prepared to settle for more mediocre alternatives. All these factors determining the environment’s value for the job seeker – the average value of alternatives, their variance, and the time horizon for exploration – are encoded in ACC activity (Kolling et al., 2018).

ACC activity does not just encode the opportunities available in the environment (figure 3B, right). In addition, it encodes what a person or animal might do to get to them. Individual neuron activity patterns and the multivariate pattern of activity in dACC reflects how rats, monkeys, and people progress through sequences of actions towards a goal (Holroyd et al., 2018; Ma et al., 2014, 2016; Procyk et al., 2000; Shahnazian and Holroyd, 2018; Shidara and Richmond, 2002) and the occurrence of unexpected events as they make their progress (Ribas-Fernandes et al., 2011). Again such knowledge may be the product of interactions between dACC and hippocampus (Remondes and Wilson, 2013).

dACC and information seeking

So far, we have considered situations in which people and animals change and redirect their behavior to exploit opportunities they know the environment contains. However, in many cases such knowledge is absent or incomplete. When this is the case, dACC activity reflects the process of information seeking and the subsequent updating of the animal’s model of the world. For example, Tervo et al. trained rats to accept or reject choices signaled by two distinct tones. Each tone was paired with distinct reward probabilities which could change at unpredictable and uncued times. This meant that rats usually developed a preferred option they accepted and an unpreferred option they tended to reject. Nevertheless, rats periodically changed away from making a preferred choice to find out more about the value of an alternative (Tervo et al., 2021). Although they were more likely to do this when their preferred option has recently been unrewarded, they were also guided by expectations about the duration of periods in which either option might be the better one and they also spontaneously moved in and out of exploratory phases of behavior (figure 3C, left).

The experiment performed by Tervo and colleagues is important not only because of the way in which it examines a naturalistic behavior in a carefully controlled setting but because it provides a way of reconciling insights into the role of ACC in, on the one hand, switching and exploring and, on the other hand, persistence and effort investment (Croxson et al., 2009; Kennerley et al., 2009; Klein-Flugge et al., 2016; Parvizi et al., 2013; Rudebeck et al., 2006; Walton et al., 2002, 2003). This is because dACC activity makes different contributions to behavior at different points in time during decision sequences and it does so via different microcircuits linking it to divisions of the distinct subcortical circuits. One microcircuit, in rats, runs from dACC to the substantia nigra pars reticulata (SNpr) and is important for initiating exploratory behavior at the time that the choice is made. Optogenetic silencing of either dACC (area Cg1/24b) itself or the dACC-SNpr pathway selectively reduced the frequency of exploratory choices of an alternative option as opposed to the preferred one. By contrast, another dACC microcircuit in rats runs from dACC to striatum and it is important for persistence in a behavior after a decision is taken and no reward is received. Optogenetic silencing of either dACC or the dACC-striatum pathway reduced the frequency with which animals persisted with a choice after non-reward (Tervo et al., 2021).

A range of approaches have been employed to address the question of information seeking but all have converged in highlighting dACC (figure 3C). Stoll and colleagues (Stoll et al., 2016) trained macaques to perform a perceptual discrimination task for food rewards. The monkeys could opt to perform the task or opt out to seek information about another visual object that gradually changed in appearance until it indicated a large bonus reward would be delivered. Many neurons in dACC changed activity at the time when the macaques sought information about the growth of the bonus indicator (figure 3C, center). Monosov, White, and colleagues trained macaques in a very different behavioral paradigm in which macaques did not have to actively seek information themselves but nevertheless they encountered cues providing information about what was to happen next and other cues confirming prior expectations (figure 3C, right). Activity in dACC neurons ramped up when macaques expected an information-bearing cue (Monosov, 2017; White et al., 2019). Hunt and colleagues trained macaques to choose between multicomponent visual objects (Hunt et al., 2018). The component features of the objects provided information about the magnitude or probability of rewards that would be received if chosen. The features were sometimes obscured but animals could seek the information they contained by saccading towards them. ACC activity was linked to the usage of information that had been sought by making a saccade; ACC activity reflected the degree to which the information revealed by a feature confirmed that the monkey would be making a good decision by choosing that object. Neuroimaging studies confirm that dACC has a preeminent role in information seeking; its activity reflects a person’s uncertainty about the choice that they are taking when they are actively exploring options to obtain information rather than when they are simply randomly responding (Trudel et al., 2021). Often in these studies, as in real life, seeking information can be beneficial in the future. How this differs from curiosity – the mere desire to know – and whether curiosity relies on dACC representations remains to be fully determined (Bromberg-Martin and Hikosaka, 2009; Kidd and Hayden, 2015; Kobayashi et al., 2019; Monosov et al., 2020; Wang and Hayden, 2021).

Like Tervo and colleagues, who investigated rodents, White and colleagues (White et al., 2019) confirm the importance in primates of a circuit spanning dACC and striatum and in addition provide evidence that the circuit extends into pallidum. Striatal regions near the internal capsule are strongly connected with ACC and these in turn are strongly connected with anterior globus pallidus and ventral pallidum. Neurons in all three areas show similar information seeking-related activity.

In sum, dACC represents the distribution of opportunities in the environment, it computes recent and long-term value, and on these bases, it determines whether a person or other animal should engage with a current option or explore the environment, including driving specific information-seeking activity (figure 3). However, important questions remain. Uncertainty-related activity has been reported in subcortical systems including the noradrenergic system originating in the locus coeruleus and the raphe nucleus. Our picture of how these regions interact with one another and dACC is currently changing rapidly (Joshi and Gold, 2022; Joshi et al., 2016; Muller et al., 2019; Soltani and Izquierdo, 2019; Tervo et al., 2014) but there is much that we still do not know. Perhaps most critically, we do not know why they interact, whether they encode identical indices of uncertainty, whether they have similar influences on behavior, and how they compare with influences from other neuromodulatory systems (Danielmeier et al., 2015; Fischer et al., 2015; Khalighinejad et al., 2020a, 2022).

Learning about environmental structure versus reward distribution in prefrontal cortex

In addition to the ability to organize behavior in time as a function of the distribution of opportunities in the environment, many mammals know about the structure and causal nature of relationships between these opportunities and other features of their environments. Such structural knowledge may suggest that the best choice to take next is not the one in which the animal has experienced the best reward distribution. For example, consider the case of a monkey that, one morning, finds a ripe fruit in a tree that yesterday’s search had revealed only to contain unripe fruit. If the monkey only considered the distribution of reward in the environment that it had experienced, then the discovery of one fruit in this previously unrewarding tree might not be enough to detain the monkey from going somewhere else. However, if the monkey knows about the way in which ripe fruit appears on a tree – in other words, if it has knowledge of the structural organization of its environment – then the presence of one ripe fruit in a tree might be sufficient to signify that all the fruit in the tree is ripening now and that the new location might now be the ideal one in which to forage. Such knowledge about the structure of the environment does not, however, depend on dACC.

One demonstration of this, for example, was recently provided by Vertechi and colleagues (2020). They devised a task for mice with features reminiscent of the foraging environment discussed above. The mice learned about an environment containing two locations at which they could nose poke. Importantly, water reward was probabilistically available at just one of the two locations, but its position varied over time. If the mice understood that water was available at only one location at any given time then this meant that, once they had discovered water at one location, they should no longer visit the other location even if water had been found there on many previous occasions. After a training period, mice learned to switch quickly; once they have discovered water at one location on just a single occasion, they focused on that location regardless of whether the other was previously rewarding on many previous occasions. Vertechi and colleagues argue that knowledge of environment structure allows mice to infer that the location of the water has changed as soon as they receive a reward for the first time at a new location. Regardless of the previous distribution of reward, they should now switch and focus on this new location. In line with what we have argued about ACC above, Vertechi and colleagues reported that ACC inactivation delayed switching regardless of the previous distribution of rewards. However, after OFC inactivation a different pattern of behavioral change was observed; mice were slower to switch but their impairment was a function of how much water had been delivered at the other location previously. In other words, after OFC inactivation the mice’s behavior was guided by their experience of the distribution of reward in their environment but it was no longer guided by their knowledge of the structure of opportunities in the environment.

This is just one of several studies emphasizing the importance of rodent OFC for the construction of models of the relationships between features of the environment (Wikenheiser and Schoenbaum, 2016). For example, OFC neuron activity reflects the learning of associations between cues that occurs incidentally even when the accumulation of this knowledge is not immediately reinforced (Sadacca et al., 2018) and other studies implicate human OFC in similar inference processes (Wang et al., 2020a, 2020b).

If the ability to make inferences based on knowledge of the structure of the environment rather than just experience of the distribution of reward in an environment is present in a mouse, then it may be present in a number of mammals. Nevertheless, it has received special attention in primates. Primates have highly developed visual systems and spend a large part of their time engaged in visual exploration of their environments (Graziano, 2009) often appearing to survey it from a much greater distance than does a rodent (Rudebeck and Izquierdo, 2021). They may range over large areas of many thousands of hectares as they forage (Passingham and Wise, 2012; Rudebeck and Izquierdo, 2021). Despite their impressive range, their feeding is often focused on the fruit, young leaves, and insects found in angiosperm trees. As a result, one difficulty primates face is in identifying the best places in which to forage within their range because only a small proportion of the trees within it, perhaps as low as 4%, bear fruit at any point in time (Zuberbühler and Janmaat, 2010). The strategies that primates exploit to estimate where they might find food suggest that, in addition to using knowledge of spatial relationships, like those examined in the rodent study by Vertechi et al., they also frequently use knowledge of non-spatial, structural relationships between environmental elements to guide their choices. For example, they can learn patterns of correlation between visual aspects of the trees’ appearances, that can be discerned from afar, and food; they spontaneously search for and inspect visual cues that have previously been seen near food (Menzel, 1996). When they see that one tree is in fruit they are more likely to visit other similar trees suggesting that they infer that similar trees might be fruiting too (Janmaat et al., 2012). When the weather has been warm they return to trees that they have recently seen with unripe fruit suggesting they have inferred that that fruit may now be ripe (Janmaat et al., 2006). And when persimmons are left in their home range, Japanese macaques appear to infer that persimmon trees may be fruiting and are more likely to visit persimmon trees (Menzel, 1991). Finally, primates are known to live in large social groups, and they excel in adjusting their behavior based on structural knowledge of the social dynamics and hierarchies present in their troop (a topic that we return to in more detail below in the section about dmPFC).

In summary, many animals learn and exploit relationships not just between sensory cues and their reinforcement consequences, but they also understand the structure of relationships and contingencies between cues. Primates are particularly good at this, and in humans, such relationships may be quite abstract (Donoso et al., 2014). They often involve outcomes that will only unfold much later, or choices made on behalf of others, for example when predicting how the choice of our children’s school may impact their future prospects. We consider these ideas further in the following sections on anterior medial and dorsomedial PFC.

Anterior medial prefrontal cortex and representation of structure

Animals possess cognitive maps of the world around them (Tolman, 1948). Importantly, such maps make it possible not just to follow previously taken routes; they allow “vector navigation” – the ability to move directly from the current position to a goal location, even when it is not directly observable, by a novel route that might never have been taken before. Several medial temporal lobe (MTL) areas, such as hippocampus and entorhinal cortex, and interconnected areas have long been recognized as preeminent in such computations (Hartley et al., 2014, Bush et al., 2015; Hartley et al., 2003). Neuroimaging experiments, however, suggest that a region on the medial surface of the frontal lobe is also a component of this circuit (Doeller et al., 2010). Moreover in both human and non-human primates, the area’s role is not confined to representing spatial information; it also represents the structure of arbitrary associations between non-spatial items (Bao et al., 2019; Baram et al., 2021; Bongioanni et al., 2021; Constantinescu et al., 2016; Gerraty et al., 2018; Schuck et al., 2015). It is possible that the effects of lesions of medial frontal cortex that have previously been attributed to inflexibility or perseveration are better understood in terms of an animal’s beliefs about task structure (Jang et al., 2015).

It is difficult to identify the precise location of the key medial frontal region highlighted in neuroimaging experiments because our knowledge of the anatomy of this region is still evolving (Glasser et al., 2016; Neubert et al., 2015). Its posterior boundary is in or near the part of the cingulate sulcus anterior to the genu of the corpus callosum and it extends anteriorly to reach the medial aspect of the frontal pole. It therefore seems likely to include prefrontal areas such as medial area 10 (10v and 10r) but it may also extend ventrally into prefrontal area 14 and posteriorly into anterior perigenual cingulate cortex, p32. We refer to it here simply as anterior medial frontal cortex (amFC; figure 4). Areas 10 and 14 have only been identified in primates such as macaques and humans and similarities in resting state connectivity patterns suggest the areas are components of similar circuits in the two species (Neubert et al., 2014, 2015). By contrast, area 32 also bears resemblances with regions found in non-primates such as the prelimbic region of rodents (Vogt, 2009). Even if its boundaries remain to be precisely defined, recognizing that there is a specialized sub-region within this area that is different from adjacent dACC, perigenual ACC, and vmPFC is necessary if we are to account for the diversity of frontal cortical contributions to decision making and behavioral flexibility.

Figure 4. Task structure and amFC.

Figure 4

(A) (top left) Human participants experienced two different four-element sequences, one of which led to reward and one of which did not. (bottom centre) The four elements in each sequence corresponded to locations on a 3 × 4 grid. (top centre) Activity in amFC reflected the structure of the sequences that participants learned. (bottom right) In addition, initially amFC activity reflected the spatial distance between any two elements on the grid. Spatial distance was important during training but became irrelevant during the scan session. Over time, amFC activity began reflecting the now relevant transition frequencies between one location and the next. (B) (left) By contrast, pOFC and temporal pole activity reflected the precise order of reward-reinforced sequences. (center) Activity in pOFC and temporal pole ramped up as increasingly more rewarded sequence elements were present in the correct order. It was thus greatest for the final element D of the sequence but only if D was preceded by the correct stimuli (A, B and C) in the correct order. (right) These representations did not change over time – they were robust and inflexible (adapted from Klein-Flügge et al., 2019). (C) (left) Human participants were asked to simulate a new experience (a novel combination of tea and jelly) on the basis of past experiences (of tea and of jelly). (center) amFC held representations of the novel simulations (light blue) and showed that representations of the previously experienced component elements became more similar to one another (dark blue). (right) Once participants actually experienced the novel combination (familiar group versus unfamiliar group), their ability to represent it was no longer a simulation based on experience of the component elements and so amFC representations of the combination and component elements became unlinked from one another (adapted from Barron et al., 2013). (D) (left) Macaques chose between novel stimuli they had not previously experienced or had only limited experience of. Prior to the main part of the experiment, monkeys were, however, extensively trained, but only on a subset of stimuli (yellow cross, familiar options). Different visual features of the stimuli – their color and the density of dots with which they were covered – indicated the amount and probability of juice rewards that would follow if they were chosen. However, during the critical test phase, when fMRI data were acquired, the monkeys demonstrated that they could draw on their knowledge of these visual features to accurately estimate the values of novel options (white). (center left) During choice, the value difference reflects the key decision variable – how much better is one option than the other. When decisions were made between novel rather than familiar options, which required evaluating novel stimuli based on their magnitude and probability dimensions, the amFC decision signal was significantly stronger. (center right) Consistent with the idea that a process of online simulation and integration of the reward probability and magnitude features was occurring, there was evidence of grid cell like encoding of the probability and magnitude space that defined the task; a hexagonal pattern of activity modulation when animals encountered single options on different trajectories with respect to one another. Notably, no such preferential signaling of novel choice computations nor grid-like representation formats was observed in OFC or dACC. (right) The ability to base decisions on integrated choice values (combinations of magnitude and probability information) was disrupted by application of TUS to amFC; lines denote individual monkeys (adapted from Bongioanni et al., 2021).

Just as maps of the spatial world allow animals to perform vector navigation, the models of abstract relationships held by amFC mean that, even without direct, prior experience of all possible states of the world, amFC can make predictions about the nature of unobserved states and the consequences that will follow for the animal if they are entered. Thus, mammals possessing amFC can flexibly adjust their behavior by simulating the consequences of potential courses of action even before experiencing them and they can generalize and apply known relationships to new situations (Behrens et al., 2018). For example, a monkey might infer that a fruit tree in which it has never previously foraged is now likely to be of high value if it has observed that another similar tree at another location has fruited or if it has recently observed appropriate weather conditions. Similarly, humans can make informed choices between potential holiday destinations even if they have not visited any of them before, based on information gathered from places with some shared features. This requires abstract representations of relationships and the simulation of potential consequences.

AmFC and adjacent cortex are often active during value-based decision making (Bartra et al., 2013; Clithero and Rangel, 2014; Rushworth et al., 2011), but it is currently debated whether this should be attributed to their most fundamental role being one of valuation and reward prediction or to some other cognitive process such as the identification of task structure or determination of behavioral policy (Hayden and Niv, 2021). While some frontal areas we return to below may be especially concerned with representations of value, there is increasing evidence that amFC is particularly important for representing the structure of the environment. Klein-Flügge and colleagues (2019) provided one demonstration of this in an experiment in which they taught human participants to navigate through an artificial task environment comprising a 3 × 4 array of abstract shape stimuli (figure 4A). Their goal was to learn which sequence of stimuli led to a reward, but participants also showed learning of incidental statistical relationships governing transitions through the task environment. During initial training, participants’ movements through the stimulus space were constrained by spatial distances, allowing only movements between adjacent stimuli. In line with this, upon entering the scanner, amFC BOLD activity was strongly modulated by the spatial distance separating two successive stimuli encountered. However, during scanning, movements through the 3 × 4 task environment were no longer governed by spatial constraints and the amFC signature of spatial distance faded over time. Instead, participants’ attention during scanning was guided by sequentially highlighted stimuli which were no longer necessarily adjacent in space. Highlights could “jump”, but their sequence followed a particular transition pattern. AmFC BOLD signals at the time of a highlighted stimulus were modulated by the likelihood of the experienced transition, as tracked by a simple learning model. The ability to predict such transitions provided a behavioral advantage in the scan task, and the amFC signature of transition frequency became stronger as scanning progressed. Thus, amFC’s model of the task flexibly changed in line with the most advantageous behavioral policy and reflected both spatial and non-spatial aspects of the task structure. Importantly, the structural knowledge reflected in amFC activity was present independent of whether the transitions led to reward or not. Thus, amFC was not only concerned with representing value. AmFC’s activity pattern was also distinct to that seen in another area, posterior lateral OFC. In posterior lateral OFC, activity only reflected knowledge of specific reward-reinforced stimulus sequences and, unlike in amFC, these OFC activity patterns were robust and inflexible; posterior lateral OFC activity continued to hold the same rewarded sequence representations even when they were no longer relevant for the task in hand (figure 4B).

Knowledge of abstract relationships can be particularly useful when we need to simulate new experiences to make predictions; several studies demonstrate that this is the case. For instance, Barron and colleagues asked participants to imagine new reward experiences based on novel combinations of previously experienced foods (Barron et al., 2013). Using fMRI in humans, they showed that amFC held representations of the novel experience and that these amFC representations of novel food combinations were linked to representations of the previously experienced component elements (figure 4C). In another study, Bongioanni et al. (Bongioanni et al., 2021) trained macaques to choose between pairs of two-dimensional stimuli for liquid rewards (figure 4D). They created new stimuli for the monkey by presenting new combinations of amount and probability, but importantly the component features of the new stimuli were familiar, thus allowing the monkeys to infer the value of the new stimuli. fMRI and TUS revealed that amFC held multidimensional representations of the new options that were needed for optimal decision making between them. This supports the idea that amFC performs feature integration and represents knowledge of abstract relationships to help simulate novel experiences (Fellows, 2006; Kahnt et al., 2011; Spalding et al., 2018).

A recent proposal suggests that making novel inferences in abstract task spaces may depend on neurons with “grid”-like activity patterns (Behrens et al., 2018; Whittington et al., 2020). Such neurons, first reported in rat entorhinal cortex, have activity covering the spatial arena. Each neuron covers the space by possessing an array of place fields arranged on a triangular grid. If a rat takes some paths through the testing arena, then the neuron will fire repeatedly as the animal moves from one of the neuron’s place fields to the next. If different movement trajectories are tested, the neural response will vary depending on how much any trajectory is aligned with the cells’ grid field. Because there are six ways to align to a triangular grid, when examining the neural activity measured during transitions along all possible directions, the signal will increase and decrease six times per cycle, i.e., every 60º, and because the grid fields of neurons are aligned within a given animal, the aggregate activity of the population can be measured with fMRI. This pattern is especially prominent in amFC in both humans and macaques when they navigate through non-spatial task environments (Bao et al., 2019; Baram et al., 2021; Bongioanni et al., 2021; Constantinescu et al., 2016) just as when they navigate in a spatial arena (Doeller et al., 2010). Therefore, abstraction in non-physical space may rely on a similar grid cell-based coding scheme as that first discovered for physical space. In their study, Bongioanni and colleagues (Bongioanni et al., 2021) provided evidence that amFC activity is consistent with grid-like encoding of an abstract value space for novel choices. Crucially, in addition, the authors showed that disrupting this activity with TUS impaired the integration of information across this space in the guidance of decision making (figure 4D). This demonstrates amFC’s causal role in representing abstract task relationships and in simulating novel experiences along dimensions of this task space.

AmFC is interconnected with the MTL, where grid cells were first observed. This raises the question whether one region’s activity is driven by the other, or whether they play complementary but independent roles. A recent intracranial EEG study of spatial navigation in humans (Chen et al., 2021) suggests that the medial frontal grid signal may precede the one observed in MTL. Similarly, activity in the dopaminergic midbrain and ventral striatum has sometimes been thought to resemble activity linked to model construction and inference typically observed in prefrontal cortex (Daw et al., 2011). When directly probed, amFC represents the full structure of participants’ task model while ventral striatum does not contain this information even if it reflects prediction errors that are contingent on such models (Klein-Flügge et al., 2019). Nevertheless, the precise contribution and communication between subcortical and PFC representations remains to be fully determined.

Orbitofrontal cortex in rodents and primates

The previous section focused on amFC in primates. Rodents, however, also make inferences. What frontal brain structures do they use when they do so? When Vertechi and colleagues (Vertechi et al., 2020) investigated how mice learned that only one of two locations in an environment was associated with reward and used this knowledge to make inferences about where to forage, they focused on an area that they referred to as OFC. The same region has also been emphasized by Liu and colleagues (Liu et al., 2021) who examined how mice made inferences about auditory stimuli with respect to a shifting criterion. The mice learned to respond in one of two directions depending on whether an auditory tone had a frequency above or below a criterion level. From time to time in the task, however, the criterion shifted upwards or downwards. Initially the mouse had to learn by trial and error that a given tone was now, for example, higher than the new criterion even if had not been higher than the previous criterion. However, such an experience should allow it to infer, for example, that other even higher tones are also above the new criterion. If the mouse understands the task’s structure, then it does not have to learn what to do when it hears each tone simply by accumulating experience with that tone alone; instead, it can make inferences from one tone to another. Again, OFC disruption compromises the mouse’s ability to do so. Other aspects of task structure, such as sequential organization, are also encoded in rodent OFC (Zhou et al., 2019, 2021).

The designation of the OFC area in rodents follows from the work of Uylings and van Eden (Uylings and van Eden, 1990) who attempted to establish rodent-primate similarities in prefrontal areas on the basis of their thalamic connection patterns rather than their intrinsic cytoarchitecture. However, whether rodent OFC corresponds in a simple and direct way to primate OFC has been long debated (Preuss, 1995; Wise, 2008). A notable difference between rodent and primate OFC is that OFC lesions in macaques do not cause switching deficits in reversal tasks (Rudebeck et al., 2013). Understanding how these OFC areas in primates and rodents relate to one another is far from straightforward.

There are also functional similarities between rodent and primate OFC areas. Even if they do not disrupt reversal task performance, Rudebeck and colleagues (2013, 2017) report that OFC lesions in primates do disrupt decision making in reward devaluation tasks just as they do in rodents (Lichtenberg et al., 2021; Malvaez et al., 2019; Pickens et al., 2003; Sias et al., 2021). Disruption of OFC leads to other patterns of change in reward-guided decision making that are similar in mice and macaques (Ballesta et al., 2020; Kuwabara et al., 2020) and there is evidence that human OFC plays a similar role (Howard et al., 2020; Wang et al., 2020a, 2020b). In such tasks, different choices lead to different rewards. If one reward is devalued (for example by feeding an animal to satiety on that reward prior to testing), then the animal should infer that it is no longer optimal to pick the choice that leads to the devalued option and so they should refrain from taking it and choose the alternative one. The fact that the reversal task and the devaluation task are dissociable in primates suggests they tap into at least partially dissociable cognitive abilities. Succeeding in a reversal task involves, in addition to cognitive flexibility, an understanding of the structure of the task, for which primate OFC may not be required, while succeeding a devaluation test requires anticipation of future outcomes and their value for the self, which appears to be enabled by OFC across species.

Additionally, there are also similarities between primate amFC and rodent OFC. Both primate amFC and rodent OFC (Stalnaker et al., 2015) represent structural knowledge and mediate inference and both primate amFC and rodent OFC have an approximately similar topographical relationship with other areas such as dACC and perigenual ACC/prelimbic cortex. OFC activity in rats reflects goal locations and not just current location making it reminiscent of amFC activity in humans (Klein-Flügge et al., 2019). It is also intriguing to see the similarity between the knowledge of sequential task structure decodable from human amFC (Klein-Flügge et al., 2019) and rat OFC (Zhou et al., 2021). In addition, just as human amFC is concerned not only with the learning of reward-related associations but also associations between non-rewarding task features, so is rodent OFC (Lopatina et al., 2015, 2016; Sadacca et al., 2018). On the other hand, however, it is notable how readily humans learn task structure. For example, human participants performing a version of Vertechi’s and colleagues’ (2020) inference test learn the underlying task structure an order of magnitude more quickly than mice. Such findings remind us that we should not expect learning and inference to be identical in every respect in rodents and primates.

It is also important to remember that Klein-Flügge and colleagues (Klein-Flügge et al., 2019) found evidence of encoding of sequence structure in a second frontal area – a posterior lateral orbitofrontal area. As already noted, sequence knowledge in this area is focused on reward prediction, is updated slowly but is also more robust and unchanging, compared to amFC. Its position near the border between granular and agranular cortex means that it is cytoarchitecturally more similar to rodent OFC. The speed with which representations are updated is reminiscent of the slow speed with which rodents learn task structure.

In both macaques and humans there is a region on the lateral border of the orbitofrontal cortex and ventral boarder of the ventrolateral prefrontal cortex that is important for learning specific choice-outcome contingencies (Boorman et al., 2016; Chau et al., 2015; Folloni et al., 2021; Jocham et al., 2016; Noonan et al., 2010, 2011, 2017; Rudebeck et al., 2017; Walton et al., 2010). The critical region is not in areas 11 and 13 between the lateral and medial orbitofrontal sulci (Rudebeck et al., 2017) but instead it lies in and lateral to the lateral orbitofrontal sulcus in area 47/12o (Chau et al., 2015; Folloni et al., 2021). Normally, monkeys’ decisions between choices reflect the history of reward received immediately after taking such choices in the past. When a lesion is made that includes 47/12o or ultrasound is focused to disrupt 47/12o, outcomes are not correctly credited to the choices that caused them and, consequently, choices are simply repeated if they were made in the context of a high global reward state even if the choice itself was not causally responsible for a reward. Activity in other brain regions, such as DRN and insula, reflects global reward state regardless of choice taken (Folloni et al., 2021; Wittmann et al., 2020). Therefore, animals lacking a prefrontal cortex and relying on older neural circuits may learn based on this global signal as opposed to on the basis of fine-grained and specific choice-outcome contingencies, like primates with lesions in 47/12o. In humans, adjacent but even more lateral prefrontal cortical regions mediate other related cognitive processes, for example, prospective, metacognitive estimation of the impact that choices will have even before they are taken (Miyamoto et al., 2021).

In summary, rodent OFC may hold representations that are similar to those present in both OFC and amFC in primates but not identical to either. In some regards it might resemble very posterior OFC areas on the boundary with insula in the primate brain that often receive less attention in human and macaque investigations. Moreover, some features of activity in rodent OFC, such as the encoding of alternating sequence elements (Zhou et al., 2021) or the relationships between task states (Bartolo and Averbeck, 2020; Tang et al., 2021) are reminiscent of patterns reported in yet other primate prefrontal regions (Shima et al., 2007). Rather than trying to link rodent OFC with any one of these primate areas – amFC, 47/12o, or some of the areas that we discuss below such as vmPFC – it may be better to think of rodent OFC as bearing a general similarity to all of them; we have illustrated this idea in figure 1. Such a view suggests that while rodents have a frontal cortex that equips them to make inferences (Vertechi et al., 2020), primates are ready to employ many different and specialized circuits for inferential processes (Wise, 2008).

Ventromedial prefrontal cortex and decisions but not just about reward

The amFC region described above is extensive and may contain different component subregions. When we focus on the decision process itself, however, it is noticeable how frequently that activity appears in or beyond the ventral border of this region. It is possible that another region here can be distinguished by both location and function. It is often referred to as ventromedial prefrontal cortex (vmPFC; area 14m) in humans but it corresponds to areas sometimes referred to as medial orbitofrontal cortex (mOFC) in macaques. We also note, however, that the label vmPFC has sometimes been used to refer to activations extending dorsally beyond area 14m, into amFC and perigenual ACC. Here we refer to this region as vmPFC/mOFC and argue that it is particularly important for turning representations of choice options into actual decisions (figure 5).

Figure 5. vmPFC and decisions about not just rewards.

Figure 5

(A) Decision making processes can be simulated in neural networks in which pools of neurons represent choices of one option or another. Neurons within each pool have recurrent excitatory connections but inhibitory interneurons connect the two populations. This pattern of connectivity ensures that if one population ends up in a high firing state, the other population’s activity is suppressed; the choice associated with the first population is taken while the second is not. The population that is most likely to reach the high firing state is the population with the strongest input (the choice with the highest associated evidence). (B) Emergence of positive and negative value differences in BOLD fMRI data may be explained by the time spent in the final attractor state before the network is reset. Top: The firing rate of A units is plotted against the firing rate of B units for a situation where A ends up being chosen (dark blue) and where B ends up being chosen (red). A key difference is the time spent in the high and low-firing state which is shown as larger (longer) and smaller (shorter) grey circles and which may change how much the speed of the competition process relative to the final steady state influence the activity measured using fMRI (or similar techniques with limited temporal resolution). Scenario A (left): If the steady state has the strongest influence on the measured signal, it will be positively related to the value of the option chosen and negatively to the option rejected, as typically observed in human fMRI studies. Scenario B (right): By contrast, if the speed of the competition process predominates in the signal, the observed modulation will be negative for the option chosen and positive for the option rejected as typically observed in macaque fMRI experiments. Bottom: The expected BOLD signal is illustrated schematically for the two scenarios. See also Supplementary Fig. 1 for a more detailed description of the attractor model. (C) vmPFC activity relates to the comparison of the value of the chosen option versus another option in macaques (left; adapted from Wittmann et al., 2020) and across a number of studies in humans (right; adapted from Boorman et al., 2009, 2013; Chau et al., 2014; Iigaya et al., 2020; Park et al., 2021). (D) By comparison, a more dorsal region in amFC shows activity related to task structure in macaques (left; adapted from Bongioanni et al., 2021) and across multiple studies in humans (right; adapted from Constantinescu et al., 2016; Doeller et al., 2010; Klein-Flügge et al., 2019; Park et al., 2021).

During reward-guided decision-making, vmPFC/mOFC acts as a choice option comparator. VmPFC/mOFC signals match predictions from a biophysically plausible network model (Wang, 2002, 2008; Wong and Wang, 2006) where two competing pools of neurons, each representing one option, mutually inhibit each other and compete for choice. Predictions from these models show that in measures of bulk activity such as those obtained with BOLD-fMRI or MEG, the signature of a choice computation amounts to a difference between the value of the chosen and the value of the unchosen option (Hunt et al., 2012). This is precisely the signature of activity found in vmPFC/mOFC (Boorman et al., 2009; De Martino et al., 2013; FitzGerald et al., 2009; Hunt et al., 2012; Trudel et al., 2021). As already noted, other brain areas, such as dACC, have activity that reflects key decision variables such as the difference in value between potential choices. However, closer inspection reveals important differences in activity patterns in vmPFC/mOFC and dACC. As we have seen, dACC activity reflects the value of switching away from a current opportunity to explore alternatives potentially over the course of an extended series of sequential decisions. Consistent with this observation, Boorman and colleagues (Boorman et al., 2013) reported dACC activity tracked the longer term value of a choice. However, vmPFC/mOFC reflected the choices’ current value on the present trial. Boorman and colleagues were able to tease apart the representations because they employed a task in which choice values reflected two different features, one that remained relatively constant over several trials (tracked by dACC) and one that changed frequently every trial – vmPFC reflected the difference in value between options once both features had been integrated.

One intriguing observation pertains to the direction of the value difference signal observed in vmPFC/mOFC. While it is consistently positive in humans (larger BOLD signal changes are associated with larger differences between the chosen and unchosen option values), it has the opposite sign in macaque monkeys (Bongioanni et al., 2021; Fouragnan et al., 2019; Papageorgiou et al., 2017; Wittmann et al., 2020). At first, these observations might seem incompatible. However, both types of patterns could reflect the output of the same biophysical attractor network, albeit with small modifications in the behavior of the neural populations. For example, allowing variation in the time spent in the high-firing attractor state before returning to baseline firing could produce two opposing predictions (figure 5A,B). If the neural population remains in the high-firing state for some time before returning to baseline, the observed BOLD difference would be most influenced by this sustained activation which would scale with option difference and thus lead to a positive BOLD value difference signal (figure 5B, left). On the contrary, if the neural populations only briefly transitioned through the decisive high-attractor state (e.g., because the decision is immediately passed on to another region), then the BOLD signal would most strongly be influenced by the speed of the competition process which would be shorter in a trial with a large value difference and longer when the value difference is small. Thus, in this situation, we would expect a negative relationship between value difference and the measured BOLD signal (figure 5B, right). A similar argument has been put forward elsewhere (Hunt and Hayden, 2017). It is notable that slight differences in the type of stimulus material about which macaques make decisions lead to different patterns of positive and negative modulation in single neuron-recorded patterns of activity (Okazawa et al., 2021). Again, these differences might relate to the way in which the dimension of stimulus discrimination is transformed into the dimension of response selection and, therefore, in how information about choice selection is passed to subsequent brain areas.

Direct electrophysiological recordings from macaque and human vmPFC/mOFC support its role in converting relative values into choices (Lopez-Persem et al., 2020; Strait et al., 2014). As in human neuroimaging experiments, vmPFC/mOFC activity in macaques reflects reward value integrated across dimensions, shows anti-correlated tuning for each of two options’ values during decision making indicative of value comparison, followed by coding of the chosen option’s value indicative of encoding of just the final choice (Strait et al., 2014). Similar signals are observed in intracranial electroencephalography recordings taken from vmPFC/mOFC in human epilepsy patients, albeit with a positive modulation by subjective value (Lopez-Persem et al., 2020).

Not only does vmPFC/mOFC carry signals that reflect a translation of values into choices, but these vmPFC/mOFC signals are necessary for decision making. Lesions in vmPFC/mOFC, both in macaques and humans (Camille et al., 2011; Noonan et al., 2010, 2017) and manipulation of human vmPFC/mOFC activity using tDCS (Hämmerer et al., 2016) increase choice stochasticity and reduce the accuracy of the choice comparison process. This is in line with work showing that individual variation in the excitation-inhibition balance in vmPFC/mOFC’s activity is directly related to individual variation in choice stochasticity (Jocham et al., 2012). By contrast, ultrasonic disruption of activity in amFC does not affect choice stochasticity (Bongioanni et al., 2021). Instead, it alters abstract value space representations which suggests differences in the functional roles of amFC and vmPFC/mOFC. Nevertheless, vmPFC/mOFC activity may not always be required to select choices. As choices become more familiar, and as a result rely less on online comparison processes and more on simpler heuristics and precomputed values, choice value signals become weaker or disappear entirely from vmPFC/mOFC in monkeys and humans (Bongioanni et al., 2021; Hunt et al., 2012).

In macaques, perhaps the best characterized population of neurons with a role in decision making is situated even more laterally on the orbital surface, in area 13. Here individual neurons have been identified with activity that is selective for specific options in reward-guided decisions. In an important series of studies, Padoa-Schioppa and colleagues examined decisions between visual stimuli associated with different types of juice. For example, in one experiment blue and yellow squares indicated water or unsweetened kool-aid. Increments in the number of stimulus elements – i.e., the number of squares – indicated increments in the amount of that juice type available. “Offer value” neurons were selective for particular stimuli/juice types but their firing rates changed with the amount available (Cai and Padoa-Schioppa, 2014; Padoa-Schioppa and Conen, 2017). The activity distributed across the population of neurons in and near this area encodes the identities of potential choice options and, during the course of decisions, it is possible to track the relative strengths of the representations of the potential choices. The relative strengths of representations may change repeatedly during the course of decision, especially when the options are close in value, but eventually one comes to predominate and the choice is taken (Bongioanni et al., 2021; Hunt et al., 2018; Klein-Flugge et al., 2013; Rich and Wallis, 2016). Selective and focused inactivation or stimulation of area 13 alone is sufficient to interfere with the way in which value-guided decisions are made (Ballesta et al., 2020; Murray et al., 2015). Understanding how area 13 and the more medial vmPFC/mOFC area discussed above collaborate or specialize during decision making remains an ongoing topic of discussion.

VmPFC/mOFC value comparison signals reflect many other influences that impact on the way that options are valued during decision making. For instance, the presence of a third option may impact the way that two other options are valued and compared and thus affects which choice is likely to be taken (Chau et al., 2020; Dumbalska et al., 2020; Louie et al., 2015; Webb et al., 2020). In addition, the presence of a less valuable item within a compound option is known to reduce the estimated value of the compound relative to the more valuable item alone, in the “less-is-more” effect displayed by both human and non-human primates (Kralik et al., 2012; List, 2002). Both of these phenomena are reflected in vmPFC/mOFC’s choice comparison signal (Chau et al., 2014; Fouragnan et al., 2019; Lim et al., 2011; Papageorgiou et al., 2017; Suzuki et al., 2017) and both phenomena are disrupted by lesions in vmPFC/OFC (Noonan et al., 2010, 2017; Papageorgiou et al., 2017). When satiety or background context change the way that options are valued, then vmPFC activity reflects changes in the way that the options will be valued even before any decision is made (Abitbol et al., 2015).

However, many factors other than the reward value of a choice influence whether it will be taken. Such influences include, for example, the recent reward rate regardless of which choice was taken, choice traces (the history of which choices were taken recently regardless of whether they were rewarded), the number of offers viewed, the attended location, the sense of social controllability, the confidence and uncertainty related to a choice, or the value of upcoming information. All these variables have been shown to affect vmPFC activity at the time of decision-making (Kaanders et al., 2021; Leong et al., 2017; Mehta et al., 2019; Na et al., 2021; Trudel et al., 2021; Wittmann et al., 2020). One important consideration seems to be the policy that is currently guiding behavior. In a recent study, Trudel and colleagues (2021) demonstrated that vmPFC/mOFC activity displays an impressive degree of flexibility and that the same variable can be associated with either a positive or a negative change in vmPFC/mOFC activity depending on the goal of the decision. In their study, optimal decisions required periods of exploration and periods of exploitation. Not only did vmPFC/mOFC BOLD reflect the uncertainty as well as the value of choices, but vmPFC/mOFC BOLD signatures of choice uncertainty flipped sign depending on context, with negative uncertainty coding during exploitation (when participants were selecting options that they were certain were high in value) and positive uncertainty coding during exploration (when participants were selecting options that they were uncertain about in order to find out more about their value). Such a change is consistent with the existence of not only neural circuits specialized for exploration but also neural systems mediating both reward exploration and exploitation (Costa et al., 2019).

In summary, vmPFC/mOFC represents potential choice options, computes their comparison, and turns them into actual choices in the frame of reference currently relevant for guiding actions (Grueschow et al., 2015). More generally, and in contrast to dACC, vmPFC/mOFC activity reflects the relative evidence for taking one choice over another, along the multiple dimensions and in the frame of reference relevant for the choice at hand. In many cases this might mean that it reflects choice values, but other variables might be represented depending on current action policies (Hayden and Niv, 2021).

The boundary between amFC and vmPFC/mOFC remains to be precisely defined (figure 5C, D). Many studies on reward-guided decision making find activations in both locations (Bartra et al., 2013; Clithero and Rangel, 2014) or at the border (Schuck et al., 2016). This might be because these tasks often require simulations of novel option values or states as well as choice computations, meaning the two processes occur simultaneously and may not be easily teased apart. More evidence to support a dissociation between abstract structure representations in amFC and the turning of such representations into decisions in vmPFC/mOFC will therefore be needed. However, a recent study by Park et al. (Park et al., 2021) provided a first compelling test for this. In their task, human participants were required to represent abstract relationships between different individuals along two dimensions (popularity and competence). When participants had been trained on these relationships, they were asked to make binary choices about which of two individuals would be a better partner for a third individual while undergoing fMRI. Making such a choice required representing the popularity and competence of each of the three individuals on the one hand, and, on the other hand, a computation of the combined strength or ‘growth potential’ of each two-person team. A hexagonal modulation of the BOLD signal in amFC indicated grid-like coding of the growth potential in the abstract social space spanned by all possible individuals, while the BOLD signal related to the decision variable – the difference between teams’ growth potential – was located more ventrally in vmPFC/mOFC (Figure 5C, D). This provides compelling evidence, within the same task, that abstract grid-like representations of relevant relationships are represented in amFC and converted into decisions in vmPFC/mOFC (Park et al., 2021).

As is the case for other frontal cortex regions, it is not always clear how the activity patterns that emerge in vmPFC/mOFC during decision making can be distinguished from the activity patterns seen in subcortical structures such as the ventral striatum and dopaminergic midbrain. Both of these structures show activity related to choice selection that resembles that seen in vmPFC/mOFC and precedes it in time (Strait et al., 2015; Yun et al., 2020). So far, however, at least within the dopaminergic midbrain, such activity has been recorded in very simple situations in which monkeys are presented with a single opportunity and the decision is whether or not to engage with that option rather than a process of comparison between two or more options. It is possible that representations in vmPFC/mOFC as opposed to the dopaminergic midbrain are especially important when the decision to be made is new and linked to inferential processes in adjacent amFC.

Perigenual cingulate cortex and cost-benefit arbitration

While amFC encodes task structure even when this does not involve value or reward, a region anterior to the genu of the corpus callosum and slightly posterior to amFC, the peri- or pregenual anterior cingulate cortex (pgACC), integrates costs and benefits to evaluate the overall value of initiating an action. Amemori and colleagues (Amemori and Graybiel, 2012) recorded neural activity from pgACC of macaque monkeys while they evaluated cues that were simultaneously associated with varying levels of air-puff (cost) and liquid food reward (benefit). When monkeys chose to approach the cue, they received both the associated air-puff and liquid reward; when they avoided the cue, they received neither outcome (figure 6A, left). The monkeys’ choices indicated that they were influenced by both the cost and benefit associated with an offer. Simultaneous recordings from pgACC neurons revealed different activity but, in summary, firing rates were best explained as reflecting the overall utility of the chosen outcome. In other words, firing rates showed an integration across the cost and benefit dimensions of the cue. Furthermore, microstimulation of pgACC produced changes in the animals’ cost-benefit decisions such that they were more likely, on average, to avoid rather than approach the cue. Importantly, microstimulation was most effective on trials where the choice required a trade-off between costs and benefits, and thus when the positive and negative motivational aspects of the cue competed to drive behavior in opposite directions.

Figure 6. pgACC and the costs and benefits of initiating a course of action.

Figure 6

(A) pgACC evaluates the cost and benefits of initiating a course of action. (left) Neurons in macaque pgACC reflect the benefit (juice) and cost (air puff) of taking a choice. Microstimulation in the area with a predominance of aversive cost-related neurons led to a change in the decision boundary for taking the action (adapted from Amemori and Graybiel, 2012). (center) Activity in human pgACC reflects each individual’s subjective valuation of an opportunity composed of both a monetary reward and a temporal delay cost (adapted from Kable and Glimcher, 2007). (right) Rats decided between pursuing a large food reward associated with an aversive cost (a bright light) or a small reward in a less aversive, darker environment. Optogenetic activation and inhibition of a pathway from the pgACC-like PL area to inhibitory interneurons in the striosomes of the striatum led to decrements and increments in high value/high cost choices (adapted from Friedman et al., 2015). (B) pgACC is related to action initiation. (left) In humans individual variation in pgACC activity is predictive of whether or not a choice will be pursued (adapted from Kolling et al., 2018). (center) Activity in human pgACC reflects the relative average preference for a default choice type, for example people may be a priori more likely to accept website cookies than to manage them, or to take a sweet rather than a savory snack regardless of the specific sweet and savory snacks they are offered (adapted from Lopez-Persem et al., 2016). (right) pgACC tracks expected performance on a perceptual task and this effect scales with subjective decision confidence (adapted from Bang and Fleming, 2018).

The idea that pgACC is crucial for integrating costs and benefits to decide whether it is worth initiating an action is consistent with work in other species, including humans and rodents, all of whom share this agranular part of PFC. In humans, pgACC BOLD signals were shown to reflect integrated cost-benefit value in a delay-based decision making task where a larger delayed reward offer could be accepted or foregone for a small reward received immediately (Kable and Glimcher, 2007). In this task, pgACC BOLD was better explained by the integrated subjective value of the delayed option than by reward amount or delay considered separately (figure 6A, center). Consistent with this, causal evidence from lesion experiments in rats shows that cost-benefit integrations are impaired following lesions that include pgACC. Walton et al. (Walton et al., 2002) trained rats to choose between two arms of a T-maze, one of which was associated with a higher reward and a cost, climbing a barrier, while the other resulted in a smaller reward but did not require climbing a barrier. While rewards in the high-effort arm could be adjusted such that healthy rats generally preferred the high-effort/high-reward arm, following a lesion that included pgACC, the same rats were less likely to choose the high-effort arm, even though they had no problem with climbing a barrier or with choosing the high-reward option when both arms of the T-maze included a barrier (Walton et al., 2002).

In work by Friedman and colleagues (Friedman et al., 2015), the trade-off between costs and benefits was shown to rely on pgACC’s projections to specialized regions in the striatum, the striosome (a component of the subcortical circuit for reward-guided behavior illustrated in figure 2A). Friedman et al. optogenetically targeted pgACC cells projecting to the striosome in a mouse T-maze task like that used by Walton and colleagues, except that the cost in this task was the overcoming of the mice’s instinctual aversion to a bright light instead of a barrier (figure 6A, right). Friedman and colleagues found that inhibition and excitation of the pgACC-striosome pathway induced shifts in mice behavior leading to an increase or decrease in choosing the high-cost/high-reward option, respectively. This effect was pathway-specific and specific to the cost-benefit condition and thus situations where the net outcome entailed motivationally conflicting positive and negative components that had to be integrated to make a choice. It is notable that in primates too, pgACC is very unusual in having a projection to the striosome; within frontal cortex only pgACC and a posterior OFC region, on the border with the insula, project to the striosome (Eblen and Graybiel, 1995).

Whether the precise type of cost determines pgACC’s involvement in decision making remains to be clarified. In the above, we have discussed work linking pgACC to action initiation and cost-benefit integration across a wide range of costs: aversive bright lights (Friedman et al., 2015) and air-puffs (Amemori and Graybiel, 2012), effort costs of climbing a barrier (Walton et al., 2002), as well as delays (Kable and Glimcher, 2007). By contrast, computations reflecting the direct comparison of effort- and reward-linked choice options have been associated with a more dorsal posterior cingulate area in humans (Klein-Flügge et al., 2016), and the processing of different types of costs occurs in dissociable neural circuits some of which are distinct from pgACC (Bonnelle et al., 2016; Burke et al., 2013; Croxson et al., 2009; Kennerley et al., 2009; Kurniawan et al., 2013; Prevost et al., 2010; Rudebeck et al., 2006; Scholl et al., 2015; Walton et al., 2003). While specific types of costs may be processed in separate subregions of PFC, pgACC seems to be crucial for integrating the motivational value of an outcome across costs and benefits to initiate or avoid initiating an approach behavior (figure 6B). In ecological environments where opportunities typically arise sequentially, foraging animals frequently encounter this type of decision about whether to engage with a particular opportunity given its costs and benefits. In laboratory-based tasks for humans that capture aspects of such scenarios (figure 6B, left), individual variation in pgACC activity and pgACC connectivity with striosome-rich parts of the striatum predict individual variation in whether behavior will be determined by the potential benefits that might ensue from the course of action, despite increasing costs, and thus how likely participants are to proceed with taking the course of action (Kolling et al., 2012, 2018). PgACC’s anatomical position and connections to regions beyond the striosome, such as with the subgenual ACC, amygdala, habenula, neuromodulatory systems and periaqueductal grey (An et al., 1998; Chiba et al., 2001) place it in an ideal position to provide motivational regulation of the initiation of actions, perhaps especially approach and avoidance choices (Khalighinejad et al., 2020a, 2020b, 2021). In humans, pgACC activity reflects the value of default options that people are most likely to go ahead and choose (figure 6B, center) (Lopez-Persem et al., 2016). When pgACC is absent or not providing an input to the striosomes, then animals still initiate actions, but they do not always initiate them in the situations in which they had judged them to be worthwhile in the control condition. While the subcortical circuitry that pgACC projects to is sufficient for action initiation it may not be sufficient for determining when the balance of costs and benefits suggests it is best to initiate action.

In order to decide whether to engage with an opportunity, it is important to track the success of recent engagements with opportunities, or in other words, to track one’s own recent performance (figure 6B, right). In humans, pgACC carries signals consistent with the monitoring of one’s own performance over both shorter and more extended time frames (Bang and Fleming, 2018; Wittmann et al., 2016b). For example, Wittmann and colleagues reported that pgACC activity reflected the feedback human participants received about their performance levels on a variety of arbitrary games and it predicted the influence that the feedback would have on the participants’ estimations of their ability levels. One recent hypothesis is that this type of self-awareness may be altered in mood and anxiety disorders known to implicate pgACC (Amemori et al., 2021).

The pgACC area that we discuss in this section is close to amFC and vmPFC and determining the precise border in relation to landmarks such as the cingulate sulcus is important but still a matter of debate. Moreover, it is possible that these regions co-activate when multidimensional features need to be integrated in order to derive a choice value estimate (Bongioanni et al., 2021) and of course all these areas and dACC might be expected to interact with one another and with areas beyond frontal cortex (Klein et al., 2017; Korn and Bach, 2018, 2019; Maier et al., 2015). However, when this integration concerns more abstract features and inferential processes, the peak is more anterior, in amFC, but when the integration is between costs and benefits relating to a specific action that might or might not be taken, the peak is more posterior, in pgACC (Amemori and Graybiel, 2012; Khalighinejad et al., 2020a). Other studies and reviews also confirm a functional difference between pgACC and more anterior brain areas (Grabenhorst and Rolls, 2011).

Dorsomedial frontal cortex and the organization of interpersonal relationships

So far, we have seen that activity in amFC encodes the structure and organization of the task environment. Alongside their ability to learn about a multitude of arbitrary task environments, humans and many other primates, spend a large part of their time navigating one particular type of environment – the social environment. The most dorsal part of medial frontal cortex – dorsomedial frontal cortex (dmPFC) – may have a specialized role in encoding key features of the structure of the social environment and the position of the decision maker within this social environment.

Social environments share many features with other arbitrary task environments. The diverse and ever-changing patterns of the social structures in which humans dwell attests to their arbitrariness. However, at the same time, there are also features of social environments, such as competition and cooperation, that are consistently present. Competition and cooperation have an important impact on the individual animal’s or person’s fitness, health, and longevity. For example, in macaques, competition and collaboration are important predictors of social dominance and, in turn, these are predictors of breeding success (Schulke et al., 2010). In humans, loneliness – social isolation and absence of cooperation – has a major impact on mortality (Holt-Lunstad et al., 2010, 2015).

Patterns of competition and collaboration are rarely static for long. Despite this temporal complexity, an important feature of many social environments is that, regardless of their arbitrariness, the position of the self within the environment is a key anchor or origin for the task space. Wittmann and colleagues (Wittmann et al., 2016b, 2021) investigated how changing patterns of competition and cooperation between participant and other pre-programmed “players” are tracked over time during a series of simple games. On each trial, participants performed a simple task and received feedback about how well they and two other players had done. On some trials the participants cooperated with one of the other players – the sum of their performances determined a payoff. On other trials, however, they competed – now the differences in their performances determined payoff. Not surprisingly, the participants’ assessments of their own performances reflected the feedback they received about their performances (figure 7A, left and center). Moreover, as noted above, such feedback was associated with pgACC activity and individual variation in its impact on pgACC was associated with individual variation in the impact it had on self-assessment.

Figure 7. dmPFC and the organization of inter-individual relationships.

Figure 7

(A) (Left) In a game of multi-player competition and co-operation, people track not just their own performances in order to form an estimate of their ability, but these estimates are also influenced by the performances of the people around them. In a complementary fashion, people’s estimates of others’ abilities are also influenced by their own performances. These two ways of inappropriately ascribing value to oneself or another person are referred to as self-other-mergence. (Centre) The direction of self-other-mergence effects depend on context (cooperation vs competition). For example, a good partner (high other-performance) boosts self-value in cooperation but diminishes self-value in competition. (Right) While pgACC tracks their own performance levels (see also figure 6B), dmPFC tracks both the influence that other players have on the self-performance estimate and the estimate that the self’s own performance has on estimates of the other players (adapted from Wittmann et al., 2016b, 2018, 2021). (B) When one monkey watches another monkey in order to work out which choice is the better one to take, neural activity in dmPFC tracks the partner monkey’s actions and distinguishes between situation when the partner makes erroneous versus correct actions allowing the observing animal to learn which actions not to repeat (adapted from Yoshida et al., 2011, 2012).

Their evaluations of themselves, however, also varied as a function of the other players’ performances. When they cooperated with good players, they rated their own performances as stronger and when they competed with good players, they rated their own performances as weaker, and vice-versa for weak players (figure 7A, left and center). Wittmann and colleagues called this phenomenon self-other mergence and found that it was dependent on dmPFC; activity there tracked the performance of the other player and predicted its impact on self-performance estimates. In addition, it had a complementary effect; dmPFC activity also tracked self-performance and the impact that one’s own performance had on the estimation of the other player (figure 7A, right). However, self-other mergence is not a simple consequence of dmPFC activity; when dmPFC is disrupted by transcranial magnetic stimulation (TMS) then self-other mergence is augmented, even if its correlation with neural activity is abolished. This pattern of change suggests dmPFC may be critical for disentangling and tracking what each agent is doing. When this is disrupted by dmPFC TMS then people appear to track the aggregate consequence of the competitive or cooperative interaction at the expense of the individual performances.

In monkeys, dmPFC activity also tracks the performances of other individuals; dmPFC activity reflects both the choices that other animals make and the rewards that they receive for making them (Ninomiya et al., 2020; Noritake et al., 2018; Yoshida et al., 2011, 2012; figure 7B). In a manner reminiscent of human self-other mergence, the social context has an impact on the way in which macaques evaluate the choices that are available and the consequences that will follow. For example, anticipatory licking measures suggest macaques value stimuli more if they are associated with more reward for the macaque itself, but they value stimuli less if they are associated with more reward for other animals. While some dmPFC neurons track a stimulus’ association with reward for the individual itself, others track the stimulus’ association with reward for the other monkey. Such activity in dmPFC neurons precedes activity in other brain structures, such as the dopaminergic midbrain, that reflects the monkey’s evaluation of its own reward prospects given the context of the social environment (Noritake et al., 2018). In the context of competitive games, activity in dmPFC also reflects the monkeys’ adoption of response selection strategies that mean that each decision they take is difficult to predict from previous decisions (Seo et al., 2014). As a result, the monkey’s decisions are difficult for other individuals in the group to predict and pre-empt. Recording of human dmPFC neurons suggest that it not only tracks the behaviors of other individuals but also, at least in humans, the beliefs of other individuals (Jamali et al., 2021).

It is difficult to determine the degree to which dmPFC is exclusively concerned with social structure and social decision making. On the one hand, the amFC activity that is recorded in tasks lacking any simple social component extends into dmPFC (Barron et al., 2013; Constantinescu et al., 2016). On the other hand, social tasks that involve tracking other individuals’ thoughts or behaviors typically activate dmPFC only (Behrens et al., 2008; Frith and Frith, 2012; Konovalov et al., 2021; Wittmann et al., 2016b, 2021).

Summary

Neural systems in many vertebrates exist that represent the value of the environment. An extended subcortical circuit spanning the striatum and midbrain and brainstem nuclei of mammals correspond to these ancient systems. In addition, however, mammals possess several frontal cortical regions concerned with guidance of decision making and adaptive, flexible behavior. While these frontal systems interact extensively with these subcortical circuits, they make specific contributions to behavior and they also influence behavior via other cortical routes. While some areas such as ACC, present in a broad range of mammals, represent the distribution of opportunities in an environment over space and time, other brain regions such as amFC and dmPFC have roles in representing structural associations and causal links between environment features including aspects of the social environment (Figure 8). Although the origins of these areas and their functions are traceable to rodents, they are especially prominent in primates. They make it possible not just to select choices on the basis of past experience of identical situations, but to make inferences to guide decisions in new scenarios.

Figure 8. Summary of functional specializations,

Figure 8

Schematic overview of the functional contributions of different subregions of medial prefrontal cortex discussed in this review.

Supplementary Material

SupplFig1

In Brief.

Prefrontal cortex (PFC) provides high level coordination of behavior but is not a homogenous structure. In this review, Klein-Flügge et al. compare the function of several distinct PFC regions and connected subcortical nuclei in decision-making, behavioral flexibility, and social behavior.

Acknowledgements

MCKF was funded by a Wellcome Trust/Royal Society Sir Henry Dale Fellowship (223263/Z/21/Z). MFSR was supported by a Wellcome Investigator Award (221794/Z/20/Z). All authors were supported by a Medical Research Council grant awarded to MFSR (MR/P024955/1). We would like to thank Nadescha Trudel and Laurence Hunt for helpful comments on an earlier version of the manuscript.

Footnotes

Declaration of Interest

The authors declare no competing interests.

References

  1. Abitbol R, Lebreton M, Hollard G, Richmond BJ, Bouret S, Pessiglione M. Neural mechanisms underlying contextual dependency of subjective values: converging evidence from monkeys and humans. J Neurosci. 2015;35:2308–2320. doi: 10.1523/JNEUROSCI.1878-14.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Agetsuma M, Aizawa H, Aoki T, Nakayama R, Takahoko M, Goto M, Sassa T, Amo R, Shiraki T, Kawakami K, et al. The habenula is crucial for experience-dependent modification of fear responses in zebrafish. Nat Neurosci. 2010;13:1354–1356. doi: 10.1038/nn.2654. [DOI] [PubMed] [Google Scholar]
  3. Amemori K, Graybiel AM. Localized microstimulation of primate pregenual cingulate cortex induces negative decision-making. Nat Neurosci. 2012;15:776–785. doi: 10.1038/nn.3088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Amemori S, Graybiel AM, Amemori K. Causal Evidence for Induction of Pessimistic Decision-Making in Primates by the Network of Frontal Cortex and Striosomes. Front Neurosci. 2021;15:741. doi: 10.3389/fnins.2021.649167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Amo R, Fredes F, Kinoshita M, Aoki R, Aizawa H, Agetsuma M, Aoki T, Shiraki T, Kakinuma H, Matsuda M, et al. The habenulo-raphe serotonergic circuit encodes an aversive expectation value essential for adaptive active avoidance of danger. Neuron. 2014;84:1034–1048. doi: 10.1016/j.neuron.2014.10.035. [DOI] [PubMed] [Google Scholar]
  6. An X, Bandler R, Ongur D, Price JL. Prefrontal cortical projections to longitudinal columns in the midbrain periaqueductal gray in macaque monkeys. J Comp Neurol. 1998;401:455–479. [PubMed] [Google Scholar]
  7. Ashwell KWS, McAllan BM, Mai JK, Paxinos G. Cortical cyto- and chemoarchitecture in three small Australian marsupial carnivores: Sminthopsis macroura, Antechinus stuartii and Phascogale calura. Brain Behav Evol. 2008;72:215–232. doi: 10.1159/000165101. [DOI] [PubMed] [Google Scholar]
  8. Badre D, Doll BB, Long NM, Frank MJ. Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration. Neuron. 2012;73:595–607. doi: 10.1016/j.neuron.2011.12.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Ballesta S, Shi W, Conen KE, Padoa-Schioppa C. Values encoded in orbitofrontal cortex are causally related to economic choices. Nature. 2020;588:450–453. doi: 10.1038/s41586-020-2880-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Banerjee A, Parente G, Teutsch J, Lewis C, Voigt FF, Helmchen F. Value-guided remapping of sensory cortex by lateral orbitofrontal cortex. Nature. 2020;585:245–250. doi: 10.1038/s41586-020-2704-z. [DOI] [PubMed] [Google Scholar]
  11. Bang D, Fleming SM. Distinct encoding of decision confidence in human medial prefrontal cortex. Proc Natl Acad Sci. 2018;115:6082–6087. doi: 10.1073/pnas.1800795115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Bao X, Gjorgieva E, Shanahan LK, Howard JD, Kahnt T, Gottfried JA. Grid-like Neural Representations Support Olfactory Navigation of a Two-Dimensional Odor Space. Neuron. 2019;102:1066–1075.:e5. doi: 10.1016/j.neuron.2019.03.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Barack DL, Platt ML. Neuronal Activity in the Posterior Cingulate Cortex Signals Environmental Information and Predicts Behavioral Variability during Trapline Foraging. J Neurosci Off J Soc Neurosci. 2021;41:2703–2712. doi: 10.1523/JNEUROSCI.0305-20.2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Barack DL, Chang SWC, Platt ML. Posterior Cingulate Neurons Dynamically Signal Decisions to Disengage during Foraging. Neuron. 2017;96:339–347.:e5. doi: 10.1016/j.neuron.2017.09.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Baram AB, Muller TH, Nili H, Garvert MM, Behrens TEJ. Entorhinal and ventromedial prefrontal cortices abstract and generalize the structure of reinforcement learning problems. Neuron. 2021;109:713–723.:e7. doi: 10.1016/j.neuron.2020.11.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Barron HC, Dolan RJ, Behrens TE. Online evaluation of novel choices by simultaneous representation of multiple memories. Nat Neurosci. 2013;16:1492–1498. doi: 10.1038/nn.3515. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Bartolo R, Averbeck BB. Prefrontal Cortex Predicts State Switches during Reversal Learning. Neuron. 2020;106:1044–1054.:e4. doi: 10.1016/j.neuron.2020.03.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Bartra O, McGuire JT, Kable JW. The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. Neuroimage. 2013;76:412–427. doi: 10.1016/j.neuroimage.2013.02.063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Behrens TE, Woolrich MW, Walton ME, Rushworth MF. Learning the value of information in an uncertain world. Nat Neurosci. 2007;10:1214–1221. doi: 10.1038/nn1954. https://doi.org/nn1954[pii] [DOI] [PubMed] [Google Scholar]
  20. Behrens TE, Hunt LT, Woolrich MW, Rushworth MF. Associative learning of social value. Nature. 2008;456:245–249. doi: 10.1038/nature07538. https://doi.org/nature07538[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Behrens TEJ, Muller TH, Whittington JCR, Mark S, Baram AB, Stachenfeld KL, Kurth-Nelson Z. What Is a Cognitive Map? Organizing Knowledge for Flexible Behavior. Neuron. 2018;100:490–509. doi: 10.1016/j.neuron.2018.10.002. [DOI] [PubMed] [Google Scholar]
  22. Bernacchia A, Seo H, Lee D, Wang XJ. A reservoir of time constants for memory traces in cortical neurons. Nat Neurosci. 2011;14:366–372. doi: 10.1038/nn.2752. https://doi.org/nn.2752[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Blanchard TC, Hayden BY. Neurons in dorsal anterior cingulate cortex signal postdecisional variables in a foraging task. J Neurosci. 2014;34:646–655. doi: 10.1523/JNEUROSCI.3151-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Bongioanni A, Folloni D, Verhagen L, Sallet J, Klein-Flügge MC, Rushworth MFS. Activation and disruption of a neural mechanism for novel choice in monkeys. Nature. 2021;591:270–274. doi: 10.1038/s41586-020-03115-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Bonnelle V, Manohar S, Behrens T, Husain M. Individual Differences in Premotor Brain Systems Underlie Behavioral Apathy. Cereb Cortex. 2016;26:807–819. doi: 10.1093/cercor/bhv247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Boorman ED, Behrens TE, Woolrich MW, Rushworth MF. How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron. 2009;62:733–743. doi: 10.1016/j.neuron.2009.05.014. https://doi.org/S0896-6273(09)00389-4[pii] [DOI] [PubMed] [Google Scholar]
  27. Boorman ED, Rushworth MF, Behrens T. Ventromedial prefrontal and anterior cingulate cortex adopt choice and default reference frames during sequential multialternative choice. J Neurosci. 2013;33:2242–2253. doi: 10.1523/JNEUROSCI.3022-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Boorman ED, Rajendran VG, O’Reilly JX, Behrens TE. Two Anatomically and Computationally Distinct Learning Signals Predict Changes to Stimulus-Outcome Associations in Hippocampus. Neuron. 2016;89:1343–1354. doi: 10.1016/j.neuron.2016.02.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Bromberg-Martin ES, Hikosaka O. Midbrain dopamine neurons signal preference for advance information about upcoming rewards. Neuron. 2009;63:119–126. doi: 10.1016/j.neuron.2009.06.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Burke CJ, Brünger C, Kahnt T, Park SQ, Tobler PN. Neural integration of risk and effort costs by the frontal pole: only upon request. J Neurosci Off J Soc Neurosci. 2013;33:1706–1713a. doi: 10.1523/JNEUROSCI.3662-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Bush D, Barry C, Manson D, Burgess N. Using Grid Cells for Navigation. Neuron. 2015;87:507–520. doi: 10.1016/j.neuron.2015.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Cai X, Padoa-Schioppa C. Contributions of orbitofrontal and lateral prefrontal cortices to economic choice and the good-to-action transformation. Neuron. 2014;81:1140–1151. doi: 10.1016/j.neuron.2014.01.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Camille N, Griffiths CA, Vo K, Fellows LK, Kable JW. Ventromedial frontal lobe damage disrupts value maximization in humans. J Neurosci. 2011;31:7527–7532. doi: 10.1523/JNEUROSCI.6527-10.2011. https://doi.org/31/20/7527[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Cavanagh SE, Wallis JD, Kennerley SW, Hunt LT. Autocorrelation structure at rest predicts value correlates of single neurons during reward-guided choice. ELife. 2016;5:e18937. doi: 10.7554/eLife.18937. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Charnov EL. Optimal foraging: the marginal value theorem. Theor Popul Biol. 1976;9:129–136. doi: 10.1016/0040-5809(76)90040-x. [DOI] [PubMed] [Google Scholar]
  36. Chau BK, Kolling N, Hunt LT, Walton ME, Rushworth MF. A neural mechanism underlying failure of optimal choice with multiple alternatives. Nat Neurosci. 2014;17:463–470. doi: 10.1038/nn.3649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Chau BK, Sallet J, Papageorgiou GK, Noonan MP, Bell AH, Walton ME, Rushworth MF. Contrasting Roles for Orbitofrontal Cortex and Amygdala in Credit Assignment and Learning in Macaques. Neuron. 2015;87:1106–1118. doi: 10.1016/j.neuron.2015.08.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Chau BK, Law C-K, Lopez-Persem A, Klein-Flügge MC, Rushworth MF. Consistent patterns of distractor effects during decision making. ELife. 2020;9:e53850. doi: 10.7554/eLife.53850. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Chen D, Kunz L, Lv P, Zhang H, Zhou W, Liang S, Axmacher N, Wang L. Theta oscillations coordinate grid-like representations between ventromedial prefrontal and entorhinal cortex. Sci Adv. 2021;7:eabj0200. doi: 10.1126/sciadv.abj0200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Chiba T, Kayahara T, Nakano K. Efferent projections of infralimbic and prelimbic areas of the medial prefrontal cortex in the Japanese monkey, Macaca fuscata. Brain Res. 2001;888:83–101. doi: 10.1016/s0006-8993(00)03013-4. [DOI] [PubMed] [Google Scholar]
  41. Clithero JA, Rangel A. Informatic parcellation of the network involved in the computation of subjective value. Soc Cogn Affect Neurosci. 2014;9:1289–1302. doi: 10.1093/scan/nst106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Cohen JY, Amoroso MW, Uchida N. Serotonergic neurons signal reward and punishment on multiple timescales. ELife. 2015;4 doi: 10.7554/eLife.06346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Constantinescu AO, O’Reilly JX, Behrens TE. Organizing conceptual knowledge in humans with a gridlike code. Science. 2016;352:1464–1468. doi: 10.1126/science.aaf0941. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Costa VD, Mitz AR, Averbeck BB. Subcortical Substrates of Explore-Exploit Decisions in Primates. Neuron. 2019;103:533–545.:e5. doi: 10.1016/j.neuron.2019.05.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Croxson PL, Walton ME, O’Reilly JX, Behrens TE, Rushworth MF. Effort-based cost-benefit valuation and the human brain. J Neurosci. 2009;29:4531–4541. doi: 10.1523/JNEUROSCI.4515-08.2009. https://doi.org/29/14/4531[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Dabney W, Kurth-Nelson Z, Uchida N, Starkweather CK, Hassabis D, Munos R, Botvinick M. A distributional code for value in dopamine-based reinforcement learning. Nature. 2020;577:671–675. doi: 10.1038/s41586-019-1924-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Danielmeier C, Allen EA, Jocham G, Onur OA, Eichele T, Ullsperger M. Acetylcholine mediates behavioral and neural post-error control. Curr Biol CB. 2015;25:1461–1468. doi: 10.1016/j.cub.2015.04.022. [DOI] [PubMed] [Google Scholar]
  48. Daw ND, O’Doherty JP, Dayan P, Seymour B, Dolan RJ. Cortical substrates for exploratory decisions in humans. Nature. 2006;441:876–879. doi: 10.1038/nature04766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Daw ND, Gershman SJ, Seymour B, Dayan P, Dolan RJ. Model-based influences on humans’ choices and striatal prediction errors. Neuron. 2011;69:1204–1215. doi: 10.1016/j.neuron.2011.02.027. https://doi.org/S0896-6273(11)00125-5[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. De Martino B, Fleming SM, Garret N, Dolan RJ. Knowing what you want: confidence in value-based choice. Nat Neurosci. 2013;16:105–110. doi: 10.1038/nn.3279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Doeller CF, Barry C, Burgess N. Evidence for grid cells in a human memory network. Nature. 2010;463:657–661. doi: 10.1038/nature08704. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Donoso M, Collins AG, Koechlin E. Human cognition. Foundations of human reasoning in the prefrontal cortex. Science. 2014;344:1481–1486. doi: 10.1126/science.1252254. [DOI] [PubMed] [Google Scholar]
  53. Dumbalska T, Li V, Tsetsos K, Summerfield C. A map of decoy influence in human multialternative choice. Proc Natl Acad Sci U S A. 2020;117:25169–25178. doi: 10.1073/pnas.2005058117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Eblen F, Graybiel AM. Highly restricted origin of prefrontal cortical inputs to striosomes in the macaque monkey. J Neurosci. 1995;15:5999–6013. doi: 10.1523/JNEUROSCI.15-09-05999.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Eldar E, Rutledge RB, Dolan RJ, Niv Y. Mood as Representation of Momentum. Trends Cogn Sci. 2016;20:15–24. doi: 10.1016/j.tics.2015.07.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Farashahi S, Donahue CH, Khorsand P, Seo H, Lee D, Soltani A. Metaplasticity as a Neural Substrate for Adaptive Learning and Choice under Uncertainty. Neuron. 2017;94:401–414.:e6. doi: 10.1016/j.neuron.2017.03.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Fellows LK. Deciding how to decide: ventromedial frontal lobe damage affects information acquisition in multi-attribute decision making. Brain. 2006;129:944–952. doi: 10.1093/brain/awl017. [DOI] [PubMed] [Google Scholar]
  58. Fischer AG, Endrass T, Reuter M, Kubisch C, Ullsperger M. Serotonin reuptake inhibitors and serotonin transporter genotype modulate performance monitoring functions but not their electrophysiological correlates. J Neurosci Off J Soc Neurosci. 2015;35:8181–8190. doi: 10.1523/JNEUROSCI.5124-14.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Fischer AG, Bourgeois-Gironde S, Ullsperger M. Short-term reward experience biases inference despite dissociable neural correlates. Nat Commun. 2017;8:1690. doi: 10.1038/s41467-017-01703-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. FitzGerald TH, Seymour B, Dolan RJ. The role of human orbitofrontal cortex in value comparison for incommensurable objects. J Neurosci. 2009;29:8388–8395. doi: 10.1523/JNEUROSCI.0717-09.2009. https://doi.org/29/26/8388[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Folloni D, Fouragnan E, Wittmann MK, Roumazeilles L, Tankelevitch L, Verhagen L, Attali D, Aubry J-F, Sallet J, Rushworth MFS. Ultrasound modulation of macaque prefrontal cortex selectively alters credit assignment-related activity and behavior. Sci Adv. 2021;7:eabg7700. doi: 10.1126/sciadv.abg7700. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Fouragnan EF, Chau BKH, Folloni D, Kolling N, Verhagen L, Klein-Flugge M, Tankelevitch L, Papageorgiou GK, Aubry JF, Sallet J, et al. The macaque anterior cingulate cortex translates counterfactual choice value into actual behavioral change. Nat Neurosci. 2019;22:797–808. doi: 10.1038/s41593-019-0375-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Freidin E, Kacelnik A. Rational choice, context dependence, and the value of information in European Starlings (Sturnus vulgaris) . Science. 2011;334:1000–1002. doi: 10.1126/science.1209626. [DOI] [PubMed] [Google Scholar]
  64. Freudenmacher L, Schauer M, Walkowiak W, von Twickel A. Refinement of the dopaminergic system of anuran amphibians based on connectivity with habenula, basal ganglia, limbic system, pallium, and spinal cord. J Comp Neurol. 2020;528:972–988. doi: 10.1002/cne.24793. [DOI] [PubMed] [Google Scholar]
  65. Friedman A, Homma D, Gibb LG, Amemori K-I, Rubin SJ, Hood AS, Riad MH, Graybiel AM. A Corticostriatal Path Targeting Striosomes Controls Decision-Making under Conflict. Cell. 2015;161:1320–1333. doi: 10.1016/j.cell.2015.04.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Frith CD, Frith U. Mechanisms of social cognition. Annu Rev Psychol. 2012;63:287–313. doi: 10.1146/annurev-psych-120710-100449. [DOI] [PubMed] [Google Scholar]
  67. Gerraty RT, Davidow JY, Foerde K, Galvan A, Bassett DS, Shohamy D. Dynamic Flexibility in Striatal-Cortical Circuits Supports Reinforcement Learning. J Neurosci Off J Soc Neurosci. 2018;38:2442–2453. doi: 10.1523/JNEUROSCI.2084-17.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Glasser MF, Coalson TS, Robinson EC, Hacker CD, Harwell J, Yacoub E, Ugurbil K, Andersson J, Beckmann CF, Jenkinson M, et al. A multi-modal parcellation of human cerebral cortex. Nature. 2016;536:171–178. doi: 10.1038/nature18933. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Grabenhorst F, Rolls ET. Value, pleasure and choice in the ventral prefrontal cortex. Trends Cogn Sci. 2011;15:56–67. doi: 10.1016/j.tics.2010.12.004. [DOI] [PubMed] [Google Scholar]
  70. Graziano MSA. Oxford University Press; Oxford; New York: 2009. The intelligent movement machine: an ethological perspective on the primate motor system. [Google Scholar]
  71. Grossman CD, Bari BA, Cohen JY. Serotonin neurons modulate learning rate through uncertainty. Curr Biol CB. 2022;32:586–599.:e7. doi: 10.1016/j.cub.2021.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Grueschow M, Polania R, Hare TA, Ruff CC. Automatic versus Choice-Dependent Value Representations in the Human Brain. Neuron. 2015;85:874–885. doi: 10.1016/j.neuron.2014.12.054. [DOI] [PubMed] [Google Scholar]
  73. Hämmerer D, Bonaiuto J, Klein-Flügge M, Bikson M, Bestmann S. Selective alteration of human value decisions with medial frontal tDCS is predicted by changes in attractor dynamics. Sci Rep. 2016;6 doi: 10.1038/srep25160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Hart AS, Rutledge RB, Glimcher PW, Phillips PE. Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term. J Neurosci. 2014;34:698–704. doi: 10.1523/JNEUROSCI.2489-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Hartley T, Maguire EA, Spiers HJ, Burgess N. The well-worn route and the path less traveled: distinct neural bases of route following and wayfinding in humans. Neuron. 2003;37:877–888. doi: 10.1016/s0896-6273(03)00095-3. [DOI] [PubMed] [Google Scholar]
  76. Hayashi K, Nakao K, Nakamura K. Appetitive and aversive information coding in the primate dorsal raphé nucleus. J Neurosci Off J Soc Neurosci. 2015;35:6195–6208. doi: 10.1523/JNEUROSCI.2860-14.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Hayden BY, Niv Y. The case against economic values in the orbitofrontal cortex (or anywhere else in the brain) Behav Neurosci. 2021;135:192–201. doi: 10.1037/bne0000448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Hayden BY, Pearson JM, Platt ML. Neuronal basis of sequential foraging decisions in a patchy environment. Nat Neurosci. 2011;14:933–939. doi: 10.1038/nn.2856. https://doi.org/nn.2856[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Holroyd CB, Ribas-Fernandes JJF, Shahnazian D, Silvetti M, Verguts T. Human midcingulate cortex encodes distributed representations of task progress. Proc Natl Acad Sci. 2018;115:6398–6403. doi: 10.1073/pnas.1803650115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Holt-Lunstad J, Smith TB, Layton JB. Social relationships and mortality risk: a meta-analytic review. PLoS Med. 2010;7:e1000316. doi: 10.1371/journal.pmed.1000316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Holt-Lunstad J, Smith TB, Baker M, Harris T, Stephenson D. Loneliness and social isolation as risk factors for mortality: a meta-analytic review. Perspect Psychol Sci J Assoc Psychol Sci. 2015;10:227–237. doi: 10.1177/1745691614568352. [DOI] [PubMed] [Google Scholar]
  82. Hong S, Hikosaka O. The globus pallidus sends reward-related signals to the lateral habenula. Neuron. 2008;60:720–729. doi: 10.1016/j.neuron.2008.09.035. https://doi.org/S0896-6273(08)00837-4[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Howard JD, Reynolds R, Smith DE, Voss JL, Schoenbaum G, Kahnt T. Targeted Stimulation of Human Orbitofrontal Networks Disrupts Outcome-Guided Behavior. Curr Biol CB. 2020;30:490–498.:e4. doi: 10.1016/j.cub.2019.12.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Hunt LT, Hayden BY. A distributed, hierarchical and recurrent framework for reward-based choice. Nat Rev Neurosci. 2017;18:172–182. doi: 10.1038/nrn.2017.7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Hunt LT, Kolling N, Soltani A, Woolrich MW, Rushworth MF, Behrens TE. Mechanisms underlying cortical activity during value-guided choice. Nat Neurosci. 2012 doi: 10.1038/nn.3017. https://doi.org/nn.3017[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Hunt LT, Malalasekera WMN, de Berker AO, Miranda B, Farmer SF, Behrens TEJ, Kennerley SW. Triple dissociation of attention and decision computations across prefrontal cortex. Nat Neurosci. 2018;21:1471–1481. doi: 10.1038/s41593-018-0239-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Hunt LT, Daw ND, Kaanders P, MacIver MA, Mugan U, Procyk E, Redish AD, Russo E, Scholl J, Stachenfeld K, et al. Formalizing planning and information search in naturalistic decision-making. Nat Neurosci. 2021;24:1051–1064. doi: 10.1038/s41593-021-00866-w. [DOI] [PubMed] [Google Scholar]
  88. Iigaya K, Hauser TU, Kurth-Nelson Z, O’Doherty JP, Dayan P, Dolan RJ. The value of what’s to come: Neural mechanisms coupling prediction error and the utility of anticipation. Sci Adv. 2020;6:eaba3828. doi: 10.1126/sciadv.aba3828. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Jamali M, Grannan BL, Fedorenko E, Saxe R, Báez-Mendoza R, Williams ZM. Single-neuronal predictions of others’ beliefs in humans. Nature. 2021;591:610–614. doi: 10.1038/s41586-021-03184-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Jang AI, Costa VD, Rudebeck PH, Chudasama Y, Murray EA, Averbeck BB. The Role of Frontal Cortical and Medial-Temporal Lobe Brain Areas in Learning a Bayesian Prior Belief on Reversals. J Neurosci. 2015;35:11751–11760. doi: 10.1523/JNEUROSCI.1594-15.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Janmaat KRL, Byrne RW, Zuberbühler K. Primates take weather into account when searching for fruits. Curr Biol CB. 2006;16:1232–1237. doi: 10.1016/j.cub.2006.04.031. [DOI] [PubMed] [Google Scholar]
  92. Janmaat KRL, Chapman CA, Meijer R, Zuberbühler K. The use of fruiting synchrony by foraging mangabey monkeys: a “simple tool” to find fruit. Anim Cogn. 2012;15:83–96. doi: 10.1007/s10071-011-0435-0. [DOI] [PubMed] [Google Scholar]
  93. Jocham G, Hunt LT, Near J, Behrens TE. A mechanism for value-guided choice based on the excitation-inhibition balance in prefrontal cortex. Nat Neurosci. 2012;15:960–961. doi: 10.1038/nn.3140. https://doi.org/nn.3140[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Jocham G, Brodersen KH, Constantinescu AO, Kahn MC, Ianni AM, Walton ME, Rushworth MF, Behrens TE. Reward-Guided Learning with and without Causal Attribution. Neuron. 2016;90:177–190. doi: 10.1016/j.neuron.2016.02.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Joshi S, Gold JI. Context-dependent relationships between locus coeruleus firing patterns and coordinated neural activity in the anterior cingulate cortex. ELife. 2022;11:e63490. doi: 10.7554/eLife.63490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Joshi S, Li Y, Kalwani RM, Gold JI. Relationships between Pupil Diameter and Neuronal Activity in the Locus Coeruleus, Colliculi, and Cingulate Cortex. Neuron. 2016;89:221–234. doi: 10.1016/j.neuron.2015.11.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Kaanders P, Nili H, O’Reilly JX, Hunt L. Medial Frontal Cortex Activity Predicts Information Sampling in Economic Choice. J Neurosci. 2021;41:8403–8413. doi: 10.1523/JNEUROSCI.0392-21.2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Kable JW, Glimcher PW. The neural correlates of subjective value during intertemporal choice. Nat Neurosci. 2007;10:1625–1633. doi: 10.1038/nn2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Kable JW, Glimcher PW. The neurobiology of decision: consensus and controversy. Neuron. 2009;63:733–745. doi: 10.1016/j.neuron.2009.09.003. https://doi.org/S0896-6273(09)00681-3[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Kahnt T, Heinzle J, Park SQ, Haynes J-D. Decoding different roles for vmPFC and dlPFC in multi-attribute decision making. NeuroImage. 2011;56:709–715. doi: 10.1016/j.neuroimage.2010.05.058. [DOI] [PubMed] [Google Scholar]
  101. Kennerley SW, Walton ME, Behrens TE, Buckley MJ, Rushworth MF. Optimal decision making and the anterior cingulate cortex. Nat Neurosci. 2006;9:940–947. doi: 10.1038/nn1724. [DOI] [PubMed] [Google Scholar]
  102. Kennerley SW, Dahmubed AF, Lara AH, Wallis JD. Neurons in the frontal lobe encode the value of multiple decision variables. J Cogn Neurosci. 2009;21:1162–1178. doi: 10.1162/jocn.2009.21100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Keren H, Zheng C, Jangraw DC, Chang K, Vitale A, Rutledge RB, Pereira F, Nielson DM, Stringaris A. The temporal representation of experience in subjective mood. ELife. 2021;10:e62051. doi: 10.7554/eLife.62051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Khalighinejad N, Bongioanni A, Verhagen L, Folloni D, Attali D, Aubry J-F, Sallet J, Rushworth MFS. A Basal Forebrain-Cingulate Circuit in Macaques Decides It Is Time to Act. Neuron. 2020a;105:370–384.:e8. doi: 10.1016/j.neuron.2019.10.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Khalighinejad N, Priestley L, Jbabdi S, Rushworth MFS. Human decisions about when to act originate within a basal forebrain-nigral circuit. Proc Natl Acad Sci U S A. 2020b;117:11799–11810. doi: 10.1073/pnas.1921211117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Khalighinejad N, Garrett N, Priestley L, Lockwood P, Rushworth MFS. A habenula-insular circuit encodes the willingness to act. Nat Commun. 2021;12:6329. doi: 10.1038/s41467-021-26569-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Khalighinejad N, Manohar S, Husain M, Rushworth MFS. Complementary roles of serotonergic and cholinergic systems in decisions about when to act. Curr Biol CB. 2022:S0960-9822(22)00104-X. doi: 10.1016/j.cub.2022.01.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Kidd C, Hayden BY. The Psychology and Neuroscience of Curiosity. Neuron. 2015;88:449–460. doi: 10.1016/j.neuron.2015.09.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Klein TA, Ullsperger M, Jocham G. Learning relative values in the striatum induces violations of normative decision making. Nat Commun. 2017;8:16033. doi: 10.1038/ncomms16033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Klein-Flugge MC, Barron HC, Brodersen KH, Dolan RJ, Behrens TE. Segregated encoding of reward-identity and stimulus-reward associations in human orbitofrontal cortex. J Neurosci. 2013;33:3202–3211. doi: 10.1523/JNEUROSCI.2532-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Klein-Flugge MC, Kennerley SW, Friston K, Bestmann S. Neural Signatures of Value Comparison in Human Cingulate Cortex during Decisions Requiring an Effort-Reward Trade-off. J Neurosci. 2016;36:10002–10015. doi: 10.1523/JNEUROSCI.0292-16.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Klein-Flügge MC, Kennerley SW, Friston K, Bestmann S. Neural Signatures of Value Comparison in Human Cingulate Cortex during Decisions Requiring an Effort-Reward Trade-off. J Neurosci. 2016;36:10002–10015. doi: 10.1523/JNEUROSCI.0292-16.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  113. Klein-Flügge MC, Wittmann MK, Shpektor A, Jensen DEA, Rushworth MFS. Multiple associative structures created by reinforcement and incidental statistical learning mechanisms. Nat Commun. 2019;10:4835. doi: 10.1038/s41467-019-12557-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Kobayashi K, Ravaioli S, Baranès A, Woodford M, Gottlieb J. Diverse motives for human curiosity. Nat Hum Behav. 2019;3:587–595. doi: 10.1038/s41562-019-0589-3. [DOI] [PubMed] [Google Scholar]
  115. Kolling N, Behrens TE, Mars RB, Rushworth MF. Neural mechanisms of foraging. Science. 2012;336:95–98. doi: 10.1126/science.1216930. https://doi.org/336/6077/95[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Kolling N, Behrens T, Wittmann MK, Rushworth M. Multiple signals in anterior cingulate cortex. Curr Opin Neurobiol. 2016a;37:36–43. doi: 10.1016/j.conb.2015.12.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  117. Kolling N, Wittmann MK, Behrens TE, Boorman ED, Mars RB, Rushworth MF. Value, search, persistence and model updating in anterior cingulate cortex. Nat Neurosci. 2016b;19:1280–1285. doi: 10.1038/nn.4382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. Kolling N, Scholl J, Chekroud A, Trier HA, Rushworth MFS. Prospection, Perseverance, and Insight in Sequential Behavior. Neuron. 2018;99:1069–1082.:e7. doi: 10.1016/j.neuron.2018.08.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Konovalov A, Hill C, Daunizeau J, Ruff CC. Dissecting functional contributions of the social brain to strategic behavior. Neuron. 2021;109:3323–3337.:e5. doi: 10.1016/j.neuron.2021.07.025. [DOI] [PubMed] [Google Scholar]
  120. Korn CW, Bach DR. Heuristic and optimal policy computations in the human brain during sequential decision-making. Nat Commun. 2018;9:325. doi: 10.1038/s41467-017-02750-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  121. Korn CW, Bach DR. Minimizing threat via heuristic and optimal policies recruits hippocampus and medial prefrontal cortex. Nat Hum Behav. 2019;3:733–745. doi: 10.1038/s41562-019-0603-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  122. Kralik JD, Xu ER, Knight EJ, Khan SA, Levine WJ. When less is more: evolutionary origins of the affect heuristic. PloS One. 2012;7:e46240. doi: 10.1371/journal.pone.0046240. [DOI] [PMC free article] [PubMed] [Google Scholar]
  123. Kurniawan IT, Guitart-Masip M, Dayan P, Dolan RJ. Effort and valuation in the brain: the effects of anticipation and execution. J Neurosci Off J Soc Neurosci. 2013;33:6160–6169. doi: 10.1523/JNEUROSCI.4777-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  124. Kuwabara M, Kang N, Holy TE, Padoa-Schioppa C. Neural mechanisms of economic choices in mice. ELife. 2020;9:e49669. doi: 10.7554/eLife.49669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  125. Leong YC, Radulescu A, Daniel R, DeWoskin V, Niv Y. Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments. Neuron. 2017;93:451–463. doi: 10.1016/j.neuron.2016.12.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  126. Lichtenberg NT, Sepe-Forrest L, Pennington ZT, Lamparelli AC, Greenfield VY, Wassum KM. The Medial Orbitofrontal Cortex-Basolateral Amygdala Circuit Regulates the Influence of Reward Cues on Adaptive Behavior and Choice. J Neurosci Off J Soc Neurosci. 2021;41:7267–7277. doi: 10.1523/JNEUROSCI.0901-21.2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  127. Lim SL, O’Doherty JP, Rangel A. The Decision Value Computations in the vmPFC and Striatum Use a Relative Value Code That is Guided by Visual Attention. J Neurosci. 2011;31:13214–13223. doi: 10.1523/JNEUROSCI.1246-11.2011. https://doi.org/31/37/13214[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  128. List JA. Preference Reversals of a Different Kind: The “More Is Less” Phenomenon. Am Econ Rev. 2002;92:1636–1643. [Google Scholar]
  129. Liu Y, Xin Y, Xu N. A cortical circuit mechanism for structural knowledge-based flexible sensorimotor decision-making. Neuron. 2021;109:2009–2024.:e6. doi: 10.1016/j.neuron.2021.04.014. [DOI] [PubMed] [Google Scholar]
  130. Lopatina N, McDannald MA, Styer CV, Sadacca BF, Cheer JF, Schoenbaum G. Lateral orbitofrontal neurons acquire responses to upshifted, downshifted, or blocked cues during unblocking. Elife. 2015;4 doi: 10.7554/eLife.11299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  131. Lopatina N, McDannald MA, Styer CV, Peterson JF, Sadacca BF, Cheer JF, Schoenbaum G. Medial Orbitofrontal Neurons Preferentially Signal Cues Predicting Changes in Reward during Unblocking. J Neurosci. 2016;36:8416–8424. doi: 10.1523/JNEUROSCI.1101-16.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  132. Lopez-Persem A, Domenech P, Pessiglione M. How prior preferences determine decision-making frames and biases in the human brain. Elife. 2016;5 doi: 10.7554/eLife.20317. [DOI] [PMC free article] [PubMed] [Google Scholar]
  133. Lopez-Persem A, Bastin J, Petton M, Abitbol R, Lehongre K, Adam C, Navarro V, Rheims S, Kahane P, Domenech P, et al. Four core properties of the human brain valuation system demonstrated in intracranial signals. Nat Neurosci. 2020;23:664–675. doi: 10.1038/s41593-020-0615-9. [DOI] [PubMed] [Google Scholar]
  134. Louie K, Glimcher PW, Webb R. Adaptive neural coding: from biological to behavioral decision-making. Curr Opin Behav Sci. 2015;5:91–99. doi: 10.1016/j.cobeha.2015.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  135. Ma L, Hyman JM, Phillips AG, Seamans JK. Tracking Progress toward a Goal in Corticostriatal Ensembles. J Neurosci. 2014;34:2244–2253. doi: 10.1523/JNEUROSCI.3834-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  136. Ma L, Hyman JM, Durstewitz D, Phillips AG, Seamans JK. A Quantitative Analysis of Context-Dependent Remapping of Medial Frontal Cortex Neurons and Ensembles. J Neurosci. 2016;36:8258–8272. doi: 10.1523/JNEUROSCI.3176-15.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  137. MacIver MA, Schmitz L, Mugan U, Murphey TD, Mobley CD. Massive increase in visual range preceded the origin of terrestrial vertebrates. Proc Natl Acad Sci U S A. 2017;114:E2375–E2384. doi: 10.1073/pnas.1615563114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  138. Maier SU, Makwana AB, Hare TA. Acute Stress Impairs Self-Control in Goal-Directed Choice by Altering Multiple Functional Connections within the Brain’s Decision Circuits. Neuron. 2015;87:621–631. doi: 10.1016/j.neuron.2015.07.005. [DOI] [PubMed] [Google Scholar]
  139. Malvaez M, Shieh C, Murphy MD, Greenfield VY, Wassum KM. Distinct cortical-amygdala projections drive reward value encoding and retrieval. Nat Neurosci. 2019;22:762–769. doi: 10.1038/s41593-019-0374-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  140. Marques JC, Li M, Schaak D, Robson DN, Li JM. Internal state dynamics shape brainwide activity and foraging behaviour. Nature. 2020;577:239–243. doi: 10.1038/s41586-019-1858-z. [DOI] [PubMed] [Google Scholar]
  141. Matsumoto M, Hikosaka O. Lateral habenula as a source of negative reward signals in dopamine neurons. Nature. 2007;447:1111–1115. doi: 10.1038/nature05860. https://doi.org/nature05860[pii] [DOI] [PubMed] [Google Scholar]
  142. Mayner L. A cyto-architectonic study of the cortex of the tammar wallaby, Macropus eugenii. Brain Behav Evol. 1989;33:303–316. doi: 10.1159/000115938. [DOI] [PubMed] [Google Scholar]
  143. Meder D, Kolling N, Verhagen L, Wittmann MK, Scholl J, Madsen KH, Hulme OJ, Behrens TEJ, Rushworth MFS. Simultaneous representation of a spectrum of dynamically changing value estimates during decision making. Nat Commun. 2017;8:1942. doi: 10.1038/s41467-017-02169-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  144. Mehta PS, Tu JC, LoConte GA, Pesce MC, Hayden BY. Ventromedial Prefrontal Cortex Tracks Multiple Environmental Variables during Search. J Neurosci. 2019;39:5336–5350. doi: 10.1523/JNEUROSCI.2365-18.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  145. Menzel CR. Cognitive aspects of foraging in Japanese monkeys. Anim Behav. 1991;41:397–402. doi: 10.1016/S0003-3472(05)80840-1. [DOI] [Google Scholar]
  146. Menzel CR. Spontaneous use of matching visual cues during foraging by long-tailed macaques (Macaca fascicularis) J Comp Psychol. 1996;110:370–376. doi: 10.1037/0735-7036.110.4.370. [DOI] [PubMed] [Google Scholar]
  147. Mishkin M, Vest B, Waxler M, Rosvold HE. A re-examination of the effects of frontal lesions on object alternation. Neuropsychologia. 1969;7:357–363. doi: 10.1016/0028-3932(69)90060-8. [DOI] [Google Scholar]
  148. Miyamoto K, Trudel N, Kamermans K, Lim MC, Lazari A, Verhagen L, Wittmann MK, Rushworth MFS. Identification and disruption of a neural mechanism for accumulating prospective metacognitive information prior to decision-making. Neuron. 2021;109:1396–1408.:e7. doi: 10.1016/j.neuron.2021.02.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  149. Monosov IE. Anterior cingulate is a source of valence-specific information about value and uncertainty. Nat Commun. 2017;8:134. doi: 10.1038/s41467-017-00072-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  150. Monosov IE, Haber SN, Leuthardt EC, Jezzini A. Anterior Cingulate Cortex and the Control of Dynamic Behavior in Primates. Curr Biol CB. 2020;30:R1442–R1454. doi: 10.1016/j.cub.2020.10.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  151. Mugan U, MacIver MA. Spatial planning with long visual range benefits escape from visual predators in complex naturalistic environments. Nat Commun. 2020;11:3057. doi: 10.1038/s41467-020-16102-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  152. Muller TH, Mars RB, Behrens TE, O’Reilly JX. Control of entropy in neural models of environmental state. Elife. 2019;8 doi: 10.7554/eLife.39404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  153. Murray EA, Moylan EJ, Saleem KS, Basile BM, Turchi J. Specialized areas for value updating and goal selection in the primate orbitofrontal cortex. Elife. 2015;4 doi: 10.7554/eLife.11695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  154. Murray JD, Bernacchia A, Freedman DJ, Romo R, Wallis JD, Cai X, Padoa-Schioppa C, Pasternak T, Seo H, Lee D, et al. A hierarchy of intrinsic timescales across primate cortex. Nat Neurosci. 2014;17:1661–1663. doi: 10.1038/nn.3862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  155. Na S, Chung D, Hula A, Perl O, Jung J, Heflin M, Blackmore S, Fiore VG, Dayan P, Gu X. Humans use forward thinking to exploit social controllability. ELife. 2021;10:e64983. doi: 10.7554/eLife.64983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  156. Neubert FX, Mars RB, Thomas AG, Sallet J, Rushworth MF. Comparison of Human Ventral Frontal Cortex Areas for Cognitive Control and Language with Areas in Monkey Frontal Cortex. Neuron. 2014 doi: 10.1016/j.neuron.2013.11.012. https://doi.org/S0896-6273(13)01080-5[pii] [DOI] [PubMed] [Google Scholar]
  157. Neubert FX, Mars RB, Sallet J, Rushworth MF. Connectivity reveals relationship of brain areas for reward-guided learning and decision making in human and monkey frontal cortex. Proc Natl Acad Sci U A. 2015 doi: 10.1073/pnas.1410767112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  158. Ninomiya T, Noritake A, Kobayashi K, Isoda M. A causal role for frontal cortico-cortical coordination in social action monitoring. Nat Commun. 2020;11:5233. doi: 10.1038/s41467-020-19026-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  159. Noonan MP, Walton ME, Behrens TE, Sallet J, Buckley MJ, Rushworth MF. Separate value comparison and learning mechanisms in macaque medial and lateral orbitofrontal cortex. Proc Natl Acad Sci U A. 2010;107:20547–20552. doi: 10.1073/pnas.1012246107. https://doi.org/1012246107[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  160. Noonan MP, Mars RB, Rushworth MF. Distinct Roles of Three Frontal Cortical Areas in Reward-Guided Behavior. J Neurosci. 2011 doi: 10.1523/JNEUROSCI.6456-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  161. Noonan MP, Chau BKH, Rushworth MFS, Fellows LK. Contrasting Effects of Medial and Lateral Orbitofrontal Cortex Lesions on Credit Assignment and Decision-Making in Humans. J Neurosci. 2017;37:7023–7035. doi: 10.1523/JNEUROSCI.0692-17.2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  162. Noritake A, Ninomiya T, Isoda M. Social reward monitoring and valuation in the macaque brain. Nat Neurosci. 2018;21:1452–1462. doi: 10.1038/s41593-018-0229-7. [DOI] [PubMed] [Google Scholar]
  163. Okazawa G, Hatch CE, Mancoo A, Machens CK, Kiani R. Representational geometry of perceptual decisions in the monkey parietal cortex. Cell. 2021;184:3748–3761.:e18. doi: 10.1016/j.cell.2021.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  164. Padoa-Schioppa C, Conen KE. Orbitofrontal Cortex: A Neural Circuit for Economic Decisions. Neuron. 2017;96:736–754. doi: 10.1016/j.neuron.2017.09.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  165. Papageorgiou GK, Sallet J, Wittmann MK, Chau BKH, Schuffelgen U, Buckley MJ, Rushworth MFS. Inverted activity patterns in ventromedial prefrontal cortex during value-guided decision-making in a less-is-more task. Nat Commun. 2017;8:1886. doi: 10.1038/s41467-017-01833-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  166. Park SA, Miller DS, Boorman ED. Inferences on a multidimensional social hierarchy use a grid-like code. Nat Neurosci. 2021;24:1292–1301. doi: 10.1038/s41593-021-00916-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  167. Parvizi J, Rangarajan V, Shirer WR, Desai N, Greicius MD. The will to persevere induced by electrical stimulation of the human cingulate gyrus. Neuron. 2013;80:1359–1367. doi: 10.1016/j.neuron.2013.10.057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  168. Passingham RE, Wise SP. The Neurobiology of the Prefrontal Cortex: Anatomy, Evolution, and the Origin of Insight. Oxford University Press; Oxford: 2012. [Google Scholar]
  169. Pearson JM, Watson KK, Platt ML. Decision making: the neuroethological turn. Neuron. 2014;82:950–965. doi: 10.1016/j.neuron.2014.04.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  170. Pickens CL, Saddoris MP, Setlow B, Gallagher M, Holland PC, Schoenbaum G. Different roles for orbitofrontal cortex and basolateral amygdala in a reinforcer devaluation task. J Neurosci. 2003;23:11078–11084. doi: 10.1523/JNEUROSCI.23-35-11078.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  171. Preuss TM. Do rats have prefrontal cortex? The Rose-Woolsey-Akert program reconsidered. J Comp Neurol. 1995;7:1–24. doi: 10.1162/jocn.1995.7.1.1. [DOI] [PubMed] [Google Scholar]
  172. Prevost C, Pessiglione M, Metereau E, Clery-Melin ML, Dreher JC. Separate valuation subsystems for delay and effort decision costs. J Neurosci. 2010;30:14080–14090. doi: 10.1523/JNEUROSCI.2752-10.2010. https://doi.org/30/42/14080[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  173. Procyk E, Tanaka YL, Joseph JP. Anterior cingulate activity during routine and non-routine sequential behaviors in macaques. Nat Neurosci. 2000;3:502–508. doi: 10.1038/74880. [DOI] [PubMed] [Google Scholar]
  174. Remondes M, Wilson MA. Cingulate-hippocampus coherence and trajectory coding in a sequential choice task. Neuron. 2013;80:1277–1289. doi: 10.1016/j.neuron.2013.08.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  175. Ribas-Fernandes JJF, Solway A, Diuk C, McGuire JT, Barto AG, Niv Y, Botvinick MM. A Neural Signature of Hierarchical Reinforcement Learning. Neuron. 2011;71:370–379. doi: 10.1016/j.neuron.2011.05.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  176. Rich EL, Wallis JD. Decoding subjective decisions from orbitofrontal cortex. Nat Neurosci. 2016;19:973–980. doi: 10.1038/nn.4320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  177. Rudebeck PH, Izquierdo A. Foraging with the frontal cortex: A cross-species evaluation of reward-guided behavior. Neuropsychopharmacol Off Publ Am Coll Neuropsychopharmacol. 2021 doi: 10.1038/s41386-021-01140-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  178. Rudebeck PH, Murray EA. The orbitofrontal oracle: cortical mechanisms for the prediction and evaluation of specific behavioral outcomes. Neuron. 2014;84:1143–1156. doi: 10.1016/j.neuron.2014.10.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  179. Rudebeck PH, Walton ME, Smyth AN, Bannerman DM, Rushworth MF. Separate neural pathways process different decision costs. Nat Neurosci. 2006;9:1161–1168. doi: 10.1038/nn1756. [DOI] [PubMed] [Google Scholar]
  180. Rudebeck PH, Saunders RC, Prescott AT, Chau LS, Murray EA. Prefrontal mechanisms of behavioral flexibility, emotion regulation and value updating. Nat Neurosci. 2013;16:1140–1145. doi: 10.1038/nn.3440. [DOI] [PMC free article] [PubMed] [Google Scholar]
  181. Rudebeck PH, Saunders RC, Lundgren DA, Murray EA. Specialized Representations of Value in the Orbital and Ventrolateral Prefrontal Cortex: Desirability versus Availability of Outcomes. Neuron. 2017;95:1208–1220.:e5. doi: 10.1016/j.neuron.2017.07.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  182. Rushworth MF, Noonan MP, Boorman ED, Walton ME, Behrens TE. Frontal cortex and reward-guided learning and decision-making. Neuron. 2011;70:1054–1069. doi: 10.1016/j.neuron.2011.05.014. https://doi.org/S0896-6273(11)00395-3[pii] [DOI] [PubMed] [Google Scholar]
  183. Sadacca BF, Wied HM, Lopatina N, Saini GK, Nemirovsky D, Schoenbaum G. Orbitofrontal neurons signal sensory associations underlying model-based inference in a sensory preconditioning task. ELife. 2018;7:e30373. doi: 10.7554/eLife.30373. [DOI] [PMC free article] [PubMed] [Google Scholar]
  184. Scholl J, Kolling N, Nelissen N, Wittmann MK, Harmer CJ, Rushworth MF. The Good, the Bad, and the Irrelevant: Neural Mechanisms of Learning Real and Hypothetical Rewards and Effort. J Neurosci. 2015;35:11233–11251. doi: 10.1523/JNEUROSCI.0396-15.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  185. Schuck NW, Gaschler R, Wenke D, Heinzle J, Frensch PA, Haynes J-D, Reverberi C. Medial prefrontal cortex predicts internally driven strategy shifts. Neuron. 2015;86:331–340. doi: 10.1016/j.neuron.2015.03.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  186. Schuck NW, Cai MB, Wilson RC, Niv Y. Human Orbitofrontal Cortex Represents a Cognitive Map of State Space. Neuron. 2016;91:1402–1412. doi: 10.1016/j.neuron.2016.08.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  187. Schulke O, Bhagavatula J, Vigilant L, Ostner J. Social bonds enhance reproductive success in male macaques. Curr Biol. 2010;20:2207–2210. doi: 10.1016/j.cub.2010.10.058. https://doi.org/S0960-9822(10)01375-8[pii] [DOI] [PubMed] [Google Scholar]
  188. Schultz W. Updating dopamine reward signals. Curr Opin Neurobiol. 2013;23:229–238. doi: 10.1016/j.conb.2012.11.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  189. Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997;275:1593–1599. doi: 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
  190. Schweighofer N, Doya K. Meta-learning in reinforcement learning. Neural Netw Off J Int Neural Netw Soc. 2003;16:5–9. doi: 10.1016/s0893-6080(02)00228-9. [DOI] [PubMed] [Google Scholar]
  191. Seo H, Lee D. Behavioral and neural changes after gains and losses of conditioned reinforcers. J Neurosci. 2009;29:3627–3641. doi: 10.1523/JNEUROSCI.4726-08.2009. https://doi.org/29/11/3627[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  192. Seo H, Cai X, Donahue CH, Lee D. Neural correlates of strategic reasoning during competitive games. Science. 2014;346:340–343. doi: 10.1126/science.1256254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  193. Shahnazian D, Holroyd CB. Distributed representations of action sequences in anterior cingulate cortex: A recurrent neural network approach. Psychon Bull Rev. 2018;25:302–321. doi: 10.3758/s13423-017-1280-1. [DOI] [PubMed] [Google Scholar]
  194. Shidara M, Richmond BJ. Anterior cingulate: single neuronal signals related to degree of reward expectancy. Science. 2002;296:1709–1711. doi: 10.1126/science.1069504. [DOI] [PubMed] [Google Scholar]
  195. Shima K, Isoda M, Mushiake H, Tanji J. Categorization of behavioural sequences in the prefrontal cortex. Nature. 2007;445:315–318. doi: 10.1038/nature05470. [DOI] [PubMed] [Google Scholar]
  196. Sias AC, Morse AK, Wang S, Greenfield VY, Goodpaster CM, Wrenn TM, Wikenheiser AM, Holley SM, Cepeda C, Levine MS, et al. A bidirectional corticoamygdala circuit for the encoding and retrieval of detailed reward memories. ELife. 2021;10:e68617. doi: 10.7554/eLife.68617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  197. Soltani A, Izquierdo A. Adaptive learning under expected and unexpected uncertainty. Nat Rev Neurosci. 2019;20:635–644. doi: 10.1038/s41583-019-0180-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  198. Soltani A, Koechlin E. Computational models of adaptive behavior and prefrontal cortex. Neuropsychopharmacol Off Publ Am Coll Neuropsychopharmacol. 2022;47:58–71. doi: 10.1038/s41386-021-01123-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  199. Soltani A, Murray JD, Seo H, Lee D. Timescales of Cognition in the Brain. Curr Opin Behav Sci. 2021;41:30–37. doi: 10.1016/j.cobeha.2021.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  200. Spalding KN, Schlichting ML, Zeithamova D, Preston AR, Tranel D, Duff MC, Warren DE. Ventromedial Prefrontal Cortex Is Necessary for Normal Associative Inference and Memory Integration. J Neurosci Off J Soc Neurosci. 2018;38:3767–3775. doi: 10.1523/JNEUROSCI.2501-17.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  201. Spitmaan M, Seo H, Lee D, Soltani A. Multiple timescales of neural dynamics and integration of task-relevant signals across cortex. Proc Natl Acad Sci U S A. 2020;117:22522–22531. doi: 10.1073/pnas.2005993117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  202. Stalnaker TA, Cooch NK, Schoenbaum G. What the orbitofrontal cortex does not do. Nat Neurosci. 2015;18:620–627. doi: 10.1038/nn.3982. [DOI] [PMC free article] [PubMed] [Google Scholar]
  203. Stephens DW, Krebs JR. Foraging Theory. Princeton University Press; Princeton, NJ: 1986. [Google Scholar]
  204. Stephenson-Jones M, Kardamakis AA, Robertson B, Grillner S. Independent circuits in the basal ganglia for the evaluation and selection of actions. Proc Natl Acad Sci U S A. 2013;110:E3670–3679. doi: 10.1073/pnas.1314815110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  205. Stephenson-Jones M, Yu K, Ahrens S, Tucciarone JM, van Huijstee AN, Mejia LA, Penzo MA, Tai L-H, Wilbrecht L, Li B. A basal ganglia circuit for evaluating action outcomes. Nature. 2016;539:289–293. doi: 10.1038/nature19845. [DOI] [PMC free article] [PubMed] [Google Scholar]
  206. Stoll FM, Fontanier V, Procyk E. Specific frontal neural dynamics contribute to decisions to check. Nat Commun. 2016;7:11990. doi: 10.1038/ncomms11990. [DOI] [PMC free article] [PubMed] [Google Scholar]
  207. Strait CE, Blanchard TC, Hayden BY. Reward value comparison via mutual inhibition in ventromedial prefrontal cortex. Neuron. 2014;82:1357–1366. doi: 10.1016/j.neuron.2014.04.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  208. Strait CE, Sleezer BJ, Hayden BY. Signatures of Value Comparison in Ventral Striatum Neurons. PLoS Biol. 2015;13:e1002173. doi: 10.1371/journal.pbio.1002173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  209. Suárez R, Paolino A, Fenlon LR, Morcom LR, Kozulin P, Kurniawan ND, Richards LJ. A pan-mammalian map of interhemispheric brain connections predates the evolution of the corpus callosum. Proc Natl Acad Sci U S A. 2018;115:9622–9627. doi: 10.1073/pnas.1808262115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  210. Suzuki S, Cross L, O’Doherty JP. Elucidating the underlying components of food valuation in the human orbitofrontal cortex. Nat Neurosci. 2017;20:1780–1786. doi: 10.1038/s41593-017-0008-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  211. Tang H, Bartolo R, Averbeck BB. Reward-related choices determine information timing and flow across macaque lateral prefrontal cortex. Nat Commun. 2021;12:894. doi: 10.1038/s41467-021-20943-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  212. Tervo DG, Proskurin M, Manakov M, Kabra M, Vollmer A, Branson K, Karpova AY. Behavioral variability through stochastic choice and its gating by anterior cingulate cortex. Cell. 2014;159:21–32. doi: 10.1016/j.cell.2014.08.037. [DOI] [PubMed] [Google Scholar]
  213. Tervo DGR, Kuleshova E, Manakov M, Proskurin M, Karlsson M, Lustig A, Behnam R, Karpova AY. The anterior cingulate cortex directs exploration of alternative strategies. Neuron. 2021;109:1876–1887.:e6. doi: 10.1016/j.neuron.2021.03.028. [DOI] [PubMed] [Google Scholar]
  214. Tolman EC. Cognitive maps in rats and men. Psychol Rev. 1948;55:189–208. doi: 10.1037/h0061626. [DOI] [PubMed] [Google Scholar]
  215. Trudel N, Scholl J, Klein-Flügge MC, Fouragnan E, Tankelevitch L, Wittmann MK, Rushworth MFS. Polarity of uncertainty representation during exploration and exploitation in ventromedial prefrontal cortex. Nat Hum Behav. 2021;5:83–98. doi: 10.1038/s41562-020-0929-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  216. Uylings HB, van Eden CG. Qualitative and quantitative comparison of the prefrontal cortex in rat and in primates, including humans. Prog Brain Res. 1990;85:31–62. doi: 10.1016/s0079-6123(08)62675-8. [DOI] [PubMed] [Google Scholar]
  217. Vassena E, Deraeve J, Alexander WH. Surprise, value and control in anterior cingulate cortex during speeded decision-making. Nat Hum Behav. 2020;4:412–422. doi: 10.1038/s41562-019-0801-5. [DOI] [PubMed] [Google Scholar]
  218. Vertechi P, Lottem E, Sarra D, Godinho B, Treves I, Quendera T, Oude Lohuis MN, Mainen ZF. Inference-Based Decisions in a Hidden State Foraging Task: Differential Contributions of Prefrontal Cortical Areas. Neuron. 2020;106:166–176.:e6. doi: 10.1016/j.neuron.2020.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  219. Vogt BA. In: Cingulate Neurobiology and Disease. Vogt BA, editor. Oxford University Press; New York: 2009. Architecture, neurocytology, and comparative organization of monkey and human cingulate cortices. [Google Scholar]
  220. Wallis JD. Cross-species studies of orbitofrontal cortex and value-based decision-making. Nat Neurosci. 2012;15:13–19. doi: 10.1038/nn.2956. https://doi.org/nn.2956[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  221. Walton ME, Bannerman DM, Rushworth MFS. The role of rat medial frontal cortex in effort-based decision making. J Neurosci. 2002;22:10996–11003. doi: 10.1523/JNEUROSCI.22-24-10996.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  222. Walton ME, Bannerman DM, Alterescu K, Rushworth MFS. Functional specialization within medial frontal cortex of the anterior cingulate for evaluating effort-related decisions. J Neurosci. 2003;23:6475–6479. doi: 10.1523/JNEUROSCI.23-16-06475.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  223. Walton ME, Behrens TE, Buckley MJ, Rudebeck PH, Rushworth MF. Separable learning systems in the macaque brain and the role of orbitofrontal cortex in contingent learning. Neuron. 2010;65:927–939. doi: 10.1016/j.neuron.2010.02.027. https://doi.org/S0896-6273(10)00144-3[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  224. Wang XJ. Probabilistic decision making by slow reverberation in cortical circuits. Neuron. 2002;36:955–968. doi: 10.1016/s0896-6273(02)01092-9. https://doi.org/S0896627302010929[pii] [DOI] [PubMed] [Google Scholar]
  225. Wang XJ. Decision making in recurrent neuronal circuits. Neuron. 2008;60:215–234. doi: 10.1016/j.neuron.2008.09.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  226. Wang MZ, Hayden BY. Latent learning, cognitive maps, and curiosity. Curr Opin Behav Sci. 2021;38:1–7. doi: 10.1016/j.cobeha.2020.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  227. Wang F, Schoenbaum G, Kahnt T. Interactions between human orbitofrontal cortex and hippocampus support model-based inference. PLoS Biol. 2020a;18:e3000578. doi: 10.1371/journal.pbio.3000578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  228. Wang F, Howard JD, Voss JL, Schoenbaum G, Kahnt T. Targeted Stimulation of an Orbitofrontal Network Disrupts Decisions Based on Inferred, Not Experienced Outcomes. J Neurosci Off J Soc Neurosci. 2020b;40:8726–8733. doi: 10.1523/JNEUROSCI.1680-20.2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  229. Wang Y, Toyoshima O, Kunimatsu J, Yamada H, Matsumoto M. Tonic firing mode of midbrain dopamine neurons continuously tracks reward values changing moment-by-moment. ELife. 2021;10:e63166. doi: 10.7554/eLife.63166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  230. Webb R, Glimcher PW, Louie K. Divisive normalization does influence decisions with multiple alternatives. Nat Hum Behav. 2020;4:1118–1120. doi: 10.1038/s41562-020-00941-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  231. White JK, Bromberg-Martin ES, Heilbronner SR, Zhang K, Pai J, Haber SN, Monosov IE. A neural network for information seeking. Nat Commun. 2019;10:5168. doi: 10.1038/s41467-019-13135-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  232. Whittington JCR, Muller TH, Mark S, Chen G, Barry C, Burgess N, Behrens TEJ. The Tolman-Eichenbaum Machine: Unifying Space and Relational Memory through Generalization in the Hippocampal Formation. Cell. 2020;183:1249–1263.:e23. doi: 10.1016/j.cell.2020.10.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  233. Wikenheiser AM, Schoenbaum G. Over the river, through the woods: cognitive maps in the hippocampus and orbitofrontal cortex. Nat Rev Neurosci. 2016;17:513–523. doi: 10.1038/nrn.2016.56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  234. Wise SP. Forward frontal fields: phylogeny and fundamental function. Trends Neurosci. 2008;31:599–608. doi: 10.1016/j.tins.2008.08.008. https://doi.org/S0166-2236(08)00207-5[pii] [DOI] [PMC free article] [PubMed] [Google Scholar]
  235. Wittmann MK, Kolling N, Akaishi R, Chau BK, Brown JW, Nelissen N, Rushworth MF. Predictive decision making driven by multiple time-linked reward representations in the anterior cingulate cortex. Nat Commun. 2016a;7:12327. doi: 10.1038/ncomms12327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  236. Wittmann MK, Kolling N, Faber NS, Scholl J, Nelissen N, Rushworth MF. Self-Other Mergence in the Frontal Cortex during Cooperation and Competition. Neuron. 2016b;91:482–493. doi: 10.1016/j.neuron.2016.06.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  237. Wittmann MK, Lockwood PL, Rushworth MFS. Neural Mechanisms of Social Cognition in Primates. Annu Rev Neurosci. 2018;41:99–118. doi: 10.1146/annurev-neuro-080317-061450. [DOI] [PMC free article] [PubMed] [Google Scholar]
  238. Wittmann MK, Fouragnan E, Folloni D, Klein-Flügge MC, Chau BKH, Khamassi M, Rushworth MFS. Global reward state affects learning and activity in raphe nucleus and anterior insula in monkeys. Nat Commun. 2020;11:3771. doi: 10.1038/s41467-020-17343-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  239. Wittmann MK, Trudel N, Trier HA, Klein-Flügge MC, Sel A, Verhagen L, Rushworth MFS. Causal manipulation of self-other mergence in the dorsomedial prefrontal cortex. Neuron. 2021;109:2353–2361.:e11. doi: 10.1016/j.neuron.2021.05.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  240. Wong KF, Wang XJ. A recurrent network mechanism of time integration in perceptual decisions. J Neurosci. 2006;26:1314–1328. doi: 10.1523/JNEUROSCI.3733-05.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  241. Yoshida K, Saito N, Iriki A, Isoda M. Representation of others’ action by neurons in monkey medial frontal cortex. Curr Biol. 2011;21:249–253. doi: 10.1016/j.cub.2011.01.004. https://doi.org/S0960-9822(11)00027-3[pii] [DOI] [PubMed] [Google Scholar]
  242. Yoshida K, Saito N, Iriki A, Isoda M. Social error monitoring in macaque frontal cortex. Nat Neurosci. 2012;15:1307–1312. doi: 10.1038/nn.3180. https://doi.org/nn.3180[pii] [DOI] [PubMed] [Google Scholar]
  243. Yun M, Kawai T, Nejime M, Yamada H, Matsumoto M. Signal dynamics of midbrain dopamine neurons during economic decision-making in monkeys. Sci Adv. 2020;6:eaba4962. doi: 10.1126/sciadv.aba4962. [DOI] [PMC free article] [PubMed] [Google Scholar]
  244. Zajkowski WK, Kossut M, Wilson RC. A causal role for right frontopolar cortex in directed, but not random, exploration. Elife. 2017;6 doi: 10.7554/eLife.27430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  245. Zhou J, Gardner MPH, Stalnaker TA, Ramus SJ, Wikenheiser AM, Niv Y, Schoenbaum G. Rat Orbitofrontal Ensemble Activity Contains Multiplexed but Dissociable Representations of Value and Task Structure in an Odor Sequence Task. Curr Biol CB. 2019;29:897–907.:e3. doi: 10.1016/j.cub.2019.01.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  246. Zhou J, Jia C, Montesinos-Cartagena M, Gardner MPH, Zong W, Schoenbaum G. Evolving schema representations in orbitofrontal ensembles during learning. Nature. 2021;590:606–611. doi: 10.1038/s41586-020-03061-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  247. Zuberbühler K, Janmaat K. In: Primate Neuroethology. Platt M, Ghazanfar A, editors. Oxford University Press; 2010. Foraging Cognition in Nonhuman Primates; pp. 64–83. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SupplFig1

RESOURCES