Abstract
During decision-making, neurons in the orbitofrontal cortex (OFC) sequentially represent the value of each option in turn, but it is unclear how these dynamics are translated into a choice response. One brain region that may be implicated in this process is the anterior cingulate cortex (ACC), which strongly connects with OFC and contains many neurons that encode the choice response. We investigated how OFC value signals interacted with ACC neurons encoding the choice response by performing simultaneous high-channel count recordings from the two areas in nonhuman primates. ACC neurons encoding the choice response steadily increased their firing rate throughout the decision-making process, peaking shortly before the time of the choice response. Furthermore, the value dynamics in OFC affected ACC ramping—when OFC represented the more valuable option, ACC ramping accelerated. Because OFC tended to represent the more valuable option more frequently and for a longer duration, this interaction could explain how ACC selects the more valuable response.
A wealth of evidence demonstrates the necessity of the orbitofrontal cortex (OFC) for value-based decision-making. Patients with OFC damage show specific deficits in value-based decision-making1, while electrical microstimulation of OFC in humans2 and monkeys3,4 selectively impairs value-based decision-making. Despite this, there is less evidence that OFC is involved in selecting the correct response to realize the decision. OFC only weakly connects with motor areas5 and its neurons only weakly encode the choice response6–8. At the population level, although OFC alternately represents the value of each available option9,10, it does not appear to represent a specific option at the time that the choice response occurs.
One area that could have an important role in translating value-based decisions into actions is the anterior cingulate cortex (ACC). Like OFC, ACC neurons strongly encode the value of anticipated outcomes, but unlike OFC, they also often encode the choice response7,11–15. In addition, ACC strongly connects with both OFC5,16 and motor areas in the medial frontal cortex, such as the cingulate motor area17,18. Stimulation of ACC in humans evokes movements19 and an urgency to act20. ACC seems to be particularly important when the cost of action must be factored into the decision. Lesions of ACC impair effort-based decisions21,22 and neuronal tuning in ACC reflects the value of anticipated outcomes, discounted by the effort necessary to obtain them23.
We, therefore, aimed to determine whether the dynamics of OFC value signals influenced neurons in ACC that encoded the choice response. One clue as to how this might occur is that more valuable options tend to be represented more frequently and for longer duration9,10. Consequently, a downstream area that integrated the OFC value dynamics would be able to select the more valuable option. ACC neurons that encoded the choice response tended to increase their firing rate throughout the decision, peaking shortly before the choice response. To determine whether this ramping was affected by the value dynamics in OFC, we carried out high-channel count recordings simultaneously from OFC and ACC while monkeys performed a value-based decision-making task.
Results
We taught two monkeys (subjects C and G) to use a bidirectional lever to select either one (forced choice trials) or between two (free choice trials) available pictures for the corresponding juice outcome (Fig. 1a,b). Both subjects preferred larger juice amounts and more certain rewards associated with the 16 pictures (Fig. 1c). We modeled each subject’s choice behavior on free trials with a soft-max decision function that estimated the subjective value as the expected value (equations (1) and (2)). The subjects selected the higher value option on 92% (subject C) and 91% (subject G) of free trials. The pattern of errors revealed that the subjects had a slight tendency to overvalue low-probability options relative to high probabilities (Extended Data Fig. 1), consistent with prior results24. Our decoding analyses required the same number of trials for each experimental condition on which the decoder was trained (Methods). To make this tractable, we binned the 16 pictures into groups of four from the lowest to the highest value (Fig. 1d). For all other analyses, we used continuous values.
Subjects’ lever response times were relatively slow (subject C: free trials’ median = 649 ms, forced trials’ median = 666 ms; subject G: free trials’ median = 439 ms, forced trials’ median = 448 ms) and broadly distributed (Fig. 1e). To examine the relationship between picture value and response times, we modeled response times as a linear function of the best picture (maximum value) and the choice difficulty (value difference). Subjects responded significantly faster when the maximum value increased (subject C: = −0.35, coefficient of partial determination (CPD) = 5.3%, P < 1 × 10−15; subject G: = −0.26, CPD = 2.4%, P < 1 × 10−15; Fig. 1f). There was also a small but significant decrease in response times with larger value difference (subject C: = −0.12, CPD = 0.6%, P < 1 × 10−15; subject G: = −0.11, CPD = 0.4%, P < 1 × 10−5; Fig. 1f). Because maximum value, not value difference, was predominantly correlated with both subjects’ behavior, we focused on this metric of picture value in the neural data.
During free choice trials, subject C made a mean of 2.0 saccades and subject G 1.4 saccades before moving the lever. The first saccade tended to be very fast (subject C: median = 166 ms, subject G: median = 184 ms; Extended Data Fig. 2) and, on free choice trials, was highly predictive of the option ultimately selected with the lever (subject C: 82% of first saccades predicted the lever direction; subject G: 88% of first saccades predicted the lever direction). Both animals had a slight rightward bias with their lever movements on free choice trials (subject C: 3.5% right bias, P < 0.001, binomial test; subject G: 1.5% right bias, P < 0.001, binomial test) and first saccades (subject C: 1.2% right bias, P < 0.001, binomial test; subject G: 1.8% right bias, P < 0.001, binomial test).
ACC neurons ramp during the preparation of the choice response
We recorded single neurons from OFC and ACC using up to eight acute multisite probes per session (Extended Data Fig. 3 and Supplementary Table 1). To investigate the relationship between subject behavior and neuronal firing rates leading to the decision, we performed linear regressions on neuronal firing rates in 100-ms overlapping time windows with maximum value, choice direction (ipsilateral or contralateral to the recording location) and trial type (forced or free) as predictors (equation (3); Methods). Around half the neurons in both regions encoded at least one parameter (Fig. 2). Encoding of value was most common, followed by encoding of choice direction. In both brain areas and both subjects, neurons were more likely to have a positive rather than a negative relationship with value. In subject C, 140 of 193 (73%) value-selective neurons in OFC (P < 0.001, binomial test) and 190 of 275 (69%) ACC neurons (P < 0.0001) had a positive relationship with value. In subject G, these numbers were 106 of 172 (62%) in OFC (P < 0.005) and 120 of 210 (57%) in ACC (P < 0.05). The small behavioral side biases did not appear to affect the prevalence of right-preferring, direction-encoding neurons. In subject C, 56 of 99 (57%) direction-selective neurons in OFC (P > 0.1) and 72 of 157 (46%) ACC neurons (P > 0.1) preferred leftward choices. In subject G, 46 of 71 (65%) in OFC (P < 0.01) and 50 of 107 (47%) in ACC (P > 0.1) preferred leftward responses. Because we recorded bilaterally in subject G, we were also able to see whether there would be a direction bias if we sorted neurons according to the hemisphere from which they were recorded. Resorting the data in this way made little difference to the results. Forty-seven of 71 (66%) preferred contralateral responses in OFC (P < 0.01), while 58 of 107 (54%) preferred contralateral responses in ACC (P > 0.1).
To better understand the temporal evolution of information leading to the decision, we analyzed population activity in individual trials. For each session, we trained linear discriminant analysis (LDA) decoders to predict the choice direction (left or right) from neuronal firing rates in overlapping 20-ms windows (Methods). We used a subset of free trials to assess decoder performance with a bootstrapping procedure. We applied the trained decoder weights to held-out free trials and used the posterior probability for the chosen direction as a proxy for the strength of direction selectivity. We observed a gradual rise in this measure over approximately 300 ms before the lever movement, and it was much stronger in ACC compared to OFC (Fig. 3a). We refer to this measure in future analyses as DIRACC.
To better understand the ramping of direction information in ACC, we examined the firing rate dynamics of ACC direction-selective neurons. Most of these neurons showed similar dynamics, steadily increasing their firing rate and peaking shortly before the choice response (Fig. 3b). For each neuron, we measured three parameters of the following dynamics: the ramp onset time, the time of peak activation and the magnitude of peak activation. We then determined how correlated these measures were across ipsilateral and contralateral movements (Fig. 3c). For both subjects, we found that ramp onset times and peak times were highly correlated. In contrast, the magnitude of the ramp peak was not significantly correlated across movement types, consistent with these being direction-discriminating neurons (that is, neurons whose overall firing rate was significantly different for the two different movements). Together, these results indicate that most ACC direction-encoding neurons homogeneously ramp in the run-up to a choice and that the peak of that ramp determines their preferred movement direction.
Value signals in ACC and OFC vacillate between representing either offer
We have previously observed that during decision-making, OFC alternately represents the value of each option and that these dynamics can predict the optimality and speed of the choice response9. We have proposed that stimulus–outcome associations are stored in OFC, with the synaptic strength of the neuronal ensemble proportional to the value of the outcome25. When a choice is required, the two neuronal ensembles that represent the relevant stimulus–outcome associations are excited. This can be conceived as a bistable attractor26, in which the depths of the attractor basin are proportional to the value of the outcome. This model predicts that more valuable options should be represented more frequently and for a longer duration than less valuable options because deeper basins are easier to fall into and more difficult to escape. We replicate all the main findings of this study in the current dataset and show that they also extend to ACC.
First, we decoded value states, or sustained periods of confident decoding, on individual free trials from neural firing rates (Methods). In both OFC and ACC, multiple value states occurred during a choice (Fig. 4a), and states associated with the chosen (more valuable) option were more frequent (Fig. 4b) and of longer duration (Fig. 4c) than states associated with the unchosen or unavailable options. The strength of value decoding was predictive of choice response times (Fig. 4d). Chosen states had a negative relationship with response times—when they were represented more strongly, they predicted faster choices. In contrast, unchosen states had a positive relationship, predicting slower choices. The value dynamics decoded simultaneously from OFC and ACC were significantly correlated (Fig. 4e).
We also examined how value states varied both as a function of the maximum value on offer and the difference in value between the two options. We fit general linear models with parameters of maximum value and value difference and examined how well they could predict the total number of decoded value states and the duration of each state (Supplementary Table 2). More valuable offers were associated with more frequent and longer chosen states and less frequent and shorter unchosen states, which is consistent with value modifying the depth of basins in a bistable attractor. The effect of value difference was more specific and less clearly interpretable—bigger differences predicted shorter unchosen states and had no effect on chosen states.
Finally, our previous study differentiated between two possibilities with respect to the population value dynamics, specifically whether different neurons represented the value of the alternate options or whether it was the same neurons. Our results were consistent with the majority of value-encoding neurons in OFC, dynamically shifting from encoding the value of one choice option to another9. We replicated this result with the current OFC recordings and examined whether it also applied to the ACC data. We assessed whether decoded states were a result of a unitary population of value-encoding neurons by removing individual neurons from the population decoder. Then, we assessed whether the held-out neuron was more likely to encode the value associated with the state decoded from the rest of the population, or the alternative value on offer. Neurons in both ACC and OFC were significantly more likely to encode values associated with decoded states from the rest of the population (Extended Data Fig. 4), showing that, in both areas, most value-encoding neurons dynamically shift from encoding the value of one choice option to the other.
Effect of value signals on ACC ramping
We next investigated whether ACC ramping before the lever movement was affected by the dynamics of the value signal decoded simultaneously from OFC. We hypothesized that OFC value signals could accelerate or decelerate the ACC ramp to bias activity in favor of one or other choice response. Consequently, we would expect ACC ramping to be affected by the value of the picture currently being represented by OFC (Fig. 5a). To examine whether this was the case, we compared ACC ramping at the onset of chosen and unchosen OFC value states in sessions with simultaneous ACC and OFC recordings (subject C: = 5 sessions; subject G: = 5 sessions). We restricted the analysis to value states in the 350-ms period of rising DIRACC before lever movement (Fig. 3a). In addition, it was important to ensure that there were no differences between the chosen and unchosen states in terms of the trials from which they were drawn or where they occurred within the trial. To account for these potential confounds, we performed a bootstrapping procedure, in which we selectively downsampled chosen states (which were more frequent than unchosen states) so that they were drawn from the same trials as the unchosen states and matched with respect to their timing within the trial (Methods). We found that DIRACC was significantly stronger at 100 ms following a chosen OFC value state compared to an unchosen value state (Fig. 5b).
We next examined whether there was a relationship between ACC value states and DIRACC. Similar to OFC value states, when ACC represented the chosen value, we saw significantly stronger DIRACC (Fig. 5c). However, in contrast to when we aligned to OFC value states, the differential effects of states on DIRACC were apparent before the onset of the ACC value state. This suggests that DIRACC is more tightly aligned to OFC value states than ACC value states. We tested this by comparing the differences between DIRACC aligned to chosen and unchosen value states in 100 ms before the state onset when states were derived from either the ACC or OFC. To guard against the confounds of value, state frequency and onset times, we calculated the differences from each of the 10,000 bootstraps. Positive values of this measure indicated greater (chosen–unchosen) differences when aligned to ACC as compared to OFC value states. Significance was assessed via the percentiles of the resulting distributions of differences. The first percentile of these distributions was greater than 0 (P < 0.01) in both subjects, indicating that DIRACC is better aligned to OFC than ACC value states (Fig. 5d).
Given these results, we wondered whether it was possible to observe the effects of OFC value states on individual direction-selective ACC neurons. We focused on 184 direction-selective ACC neurons that were recorded simultaneously with OFC recordings. We grouped neuronal firing rates according to the direction of the subject’s choice and whether a given OFC value state corresponded to the left or right option. Figure 6a illustrates two examples of ACC neurons that showed significant effects of OFC value states on their firing rates. Both neurons preferred rightward choices (purple lines) and exhibited elevated activity when OFC represented the value of right options (dark purple) as compared to left options (light purple). These results suggest that OFC value states may influence ACC direction coding by selectively enhancing or otherwise accelerating the ramping of neurons associated with the side of the chosen state. Overall, we identified 18 of 184 (10%) direction-selective ACC neurons that significantly increased their firing rate when OFC value states were congruent with their preferred direction (Fig. 6b). Only 2 of 184 (1%) increased their firing rate when OFC value states were incongruent, which did not exceed the proportion expected by chance (binomial test, P > 0.1). Note that, because of the small number of neurons in which we were able to detect this effect, we pooled the results across the subjects.
Discussion
Our results showed that during value-based decision-making, ramping signals in ACC were affected by which value the OFC was currently representing. We found that there was a unitary ramp in ACC with direction selectivity related to the choice response riding on the top of the ramp. In other words, response-selective neurons in ACC increased their firing rate around the time of response preparation, irrespective of whether they did so more for left or right responses. The ramp accelerated depending on which option was currently represented in OFC. Because OFC represents higher values more frequently and for a longer duration than lower values, this interaction between the two areas could contribute to an ACC bias toward encoding the choice response associated with more valuable options.
Evidence accumulation models may be useful to understand the interaction between ACC and OFC27,28. These models have typically been used to describe the processes underlying sensory decision-making, whereby sensory information in favor of one or other option is gradually accumulated until a threshold is reached that elicits the favored response. These models have been used to explain the build-up of activity in a broad network of brain areas29,30. Regarding the interaction between OFC and ACC, one could consider ACC the accumulator, integrating evidence for one or other choice response based on the dynamics in OFC.
One difference between traditional evidence accumulation models and the activity that we observed in ACC is that the models typically posit that evidence in favor of one option or another fluctuates around zero. A problem with such models is that they predict infinite response times if there is insufficient evidence with which to decide. To address this problem, several modifications of these models have been proposed31–34, which essentially impose some cost on the accumulation process. Perhaps most relevant is the notion of an urgency signal35, whereby evidence has a multiplicative effect on an urgency signal that consists of a ramp to the threshold. This model predicts both acceleration and deceleration of the ramp, consistent with our data.
While we have demonstrated an effect of OFC value signals on ramping activity in ACC, the precise function of the ramping remains unclear. Neurons throughout the frontal cortex that are engaged in motor control show similar directionally selective ramping that peaks shortly before the choice response36, and many investigators have favored a motoric explanation of ACC neural activity15,37–39. Electrical stimulation of ACC in humans produces goal-directed movements, such as reaching toward objects or grasping movements, with a topographic organization in the type of evoked movement19. However, other investigators have favored explanations of ACC function that emphasize a role in processing aspects of the expected reward, including prediction errors40–42, volatility of the reward environment43, value of exerting cognitive control44 and information seeking and reduction of uncertainty45. Even the movements evoked by electrical stimulation have an affective component to them, with patients reporting a sense of urgency or compulsion to act19,20, consistent with computational models of ACC ramping that incorporate an urgency signal35.
This raises the question as to whether ACC ramping reflects the selection of the choice response. Other researchers have argued that the selection of the choice response relies on interactions between OFC and the lateral prefrontal cortex8,46. It is unlikely that there is a single system responsible for preparing actions. There are many distinct regions involved in motor control in the frontal lobe, including at least two parallel pathways through medial and lateral areas18,47, whose differential contribution to motor preparation is poorly understood. Although the signals in ACC could be used to select the correct motor response, they are also compatible with other processes, such as determining the vigor or urgency of the action. These processes could be particularly relevant during effort-based decision-making. Indeed, there is increased synchrony between ACC and lateral prefrontal cortex, specifically during effort-based decisions versus delay-based decisions48, consistent with a role for ACC in computing the effective component of action and lateral prefrontal cortex determining the choice response.
Irrespective of how these ACCcomputations are used in motor control, our data show that they are calculated on a moment-by-moment basis that depends on the choice option that is currently represented by OFC. This is consistent with previous findings in ACC. For example, ACC ramping can also be affected by the arrival of new information about the value of a choice option11. Furthermore, ACC may not be the only area involved in this accumulation of evidence—similar signals have been observed in the dorsolateral prefrontal cortex49.
We have previously described in detail how we think OFC value dynamics arise and contribute to decision-making25, drawing heavily on similar mechanisms that have been proposed to underlie the ‘flickering’ of competing spatial representations in the hippocampus50,51. We argue that stimulus–outcome associations are stored in OFC, with the synaptic strength of the neuronal ensemble proportional to the value of the outcome. The differential strengths would arise during learning putatively through more valuable outcomes triggering larger prediction errors and dopamine transients. When the subject is presented with a choice, two neuronal ensembles are activated creating a bistable attractor26, in which more valuable options have deeper attractor basins and are consequently represented more frequently and for longer duration than less valuable options. A downstream area integrating these OFC dynamics could potentially use the information to determine the action associated with the more valuable option, a process consistent with the influence of fluctuating OFC value representations on ACC direction ramping. One caveat to our results is that we cannot rule out a potential explanation for attention in driving the OFC fluctuations. Such an explanation seems unlikely, as OFC states are unrelated to eye movements9 and the dynamics of the eye movements during the choice do not map onto the dynamics of OFC states. The average duration of a state in OFC (70–80 ms) is much quicker than the average fixations during free viewing (~200 ms). Nevertheless, it is possible that covert attentional shifts might operate on a faster timescale and drive the OFC fluctuations.
This view of OFC operation contrasts with models which propose that distinct populations of neurons encode the values that are on offer and compute their comparison to determine the optimal choice response52–54. However, these models were developed from empirical data that characterize single-neuron coding, which requires averaging neural activity across many trials. This averaging can misrepresent the tuning of a single neuron9. For example, the tendency for single OFC neurons to encode chosen value arises because chosen value states are represented more often and for longer duration, so this type of coding dominates neural tuning when activity is averaged across trials. Indeed, as shown previously9 and replicated in the current results, OFC neurons represent the value of either option, and which option is represented fluctuates across the course of the trial. However, population-level decoding also has drawbacks. It can potentially obscure the contribution of different subpopulations of neurons, as typically all neurons are included to achieve adequate decoding. Furthermore, the brain may use a combination of single-neuron and population-level codes. Recent work has shown that different cognitive strategies applied to the same task can cause information to be more strongly represented at either the single neuron or population level55. Consequently, it seems prudent to develop testable models of prefrontal function at both levels of explanation.
In conclusion, our results are consistent with ACC accumulating value evidence from OFC during the preparation of a choice response. Our results also demonstrate the advantages of using high-channel count recordings combined with population-level analysis to decode the dynamics of cognitive processes25. We used the encoding of value in OFC to understand the dynamics of decision-making, which then enabled us to better understand neural signals in ACC.
Methods
Experimental model and subject details
All procedures were carried out as specified in the National Research Council guidelines and approved by the Animal Care and Use Committee at the University of California, Berkeley. Two male rhesus macaques (subjects C and G, respectively) aged 6 and 4 years and weighing 10 and 7 kg at the time of recording were used in the current study. Subjects sat head-fixed in a primate chair (Crist Instrument) and manipulated a bidirectional lever located on the front of the chair. Eye movements were tracked with an infrared system (SR Research). Stimulus presentation and behavioral conditions were controlled using the MonkeyLogic toolbox56. Subjects had unilateral (subject C) or bilateral (subject G) recording chambers implanted, centered over the frontal lobe.
Task design
Subjects performed a task in which they were required to choose between pairs of pictures or single pictures, presented in random order throughout the session. Subjects fixated continuously for 750 ms on a central 0.5° cue to initiate the presentation of one (forced choice trials, 33%) or two (free choice trials, 67%) 2.5° × 2.5° pictures, presented 6° to either side of the fixation cue. We sampled gaze position at 500 Hz. Subjects used a bidirectional lever to indicate a left or right choice, and the selected picture remained on the screen, while the corresponding juice amount was delivered probabilistically. The juice amounts (subject C: 0.15, 0.3, 0.45 and 0.6 ml; subject G: 0.1, 0.2, 0.3 and 0.4 ml) and reward probabilities (subject C: 0.15, 0.4, 0.65 and 0.9; subject G: 0.1, 0.37, 0.63 and 0.9) uniquely associated with each picture were titrated for each subject so that the subjects considered both dimensions during their choices. Subject C completed 12,033 free and 6,307 forced trials over 13 sessions; subject G completed 4,696 free and 2,577 forced trials over five sessions.
Neurophysiological recordings
Subjects were fitted with head positioners and imaged in a 3 T MRI scanner. From the MR images, we constructed 3D models of each subject’s skull and target brain areas. Subjects were implanted with custom radiolucent recording chambers fabricated from polyether ether ketone. During each recording session, up to eight multisite linear probes (16- or 32-channel V probes with 75, 100 or 200 μm contact spacing, Plexon) were lowered into ACC (AP +21 to +37 mm) and OFC (areas 11 and 13, AP +30 to +40 mm). Electrode trajectories were defined in custom software, and the appropriate microdrives were 3D printed (form 2 and 3, Formlabs)57. Lowering depths were derived from the MR images and verified from neurophysiological signals via gray/white matter transitions. Neural signals were digitized using a Plexon OmniPlex system, with continuous spike-filtered signals (200 Hz to 6 kHz) acquired at 40 kHz.
We recorded the neuronal activity over the course of 13 sessions for subject C and five sessions for subject G (Supplementary Table 1). Units were manually sorted using 1400 μs waveforms and thresholded at four standard deviations above noise (Offline Sorter, Plexon). We restricted our analysis to neurons with a mean firing rate of >1 Hz across the session. To ensure adequate isolation of neurons, we excluded neurons where >0.2% of spikes were separated by <1100 μs. Sorting quality was subjectively ranked on a 1–5 scale toseparate well-isolated single neurons from possible multiunits. None of the results reported in the manuscript depended on the isolation quality of our neurons, so we included all neurons in our analysis. In total, we recorded 453 and 517 neurons from ACC in subjects C and G, respectively, and 454 and 429 neurons from OFC.
Behavioral analysis
We estimated the subjective value for each picture as the expected value, :
(1) |
We then fit choice behavior using a soft-max decision rule:
(2) |
This modeled the probability that the subject will select the picture on the right, , with the following three free parameters: the inverse temperature, , which determines the stochasticity as a function of the value difference between the right and left picture; a saccade bias term, , which accounts for the influence of the first gaze location, , after picture onset58; and a side bias term, , which accounts for any preference for one choice direction. consisted of a dummy variable that indicated the direction of the first saccade after the pictures were displayed (+1 = right, −1 = left). We estimated all free parameters by maximizing the log-likelihood of the full model. Our model outperformed other discount functions59.
We defined saccades as eye movements whose velocity exceeded six standard deviations from the mean velocity during fixation. The saccade time was defined by the peak velocity within each movement, and the direction was identified by the subsequent eye position.
Single-neuron regression analysis
For each neuron, we examined the relationship between firing rate (FR) and task variables in linear regression:
(3) |
with binary variables for trial type (+1 free and −1 forced) and choice direction (DIR: +1 contralateral to the neuron and −1 ipsilateral), continuous variables for the maximum value of the presented pictures (Max calculated using equation (1)) and trial number, , a nuisance parameter to absorb potential variance due to neuronal drift over the recording session. We repeated this regression in overlapping 100-ms windows shifted by 25 ms and defined significance as P < 0.01 for at least 100 ms (four consecutive time bins). We ensured that this criterion produced an acceptable false discovery rate, by confirming that <1% of neurons reached significance during the fixation epoch. We performed this analysis for the 500-ms period immediately before choice.
Population analysis of choice response dynamics
For each session, we trained LDA decoders to predict the choice direction (left or right) from neural activity. To reduce the dimensionality of the input features into our decoders, we first performed principal components analysis (PCA). This was necessary to prevent overfitting of our neural data to the training set as we had many neurons going into the decoder relative to the number of training trials. PCA was performed using MATLAB’s Statistics and Machine Learning Toolbox pca function. We only included the principal components that cumulatively explained 95% of the variance in neuronal firing rates. The LDA decoders were then trained to discriminate the values within the principal component space. This approach allowed us to examine how ensemble activity converged into different subregions of principal components space, which the LDA decoders could then systematically identify as corresponding to specific experimental conditions. The direction decoder was trained on neural activity during free trials in overlapping windows of 20 ms stepped by 5 ms. We aligned the trials to the choice response.
To guard against overfitting and assess decoder performance, we carried out a k-fold validation procedure. We split all correctly performed free choice trials into separate training and held-out sets. The training set was constrained such that there was an equal number of each offer pair (12 trials in total, consisting of each combination of binned values with the higher offer on either the left or right). The held-out set contained the remaining trials. We then used the training set to obtain decoder weights to compute the posterior probability for the left versus right direction on each trial in the held-out set. This procedure was repeated exhaustively until the maximum number of trials had served as training data. We then repeated the entire procedure 25 times, and for each trial, we averaged the computed posterior probabilities across all instances when it was in a held-out set. We relabeled the decoder class (left and right) as chosen direction and unchosen direction as appropriate for each trial. Only sessions with good decoding accuracy on the training set of free trials were included in the analyses, defined as >60% overall accuracy (chance = 50%). This criterion excluded two ACC sessions from subject C and all the OFC sessions from both subjects.
Single-neuron analysis of ACC choice response dynamics
To examine the dynamics of direction-selective neurons in ACC, we first normalized ( score) the activity of each neuron to that of its baseline, fixation-period activity. We then obtained the mean normalized activities of each neuron in the time leading up to and just after choices involving contralateral and ipsilateral movements. Contralateral/ipsilateral movements were defined relative to the hemisphere from which a neuron was recorded. We characterized neuronal ramping during both contralateral and ipsilateral movements according to the following three parameters: the ramp onset time, the time of peak activation and the magnitude of peak activation. We defined ramp onsets as the first time bin where a neuron’s normalized mean activity exceeded one standard deviation (that is, > 1). We selected this threshold because 99% of neurons had activities less than this threshold during the baseline fixation period (we note that other arbitrarily chosen thresholds yielded qualitatively similar results). Ramp peaks were defined as a given neuron’s maximum activity that occurred after the ramp onset and before the choice itself. If a neuron failed to reach the ramp threshold for one of the movements, we set its onset and peak time to zero for that movement. Of the 276 direction-selective neurons detected across both animals (using all ACC recording sessions), only one neuron recorded from animal C did not reach the ramp detection threshold in either a contralateral or ipsilateral movement. This neuron was excluded from subsequent analysis.
Value decoding with a single trial resolution
For each session, we trained an LDA decoder to predict the value (1–4) from either OFC or ACC neural activity during forced trials9. We averaged neuron firing rates following picture onset (100–400 ms) or preceding choice (−300 to 0 ms). We performed the same dimensionality reduction as detailed for direction decoding. To assess decoder performance, we used a k-fold validation procedure on the forced trials. A single fold consisted of 32 trials—each value presented on either the left or the right. We then used the training set to obtain decoder weights to compute the posterior probability for each value on all free trials in overlapping windows of 20 ms stepped by 5 ms. We repeated the procedure with 50 random samples of forced training set trials and averaged the posterior probabilities for free trials. Only sessions with good decoding accuracy on forced trials were included in the analyses, defined as >40% overall accuracy (subject C: 54 ± 4%; subject G: 52 ± 4%; chance = 25%).
We observed that chance rates of decoding value on free trials deviated from the expected 25%, likely due to global differences in population firing rates during different phases of the task. To correct this, we defined a unique baseline rate for each value level by averaging the posterior probability corresponding to that level across trials when the picture value was not available. We reported posterior probabilities as percent change from these baselines; we refer to this metric as decoding strength. We could then label the decoded value as either the chosen value, unchosen value or unavailable as appropriate for each trial. We could also label the value as that associated with either the left or right option. Only free trials where the chosen value was greater than the unchosen value (that is, correct trials) were analyzed. For all visualizations and analyses, one of the two unavailable value levels was randomly selected for each trial.
We identified sustained periods of confident decoding as states. On each trial, the decoding strength for a given value level needed to surpass 200% (that is, double the baseline rate) for at least four consecutive time bins (spanning 35 ms) to be considered a state.
Change in ACC direction strength following OFC value states
We first restricted our analysis to trials in which at least one chosen and unchosen value state occurred. In addition, we restricted the analysis to value states in the 350-ms period of rising DIRACC before lever movement (Fig. 3a). We then conducted a bootstrapping analysis and, within each of 10,000 bootstraps, we ensured that chosen and unchosen OFC states were drawn from the matched sets of trials with respect to the values being compared. Because chosen states were more frequent than unchosen states, we did this by selectively downsampling the chosen states. We identified the unchosen states on each trial and then identified the nearest chosen state that occurred before and/or after each unchosen state. To confirm this procedure generated sets of chosen and unchosen states with similar onset times, we performed paired -tests of the randomly selected, temporally adjacent chosen and unchosen states identified on each trial. We found no significant difference in any bootstrap in either animal. Thus, our bootstrapping procedure provided sets of temporally adjacent chosen and unchosen states drawn from the same value-matched trials, ruling out any explanation for the effects of OFC states on ACC direction representations in terms of the value of the options under consideration or the onset times of the states.
To examine the relationship between OFC value states and ACC direction neurons, we focused on ACC neurons with significant direction-encoding in the 500 ms preceding choice that was recorded simultaneously with OFC neurons. To guard against the confounds of the values on offer and the timing of the states, we implemented the same bootstrapping procedure as for our analysis at the population level. To test whether individual direction-selective ACC neurons were affected by the value state in OFC, we aligned each direction-selective ACC neuron’s z-scored firing rates to the onset of each OFC value state and computed mean firing rates for 100 ms immediately after state onset. Next, we grouped the firing rates according to the direction of the subject’s choice and whether a given OFC value state corresponded to the option on the side of the screen congruent with that choice. We then performed a -test to determine whether an ACC neuron’s activity significantly differed when OFC value states were congruent versus incongruent with the subject’s choice direction. We retained the -statistic from each iteration of the bootstrapping procedure. Significant neurons were defined as those where the mean -statistic exceeded the value corresponding to an α level of P = 0.01 in the -distribution.
Statistics and reproducibility
All statistical tests are described in the main text or the corresponding figure legends. Error bars and shading indicate s.e.m. unless otherwise specified. Probabilities and response times were transformed with logit and log10 functions, respectively. All terms in regression models were normalized and had maximum variance inflation factors of 1.7. All comparisons were two-sided.
No statistical method was used to predetermine sample size, and neurons were randomly sampled from the ACC and OFC. No blinding was relevant and thus data collection and analysis were not performed blind to the conditions of the experiment. In many cases, data were assumed to be normally distributed but were not formally tested.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Extended Data
Supplementary Material
Acknowledgements
We thank C. Ford, E. Hu, W. Liberti, L. Meckler and N. Munet for useful feedback on the manuscript. This work was funded by NIMH (R01-MH117763 and R01-MH121448 to J.D.W.). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Footnotes
Code availability
The analysis code supporting the current work is available on T.W.E.’s GitHub: https://github.com/t-elston/OFCvalue-to-ACCresponse.
Competing interests
The authors declare no competing interests.
Additional information
Extended data is available for this paper at https://doi.org/10.1038/s41593-023-01407-3.
Supplementary information The online version contains supplementary material available at https://doi.org/10.1038/s41593-023-01407-3.
Peer review information Nature Neuroscience thanks R Becket Ebitz, Camillo Padoa-Schioppa and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Data availability
The dataset supporting the current work is available from the corresponding author upon request.
References
- 1.Camille N, Griffiths CA, Vo K, Fellows LK & Kable JW Ventromedial frontal lobe damage disrupts value maximization in humans. J. Neurosci 31, 7527–7532 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Howard JD et al. Targeted stimulation of human orbitofrontal networks disrupts outcome-guided behavior. Curr. Biol 30, 490–498 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Ballesta S, Shi W, Conen KE & Padoa-Schioppa C Values encoded in orbitofrontal cortex are causally related to economic choices. Nature 588, 450–453 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Knudsen EB & Wallis JD Closed-loop theta stimulation in the orbitofrontal cortex prevents reward-based learning. Neuron 106, 537–547 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Carmichael ST & Price JL Connectional networks within the orbital and medial prefrontal cortex of macaque monkeys. J. Comp. Neurol 371, 179–207 (1996). [DOI] [PubMed] [Google Scholar]
- 6.Padoa-Schioppa C & Assad JA Neurons in the orbitofrontal cortex encode economic value. Nature 441, 223–226 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kennerley SW, Dahmubed AF, Lara AH & Wallis JD Neurons in the frontal lobe encode the value of multiple decision variables. J. Cogn. Neurosci 21, 1162–1178 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Cai X & Padoa-Schioppa C Contributions of orbitofrontal and lateral prefrontal cortices to economic choice and the good-to-action transformation. Neuron 81, 1140–1151 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Rich EL & Wallis JD Decoding subjective decisions from orbitofrontal cortex. Nat. Neurosci 19, 973–980 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Balewski ZZ, Knudsen EB & Wallis JD Fast and slow contributions to decision-making in corticostriatal circuits. Neuron 110, 2170–2182 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Hunt LT et al. Triple dissociation of attention and decision computations across prefrontal cortex. Nat. Neurosci 21, 1471–1481 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Cai X & Padoa-Schioppa C Neuronal encoding of subjective value in dorsal and ventral anterior cingulate cortex. J. Neurosci 32, 3791–3808 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Luk CH & Wallis JD Choice coding in frontal cortex during stimulus-guided or action-guided decision-making. J. Neurosci 33, 1864–1871 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Hayden BY & Platt ML Neurons in anterior cingulate cortex multiplex information about reward and action. J. Neurosci 30, 3339–3346 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Matsumoto K, Suzuki W & Tanaka K Neuronal correlates of goal-based motor selection in the prefrontal cortex. Science 301, 229–232 (2003). [DOI] [PubMed] [Google Scholar]
- 16.Barbas H & Pandya DN Architecture and intrinsic connections of the prefrontal cortex in the rhesus monkey. J. Comp. Neurol 286, 353–375 (1989). [DOI] [PubMed] [Google Scholar]
- 17.Vogt BA & Gabriel M (eds) in Neurobiology of Cingulate Cortex and Limbic Thalamus: A Comprehensive Handbook 249–284 (Birkhaeuser, 1993). [Google Scholar]
- 18.Dum RP & Strick PL Motor areas in the frontal lobe of the primate. Physiol. Behav 77, 677–682 (2002). [DOI] [PubMed] [Google Scholar]
- 19.Caruana F et al. Motor and emotional behaviours elicited by electrical stimulation of the human cingulate cortex. Brain 141, 3035–3051 (2018). [DOI] [PubMed] [Google Scholar]
- 20.Parvizi J, Rangarajan V, Shirer WR, Desai N & Greicius MD The will to persevere induced by electrical stimulation of the human cingulate gyrus. Neuron 80, 1359–1367 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Walton ME, Bannerman DM & Rushworth MF The role of rat medial frontal cortex in effort-based decision making. J. Neurosci 22, 10996–11003 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Rudebeck PH, Walton ME, Smyth AN, Bannerman DM & Rushworth MF Separate neural pathways process different decision costs. Nat. Neurosci 9, 1161–1168 (2006). [DOI] [PubMed] [Google Scholar]
- 23.Cai X & Padoa-Schioppa C Neuronal activity in dorsal anterior cingulate cortex during economic choices under variable action costs. eLife 10, e71695 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Heilbronner SR & Hayden BY Dorsal anterior cingulate cortex: a bottom-up view. Annu. Rev. Neurosci 39, 149–170 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wallis JD Decoding cognitive processes from neural ensembles. Trends Cogn. Sci 22, 1091–1102 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Piet AT, Erlich JC, Kopec CD & Brody CD Rat prefrontal cortex inactivations during decision making are explained by bistable attractor dynamics. Neural Comput 29, 2861–2886 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Smith PL & Ratcliff R Psychology and neurobiology of simple decisions. Trends Neurosci 27, 161–168 (2004). [DOI] [PubMed] [Google Scholar]
- 28.Usher M & McClelland JL The time course of perceptual choice: the leaky, competing accumulator model. Psychol. Rev 108, 550–592 (2001). [DOI] [PubMed] [Google Scholar]
- 29.Gold JI & Shadlen MN Representation of a perceptual decision in developing oculomotor commands. Nature 404, 390–394 (2000). [DOI] [PubMed] [Google Scholar]
- 30.Roitman JD & Shadlen MN Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task. J. Neurosci 22, 9475–9489 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Hanks TD, Mazurek ME, Kiani R, Hopp E & Shadlen MN Elapsed decision time affects the weighting of prior probability in a perceptual decision task. J. Neurosci 31, 6339–6352 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Cisek P, Puskas GA & El-Murr S Decisions in changing conditions: the urgency-gating model. J. Neurosci 29, 11560–11571 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Ditterich J Evidence for time-variant decision making. Eur. J. Neurosci 24, 3628–3641 (2006). [DOI] [PubMed] [Google Scholar]
- 34.Drugowitsch J, Moreno-Bote R, Churchland AK, Shadlen MN & Pouget A The cost of accumulating evidence in perceptual decision making. J. Neurosci 32, 3612–3628 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Thura D, Beauregard-Racine J, Fradet CW & Cisek P Decision making by urgency gating: theory and experimental support. J. Neurophysiol 108, 2912–2930 (2012). [DOI] [PubMed] [Google Scholar]
- 36.Roesch MR & Olson CR Impact of expected reward on neuronal activity in prefrontal cortex, frontal and supplementary eye fields and premotor cortex. J. Neurophysiol 90, 1766–1789 (2003). [DOI] [PubMed] [Google Scholar]
- 37.Rushworth MF, Behrens TE, Rudebeck PH & Walton ME Contrasting roles for cingulate and orbitofrontal cortex in decisions and social behaviour. Trends Cogn. Sci 11, 168–176 (2007). [DOI] [PubMed] [Google Scholar]
- 38.Kennerley SW, Walton ME, Behrens TE, Buckley MJ & Rushworth MF Optimal decision making and the anterior cingulate cortex. Nat. Neurosci 9, 940–947 (2006). [DOI] [PubMed] [Google Scholar]
- 39.Sallet J et al. Expectations, gains, and losses in the anterior cingulate cortex. Cogn. Affect. Behav. Neurosci 7, 327–336 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Kennerley SW, Behrens TE & Wallis JD Double dissociation of value computations in orbitofrontal and anterior cingulate neurons. Nat. Neurosci 14, 1581–1589 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Matsumoto M, Matsumoto K, Abe H & Tanaka K Medial prefrontal cell activity signaling prediction errors of action values. Nat. Neurosci 10, 647–656 (2007). [DOI] [PubMed] [Google Scholar]
- 42.Amiez C, Joseph JP & Procyk E Anterior cingulate error-related activity is modulated by predicted reward. Eur. J. Neurosci 21, 3447–3452 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Behrens TE, Woolrich MW, Walton ME & Rushworth MF Learning the value of information in an uncertain world. Nat. Neurosci 10, 1214–1221 (2007). [DOI] [PubMed] [Google Scholar]
- 44.Shenhav A, Botvinick MM & Cohen JD The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron 79, 217–240 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.White JK et al. A neural network for information seeking. Nat. Commun 10, 5168 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Yim MY, Cai X & Wang XJ Transforming the choice outcome to an action plan in monkey lateral prefrontal cortex: a neural circuit model. Neuron 103, 520–532 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Passingham RE & Wise SP The Neurobiology of the Prefrontal Cortex: Anatomy, Evolution, and the Origin of Insight (Oxford Univ. Press, 2014). [Google Scholar]
- 48.Hunt LT, Behrens TE, Hosokawa T, Wallis JD & Kennerley SW Capturing the temporal evolution of choice across prefrontal cortex. eLife 4, e11945 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Lin Z, Nie C, Zhang Y, Chen Y & Yang T Evidence accumulation for value computation in the prefrontal cortex during decision making. Proc. Natl Acad. Sci. USA 117, 30728–30737 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Jezek K, Henriksen EJ, Treves A, Moser EI & Moser MB Theta-paced flickering between place-cell maps in the hippocampus. Nature 478, 246–249 (2011). [DOI] [PubMed] [Google Scholar]
- 51.Mark S, Romani S, Jezek K & Tsodyks M Theta-paced flickering between place-cell maps in the hippocampus: a model based on short-term synaptic plasticity. Hippocampus 27, 959–970 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Strait CE, Blanchard TC & Hayden BY Reward value comparison via mutual inhibition in ventromedial prefrontal cortex. Neuron 82, 1357–1366 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Rustichini A & Padoa-Schioppa C A neuro-computational model of economic decisions. J. Neurophysiol 114, 1382–1398 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Hunt LT et al. Mechanisms underlying cortical activity during value-guided choice. Nat. Neurosci 15, 470–476 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Chiang FK, Wallis JD & Rich EL Cognitive strategies shift information from single neurons to populations in prefrontal cortex. Neuron 110, 709–721 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Hwang J, Mitz AR & Murray EA NIMH MonkeyLogic: behavioral control and data acquisition in MATLAB. J. Neurosci. Methods 323, 13–21 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Knudsen EB, Balewski ZZ & Wallis JD A model-based approach for targeted neurophysiology in the behaving non-human primate. In Proc. 9th International IEEE/EMBS Conference on Neural Engineering (NER) 195–198 (IEEE, 2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Cavanagh SE, Malalasekera WMN, Miranda B, Hunt LT & Kennerley SW Visual fixation patterns during economic choice reflect covert valuation processes that emerge with learning. Proc. Natl Acad. Sci. USA 116, 22795–22801 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Hosokawa T, Kennerley SW, Sloan J & Wallis JD Single-neuron mechanisms underlying cost-benefit analysis in frontal cortex. J. Neurosci 33, 17385–17397 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The dataset supporting the current work is available from the corresponding author upon request.